ERP on DB

Saturday, May 25, 2024

EBS forms failed by CrowdStrike

EBS Forms in our financial applications suddenly does not work. The message on the webpage is

Failure of Web Server bridge:

No backend server available for connection: timed out after 10 seconds or idempotent set to OFF or method not idempotent.

The error message does not tell the true problem. When checking into services on OS level, I saw Oracle EBS Forms service was not running and also saw errors from startup script $ADMIN_SCRIPTS_HOME/adstrtal.sh:

Forms service failed to start.

The Node Manager is already up.

ERROR: Unable to start up the managed server forms_server1

Server specific logs are located at $EBS_DOMAIN_HOME/servers/forms_server1/logs

05/13/24-20:56:26 :: admanagedsrvctl.sh: exiting with status 1

Java error exists in Forms log file $EBS_DOMAIN_HOME/servers/forms_server1/logs/forms_server1.out

<May 13, 2024 8:56:25 PM EDT> <Emergency> <Store> <BEA-280060> <The persistent store "_WLS_forms_server1" encountered a fatal error, and it must be shut down: weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the log file

weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the log file

at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:128)

at weblogic.store.internal.PersistentStoreImpl.recoverStoreConnections(PersistentStoreImpl.java:435)

at weblogic.store.internal.PersistentStoreImpl.open(PersistentStoreImpl.java:423)

at weblogic.store.admin.AdminHandler.activate(AdminHandler.java:126)

at weblogic.store.admin.FileAdminHandler.activate(FileAdminHandler.java:207)

Truncated.

Caused By: java.io.EOFException: premature EOF: expected=512, actual=126

at weblogic.store.io.file.StoreFile.readBulk(StoreFile.java:316)

at weblogic.store.io.file.Heap.readStoreFile(Heap.java:1142)

at weblogic.store.io.file.Heap.getNextRecoveryFile(Heap.java:1226)

at weblogic.store.io.file.Heap.open(Heap.java:373)

at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:117)

Truncated.

Seems WebLogic failed to open a file, but the log did not say which file. I knew that Linux Admins just did server maintenance and rebooted server after they applied monthly patches and Security updates on OS level. That was the only change in the application environment recently.

After searching around, I found the Java errors match the description in Oracle Doc ID 3017110.1 ( Managed Forms Server Fails To Start - Displaying Message: FAILED_NOT_RESTARTABLE - ERROR: <BEA-280061> The persistent store "_WLS_forms_server1" could not be deployed: weblogic.store.PersistentStoreFatalException [Store:280020] ).

The document points out the problem is caused by CrowdStrike, which locks a Forms file in $EBS_DOMAIN_HOME/servers/forms_server#/data/store/default.

CrowdStrike is installed in /opt/CrowdStrike. It is owned by root, and it is running constantly on the Linux server.

$ ps -ef | grep falcon-sensor

root 1081 1079 0 May13 ? 00:22:23 falcon-sensor

The problem can be fixed temporarily by a workaround:

1. Delete/rename below .DAT file (I guess CrowdStrike does not like the file name and so locks it)

$ cd $EBS_DOMAIN_HOME/servers/forms_server1/data/store/default

$ ls -altr

total 1028

drwxr-xr-x 4 user group 40 Sep 13 2023 ..

-rw-r--r-- 1 user group 1049088 May 13 20:51 _WLS_FORMS_SERVER1000000.DAT

drwxr-xr-x 2 user group 42 May 13 20:56 .

$ rm _WLS_FORMS_SERVER1000000.DAT

2. Re-start services cleanly by

$ADMIN_SCRIPTS_HOME/adstrtal.sh

$ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh start forms_server1

The permanent fix is that the Secure team rolls back the CrowdStrike change (and applies it again until CrowdStrike fixes the problem), because its new update touches an Oracle Forms data file wrongly during its scan.

Saturday, April 20, 2024

Scripts for start & stop EBS services

When server reboots for maintenance or unexpected downtime, we want it to bring EBS services down and up automatically. Sometimes, we also want to schedule an EBS downtime by cron job. I wrote two shell scripts by calling EBS start/stop scripts for automation. They also generate log files for tracing the scripts' last run.

Assume that a solid $HOME/.profile for setting up R12.2 environment variables (such as $isMaster) and a file $HOME/xxx_scripts/.EBSpassenv holding key passwords exist on the server.

$ more .EBSpassenv

export APPS_PWD=apps#@PWD

export SYSTEM_PWD=system%_PWD

export WLS_ADMIN=wls$%^PWD

Linux Admin can create a short script owned by root in directory /etc/init.d (or /etc/rc.d/init.d) which will be called during server reboot to execute both auto_stopall.sh for automatic shutdown and anto_startall.sh for automatic startup.

============ script auto_startall.sh ============

# Start all EBS services

DT=date +"%h %d, %y %H:%M"

RUNLOG="$HOME/xxx_scripts/reboot_scripts/reboot_start.log"

RUNLOG_ERR="$HOME/xxx_scripts/reboot_scripts/reboot_start_Error.log"

if [ -f $RUNLOG ]; then

mv $RUNLOG ${RUNLOG}_old

if [ -f $RUNLOG_ERR ]; then

mv $RUNLOG_ERR ${RUNLOG_ERR}_old

exec 1>$RUNLOG

exec 2>$RUNLOG_ERR

sleep 2

echo "Running at $DT"

. $HOME/.profile

. $HOME/xxx_scripts/.EBSpassenv

ps -ef | grep $LOGNAME # check current status of EBS services

# for R12.1

# $ADMIN_SCRIPTS_HOME/adstrtal.sh apps/$APPS_PWD@$TWO_TASK

# for R12.2

if [ $isMaster == "enabled" ]; then ## $isMaster is defined in .profile

{ echo apps ; echo $AAPS_PWD ; echo $WLS_ADMIN ; } | $ADMIN_SCRIPTS_HOME/adstrtal.sh @ -mode=allnodes -nopromptmsg

else

{ echo apps ; echo $AAPS_PWD ; echo $WLS_ADMIN ; } | $ADMIN_SCRIPTS_HOME/adstrtal.sh @ -msimode -nopromptmsg

echo 'sleep 10 seconds'

sleep 10

exit 0

============= end ============

NOTES for concurrent (CM) server/node:

1. Even WLS AdminServer are not started and running on Primary node, adstrtal.sh will fully start concurrent managers on a CM node (where only s_batch_status is "enabled" in CONTEXT_FILE).

2. If CM node crashed and services were not stopped gracefully, adstrtal.sh may not be able to start concurrent managers on CM node next time. Instead, all FNDLUBR processes may start and run on the Primary node (where even s_batch_status is "disabled"). If that happens, you have to stop services on all nodes, and then run adstrtal.sh to start concurrent services on CM node first to correct the issue. " adcmctl.sh start " by its own will not do much in R12.2.

3. If adstpall.sh has troubles in stopping CM processes, run " adcmctl.sh stop " may help.

========== script auto_stopall.sh =========

# Stop all EBS services. It may take 10 minutes for all apps processes shutdown.

DT=date +"%h %d, %y %H:%M"

RUNLOG="$HOME/xxx_scripts/reboot_scripts/reboot_stop.log"

RUNLOG_ERR="$HOME/xxx_scripts/reboot_scripts/reboot_stop_Error.log"

if [ -f $RUNLOG ]; then

mv $RUNLOG ${RUNLOG}_old

if [ -f $RUNLOG_ERR ]; then

mv $RUNLOG_ERR ${RUNLOG_ERR}_old

exec 1>$RUNLOG

exec 2>$RUNLOG_ERR

echo "Running at $DT"

. $HOME/.profile

. $HOME/xxx_scripts/.EBSpassenv

ps -ef | grep $LOGNAME

echo "shutting down ..."

# for R12.1

# $ADMIN_SCRIPTS_HOME/adstpall.sh apps/$APPS_PWD

{ echo apps ; echo $APPS_PWD ; echo $WLS_PWD ; } | $ADMIN_SCRIPTS_HOME/adstpall.sh @ -nopromptmsg

echo 'sleep 20 seconds'

sleep 20

PNUM=ps -ef | grep $LOGNAME | egrep -i 'FNDLIB|FNDSM' | wc -l

if [ $PNUM -gt 1 ]; then

echo 'sleep 90 seconds more...'

sleep 90

# only check upper case and assume $TWO_TASK is in the $ORACLE_HOME path

if [ $PNUM -gt 1 ]; then

echo 'sleep 30 seconds'

sleep 30

if [ $PNUM -gt 1 ]; then

echo 'sleep 30 seconds more ...'

sleep 30

if [ $PNUM -gt 1 ]; then

echo 'sleep 15 seconds more ...'

sleep 15

ps -ef | grep $LOGNAME

exit 0

============= end ============

Saturday, April 6, 2024

Use .profile in Linux to customize the shell prompt

When you have many EBS instances in a multi-nodes environment, it will be very useful to let the Linux prompt display current user ID, server name and the path location. A custom .profile saved under $HOME works for me very well. Its colors tell if you are in a Admin node or not, and if you are in a production environment or not (assume the last character of production server's name is "p").

For a Linux account, environment variable $HOME is defined by file /etc/passwd. But, if the account was created by AD (Active Directory), the default value of $HOME is defined in "Home Directory" section of AD. The value of $HOME could be set by "export" in file ~applMgr/.profile, in case some confusion.

Our EBS applMgr accounts use Korn shell which uses two startup files under $HOME, the .profile and the .kshrc. During a session start, .profile is first read once, then .kshrc (if it exists) is read by each new ksh. e.g. :

$ echo $SHELL

/bin/ksh

$ echo $0

-ksh

$ which ksh

/usr/bin/ksh

$ more .kshrc

alias ftp="print 'Reminder: Use sftp instead of \\\ftp'"

echo "This is .kshrc"

$ ksh

This is .kshrc

$ ftp

Reminder: Use sftp instead of \ftp

============= $HOME/.profile =============

PATH=/bin:/usr/bin:/usr/local/bin

export PATH

MANPATH=/usr/share/man:/usr/local/share/man

export MANPATH # for man manual

EDITOR=/bin/vi

export EDITOR

# ENV=$HOME/.kshrc

# export ENV

. /u02/app/EBSPROD/EBSapps.env RUN # R12.2 env file

. /u02/app/xxx_scripts/.EBSpassenv # password file (custom)

isMaster="no"

if [ ! -z $APPS_VERSION ] && [ ${APPS_VERSION:0:4} == "12.2" ]

then

s_status=cat $CONTEXT_FILE | grep -i s_adminserverstatus

isMaster="${s_status:60:7}"

if [ $isMaster == "enabled" ] # on admin/primary node

then

if [ echo -n ${HOSTNAME%%.*} | tail -c -1 != "p" ]

# last character of server name is not "p" => non-production server

then

PS1=$'

\e[0;31m$USER@${HOSTNAME%%.}[$TWO_TASK]\e[m$PWD

-->$ '

else # on production server: Red, and Green color on PWD

PS1=$'

\e[0;31m$USER@${HOSTNAME%%.}[$TWO_TASK]\e[m\E[32m$PWD \E[0m

-->$ '

else # on other node(s)

if [ echo -n ${HOSTNAME%%.*} | tail -c -1 != "p" ]

# on non-production server

then

PS1='

$USER@${HOSTNAME%%.}[$TWO_TASK]$PWD

-->$ '

else # on production server

PS1=$'

$USER@${HOSTNAME%%.}[$TWO_TASK]\E[32m$PWD \E[0m

-->$ '

alias rm='rm -i'

stty erase ^?

umask u=rwx,g=rwx,o=rx

================ end =================

On an Admin node in production env, the prompt looks like this:

applMgr@server_1p[EBSPROD]/u02/app

-->$

applMgr@server_1p[EBSPROD]/u02/app

-->$ echo $USER

applMgr

applMgr@server_1p[EBSPROD]/u02/app

-->$ echo $TWO_TASK

EBSPROD

applMgr@server_1p[EBSPROD]/u02/app

-->$ cd $TWO_TASK

applMgr@server_1p[EBSPROD]/u02/app/EBSPROD

-->$ echo $HOME

/u02/app

applMgr@server_1p[EBSPROD]/u02/app/EBSPROD

-->$ ls

EBSapps.env fs1 fs2 fs_ne

Wednesday, March 6, 2024

script to check if a password is expiring

The environment variable $HOME for a Linux account is defined by file /etc/passwd if the account was not created in AD (Active Directory). Each account has an entry line in file /etc/passwd. For example, I can get my account's password expiration date by:

$ echo $HOME

/u02/app

$ whoami

applmgr

$ grep applmgr /etc/passwd

applmgr:x:50378:102:Oracle EBS ID - J Y:/u02/app:/bin/ksh

$ expstr=$( chage -l $(whoami) | grep "^Password expires" | awk -F: '{ print $(NF) }' | sed -e 's/^ *//g; s/ *$//g;' )

$ echo "password for account `whoami` will expire on $expstr"

password for account applmgr will expire on Jul 30, 2025

But, if the account was created by Windows AD (Active Directory), the variable $HOME is defined in AD by "Home Directory" (Note: an entry in .profile or such could change $HOME to a different path immediately after login). ADHelp search page may show info:

Unix Account

Home Directory: /users/applmgr

Login Shell: /bin/ksh

In that case, "chage" will give a different result:

$ echo $HOME

/users/applmgr

$ expstr=$( chage -l $(whoami) | grep "^Password expires" | awk -F: '{ print $(NF) }' | sed -e 's/^ *//g; s/ *$//g;' )

chage: user 'applmgr' does not exist in /etc/passwd

For an important account created in Linux (vs. an AD account), I wrote a script to email warning out before its password expires. It can be run by a cron job, such as

30 12 * * * /path/to/xxxx_scripts/checkPWDexpire.sh 2>&1

============= script checkPWDexpire.sh =============

let secs_per_day=60*60*24

nowtime=$( date +%s )

expstr=$( chage -l $(whoami) | grep "^Password expires" | awk -F: '{ print $(NF) }' | sed -e 's/^ *//g; s/ *$//g;' )

echo "DEBUG: expstr is $expstr"

if [ "$expstr" == "never" ]; then

echo "Password never expires.";

exit 0;

exptime=$( date --date "$expstr" +%s )

if [ "$exptime" -lt 1 ]; then

echo "Something is wrong.";

exit 255; # Or, email a message out

if [ "$exptime" -lt "$nowtime" ]; then

echo "Password already expired.";

exit 1; # Or, email a message out

secs_til_exp=$(expr $exptime - $nowtime)

days_til_exp=$(expr $secs_til_exp / $secs_per_day)

echo "Password expires in $days_til_exp days."

if [ "$days_til_exp" -lt 6 ]; then

# send email out

echo "Please reset password manually and update 3rd party environments." | mailx -s "`whoami` on `uname -n` will expire in $days_til_exp days" me@email.com

# or

# mailx -s "`whoami` on `uname -n` will expire in $days_til_exp days" -a aFile.log me@email.com < aFile.log

else

echo "All is fine.";

exit ;

============== end =====================

"chage" Linux command:

If OS user applmgr is granted sudo, it can act as root to check another account's status or change password status.

$ sudo su -

[sudo] password for applmgr:

Last login: Mon Mar 28 03:22:57 EDT xxxx

Hostname: server_name.domain.com

OS: Red Hat Enterprise Linux release 8.10 (Ootpa)

Arch: x86_64

[root@server_name ~]# chage -E -1 batch_mgr # -1 <== number

Notes: passing the number -1 to Expire Date (-E) only never expires the account, but not unexpire the password.

[root@server_name ~]# chage -l batch_mgr # -l <== --list

Last password change : Feb 14, 2023

Password expires : May 15, 2023

Password inactive : Jun 14, 2023

Account expires : never

Minimum number of days between password change : 7

Maximum number of days between password change : 90

Number of days of warning before password expires : 7

[root@server_name ~]# chage -M -1 batch_mgr

Notes: passing the number -1 as MAX DAYS (-M) will remove checking a password validity, which turns off the various password aging properties. Now batch_mgr can use its existing password to login.

[root@server_name ~]# chage -l batch_mgr

Last password change : Feb 14, 2023

Password expires : never

Password inactive : never <= never be deactivated due to inactivity

Account expires : never

Minimum number of days between password change : 7

Maximum number of days between password change : -1

Number of days of warning before password expires : 7

[root@server_name ~]# chage -l applmgr # b/c applmgr was originally created in AD

chage: user 'applmgr' does not exist in /etc/passwd

To change a user's password as root:

[root@server_name ~]# passwd batch_mgr

Enter new UNIX password:

... ...

Saturday, February 17, 2024

Shell script for renewing ssl certificate

My post Re-new R12.2 ssl certificate has details on how to renew a certificate. A shell script helps a lot when there are many EBS instances waiting for renewal. I wrote below script which takes only one minute to renew the cert on each node after the certificate is renewed on Venafi website and downloaded/copied to Linux server.

As of today, we still have difficulties using .yaml file to extract certificate from Venafi server to Linux server automatically. We tried to set up a "push" way on Venafi website to do the automation. But if the password is changed on the Linux account, the push will fail.

============= Script renew_cert.sh ============

# Script for renewing ssl certificate after new cert file is saved to Linux server

walletpwd='putPWDhere'

# walletpwd='tttest'

walletloc=$HOME/xxx/Certs_Renew # path where the Venafi cert file is saved

walletname='ewallet.p12' # Must name Venafi cert file to this name

certname='cwallet.sso'

echo "cert at: $walletloc"

echo "cert name: $walletname"

echo $walletpwd

cd $walletloc

errorC=`env| grep RUN_BASE | wc -l`

if [ $errorC -lt 1 ]; then

echo "No R12.2 environment"

exit 1

# . $HOME/EBSQA/EBSapps.env RUN

alias orapki=$FMW_HOME/oracle_common/bin/orapki

orapki wallet display -wallet $walletloc/$walletname -pwd $walletpwd > viewCert.log

errorC=`egrep -i 'PKI-' viewCert.log | wc -l`

echo "Error: $errorC"

if [ $errorC -gt 0 ]; then

echo "The password is incorrect or the Venafi cert file is incorrect."

exit 2

DT=`date +"%h_%d_%y_%H%M"`

if [ -f $certname ]; then

mv $certname ${certname}_${DT}

orapki wallet create -wallet $walletloc/$walletname -pwd $walletpwd -auto_login

if [ ! -f $certname ]; then

echo "Failure in getting new cert file. Exiting."

exit 3

echo " "

echo "Copy cert file to directories ..."

cd $NE_BASE/inst/$CONTEXT_NAME/certs # save a copy in this folder

if [ -d Apache ]; then

mv Apache Apache_${DT}

mkdir Apache

cd Apache

pwd

cp -p $walletloc/$walletname ${walletname}

cp -p $walletloc/$certname ${certname}

iName=$(tr < $CONTEXT_FILE '<>' ' ' | awk '/"s_ohs_instance"/ {print $(NF-1)}' )

SUBiName=${iName%?????}

cd $FMW_HOME/webtier/instances/$iName/config/OPMN/opmn/wallet

pwd

if [ -f $certname ]; then

mv $certname ${certname}_${DT}

cp -p $walletloc/$certname ${certname}

cd $FMW_HOME/webtier/instances/$iName/config/OHS/$SUBiName/keystores/default

pwd

if [ -f $certname ]; then

mv $certname ${certname}_${DT}

cp -p $walletloc/$certname ${certname}

cd $FMW_HOME/webtier/instances/$iName/config/OHS/$SUBiName/proxy-wallet

pwd

if [ -f $certname ]; then

mv $certname ${certname}_${DT}

cp -p $walletloc/$certname ${certname}

echo " "

echo "Recycle Apache service..."

cd $ADMIN_SCRIPTS_HOME

./adopmnctl.sh stop

sleep 10

./adopmnctl.sh status

./adapcctl.sh start

./adopmnctl.sh status

echo "Paths for log files:"

echo $FMW_HOME/webtier/instances/$iName/diagnostics/logs/OHS/$SUBiName

echo $FMW_HOME/webtier/instances/$iName/diagnostics/logs/OPMN/opmn

============ End ==========

Run the script to renew certificate on each node:

$ ./renew_cert.sh

cert at: $HOME/temp/Certs_Renew

cert name: ewallet.p12

putPWDhere

Error: 0

Oracle PKI Tool : Version 11.1.1.9.0

Copy cert file to directories ...

/u04/app/EBSQA/fs_ne/inst/EBSQA_nodeName/certs/Apache

$FMW_HOME/webtier/instances/EBS_web_EBSQA_OHS1/config/OPMN/opmn/wallet

$FMW_HOME/webtier/instances/EBS_web_EBSQA_OHS1/config/OHS/EBS_web_EBSQA/keystores/default

$FMW_HOME/webtier/instances/EBS_web_EBSQA_OHS1/config/OHS/EBS_web_EBSQA/proxy-wallet

Recycle Apache service ...

You are running adopmnctl.sh version 120.0.12020000.2

Stopping Oracle Process Manager (OPMN) and the managed processes ...

opmnctl stopall: stopping opmn and all managed processes...

adopmnctl.sh: exiting with status 0

adopmnctl.sh: check the logfile $LOG_HOME/appl/admin/log/adopmnctl.txt for more information ...

You are running adopmnctl.sh version 120.0.12020000.2

Checking status of OPMN managed processes...

opmnctl status: opmn is not running.

adopmnctl.sh: exiting with status 0

adopmnctl.sh: check the logfile $LOG_HOME/appl/admin/log/adopmnctl.txt for more information ...

You are running adapcctl.sh version 120.0.12020000.6

Starting OPMN managed Oracle HTTP Server (OHS) instance ...

adapcctl.sh: exiting with status 0

adapcctl.sh: check the logfile $LOG_HOME/appl/admin/log/adapcctl.txt for more information ...

You are running adopmnctl.sh version 120.0.12020000.2

Checking status of OPMN managed processes...

Processes in Instance: EBS_web_ARQA_OHS1

--------------------------------+--------------------+---------+---------

ias-component | process-type | pid | status

--------------------------------+--------------------+---------+---------

EBS_web_EBSQA | OHS | 14542 | Alive

adopmnctl.sh: exiting with status 0

adopmnctl.sh: check the logfile $LOG_HOME/appl/admin/log/adopmnctl.txt for more information ...

Paths for log files:

$FMW_HOME/webtier/instances/EBS_web_EBSQA_OHS1/diagnostics/logs/OHS/EBS_web_EBSQA

$FMW_HOME/webtier/instances/EBS_web_EBSQA_OHS1/diagnostics/logs/OPMN/opmn

Check files in folder $HOME/xxx/Certs_Renew:

$ ls

renew_cert.sh

ewallet.p12

ewallet.p12.lck

cwallet.sso.lck

viewCert.log

cwallet.sso

NOTES: there is a cert file in $EBS_DOMAIN_HOME/opmn/EBS_web_EBSQA_OHS1/wallet and $EBS_DOMAIN_HOME/opmn/EBS_web_EBSQA_OHS1/EBS_web/wallet. But I do not know what uses them.