Saturday, May 25, 2024

EBS forms failed by CrowdStrike

EBS Forms in our financial applications suddenly does not work. The message on the webpage is
Failure of Web Server bridge:
No backend server available for connection: timed out after 10 seconds or idempotent set to OFF or method not idempotent. 
 
The error message does not tell the true problem. When checking into services on OS level, I saw Oracle EBS Forms service was not running and also saw errors from startup script $ADMIN_SCRIPTS_HOME/adstrtal.sh:

Forms service failed to start. 
The Node Manager is already up.
ERROR: Unable to start up the managed server forms_server1
Server specific logs are located at $EBS_DOMAIN_HOME/servers/forms_server1/logs
05/13/24-20:56:26 :: admanagedsrvctl.sh: exiting with status 1

Java error exists in Forms log file $EBS_DOMAIN_HOME/servers/forms_server1/logs/forms_server1.out

<May 13, 2024 8:56:25 PM EDT> <Emergency> <Store> <BEA-280060> <The persistent store "_WLS_forms_server1" encountered a fatal error, and it must be shut down: weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the log file
weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the log file
        at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:128)
        at weblogic.store.internal.PersistentStoreImpl.recoverStoreConnections(PersistentStoreImpl.java:435)
        at weblogic.store.internal.PersistentStoreImpl.open(PersistentStoreImpl.java:423)
        at weblogic.store.admin.AdminHandler.activate(AdminHandler.java:126)
        at weblogic.store.admin.FileAdminHandler.activate(FileAdminHandler.java:207)
        Truncated.
Caused By: java.io.EOFException: premature EOF: expected=512, actual=126
        at weblogic.store.io.file.StoreFile.readBulk(StoreFile.java:316)
        at weblogic.store.io.file.Heap.readStoreFile(Heap.java:1142)
        at weblogic.store.io.file.Heap.getNextRecoveryFile(Heap.java:1226)
        at weblogic.store.io.file.Heap.open(Heap.java:373)
        at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:117)
        Truncated.

Seems WebLogic failed to open a file, but the log did not say which file. I knew that Linux Admins just did server maintenance and rebooted server after they applied monthly patches and Security updates on OS level. That was the only change in the application environment recently.

After searching around, I found the Java errors match the description in Oracle Doc ID 3017110.1 ( Managed Forms Server Fails To Start - Displaying Message: FAILED_NOT_RESTARTABLE - ERROR: <BEA-280061> The persistent store "_WLS_forms_server1" could not be deployed: weblogic.store.PersistentStoreFatalException [Store:280020] ). 

The document points out the problem is caused by CrowdStrike, which locks a Forms file in $EBS_DOMAIN_HOME/servers/forms_server#/data/store/default.

CrowdStrike is installed in /opt/CrowdStrike. It is owned by root, and it is running constantly on the Linux server.
$ ps -ef | grep falcon-sensor
root      1081  1079  0 May13 ?        00:22:23 falcon-sensor

The problem can be fixed temporarily by a workaround:

1. Delete/rename below .DAT file (I guess CrowdStrike does not like the file name and so locks it)
$ cd $EBS_DOMAIN_HOME/servers/forms_server1/data/store/default
$ ls -altr
total 1028
drwxr-xr-x 4 user group      40 Sep 13  2023 ..
-rw-r--r-- 1   user group  1049088 May 13 20:51 _WLS_FORMS_SERVER1000000.DAT
drwxr-xr-x 2 user group     42 May 13 20:56 .
$ rm _WLS_FORMS_SERVER1000000.DAT

2. Re-start services cleanly by
$ADMIN_SCRIPTS_HOME/adstrtal.sh
Or
$ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh start forms_server1

The permanent fix is that the Secure team rolls back the CrowdStrike change (and applies it again until CrowdStrike fixes the problem), because its new update touches an Oracle Forms data file wrongly during its scan. 

No comments: