Wednesday, June 29, 2022

FS_CLONE failed with SSL cert file and ADOP troubleshooting

When run " adop phase=FS_CLONE " to sync RUN file system on /fs1 to PATCH file system on /fs2 on a concurrent node in which Web/Forms services are not enabled, it failed with generic error (getting from "adopscanlog -latest=yes"):

FUNCTION: main::runFSCloneApply [ Level 1 ]
ERRORMSG: $COMMON_TOP/adopclone_nodeNamw/bin/adclone.pl did not go through successfully.

Log file $NE_BASE/EBSapps/log/adop/6/xxx /fs_clone/xxm4p/TXK_SYNC_create/fsclone_apply/ohsT2PApply/CLONE2022-10-14_16-38-02_839829939.log points out an error on /fs2. Seems it had problem in starting Apache service on /fs2.

Error Message  :1  
PLUGIN][OHS] - ERROR - Oct 14, 2022 16:38:27 - CLONE-26009   OHS T2P failed.  
[PLUGIN][OHS] - CAUSE - Oct 14, 2022 16:38:27 - CLONE-26009   Unable to start OS component. [PLUGIN][OHS] - ACTION - Oct 14, 2022 16:38:27 - CLONE-26009   Check clone log and error file and ohs log file $PATCH_BASE/FMW_Home/webtier/instances/EBS_web_OHS3/diagnostics/logs/OHS/EBS_web/console~OHS~1.log for root cause.

 After I copied good file cwallet.sso to right directories which is required for enabling ssl, FS_CLONE worked. Most likely, the cwallet.sso expired and did not get updated. 

So, fs_clone process tests starting Apache service on /fs2 even Web services are not enabled on RUN file system /fs1 of the node.

When looking further, found a message in a log file $PATCH_BASE/inst/apps/EBSDEV_xxm4p/logs/appl/rgf/TXK/txkSetAppsConf_10141635.log on /fs2
ERROR: The value <$PATCH_BASE/FMW_Home/webtier/instances/EBS_web_OHS3> for s_ohs_instance_loc in <CONTEXT_FILE on /fs2>

This message is a false statement and a misleading one.

Troubleshoot ADOP error

In another FS_CLONE failure on the primary node with same error 
FUNCTION: main::runFSCloneApply

To troubleshoot it, I ran "adopscanlog" to get the list of errors. Note "adopscanlog -latest=yes" may not list important errors at the very beginning. Seems they list errors by timestamp order.

Log file  $NE_BASE/EBSapps/log/adop/6/20221201_155422/fs_clone/nodeName/TXK_SYNC_create/fsclone_stage/FSCloneStageAppsTier_12011556.log has errors

ERROR: Number of servers are not in sync between Run and Patch Context.
ERROR: Managed Servers would be in-sync at apply phase

The Error matches the one in Doc ID 1634239.1 ("Number of servers are not in sync between Run and Patch Context" While fs_clone of patching (WLS 17495356) Cycle). I ran SQL statements from the document and found the difference. So, the fix is to make them match.

SQL> SELECT
extractValue(XMLType(TEXT),'//oa_service_name[@oa_var="s_adminservername"]'),
extractValue(XMLType(TEXT),'//oacore_server_ports'),
extractValue(XMLType(TEXT),'//forms_server_ports'),
extractValue(XMLType(TEXT),'//oafm_server_ports'),
extractValue(XMLType(TEXT),'//forms-c4ws_server_ports'),
extractValue(XMLType(TEXT),'//oaea_server_ports')
from fnd_oam_context_files
where name not in ('TEMPLATE','METADATA')
and (status is null or status !='H')
and EXTRACTVALUE(XMLType(TEXT),'//file_edition_type')='patch'
and CTX_TYPE = 'A';
--------------------------------------------------------------------------------
AdminServer
null  (blank)
null
null
null
null

SQL> SELECT
extractValue(XMLType(TEXT),'//oa_service_name[@oa_var="s_adminservername"]'),
extractValue(XMLType(TEXT),'//oacore_server_ports'),
extractValue(XMLType(TEXT),'//forms_server_ports'),
extractValue(XMLType(TEXT),'//oafm_server_ports'),
extractValue(XMLType(TEXT),'//forms-c4ws_server_ports'),
extractValue(XMLType(TEXT),'//oaea_server_ports')
from fnd_oam_context_files
where name not in ('TEMPLATE','METADATA')
and (status is null or status !='H')
and EXTRACTVALUE(XMLType(TEXT),'//file_edition_type')='run'
and CTX_TYPE = 'A';
--------------------------------------------------------------------------------
AdminServer
oacore_server1:7205
forms_server1:7405
oafm_server1:7605
forms-c4ws_server1:7805
null

AdminServer
oacore_server1:7205
forms_server1:7405
oafm_server1:7605
forms-c4ws_server1:7805
null

Also, log $NE_BASE/EBSapps/log/adop/6/20221201_155422/fs_clone/nodeName/TXK_SYNC_create/fsclone_apply/FSCloneApplyAppsTier_12011608.log has error at its end:

configProperty id = Server2
Count for NodeIterator nextNode = 3
ERROR: Managed Server's are not in sync between file system context and DB context
ERROR: Update Moveplan Fail

START: Inside exitClone....
Updating status INCOMPLETE for ApplywlsTechStack
START: Updating status INCOMPLETE for action ApplywlsTechStack
END: Updated status INCOMPLETE for action ApplywlsTechStack

The error matches the description in Doc ID 2296036.1 (EBS R12.2: FS_CLONE Failing With ERROR: Update Moveplan Fail). But the Machine list under " 'Domain Structure' => click on the Environments => 'Machines' " link in WebLogic Admin Console is correct. So this document did not help. Note: only web/forms nodes are listed under Machine.

Log file $PATCH_BASE/inst/apps/EBSDEV_nodeName/logs/appl/rgf/TXK/txkSetAppsConf_08241646.log on /fs2 gives a different error without detail. I believe it is irrelevant. 

Error in getting Context Value for s_ohs_instance_loc
Config Option : standaloneohsconfig

No comments: