Tuesday, June 28, 2022

Abort an ADOP patching cycle

When trying to run through the adop cycle before database 19c upgrade, it failed on PREPARE:
$ adop phase=prepare
... ...
Initializing.
    Run Edition context  : $CONTEXT_FILE
    Patch edition context: ... ...
    Patch file system free space: 50.28 GB

Validating system setup.
    Node registry is valid.
    Log: $APPLRGF/TXK/verifyssh.log
    Output: $APPLRGF/TXK/out.xml
    Remote execution is operational.
... ...
Running prepare phase on node(s): ... ...
    [UNEXPECTED]adop phase=prepare on all the nodes returned failure.
    [UNEXPECTED]Unable to continue processing.

The log file does not tell any specific reason. The SQL line prior to the error tells it may fail on checking the status of a CUSTOM TOP.

$ vi txkADOPPreparePhaseSynchronize.log
... ...
==========================
Inside getPatchStatus()...
==========================
SQL Command: SELECT status||',' FROM ad_adop_session_patches 
WHERE  --... ...
bug_number = 'ADSPLICE_docXX'

patch_status = Y
Updated patch_status = Y
EXIT STATUS: 1

=============================
Inside evalADPATCHStatus()...
=============================
message_status: ERROR
Adsplice action did not go through successfully.
*******FATAL ERROR*******
PROGRAM : ($AD_TOP/patch/115/bin/txkADOPPreparePhaseSynchronize.pl)
TIME    : Wed Jul 20 14:00:36 2022
FUNCTION: main::execADSPLICE [ Level 1 ]
ERRORMSG: Adsplice action did not go through successfully.

I ran below statement to get an idea what could be the problem, then tried to de-register the bad CUSTOM TOP docXX (as I did not know what was bad):
SQL> select * FROM ad_adop_session_patches where bug_number like 'ADSPLICE%'
order by adop_session_id desc, end_date desc;

$ perl $AD_TOP/bin/adDeregisterCustomProd.pl

Enter the APPS username: apps
Enter the APPS password:
Enter the Custom Application to De-Register: docXX

Enter the Application Id to De-Register: 20268
Script adjava oracle.apps.ad.splice.adCustProdSplicer mode=uninstall options=inputpair logfile=customderegister.log
.. ...
Performing Validations for De-Register

AD Custom Product Splicer error:
Patching cycle in progress - run this utility from patch file system
You can only run it from run file system when not patching

AD Custom Product Splicer is completed with errors.
Please see the below log file for more details.

I believe I also tried to run "adop phase=fs_clone" at this stage, and it did not do anything and failed very quickly. I did not want and had no confidence to de-register a CUSTOM TOP from patch file system. So, to run through the ADOP patch cycle, the only choice is to abort the patch session first in below three steps:

$ adop phase=abort
.. ...
Validating system setup.
    Node registry is valid.
    Log: $APPLRGF/TXK/verifyssh.log
    Output: $APPLRGF/TXK/out.xml
    Remote execution is operational.

Checking for existing adop sessions.
    Continuing with existing session [Session ID: 7].
        Session Id            :   7
        Prepare phase status  :  FAILED
        Apply phase status    :    NOT COMPLETED
        Cutover  phase status :  NOT COMPLETED
        Abort phase status    :    NOT COMPLETED
        Session status        :       FAILED
The above session will be aborted. Do you want to continue [Y/N]? Y
===================================================================
ADOP (C.Delta.12)
Session ID: 7
Node: node1Name
Phase: abort
Log: $NE_BASE/EBSapps/log/adop/7/20220X21_170727/adop.log
=====================================================================

Verifying existence of context files in database.

Checking if adop can continue with available nodes in the configuration.
    Log: $NE_BASE/EBSapps/log/adop/7/20220721_170727/abort/node1Name
        txkADOPEvalSrvStatus.pl returned SUCCESS

Creating list of nodes list where abort needs to be run.
    The abort phase needs to be run on node: node1Name
    The abort phase needs to be run on node: node2Name
    The abort phase needs to be run on node: node3Name

Running abort phase on node(s): [node1Name,node2Name and node3Name].
    Output: $NE_BASE/EBSapps/log/adop/7/20220721_170727/abort/remote_execution_result_level1.xml
    Log: $NE_BASE/EBSapps/log/adop/7/20220721_170727/abort/node1Name
        txkADOPEvalSrvStatus.pl returned SUCCESS

Generating node-specific status report.
    Output: $NE_BASE/EBSapps/log/adop/7/20220721_170727/adzdnodestat.out

Summary report for current adop session:
    Node node1Name: Completed successfully
       - Abort status:      Completed successfully
    Node node2Name: Completed successfully
       - Abort status:      Completed successfully
    Node node3Name: Completed successfully
       - Abort status:      Completed successfully
    For more details, run the command: adop -status -detail
adop exiting with status = 0 (Success)

$ adop phase=cleanup cleanup_mode=full
... ...
The cleanup phase completed successfully.
adop exiting with status = 0 (Success)

$ adop -status

Enter the APPS password:
Connected.
==============================================================
ADOP (C.Delta.12)
Session Id: 7
Command: status
Output: $NE_BASE/EBSapps/log/adop/7/20220X21_172115/adzdshowstatus.out
===============================================================
Node Name       Node Type  Phase           Status                            Started              Finished             Elapsed
--------------- ---------- --------------- ----------------------------------- ------------------- -------------------- ------------
node1Name       master  PREPARE     SESSION ABORTED 2022/0X/20 13:38:49  2022/0X/21 12:54:19  23:15:30
                                       APPLY               SESSION ABORTED
                                       FINALIZE         SESSION ABORTED
                                       CUTOVER        SESSION ABORTED
                                       CLEANUP        COMPLETED       2022/0X/21 17:13:58  2022/0X/21 17:14:48  0:00:50
node2Name       slave    PREPARE      SESSION ABORTED 2022/0X/20 13:38:49  2022/0X/21 12:55:01  23:16:12
                                       APPLY              SESSION ABORTED
                                       FINALIZE        SESSION ABORTED
                                       CUTOVER       SESSION ABORTED
                                       CLEANUP       COMPLETED       2022/0X/21 17:13:58  2022/0X/21 17:14:48  0:00:50
node3Name       slave    PREPARE      SESSION ABORTED 2022/0X/20 13:38:49  2022/0X/21 12:55:00  23:16:11
                                       APPLY              SESSION ABORTED
                                       FINALIZE        SESSION ABORTED
                                       CUTOVER       SESSION ABORTED
                                       CLEANUP       COMPLETED       2022/0X/21 17:13:58  2022/0X/21 17:14:48  0:00:50
File System Synchronization Type: Full

INFORMATION: Patching cycle aborted, so fs_clone will run automatically on node2Name,node3Name nodes in prepare phase of next patching cycle.
adop exiting with status = 0 (Success)

$ adop phase=fs_clone
... ...
The fs_clone phase completed successfully.
adop exiting with status = 0 (Success)

When I ran "adop -status" again, it did not mention "fs_clone" but gave the same status as in previous one.

Then, I stop Apps services and de-register the custom top on all nodes. Now, it worked:
$ ./adstpall.sh apps/appsPWD
$ perl $AD_TOP/bin/adDeregisterCustomProd.pl   (all three nodes)

$ adstrtal.sh apps/appsPWD

After that, the patch cycle ran through successfully on the current run file system:
$ adop phase=prepare
$ adop phase=actualize_all
$ adop phase=finalize finalize_mode=full
$ adop phase=cutover
On the new run file system:
$ adop phase=cleanup cleanup_mode=full 

To re-create the CUSTOM TOP (on all three nodes. First, make sure three config files in $APPL_TOP/admin are good on all nodes):

$ ./adstpall.sh apps/appsPWD
$ adsplice 


In another case, CUTOVER phase failed with below error. I had no idea on how to fix it and ran ABORT phase to get out from it.
$NE_BASE/EBSapps/log/adop/5/20210228_114950/cutover/validate/node1Name/ADOPValidations_detailed.log: 
-------------------------------------------------------------------------------------------------------------------
Lines #(51-55):
Checking the value of s_active_webport...

ERROR: The value of s_active_webport are different on RUN & PATCH Context files.
The Value present in RUN Context file = 4484
The Value present in PATCH Context file = 8041

I think if I ran "adop phase=fs_clone" BEFORE running PREPARE phase, it might screen out the issue or avoid the issue and so I did not have to run ABORT (which is time consuming).

Monday, June 20, 2022

ADOP hits synchronization error after patching

After applying patches separately to each node in downtime mode, "adop phase=fs_clone" (and "adop phase=prepare") failed with same error:
  Checking for existing adop sessions.
    No pending session exists.
    Starting new adop session.
    [UNEXPECTED]The following nodes are not synchronized: node2Name, node3Name
    You must synchronize the nodes before continuing
    [UNEXPECTED]Unrecoverable error occurred. Exiting current adop session.

ADOP quits quickly and does not post any error in log file. 

Use below query to find PATCHRUN_ID was wrong on two nodes for a patch:
SQL> select bug_number, patchrun_id, node_name, adop_session_id 
from  AD_ADOP_SESSION_PATCHES 
where bug_number in ('33207251'); 

BUG_NUMBER   PATCHRUN_ID  NODE_NAME  ADOP_SESSION_ID
---------------------  ---------------------  ------------------- -------------------
33207251          -1             node2Name       4
33207251         29420             primaryName            4
33207251              -1             node3Name       4   

It seems patch 33207251 was included and applied with January 2022 CPU patch 33487428. But it then was applied again. There was something wrong during that. Sometimes, the keyStore file for Java Signing was wrong on a node and that failed the patching.

The fix to re-apply patch 33207251 on node2Name and node3Name separately by

$ adop phase=apply apply_mode=downtime patches=33207251 allnodes=no action=nodb restart=yes options=forceapply,nodatabaseportion
... ...
Validating credentials.
Initializing.
    Run Edition context  : /.../xxx.xml
    Patch edition context: /../xxx.xml

Warning: Ignoring 'abandon' parameter as no failed previous patching cycle was found.
Warning: Ignoring 'restart' parameter as no failed previous patching cycle was found.
    Patch file system free space: 43.73 GB

Validating system setup.
    Node registry is valid.

Checking for existing adop sessions.
    Application tier services are down.
    Continuing with the existing session [Session ID: 6].
... ...
Applying patch 32501487.
    Log: $NE_BASE/EBSapps/log/adop/... /33207251/log/u33207251.log
... ...
The apply phase completed successfully.
adop exiting with status = 0 (Success)

$ egrep -i 'error|fail|ora-' u33207251.log

After re-apply, it updated table AD_ADOP_SESSION_PATCHES, and then "adop phase=fs_clone" worked successfully.

FS_CLONE option: force=yes/no [default: no]
       Use force=yes to restart a previous failed fs_clone command from the beginning.  
       By default fs_clone will restart where it left off.

Before applying the patch again, I tried to update table AD_ADOP_SESSION_PATCHES manually by a SQL statement. But that did not make "adop phase=fs_clone" move forward.

NOTES: 
If necessary, FS_CLONE can be run separately on each node by
$ adop phase=fs_clone allnodes=no action=nodb force=yes

But if it gets below error, you have to run "adop phase=fs_clone" from the Primary node:
[UNEXPECTED]The admin server for the patch file system is not running.        
Start the patch file system admin server from the admin node and then rerun fs_clone.

After "adop phase=fs_clone" completed successfully, "adopscanlog -latest=yes" may still show various errors, such as ORA- error.

Saturday, June 18, 2022

FS_CLONE failed with directory error

I ran "adop phase=fs_clone" and it failed on the 2nd node of a two-node instance. The error is

$ adopscanlog -latest=yes
... ...
$NE_BASE/EBSapps/log/adop/.../fs_clone/node2Name/TXK_SYNC_create/txkADOPPreparePhaseSynchronize.log:
-------------------------------------------------------------------------------------------------------------
Lines #(345-347):
... ...
FUNCTION: main::removeDirectory [ Level 1 ]
ERRORMSG: Failed to delete the directory $PATCH_BASE/EBSapps/comn.

When that happened, some folders were already deleted from PATCH file system by FS_CLONE. It is kind of scared. The only way to get over this is to address the root cause and then re-run FS_CLNE. 

The error matches the description in Doc ID 2690029.1 (ADOP: Fs_clone fails with error Failed to delete the directory). The root cause is that developers copied files to a directory owned by applMgr or concurrent jobs wrote logs to folders under CUSTOM TOPs. In most cases it happens under CUSTOM TOPs. Now applMgr has no permission to remove them. The fix is to ask OS System Admin to find them and change their owner to applMgr or delete all troubling files as the file owner.

$ cd $PATCH_BASE/EBSapps/comn.
$ find . ! -user applMgr                   (Then, login as the file owner to delete them)
$ ls -lR  | grep -v applMgr | more    (Optional. to see the detailed list)
$ find . -user wrong_userID -exec chown applMgr:userGroup {} \;

After the fix on OS level, I did try to run "adop phase=fs_clone allnodes=no force=yes" on the 2nd node directly and got error:
[UNEXPECTED]The admin server for the patch file system is not running.        
Start the patch file system admin server from the admin node and then rerun fs_clone.

There two options to make it work.  Run "adop phase=fs_clone force=yes" on the Primary node. Seems it understands fs_clone worked on 1st node and quickly progressed to run it on the 2nd node. Or, start WLS Admin Server on the Primary node and then run "adop phase=fs_clone allnodes=no force=yes" on the 2nd node.

FS_CLONE normal log:

$ adop phase=fs_clone
... ...
Running fs_clone on admin node: [node1Name].
    Output: $NE_BASE/EBSapps/log/adop/.../fs_clone/remote_execution_result_level1.xml
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/node1Name/txkADOPEvalSrvStatus.pl returned SUCCESS

Running fs_clone on node(s): [node2Name].
    Output: $NE_BASE/EBSapps/log/adop/.../fs_clone/remote_execution_result_level2.xml
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/node2Name/txkADOPEvalSrvStatus.pl returned SUCCESS

Stopping services on patch file system.

    Stopping admin server.
You are running adadminsrvctl.sh version 120.10.12020000.11
Stopping WLS Admin Server...
Refer $PATCH_BASE/inst/apps/$CONTEXT_NAME/logs/appl/admin/log/adadminsrvctl.txt for details
AdminServer logs are located at $PATCH_BASE/FMW_Home/user_projects/domains/EBS_domain/servers/AdminServer/logs
adadminsrvctl.sh: exiting with status 0
adadminsrvctl.sh: check the logfile $PATCH_BASE/inst/apps/$CONTEXT_NAME/logs/appl/admin/log/adadminsrvctl.txt for more information ...

    Stopping node manager.
You are running adnodemgrctl.sh version 120.11.12020000.12
The Node Manager is already shutdown
NodeManager log is located at $PATCH_BASE/FMW_Home/wlserver_10.3/common/nodemanager/nmHome1
adnodemgrctl.sh: exiting with status 2
adnodemgrctl.sh: check the logfile $PATCH_BASE/inst/apps/$CONTEXT_NAME/logs/appl/admin/log/adnodemgrctl.txt for more information ...

Summary report for current adop session:
    Node node1Name:
       - Fs_clone status:   Completed successfully
    Node node2Name:
       - Fs_clone status:   Completed successfully
    For more details, run the command: adop -status -detail
adop exiting with status = 0 (Success)

NOTES:
It one instance, FS_CLONE took 8 hours on first node, during no log entries and updates. Just stayed frozen for hours! 

Sunday, June 5, 2022

Run FS_CLONE on each node separately

In some situation, FS_CLONE failed on one node and you have to make it work on other node(s). 

Notes FS_CLONE parameter: force=yes/no [default: no]
       - Use force=yes to restart a previous failed fs_clone command from the beginning.  
       - By default fs_clone will restart where it left off.

$ adop -status
... ...
Node Name     Node   Type  Phase  Status          Started                             Finished                        Elapsed
--------------- ---------- --------------- --------------- ------------------------ ----------------------------
node2Name      slave      FS_CLONE  FAILED             2022/04/26 08:10:47                                     3:18:01
primaryName      master    FS_CLONE   COMPLETED  2022/04/26 08:10:47  2022/04/26 08:54:44  0:43:57

After the problem was fixed, one way is to just run it again on primary node and fs_clone will start from where it failed.

$ adop phase=fs_clone
... ...
Checking if adop can continue with available nodes in the configuration.
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/node1Name
        txkADOPEvalSrvStatus.pl returned SUCCESS
Skipping configuration validation on admin node: [primaryNode]

Validating configuration on node(s): [node2Name].
    Output: $NE_BASE/EBSapps/log/adop/.../fs_clone/validate/remote_execution_result_level2.xml
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName
        txkADOPEvalSrvStatus.pl returned SUCCESS

Starting admin server on patch file system.

Running fs_clone on node(s): [node2Name].
    Output: $NE_BASE/EBSapps/log/adop/...fs_clone/remote_execution_result_level2.xml
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName
        txkADOPEvalSrvStatus.pl returned SUCCESS
... ...

The other way is to run it just on the failed node. Here are the steps to run FS_CLONE on each separate node. 

1. Run fs_clone on primary node

$ adop -status

Enter the APPS password:
Connected.
====================================================
ADOP (C.Delta.12)
Session Id: 12
Command: status
Output: $NE_BASE/EBSapps/log/adop/12/.../adzdshowstatus.out
====================================================
Node Name Node Type  Phase    Status    Started Finished      Elapsed
--------------- ---------- --------------- --------------- -------------------- -------------------- ------------
primaryName master   APPLY    ACTIVE    2022/03/07 09:09:36 2022/03/09 11:42:38  50:33:02
             CLEANUP    NOT STARTED
node2Name      slave    APPLY    ACTIVE    2022/03/07 09:47:44 2022/03/09 11:47:34  49:59:50
            CLEANUP    NOT STARTED

$ adop phase=fs_clone allnodes=no action=db
... ...
Checking for pending cleanup actions.
    No pending cleanup actions found.

Blocking managed server ports.
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName/txkCloneAcquirePort.log

Performing CLONE steps.
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName

Beginning application tier FSCloneStage - Thu May 18 13:54:06 2022
... ...
Log file located at $INST_TOP/admin/log/clone/FSCloneStageAppsTier_05181354.log
Completed FSCloneStage...
Thu May 18 14:01:38 2022

Beginning application tier FSCloneApply - Thu May 18 14:04:11 2022
... ...
Log file located at $INST_TOP/admin/log/clone/FSCloneApplyAppsTier_05181404.log
Target System Fusion Middleware Home set to $PATCH_BASE/FMW_Home
Target System Web Oracle Home set to $PATCH_BASE/FMW_Home/webtier
Target System Appl TOP set to $PATCH_BASE/EBSapps/appl
Target System COMMON TOP set to $PATCH_BASE/EBSapps/comn

Target System Instance Top set to $PATCH_BASE/inst/apps/$CONTEXT_NAME
Report file located at $PATCH_BASE/inst/apps/$CONTEXT_NAME/temp/portpool.lst
The new APPL_TOP context file has been created : $CONTEXT_FILE on /fs2/
contextfile=$CONTEXT_FILE on /fs2/
Completed FSCloneApply...
Thu May 18 14:28:29 2022

Resetting FARM name...
runDomainName: EBS_domain
patchDomainName: EBS_domain
targets_xml_loc: $PATCH_BASE/FMW_Home/user_projects/domains/EBS_domain/sysman/state/targets.xml
Patch domain is not updated, no need to reset FARM name.

Releasing managed server ports.
    Log: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName/txkCloneAcquirePort.log

Synchronizing snapshots.

Generating log report.
    Output: $NE_BASE/EBSapps/log/adop/.../fs_clone/primaryName/adzdshowlog.out

The fs_clone phase completed successfully.
adop exiting with status = 0 (Success)

$ adop -status
... ...
Node Name  Node Type  Phase           Status                   Started                        Finished                     Elapsed
------------------ ---------- ------------ ---------------------- ---------------- ------------------ ------------
node2Name    slave      FS_CLONE     NOT STARTED   2022/05/18 13:24:35                                     1:66:38
primaryName    master    FS_CLONE     COMPLETED     2022/05/18 13:24:35  2022/0518 14:28:39  1:64:04

2. On primary node, confirm the WLS Admin server is running on PATCH file system and its WebLogic Console page is accessible from browsers. If it is not, start it because FS_CLONE on 2nd node will need it to avoid error:
[UNEXPECTED] The admin server for the patch file system is not running
Note: you can not start/stop WLS Admin server on a non-primary node, because that gives message:
adadminsrvctl.sh should be run only from the primary node primaryName

$ echo $FILE_EDITION
patch
$ cd $ADMIN_SCRIPTS_HOME

$ ./adadminsrvctl.sh start forcepatchfs
$ ./adadminsrvctl.sh status
... ...
 The AdminServer is running

$ ./adnodemgrctl.sh status
... ...
The Node Manager is not up.

$ grep s_wls_adminport $CONTEXT_FILE
<wls_adminport oa_var="s_wls_adminport" oa_type="PORT" base="7001" step="1" range="-1" label="WLS Admin Server Port">7032</wls_adminport>

Test WebLogic page from a browser: primaryNode.domain.com:7032/console

3. Run fs_clone on the 2nd node (use "force=yes" option in most times to start from the beginning)

$ echo $FILE_EDITION
run
$ export CONFIG_JVM_ARGS="-Xms1024m -Xmx2048m"   (optional)
$ adop phase=fs_clone allnodes=no action=nodb force=yes
... ...
Releasing managed server ports.
Synchronizing snapshots.
Generating log report.
    Output: $NE_BASE/EBSapps/log/adop/.../fs_clone/node2Name/adzdshowlog.out

The fs_clone phase completed successfully.
adop exiting with status = 0 (Success)

$ adop -status
... ...
Node Name       Node Type  Phase           Status          Started              Finished             Elapsed
----------------- ---------- --------------- --------------- -------------------- ------------------- ------------
node2name     slave      FS_CLONE        COMPLETED       2022/05/18 13:24:35  2022/05/18 15:02:02  1:97:27
primaryName    master    FS_CLONE        COMPLETED       2022/05/18 13:24:35  2022/05/18 14:28:39  1:64:04

4.  On primary node, stop the WLS Admin server 

$ echo $FILE_EDITION
patch
$ ./adadminsrvctl.sh stop
$ stop node manager if it is running.

$ ps -ef | grep fs2        <== if PATCH is on directory fs2

Thursday, May 19, 2022

Apache (OHS) in R12.2 failed to stop and refused to start

After Linux server patching and reboot, alerts were sent out that httpd process for OHS (Oracle HTTP Server) in a production instance was not running on the server. I tried to stop/start it without luck. Somehow it tied to a pid owned by root very strangely (or, it ties to a pid that does not exist). 

$ ps -ef | grep httpd         <== No httpd running

$ adapcctl.sh start
$ adopmnctl.sh status
Processes in Instance: EBS_web_EBSPROD_OHS1
------------------------ --------------+------------------+-------+-------
ias-component                     | process-type    |     pid | status
------------------------- -------------+------------------+-------+-------
EBS_web_EBSPROD         | OHS                |    919 | Stop

$ ps -ef | grep 919 (or, process 919 does not exist)
root       919     2  0 09:32 ?        00:00:00 [xxxxxx]

$ iName=$(tr < $CONTEXT_FILE '<>' '  ' | awk '/"s_ohs_instance"/ {print $(NF-1)}' )
$ SUBiName=${iName%?????}
$ cd $FMW_HOME/webtier/instances/$iName/diagnostics/logs/OHS/$SUBiName

The log file shows many lines of message:
--------
22/0X/05 02:47:03 Stop process
--------
$FMW_HOME/webtier/ohs/bin/apachectl stop: httpd (no pid file) not running
--------
22/0X/05 02:48:03 Stop process
--------
$FMW_HOME/webtier/ohs/bin/apachectl hardstop: httpd (no pid file) not running

File httpd.pid shall reside in this log folder, which is defined in httpd.conf in $FMW_HOME/webtier/instances/$iName/config/OHS/$SUBiName (or, $IAS_ORACLE_HOME/instances/$iName/config/OHS/$SUBiName) in R12.2. I believe the problem is httpd.pid was removed BEFORE "adapcctl.sh stop" fully completed, maybe due to Linux server crash or power off.  Normally, "adapcctl.sh stop" checks it and then removes it. Because of that, adapcctl.sh failed on checking a status and refused to start Apache.

Additionally, opmn logs can be found in $FMW_HOME/webtier/instances/$iName/diagnostics/logs/OPMN/opmn

The workaround:

1. Stop/kill all opmn processes  (keeping WLS related processes will be fine) 
$ sh $ADMIN_SCRIPTS_HOME/adopmnctl.sh stop
$ ps -ef | grep opmn

2. Create a empty file 
$ cd $FMW_HOME/webtier/instances/$iName/diagnostics/logs/OHS/$SUBiName
$ touch httpd.pid

3. Clear a folder (important step) 
$ cd $FMW_HOME/webtier/instances/$iName/config/OPMN/opmn
$ ls -al states
-rw-r----- 1 user group 19 Jun 21 18:57 .opmndat
-rw-r----- 1 user group 579 Jun 21 18:54 p1878855085
$ mv states states_BK
$ mkdir states
$ ls -al states

4. Now, starting Apache shall work
$ ./adapcctl.sh start
$ ./adopmnctl.sh status
$ ps -ef | grep httpd | wc -l
4                  <== 3 httpd.worker processes running

5. Make sure all work
./adstpall.sh apps/appsPWD
./adstrtal.sh apps/appsPWD
./adopmnctl.sh status

When Apache (OHS) starts up, it writes the process ID (PID) of the parent httpd process to the httpd.pid file. When Apache is running, file httpd.pid shall exist and not be empty.