Wednesday, September 29, 2021

Apply R12.2 patch using ADOP

Adop downtime mode does not start an online patching cycle. It applies a patch to RUN file system and completes more quickly than in online mode, but at the cost of increased system downtime. There is no option to abort a failed patch and return to the existing RUN filesystem in downtime mode.

"hotpatch=yes" means while applying the patches, applications services are up and running. Same like hotpatch in R12.1 adpatch. It applies patch to RUN filesystem directly as well. Next time, when Online Patching mode is used, first run the "adop phase=prepare" command. At that time, adop config change detector will find that the RUN filesystem has had patches applied in hotpatch mode and will sync the PATCH filesystem as part of the prepare phase (Doc ID 1928798.1). Oracle support engineer says never try with hotpatch mode unless it is specified in the patch readme or any document. For example applying 33600809 (R12.AD.C.Delta.14) in hotpatch mode will fail and cause problem.

1. Pre steps

To make sure ADOP works on a multi-node instance, run "adop -validate" first, which includes execution of below line:
$ perl $AD_TOP/patch/115/bin/txkRunSSHSetup.pl verifyssh -contextfile=$CONTEXT_FILE -hosts=2nd_node

$ vi /etc/oraInst.loc
$ adop -validate
... ...
========================================
ADOP (C.Delta.12)
Node: master_node
Command: validate
Log: $NE_BASE/EBSapps/log/adop/16/.../validate/adopConsole.log
========================================
Checking for existing patching cycle.
    No existing patching cycle exists

Verifying SSH connection to all nodes.
    Log: $LOG_HOME/appl/rgf/TXK/verifyssh.log
    Output: $LOG_HOME/appl/rgf/TXK/out.xml
    Remote execution is operational.

Running adop validations on Admin node: [master_node].
    Log: master_node:$NE_BASE/EBSapps/log/adop/.../validate/master_node
    Output: $NE_BASE/EBSapps/log/adop/.../validate/remote_execution_result_level1.xml
        txkADOPEvalSrvStatus.pl returned SUCCESS
Running adop validations on node(s): [2nd_node and ].
    Output: $NE_BASE/EBSapps/log/adop/.../validate/remote_execution_result_level2.xml
        txkADOPEvalSrvStatus.pl returned SUCCESS
adop exiting with status = 0 (Success)

NOTES: it will starts WLS Admin server

$ adop -status

Enter the APPS password:
Connected.
=======================================================
ADOP (C.Delta.12)
Session Id: 16
Command: status
Output: $NE_BASE/EBSapps/log/adop/.../adzdshowstatus.out
=======================================================
Node Name       Node Type  Phase           Status          Started              Finished             Elapsed
--------------- ---------- --------------- --------------- -------------------- -------------------- ------------
Master_node    master   APPLY           ACTIVE         2021/0X/02 17:53:08  2021/0X/09 15:04:17  65:11:09
                                      CLEANUP     NOT STARTED
2nd_node         slave     APPLY           ACTIVE         2021/0X/02 18:22:12  2021/0X/09 15:22:16  65:00:04
                                      CLEANUP     NOT STARTED

File System Synchronization Type: None
adop exiting with status = 0 (Success)

2. Apply patch to multiple nodes in one command line

Below line on Master node shall apply AP patch 32385168, for example, to all nodes even $APPL_TOP file structure is not shared:

$ echo $FILE_EDITION
run
$ cd $PATCH_TOP
$ unzip p32385168_R12.AP.C_R12_GENERIC.zip
NOTES: Copy and run "unzip p32385168_R12.AP.C_R12_GENERIC.zip" on each node to avoid ADOP failure on the node.
     
$ adop phase=apply apply_mode=downtime patches=32385168 patchtop=/path/to/$PATCH_TOP
... ...
===========================================================
ADOP (C.Delta.12)
Session ID: 16
Node: master_node
Phase: apply
Log: $NE_BASE/EBSapps/log/adop/16/2021xxxx_142226/adop.log
============================================================
Verifying existence of context files in database.
Checking for failed nodes in the configuration.

Checking if adop can continue with available nodes in the configuration.
    Log: $NE_BASE/EBSapps/log/adop/16/.../apply/master_node
        txkADOPEvalSrvStatus.pl returned SUCCESS

Applying <32385168> patch(es) on admin node: [master_node].
    Output: $NE_BASE/EBSapps/log/adop/16/.../apply/remote_execution_result_level1.xml
    Log: $NE_BASE/EBSapps/log/adop/16/.../apply/master_node
        txkADOPEvalSrvStatus.pl returned SUCCESS

Applying <32385168> patch(es) on node(s): [2nd_node].
Running in Serial
    Output: $NE_BASE/EBSapps/log/adop/16/.../apply/remote_execution_result_level2.xml
    Log: $NE_BASE/EBSapps/log/adop/16/.../apply/master_node
        txkADOPEvalSrvStatus.pl returned SUCCESS
Summary report for current adop session:
     Node 2nd_node: Completed successfully
          - Apply status: Completed successfully
     Node master_node: Completed successfully
          - Apply status: Completed successfully
     For more details, run the command: adop -status -detail

Use SQL statement to confirm the patch was just applied to ALL nodes:
SQL> select ADOP_SESSION_ID, BUG_NUMBER, STATUS, APPLIED_FILE_SYSTEM_BASE, ADPATCH_OPTIONS, round((end_date-start_date)*24*60, 1) EXEC_TIME, AUTOCONFIG_STATUS,  DRIVER_FILE_NAME, NODE_NAME, END_DATE, CLONE_STATUS
from ad_adop_session_patches
order by end_date desc;

Seems specifying "patchtop=" is important. Otherwise, ADOP may fail on applying the patch to remote node. 

If patch's folder (from .zip file) did not exist in $PATCH_TOP of 2nd node or the patch was applied already, ADOP may give misleading error:
    [ERROR]     adop phase=apply failed on Node: "2nd_node"
    Log: $NE_BASE/EBSapps/log/adop/.../apply/2nd_node
        --------------------------------
        Summary of unavailable services:
        --------------------------------
        Group Name:
                Batch Processing Services
        Individual Services enabled in the group:
                OracleTNSListenerAPPS_EBSDEV_2nd_node
                OracleConcMgrEBSDEV_2nd_node
                Oracle Fulfillment Server EBSDEV_2nd_node

After this failure, SQL statement shows the patch was applied to just one node. After made a fix, I applied it to 2nd node only (with allnodes=no action=nodb options). 

In a situation, the patch failed on one of three nodes (but, the SQL may show it was not applied to any node).
    Summary report for current adop session:
          Node node1Name: Completed successfully
                - Apply status:      Completed successfully
          Node node2Name: Completed successfully
               - Apply status:      Completed successfully
          Node node3Name: Failed
               - Apply status:      Failed
         For more details, run the command: adop -status -detail
I ran "adop phase=apply apply_mode=downtime patches=33xxxxxx patchtop=$NE_BASE/EBSapps/patch restart=yes" on Master node to re-apply it.  adop will figure out and only apply the patch to Failed node node3Name.

Because patch can be applied to multiple nodes by one command line, I believe "passwordless ssh" setups are good between nodes in my multi-node instance. But somehow when I tried to set ssh up by running Oracle script, I always got errors (which I ignored):

$ perl $AD_TOP/patch/115/bin/txkRunSSHSetup.pl enablessh -contextfile=$CONTEXT_FILE -hosts=master_node,2nd_node
Enter SSH User password for the OS user applmgr:
Log: $LOG_HOME/appl/rgf/TXK/enablessh.log

Error in setting up ssh equivalence
FAILED: enableSSH

SEVERE: com.jcraft.jsch.JSchException: Algorithm negotiation fail
at com.jcraft.jsch.Session.receive_kexinit(Session.java:510)
at com.jcraft.jsch.Session.connect(Session.java:285)
at com.jcraft.jsch.Session.connect(Session.java:149)
at oracle.sysman.prov.ssh.RunCommand.runCommand(RunCommand.java:134)
at oracle.sysman.prov.ssh.SSHSetup.runCommandHelper(SSHSetup.java:2350)
at oracle.sysman.prov.ssh.SSHSetup.validateRemoteScp(SSHSetup.java:643)
at oracle.sysman.prov.ssh.SSHConnectivity.startSetup(SSHConnectivity.java:201)
at oracle.sysman.prov.ssh.SSHConnectivity.main(SSHConnectivity.java:360)
... ...
I am not sure what is the real problem or the root cause. It may be because the instance is in non-shared $APPL_TOP (s_shared_file_system => false).

3.  Apply patches on each node separately. I used below steps to apply two AP patches to two nodes.

On Master node: 

$ echo $FILE_EDITION
run
$ cd $PATCH_TOP
$ unzip p32768426_R12.AP.C_R12_GENERIC.zip
$ unzip p31211521_R12.AP.C_R12_GENERIC.zip
$ adop phase=apply apply_mode=downtime patches=32768426,31211521 allnodes=no action=db

(if it failed in somewhere, you have to fix the error and then try it again by command:
$ adop phase=apply apply_mode=downtime patches=32768426,31211521 allnodes=no action=db restart=yes)

On 2nd node:

$ cd $PATCH_TOP
$ adop phase=apply apply_mode=downtime patches=32768426,31211521 allnodes=no action=nodb
(Or, line also works: $ adop phase=apply apply_mode=downtime patches=32768426,31211521 allnodes=no action=nodb options=nocompiledb,nocompilejsp,nogenerateportion )
... ...
Applying patch 32768426.
    Log: /path/to/2nd_node/32768426/log/u32768426.log
Applying patch 31211521.
    Log: /path/to/2nd_node/31211521/log/u31211521.log

Generating post apply reports.
Generating log report.
    Output: /path/to/2nd_node/adzdshowlog.out

The apply phase completed successfully.
adop exiting with status = 0 (Success)

Then, SQL statement verifies patches were applied to both nodes.

4.  If you have to apply R12.2 patches online (or without "downtime") mode to multi-node instance and you have to run ADOP separately, more picky steps and more time are needed:

PREPARE Phase: Run this on both the servers
==================================
on Master Node:  $ adop phase=prepare allnodes=no action=db 
on 2nd Node:       $ adop phase=prepare allnodes=no action=nodb

Apply Phase: Run this on both the servers
=================================
on Master Node:  $ adop phase=apply patches=<Patch Number> allnodes=no action=db 
on 2nd Node:       $ adop phase=apply patches=<Patch Number> allnodes=no action=nodb options=nocompiledb,nocompilejsp,nogenerateportion 

Finalize:  Run only on Master node
==================================
$ adop phase=finalize allnodes=no action=db

Cutover: Run this on both the servers
==================================
on Master Node:  $ adop phase=cutover allnodes=no action=db mtrestart=no
Notes: You will see the some comments saying that it is waiting for the second node, then you need to run the adop command on the second node then only it will complete the cutover session.
on 2nd Node:       $ adop phase=cutover allnodes=no action=nodb mtrestart=no

Before running cleanup command/phase, start a new OS session or source the environment variable on the Primary server and then run the below cleanup command.

Cleanup: Run on Master node
================================
$ adop phase=cleanup cleanup_mode=full

No comments: