Monday, February 1, 2016

R12 concurrent manager troubleshooting

Concurrent manages may not work properly due to different reasons.  To check if any concurrent jobs are running, run script afcmrrq.sql at $FND_TOP/sql. See Doc ID 2089560.6 (How To Tell If Concurrent Managers For A Particular SID Are Running) for other scripts.

1. After instance refresh or clone, concurrent managers fail to start.
When old (or source) server info is still saved in the database tables, concurrent managers get confused and will not start properly, even after below two lines to clean source info before running adconfig on database server:
$ sqlplus apps/passwd
SQL> @cmclean.sql         ( <= only before R12.2 )
SQL> EXEC FND_CONC_CLONE.SETUP_CLEAN;
SQL> commit;

SQL> select * from fnd_nodes;   (shall have 0 row. But one row after DBA runs adconfig script)
SQL> select unique(node_name) from fnd_concurrent_queues;
SQL> select * from fnd_concurrent_processes;
SQL> select * from fnd_conflicts_domain;

I saw message lines in log files under $APPLCSF/log:
CONC-SM TNS FAIL
Routine AFPEIM encountered an error while starting concurrent manager FNDSCH with library $APPL_TOP/fnd/12.0.0/bin/FNDSCH.

Check that your system has enough resources to start a concurrent manager process. Contact your system ad : 06-DEC-2015 09:58:42

The best fix is to apply patch 18539575 to fix a bug in R12.1 (See Doc ID 1646026.1). After this patch, concurrent services start normally in my refreshed instances.

One time, when the CM failed to start, I see message in the log file: 
Routine AFPEIM encountered an error while starting concurrent manager STANDARD with library $APPL_TOP/fnd/12.0.0/bin/FNDLIBR.
Check that your system has enough resources to start a concurrent manager process. Contact your sys : 30-DEC-2015 15:08:42
Starting Internal Concurrent Manager Concurrent Manager : 30-DEC-2015 15:08:42
CONC-SM TNS FAIL
 : ICM failed to start for target node1Name.  Review ICM log files for additional information.
                     Process monitor session ended : 30-DEC-2015 15:08:42


I realized that node1Name is the wrong node. So, I ran sql to update the data (with caution!):

SQL> select unique(node_name) from fnd_concurrent_queues;
NODE_NAME
------------------------------
node1Name
node2Name


SQL> select count(*) from fnd_concurrent_queues;
  COUNT(*)
----------------
        45


SQL> create table fnd_concurrent_queues_BK_jan as select * from fnd_concurrent_queues;
SQL> update fnd_concurrent_queues set NODE_NAME='node2Name'
where NODE_NAME='node1Name';

3 rows updated.

SQL> select unique(node_name) from fnd_concurrent_queues;
NODE_NAME
------------------------------
node2Name

SQL> commit;

After the table was updated, all concurrent mangers started on server node2Name.

2. When all concurrent manager processes were not fully shutdown or not properly shutdown in some situation (e.g. database connections were interrupted), CM services may fail on re-start. The number of processes in "Actual" and "Target" may not match with message " System Hold, Fix Manager before resetting counters ". In this case, first try shall be to use OAM to re-start individuals. For example (in Oracle Applications Manager version 2.3.1):

Site Map => Generic Services
On radio button select Output Post Processor (or, Conflict Resolution Manager ) => View Detail => Start (or verify) in the dropdown on the top => Go
(click on Next10) Generic Service Component Container => Workflow Agent Listener Service => Start in the dropdown on the top => Go

Site Map => Request Processing Manager
On radio button select Standard Manager => View Status tab on the upper => Start => Ok. Then click on Service Instances on the top to return to manager list.

3. When try to open a Request log or output file in GUI, it gives error:

An error occurred while attempting to establish an Applications File Server connection with the
node FNDFS_NODE2NAME. There may be a network configuration problem, or the TNS listener
on node FNDFS_NODE2NAME may not be running. Please contact your system administrator.

Here, Node2NAME is the concurrent server. After you make sure the listener (tnslsnr) is started, the first try is to ping the tnsname on Node1 (web/forms server): $ tnsping FNDFS_NODE2NAME
If you can not ping it, the real problem could be in $TNS_ADMIN/tnsnames.ora file on Node1. In one case, "domain.com" was not attached to Node2NAME in entries in tnsnames.ora file on Node1 server. After I re-ran autoconfig on all nodes, the problem got fixed.

Another place to check, make sure Profile options " RRA:% " have nothing strange.

If it works, below line shall list in "Concurrent => Manager => Administrator":
NAME                                   Node      Actual    Target   Status
Service Manager: Node1    Node1     1            0          {blank}

For Service Manager, if the Status shows "Target node/queue unavailable", clicking on "Restart" or OAM tries may not help. 

UPDATE in Sept 2020: Recently, seems the R12.1 clone script on RHEL7 did catch all info during the clone (maybe due to database connection error at the begining). After a clone completion and an adautocfg.sh run on all nodes, the tnsnames.ora file on CM node missed some entries. After 2nd run of adautocfg.sh on all nodes, some entries of tnsnames.ora on CM node missed "domain.com". tnsnames.ora file on CM node matches the original tnsnames.ora prior to clone until 3rd run of adautocfg.sh completed. 

4. Sometimes you are in a hurry to stop CM services but the OS processes keep running. You may see in the GUI form the "Actual" is non-zero while the "Target" is zero, which means there are still some requests are running. If you click "Terminate" the concurrent manager on the GUI, it will NOT kill its OS process. So this will not really speed up. The best way is to Find what are the running requests and then cancel the requests.

5. One time, Output Post Processor did not start. I checked file $APPLCSF/log/FNDOPPxxxxx.txt and saw error
Exception in static block of jtf.cache.CacheManager. Stack trace is: oracle.apps.jtf.base.resources.FrameworkException:
IAS Cache initialization failed. The Distributed Caching System failed to initialize on port: 12351. The list of hosts in the distributed caching system is: 157.121.49.41 157.121.53.42 157.121.53.43 . The port 12351 should be free on each host running the JVMs.


That means port 12351 is in use. After this port became free, Output Post Processor started.

6. One way to check if CM files on file system are good or not, run below report for a test. If all work, it will generate and save Active Users report for you even when all EBS services are shutdown at OS level.

$INST_TOP/ora/10.1.2/bin/appsrwrun.sh userid=apps/appsPWD mode=character report=$FND_TOP/reports/US/FNDSCURS.rdf \
batch=yes destype=file desname=./areport.out desformat=$FND_TOP/reports/HPL pagesize=132x66 traceopts=trace_all tracefile=areport.trc tracemode=trace_replace 

7. Below error may indicate that some node name is not registered in FND_NODES table. autoconfig may not run on all nodes after a cleaning.

List of errors encountered:
.............................................................................

_ 1 _
Concurrent Manager cannot find error description for CONC-System Node
Name not Registered

Contact your support representative.
.............................................................................


List of errors encountered:
.............................................................................

_ 1 _
Routine AFPCAL received failure code while parsing or running your
concurrent program CPMGR

Review your concurrent request log file for more detailed information.
Make sure you are passing arguments in the correct format.


8. If you do not want concurrent managers to run on a node at all, use two parameters in $CONTEXT_FILE to turn off them:

<oa_service_group_status oa_var="s_batch_status">disabled</oa_service_group_status><oa_service_group_status oa_var="s_other_service_group_status">disabled</oa_service_group_status>

I noticed once below error in a manager log file, while Concurrent processing works fine. I believe the error showed up after CM services got started on a wrong node by wrong values in $CONTEXT_FILE when the node got bounced from a crash. Oracle Support said it is a database issue. But I did not do anything on tables and the error went away by itself after above two values were set to "disabled" in $CONTEXT_FILE on that node.

Routine &ROUTINE has attempted to start the internal concurrent manager. The ICM is already running. Contact you system administrator for further assistance.afpdlrq received an unsuccessful result from PL/SQL procedure or function FND_DCP.Request_Session_Lock.
Routine FND_DCP.REQUEST_SESSION_LOCK received a result code of 1 from the call to DBMS_LOCK.Request.
Possible DBMS_LOCK.Request result ORACLE error 1036 in tag_db_session

Cause: tag_db_session failed due to ORA-01036: illegal variable name/number.

The SQL statement being executed at the time of the error was: ... and was executed from the file &ERRFILE.Call to establish_icm failed
The Internal Concurrent Manager has encountered an error.

Review concurrent manager log file for more detailed information. : 26-JUL-2015 10:52:55 -
Shutting down Internal Concurrent Manager : 26-JUL-2015 10:52:55

List of errors encountered:
.............................................................................
_ 1 _
Routine AFPCSQ encountered an ORACLE error. .
Review your error messages for the cause of the error. (=<POINTER>)
.............................................................................

List of errors encountered:
.............................................................................
_ 1 _
Routine AFPCAL received failure code while parsing or running your
concurrent program CPMGR
Review your concurrent request log file for more detailed information.
Make sure you are passing arguments in the correct format.
.............................................................................
The EBSXXXX_0726@EBSXXXX internal concurrent manager has terminated with status 1 - giving up.

Friday, January 15, 2016

Place all pending concurrent requests on hold

During a long maintenance of multiple tasks, apps services may be brought down and up for two or three times. You may want to hold all scheduled jobs from execution until all tasks are completed.

1. Create a temp table to hold all request IDs
SQL> create table apps.FND_CONC_REQ_S_V_onhold as
select decode(phase_code,
              'P','Pending',
                  'Running') Phase_code,
       decode(status_code, 
              'R', 'Normal',
              'W', 'Paused',
              'Q', 'Standby',
               'I', 'Scheduled') Status_code,
     decode(hold_flag,'Y','Yes','No') Hold_flag,
     request_id,
     substr(requestor,1,10) usern_name,
     to_char(requested_start_date,'MM/DD/YYYY HH24:MI:SS') req_start_date,
     substr(program,1,50) program
from FND_CONC_REQ_SUMMARY_V
where phase_code in ('P')  -- in Pending
and status_code != 'W'      -- not in Paused
and hold_flag != 'Y'            -- not on hold
and to_date(requested_start_date,'DD-MON-YY')
between to_date(sysdate,'DD-MON-YY')
       and to_date(sysdate+2,'DD-MON-YY')
order by phase_code desc, requested_start_date, request_id
;

2. Update table  fnd_concurrent_requests to hold them
SQL> UPDATE fnd_concurrent_requests
SET hold_flag = 'Y'
WHERE phase_code = 'P'
and status_code in ('Q','I')  -- Standby or Scheduled
and hold_flag = 'N'
and request_id in (
select request_id from apps.FND_CONC_REQ_S_V_onhold);

3. Check/confirm any running requests before shutting down concurrent managers, if you like.
SQL> SELECT decode(phase_code,
              'P','Pending',
                  'Running') Phase_code,
       decode(status_code, 
              'R', 'Normal',
              'W', 'Paused',
              'Q', 'Standby',
              'I', 'Scheduled') Status_code,
     decode(hold_flag,'Y','Yes','No') Hold_flag,
     request_id,
     substr(requestor,1,10) usern_name,
     to_char(requested_start_date,'MM/DD/YYYY HH24:MI:SS') req_start_date,
     to_char(actual_start_date,'MM/DD/YYYY HH24:MI:SS') actual_start_date,
     substr(program,1,50) program
  FROM fnd_conc_req_summary_v
  WHERE phase_code in ('R', 'P')  order by program;

4. After maintenance, put on-hold requests back to scheduler
SQL> UPDATE fnd_concurrent_requests
SET hold_flag = 'N'
WHERE phase_code = 'P'
and status_code in ('Q','I')  -- Standby or Scheduled
and hold_flag = 'Y'
and request_id in (
select request_id from apps.FND_CONC_REQ_S_V_onhold);

SQL> drop table apps.FND_CONC_REQ_S_V_onhold;  -- optional


UPDATE in 2022: In R12.2, when used below line to hold concurrent jobs for a period of time for database upgrade, and after upgrade logged onto GUI forms to cancel the HOLD, some concurrent jobs got "No Manager" status.

SQL> update  apps.fnd_concurrent_requests 
set hold_flag = 'Y' 
where status_code in ('Q','I') and hold_flag = 'N';

The problem is column EDITION_NAME of table fnd_concurrent_requests was populated with invalid data. The fix is to find the current edition and run an update to the column.  

SQL> select edition_name from all_editions;
EDITION_NAME
-----------------------
ORA$BASE

SQL> SELECT sys_context('USERENV', 'CURRENT_EDITION_NAME') FROM dual;

SQL> update fnd_concurrent_requests
set edition_name='ORA$BASE'
where request_id in (228633x, 228633x, 228632x, 228632x, 228624x, 228624x)
;

Concurrent Processing: What to Do When Concurrent Requests Are Not Processing and Have Inactive No Manager Status? (Doc ID 1311526.1)

Friday, January 8, 2016

Disable responsibilities to restrict EBS site access

During business events such as month-end close, only a small group of users are allowed to access EBS website. This can not be accomplished by placing it in maintenance mode. But we can disable most responsibilities to restrict and manage users' access without changing users' accounts.

-- disable responsibilities. Assume GL_SuperUser, AR_SuperUser are two special responsibilities
SQL> update FND_RESPONSIBILITY   
set end_date = sysdate   
where application_id = 30003   
and RESPONSIBILITY_ID in (select responsibility_id   
from FND_RESPONSIBILITY_TL   
where application_id = 30003   
and Responsibility_name not in ('GL_SuperUser', 'AR_SuperUser'))   
;   
Commit;

-- After maintenance event, run sql to enable the responsibilities (without bouncing Apps services)
SQL> update FND_RESPONSIBILITY
set end_date = null
where application_id = 30003   
and RESPONSIBILITY_ID in (select responsibility_id   
from FND_RESPONSIBILITY_TL   
where application_id = 30003   
and Responsibility_name not in ('GL_SuperUser', 'AR_SuperUser'))   
;   
Commit;

When a responsibility was created, it would be associated with an Application. Below query will find how many responsibilities were created for each application.
SQ> select a.application_id, b.application_short_name, count(*)
from FND_RESPONSIBILITY a , fnd_application b
where a.application_id = b.application_id
group by a.application_id, application_short_name
order by a.application_id;

application_id application_short_name count(*)
---------------  ----------------------------  ---------
0            FND               24
1            SYSADMIN    3
99          OAM              1
101        SQLGL           27
140        OFA               7
... ... ...
9001      MTH               2
9004      INL                 1
30003    PUGL             20     <== custom stuff
30004    PUAR             3
30005    PUFND           4

Additionally, below queries can answer questions:

1) the list of responsibilities for a user
2) the list of responsibilities for an application

SQL> SELECT user_name, frt.RESPONSIBILITY_NAME, furg.end_date, fr.application_id
FROM
fnd_user_resp_groups furg,
FND_RESPONSIBILITY fr,
fnd_responsibility_tl frt,
fnd_user fu
WHERE fu.user_id = furg.user_id
AND furg.responsibility_id = fr.RESPONSIBILITY_ID
AND frt.responsibility_id = fr.RESPONSIBILITY_ID
AND fu.user_name = 'EBSuserName'
-- AND fr.application_id = 30003
ORDER BY 1,2;

How many users use an application with each responsibility?
SQL> SELECT fr.application_id, frt.RESPONSIBILITY_NAME, count(*)
FROM
fnd_user_resp_groups furg,
FND_RESPONSIBILITY fr,
fnd_responsibility_tl frt,
fnd_user fu
WHERE fu.user_id = furg.user_id
AND furg.responsibility_id = fr.RESPONSIBILITY_ID
AND frt.responsibility_id = fr.RESPONSIBILITY_ID
group by fr.application_id, frt.RESPONSIBILITY_NAME
ORDER BY 1,2;

Monday, January 4, 2016

Files holding EBS R12.1 configuration info

In addition to $CONTEXT_FILE, $APPL_TOP/customXXXX.env, and other standard files, such as appsweb.cfg for Forms, there are other files hold configuration data for the customization of EBS instance. Good understanding where are those files helps instance refresh and troubleshooting.

======================================
Files shall be in $FND_TOP/admin/template/custom
======================================

1. $FND_TOP/admin/template/custom/rwbuilder_conf_1012.tmp
This file is used by autoconfig to generate file $INST_TOP/ora/10.1.2/reports/conf/rwbuilder.conf. See Doc ID 1322704.1
NOTES:  When creating a custom folder, there is a risk that a patch may later bring a newer version of template file.  AutoConfig may fail with conflicts error because the patch only copies the template files to upper (standard) folder. In this case, the template file in custom folder has to be re-created.

For example, a concurrent job completed with Error :
REP-0069: Internal error
REP-57054: In-process job terminated:Finished successfully but output is voided


To correct this, a workaround on setting up a value in rwbuilder.conf file as follows:
(a) Create a custom directory under $FND_TOP/admin/template:
      $ mkdir $FND_TOP/admin/template/custom
(b) Copy the file $FND_TOP/admin/template/rwbuilder_conf_1012.tmp to this custom directory.
(c) Change line in file rwbuilder_conf_1012.tmp in custom directory from
      <property name="cacheSize" value="0"/>
     to
     <property name="cacheSize" value="50"/>
(d) Save it and then run autoconfig. rwbuilder.conf will get generated and kept with new content.

2. $FND_TOP/admin/template/ssl_conf_1013.tmp
This file is used to generate $INST_TOP/ora/10.1.3/Apache/Apache/conf/ssl.conf (same as $ORA_CONFIG_HOME/10.1.3/Apache/Apache/conf/ssl.conf)

For example, use it to enable TLS1.0 (Doc ID 1937646.1).

Also use one line to rotate ssl_engine_log file (daily):
Replace
   SSLLog      %s_logs_dir%/ora/10.1.3/Apache/ssl_engine_log
with
   SSLLog   "|%s_weboh_oh%/Apache/Apache/bin/rotatelogs %s_logs_dir%/ora/10.1.3/Apache/ssl_engine_log %s_ohs_log_rotation_time%"

3. $FND_TOP/admin/template/httpd_conf_1013.tmp
This file is used to generate file $INST_TOP/ora/10.1.3/Apache/Apache/conf/httpd.conf

Add lines to this file to call a shell script .sh file on OS by a custom URL ebsSiteName.domain.com/CUST/bin for a custom webpage or button on a Form.
Find the true path by "grep -w s_at $CONTEXT_FILE" for s_at and replace it in below lines

  <IfModule mod_alias.c>
    ScriptAlias /cgi-bin/ "%s_weboh_oh%/Apache/Apache/cgi-bin/"
    ScriptAlias /CUST/bin/ "%s_at%/cust/html/bin/"   <== to define a directory on file system
    #
    # "%s_weboh_oh%/Apache/Apache/cgi-bin" should be changed to whatever your ScriptAliased
    # CGI directory exists, if you have that configured.
    #
    <Directory "%s_weboh_oh%/Apache/Apache/cgi-bin">
      ... ... ...
    </Directory>

    <Directory "%s_at%/cust/html/bin">                    <== add those lines. (replace s_at)
        AllowOverride None
        Options None
        Order allow,deny
        Allow from all
    </Directory>
  </IfModule>

4. $FND_TOP/admin/template/oracle_apache_conf_1013.tmp
This file is used to generate file $INST_TOP/ora/10.1.3/Apache/Apache/conf/oracle_apache.conf

For example, add one line:
include "%s_weboh_oh%/Apache/modplsql/conf/plsql.conf"    

NOTES:  Script adtmplreport.sh can find template file for Apache configuration files.
 $ cd  $AD_TOP/bin
 $ adtmplreport.sh contextfile=$CONTEXT_FILE target=$INST_TOP/ora/10.1.3/Apache/Apache/conf/ssl.conf
 It will generate a log file showing which template file is for ssl.conf.

Other files, such as
- forms_server_xml_1013.tmp
- orion_web_xml_1013.tmp

==========================================
AUTOCONFIG does not modify below files, but read them
==========================================

5. $IAS_ORACLE_HOME/Apache/modplsql/conf/dads.conf
Add below lines to this DAD file to hold the password for custom PLSQL webpages.

<Location /pls/EBSQA>
SetHandler pls_handler
Order allow,deny
Allow from All
AllowOverride None
PlsqlDatabaseUsername apps
PlsqlDatabasePassword appsPWD
plsqlDatabaseConnectString dbServerName.domain.com:1562:EBSQA
PlsqlAuthenticationMode Basic
PlsqlDefaultPage fnd_web.ping
</Location>

6. Files in $INST_TOP/certs/Apache folder. 
They are ssl certificates if ssl is enabled for the site.

7. $APPL_TOP/admin/adsign.txt and $APPL_TOP/admin/adkeystore.dat
They are used for Java Signing.

8. $FND_TOP/fndenv.env
 $CUSTOM_TOP will be defined in this file.
… … … 
# Call devenv.env to set up development environment
        . ${FND_TOP}/${APPLUSR}/devenv.env
#Begin Customizations
MYGL_TOP=$APPL_TOP/mygl;   export MYGL_TOP
#End Customizations

===============================
AUTOCONFIG modify and re-create them
===============================
9. $INST_TOP/ora/10.1.2/forms/server/default.env
This file holds custom Forms location.
… … …
#Begin Customizations
MYGL_TOP=$APPL_TOP/mygl
#End Customizations


Sunday, January 3, 2016

Building an Index ONLINE

Creating an index on a busy table may fail when user's DML is using the table (in 11G database).

SQL> CREATE INDEX owner.index_IDX1 ON owner.table_name(col1, clm_id, col2) TABLESPACE tablespace_name;
    *
ERROR at line 1:
 ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired


Specify ONLINE to indicate that DML operations on the table will be allowed during creation of the index. Oracle will create an extra log table to keep track of any data changes without locking the table. There are restrictions on Online Index Building. One of them is that parallel DML is not supported during online index building. Oracle Database returns an error, if you specify ONLINE and then issue parallel DML statements.

Besides, other options, such as PARALLEL and NOLOGGING, will speed up the index creation.

SQL> CREATE INDEX owner.index_IDX1 ON owner.table_name(col1, clm_id, col2) TABLESPACE tablespace_name PARALLEL ( DEGREE 4 ) nologging online;
Index created.

With parallel, Oracle may automatically spawn parallel executions for every single SELECT statement. After that, the two options can be turned off by below statement:

SQL> alter index owner.index_IDX1 logging noparallel;

Extra notes: statement to turn on monitoring index usage (usually shall exclude PK from monitoring):

SQL> alter index owner.index_IDX1 monitoring usage ;