Sunday, January 15, 2023

EBS session status in the database & on OS

When an OS process uses high CPU or other OS resources, use below line to find its start time:
$  ps -eo pid,lstart,cmd | grep 21846
74063 Tue Dec 19 12:29:31 2022 grep --color=auto 21486
21486 Tue Dec 19 07:39:01 2022 frmweb server webfile=HTTP-0,0,1,default

If a forms process (frmweb) exists on OS level but does not have a responding session in the database, it is a run-away and it can be terminated.  

Below SQL statement will tell and monitor what an EBS session is doing in the database after its process ID is identified (e.g. 21846) on OS level of EBS apps server. 

SQL> SELECT to_char(S.logon_time, 'DD-MON-RRRR HH24:MI:SS') "logon_time", to_char(S.PREV_EXEC_START, 'DD-MON-RRRR HH24:MI:SS') "last op started", 
    client_identifier, module, action, status, machine, round(last_call_et/60,1) "Minutes in current status",
    sql_id, blocking_session, process, sid, serial#,
    STATE,
    ROUND((S.WAIT_TIME_MICRO/1000000)/60/60,1) "total wait hrs",
    DECODE(S.TIME_REMAINING_MICRO,'-1', 'indefinite', '0','Wait timed out',NULL,'session not waiting') "remaining wait time", 
    DECODE(S.TIME_SINCE_LAST_WAIT_MICRO, '0','Still waiting') "current wait status",
    S.time_since_last_wait_micro
   FROM v$session s
WHERE process = '21846' 
-- order by client_identifier
;
  
  BLOCKING_SESSION - Tell if it is blocked by another session.
  STATE - The state of the wait event.
  WAIT_TIME_MICRO - The wait time of the wait event in microseconds.
  TIME_REMAINING_MICRO - The time remaining in microseconds before the wait event times out.
  TIME_SINCE_LAST_WAIT_MICRO - The amount of time in microseconds elapsed from the last wait to the current wait.

Normally, a session for a running concurrent request will show the SQL_ID.

An oacore process may have many sessions in the database. To find which process ID is for oacore, try "$ ps -ef | grep oacore " on OS level of apps server, or run below line in the database (if XX is part of a CUSTOM top name).

SQL> select distinct process, machine from v$session 
where action like '%XX%' order by machine;

When an OACORE process taking high CPU, find the process ID and run following commands :

1) ps -eLo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm | grep PID
==> which thread is taking CPU.  Most of them should show 0.0 (which is the CPU column).

I did see an OACORE process (e.g. 13030) supports 125 JVM threads in one of my instances. So, it is not good to simply kill an oacore process.
$ ps -eLo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm | grep 13030 | wc -l
125

2) jmap -histo <jvm_process_id> > /tmp/ <jvm_process_id>.histo
==> will dump heap usage details.  

3) jmap -dump:format=b,file= <jvm_process_id>.jmap <jvm_process_id>
==> will generate an actual heap dump, binary format

4) kill -3 <jvm_process_id>
==> output goes to 12.2 oacore_server1.out. It will generate stack trace, run twice a minute apart.
kill -3 <jvm_process_id>

5) lsof -p <jvm_process_id> > <jvm_process_id>.log
==> List of open ports/files

6) SELECT audsid, module, last_call_et, action from gv$session where process = '&jvm_process_id';
==> corresponding DB sessions

7) log / incident files from that oacore :

$EBS_DOMAIN_HOME/servers/oa*/logs/*
$EBS_DOMAIN_HOME/servers/oa*/adr/diag/ofm/EBS_domain*/oacore*/incident/incdir*/*

No comments: