Wednesday, November 19, 2008

Find a foreign key references to a table

Foreign key can be created in the table creation or by "alter table ..."

CREATE TABLE my_table
(
  column1 datatype null/not null,
  column2 datatype null/not null,
  ...

  CONSTRAINT constrint_fk
    FOREIGN KEY (column1, column2, ... column_n)
    REFERENCES parent_table (col_1, col_2, ... col_n)
);

ALTER TABLE my_table ADD (
  CONSTRAINT constrint_fk
    FOREIGN KEY (column1, column2, ... column_n)
    REFERENCES parent_table (col_1, col_2, ... col_n)
  enable validate);

Here, my_table is the child table to the foreign key constrint_fk. Columns (col_1, col_2, ... col_n) shall be the primary key or an unique index of the parent_table. Columns (column1, column2, ... column_n) shall be an index of my_table to avoid 'eng: TM' locks in a busy database when the parent_table gets DELETE or UPDATE operations frequently.

When I tried to truncate a table before refreshing it, I got error "ORA-02266: unique/primary keys in table referenced by enabled foreign keys". Below query will identify all foreign keys referenced by child tables:

SELECT DISTINCT a.table_name CHILD_NAME, a.constraint_name F_KEY, a.status, a.last_change
  FROM dba_constraints a, dba_constraints b
WHERE a.OWNER = UPPER('<owner>')
   AND   a.r_constraint_name = b.constraint_name
   AND   a.constraint_type = 'R'

   AND   b.constraint_type in ('P', 'U')
   AND   b.table_name = UPPER('<MY_TABLE>');

Then, run "alter table owner.CHILD_NAME disable constraint F_KEY;" to disable all constraints (foreign keys). After that, I can truncate table MY_TABLE. But if I want to drop the table, I need to drop the constraints first.

One table can have many foreign keys referencing multiple parent tables. Below query finds all foreign keys that links MY_TABLE to all parent tables:

SELECT c.table_name, c.constraint_name, c.constraint_type, c2.table_name P_table, c2.constraint_name P_constraint
  FROM dba_constraints c JOIN dba_constraints c2 ON (c.r_constraint_name = c2.constraint_name)
 WHERE  c.OWNER = UPPER('<owner>')
     AND c.table_name =UPPER('<MY_TABLE>')
     AND c.constraint_TYPE = 'R';


When I delete/update data in table MY_TABLE, I do not need to disable foreign keys to all parent table generated from above query.

Tuesday, November 18, 2008

hanganalyze and systemstate

Oracle Support defines a "true" database hang as "an internal deadlock or a cyclical dependency between two or more processes." When dealing with DML locks (that is, enqueue type TM), Oracle is able to detect this dependency and roll back one of the processes to break the cyclical condition. On the other hand, when this situation occurs with internal kernel-level resources (such as latches or pins), Oracle is usually unable to automatically detect and resolve the deadlock.

To troubleshoot the hung database with the hang in progress, Oracle provides utilities to use kernel calls to identify blocking and waiting sessions.

- Hanganalyze

Hanganalyze is a newer utility to get a trace, and can be invoked from SQL *Plus or through oradebug (which is available when connected as SYS in the SQL *Plus utility).

SQL> connect / as sysdba
Connected.
SQL> oradebug hanganalyze 3
Hang Analysis in /u01/app/oracle/admin/biqa/udump/biqa_ora_1962226.trc
SQL> oradebug hanganalyze 3
Hang Analysis in /u01/app/oracle/admin/biqa/udump/biqa_ora_1962226.trc
SQL> exit

or

SQL > alter session set events 'immediate trace name hanganalyze level 3';

Various levels and trace information:

1-2 Only hanganalyze output, no process dump at all
3 Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
4 Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF, LEAF_NW, IGN_DMP state)
5 Level 4 + Dump all processes involved in wait chains (NLEAF state)
10 Dump all processes (IGN state)

In RAC environment, above command is slightly different because an lock in an instance may be from processes of aonther instance. See Metalink Note 175006.1.

- SYSTEMSTATE Dump

In a database hang situation, Oracle Support may ask systemstate dumps to diagnose the root cause of the problem. It is important to take at least three of them a few minutes apart, on all instances of your database, to capture evidence shows whether a resource is still being held from one time to the next.

1. By oradebug
SQL> conn / as sysdba
Connected.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.
SQL> exit

In a RAC, oradebug allows you to dump the global system state by connecting to one node "SQL> oradebug -g all dump systemstate 266". The -g option will dump system states for all the instances. The SYSTEMSTATE dump/trace file can be found in the user_dump_dest directory on the instance where the dump was generated.

2. By Sql*Plus
It is recommended to reconnect every time so that it gets a new process ID and also the new trace files.
SQL> ALTER SESSION SET MAX_DUMP_FILE_SIZE=UNLIMITED;
-- Take the 1st systemstate level 266:
SQL> alter session set events 'immediate trace name SYSTEMSTATE level 266';
-- Take the 2nd systemstate level 266:
SQL> alter session set events 'immediate trace name SYSTEMSTATE level 266';
-- Take the 3rd systemstate level 266:
SQL> alter session set events 'immediate trace name SYSTEMSTATE level 266';

The dump file size depends on the size of the System Global Area (SGA), the number of sessions logged in, and the workload on the system. If using systemstate level 266 takes much longer (hours) than expected to generate the dump file, then end the systemstate dump and try level 258 (or even 10).

- SQL *Plus with the prelim option

When the entire database is hung and you cannot connect to SQL *Plus, you can try invoking SQL *Plus with the prelim option if you're using Oracle 10g or later. This attaches the process to the Oracle instance and no SQL commands are run. No login triggers or no pre-processing is done, and no SQL queries are allowed to run.

$sqlplus -prelim
SQL*Plus: Release 10.2.0.1.0 - Production on Wed Nov 9 11:42:23 2005 Copyright (c) 1982, 2005, Oracle. All rights reserved.
Enter user-name: / as sysdba
SQL>

Monday, October 6, 2008

DBVERIFY

DBVERIFY is an Oracle utility for verifying the structure in data files when you experiences ORA-01578 / ORA-08103 or any other kind of corrupt messages.

During a database recovery, I gor error from creating control file:

ORA-01503: CREATE CONTROLFILE failed
ORA-01565: error in identifying file
'/t10/app/oracle/oradata/lawtest/lawts_accons_ptn_idx_7_1.dbf'
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 5: I/O error
Additional information: 2

DBVERIFY provided great evidence to System Admin that the copy of some datafiles is a bad copy by OS command, bacause DBVERIFY showed no error on other files:

$ dbv file=lawts_accons_ptn_idx_7_1.dbf blocksize=16384
DBVERIFY: Release 9.2.0.6.0 - Production on Sun Apr 27 15:03:36 2008
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBV-00100: Specified FILE (lawts_accons_ptn_idx_7_1.dbf) not accessible

$ dbv file=lawts_acbudhdr_ptn_idx_max_1.dbf blocksize=16384
DBVERIFY: Release 9.2.0.6.0 - Production on Sun Apr 27 15:04:11 2008
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBVERIFY - Verification starting : FILE = lawts_acbudhdr_ptn_idx_max_1.dbf
DBVERIFY - Verification complete

Total Pages Examined : 16384
Total Pages Processed (Data) : 0
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 9996
Total Pages Failing (Index): 0
Total Pages Processed (Other): 120
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 6268
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 36435995208 (8.2076256840)

Blocksize is specified by parameter DB_BLOCK_SIZE and is necessary in running DBV.

If ASM storage is used, we need to specify USERID to get authenticated on ASM Instance. The below script will get lines for running DBV on all datafiles of a database on ASM:
select 'dbv file=' || name || ' blocksize='|| block_size || ' USERID=sys/&SYS_PASSWORD logfile=' ||
substr(name, instr(name, '/', -1, 1) +1) || '.' || file# || '.log'
from v$datafile;

Wednesday, October 1, 2008

Place a blackout in Grid Control

There are two ways to place a blackout in Grid Control.
- From the Grid Control UI (Console), using Setup to create a blackout or using the icon on each Target page.
- From the command line at the agent home.

Here are the commands for creating a blackout:

- Place a blackout at host level. In $AGENT_HOME/bin
$ ./emctl start blackout test_dev3 -nodelevel -d 12:00
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Blackout test_dev3 added successfully
EMD reload completed successfully

$./emctl status blackout
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Blackoutname = Blackout-Sep 11 2008 11:47:59 PM
Targets = (bidev.corpnet.com:oracle_database,)
Time = ({2008-9-11|20:50:2|3600 Sec,|} )
Expired = True

Blackoutname = test_dev3(NODE_LEVEL)
Targets = (dbdev:host,)
Time = ({2008-10-01|11:09:33|720 Min,|} )
Expired = False

- Place a blackout on a target / database. In $AGENT_HOME/bin
$ ./emctl start blackout test_dev3 bidev.corpnet.com:oracle_database -d 03:00
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Blackout test_dev3 added successfully
EMD reload completed successfully

./emctl status blackout
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Blackoutname = Blackout-Sep 11 2008 11:47:59 PM
Targets = (bidev.corpnet.com:oracle_database,)
Time = ({2008-9-11|20:50:2|3600 Sec,|} )
Expired = True

Blackoutname = test_dev3
Targets = (bidev.corpnet.com:oracle_database,)
Time = ({2008-10-01|12:50:49|180 Min,|} )
Expired = False

- To stop the blackout, run
$ ./emctl stop blackout test_dev3
Oracle Enterprise Manager 10g Release 10.2.0.1.0.
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
Blackout test_dev3 stopped successfully
EMD reload completed successfully

- Additional notes
1. After "emctl start blackout test_dev3 ..." is executed, it will add an entry in $AGENT_HOME/sysman/emd/blackouts.xml. When "emctl stop blackout test_dev3 ..." is run, the entry is removed from the file. It seems if the blackout expires before the stop command is run, the entry will stay in the file for ever.
2. If you do not use the "-d" option in running "emctl start blackout ..." command, the GC console (Setup --> Blackouts) will list the blackout with "Ended" status and "Indefinite" duration immediately. It seems the blackout without "-d" option stops out quickly. It may be a bug in Grid Control 10.2.0.1.0.
3. Checking the blackout status from the repository table MGMT_BLACKOUTS.
SQL> select * from sysman.MGMT_BLACKOUTS
where BLACKOUT_NAME like 'test_dev3%';

Tuesday, September 30, 2008

CRS related commands

Oracle provides useful commands to manage CRS (Cluster Ready Services):

crs_stat -t --> Shows HA resource / service status (hard to read)
crsstat --> Ouptut of crs_stat -t formatted nicely (see Metalink note 259301.1)
crs_stop -all --> Stops all registered resources (but keeps CRS running!)
crs_start -all --> Starts all registered resources
crsctl check crs --> Verifies CSS,CRS,EVM functioning
crsctl stop crs --> Stops crs and all other services
crsctl start crs --> Starts crs and all other services

Two commands
crsctl disable crs --> Prevents CRS from starting on reboot
crsctl enable crs --> Enables CRS start on reboot
will update the file /etc/oracle/scls_scr/Node_name/root/crsstart which contains the string “enable” or “disable” as appropriate.

Command "ps -ef | grep d.bin" will check three main background processes. They are normally started by init during the operating system boot process. They can be started and stopped manually by issuing the command /etc/init.d/init.crs {start | stop | enable | disable} (or /etc/init.crs {start | stop | enable | disable})

oracle 2498 2091 0 Aug 18 - 8:18 /u01/crs/oracle/product/crs/bin/evmd.bin
root 2580 1927 0 Aug 18 - 705:21 /u01/crs/oracle/product/crs/bin/crsd.bin reboot
oracle 2662 2542 0 Aug 18 - 557:13 /u01/crs/oracle/product/crs/bin/ocssd.bin
root 2785 2951 0 Aug 18 - 1:59 /u01/crs/oracle/product/crs/bin/oprocd.bin run -t 1000 -m 500 -f

Here is a short description of each of the CRS daemon processes (Note 259301.1):

CRSD:
- Engine for HA operation
- Manages 'application resources'
- Starts, stops, and fails 'application resources' over
- Spawns separate 'actions' to start/stop/check application resources
- Maintains configuration profiles in the OCR
- Stores current known state in the OCR.
- Runs as root
- Is restarted automatically on failure
OCSSD:
- OCSSD is part of RAC and Single Instance with ASM
- Provides access to node membership
- Provides group services
- Provides basic cluster locking
- Integrates with existing vendor clusteware, when present
- Can also runs without integration to vendor clustware
- Runs as Oracle.
- Failure exit causes machine reboot. --> This is a feature to prevent data corruption in event of a split brain.
EVMD:
- Generates events when things happen
- Spawns a permanent child evmlogger
- Evmlogger, on demand, spawns children
- Scans callout directory and invokes callouts.
- Runs as Oracle.
- Restarted automatically on failure

Once the above processes are running, they will automatically start the following services in the following order if they are enabled.

- The nodeapps (gsd, VIP, ons, listener) are brought online.
- The ASM instances are brought online.
- The database instances are brought online.
- Any defined services are brought online.

Outputs from running some commands:

[root:/dssdb1]# crsctl stop crs
Stopping resources.
This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.Shutting down CSS daemon.
Shutdown request successfully issued.
[root:/dssdb1]#

[root:/dssdb1]# crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[root:/dssdb1]#

[oracle:/dssdb1]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[oracle:/dssdb1]$

If you see message other than above lines, you may use "ps -ef | grep d.bin" and "crs_stat -t" to check the status on each daemon and service.

Oracle provides an utility CLUVFY in $ORACLE_HOME/bin to verify the CRS as well:
[oracle:/dssdb1]$ cluvfy stage -post crsinst -n all -verbose