Oracle Database Administration: 2007

Wednesday, December 19, 2007

Foreign Key Constraint Error ORA-02298

While trying to create FK on cust_status got the following error.

ALTER TABLE CUST_STATUS ADD (CONSTRAINT CUST_STATUS_FK FOREIGN KEY (LOGIN) REFERENCES USERS(LOGIN))
*
ERROR at line 1:
ORA-02298: cannot validate (SCOTT.CUST_STATUS_FK) - parent keys not found

SQL> desc cust_status
Name Null? Type
----------------------------------------- -------- ----------------------------
CUSTID NOT NULL NUMBER(8)
TIMESTAMP NOT NULL DATE
STATUS NOT NULL VARCHAR2(35)
LOGIN VARCHAR2(20) ID NUMBER

SQL> desc users
Name Null? Type
----------------------------------------- -------- ----------------------------
LOGIN NOT NULL VARCHAR2(20) PASSWORD VARCHAR2(30)
LAST_NAME VARCHAR2(50)
FIRST_NAME VARCHAR2(50)
EMAIL VARCHAR2(100)
ACTIVE CHAR(1)

To find out which rows are the problem rows,

SQL> select custid,login from cust_status a
where not exists (select 'x' from users where login = a.login);

You can delete the data using...

SQL> delete from cust_status
where login is null or login not in ( select login from users )

Now you should be able to create the FK constraint.

Friday, December 7, 2007

How to test Oracle Apps after a patch

How to test Oracle Applications after any maintenance, for example patch

Access the Application Home Page (http://host_name:port)
This verifies Oracle HTTP Server is up and running.

Login to Apps as "SYSADMIN"
If you can get to that page, it verifies JSERV is up and running as this page is served by JServ

Once you are able to login to the Apps, this verifies your connection to the database is working.

Click on Help button on the top right portion, if you can access the help page, it verifies you connection from the front end to the database is working as that page is served by the database.

Click on the Concurrent Manager, if you get to the page where it shows all the jobs, it verifies Forms is up and running. Try submitting a job and see if it runs successfully, it proves Concurrent Manager is up and running.

Thursday, December 6, 2007

DG Failover Steps

Purpose : Failover to standby database in case the primary database server crashes.
Assumptions : Primary Database SID : CHDP1
Standby Database SID : CHDS1
flashback is on

Now the Primary datbase server crashes and no longer accessible.

Step by Step Instructions.

Login to standby database (CHDS1)
SQL> alter database recover managed standby database finish force;
SQL> alter database commit to switchover to primary;

In very urgent situations when you can not wait for some of the logs to be applied.
SQL> alter database activate standby database;

Now at this point in time your standby database (CHDS1) becomes primary databse.

After few hours, your original primary database server (which has CHDP1 database) comes back up and you wanted to make it (CHDP1) a standby database.

Login to CHDS1

SQL> select to_char(standby_became_primary_scn) from v$database;
TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)
--------------------------------------------------
1234567

Login to CHDP1
SQL> startup mount;
SQL> flashback database to scn 1234567;
SQL> alter database convert to physical standby;
SQL> shutdown immediate;
SQL> startup mount;

Wednesday, December 5, 2007

Check Kernel Parameters before installing software

To check Kernel parameter for Linux before installing oracle(you can add additional stuff as needed)
----- start check_OS_linux.sh -----------
echo "Checking kernel parameters..."
/sbin/sysctl -a 2>&1 | grep sem | grep -v error
/sbin/sysctl -a 2>&1 | grep shm | grep -v error
/sbin/sysctl -a 2>&1 | grep file-max | grep -v error
/sbin/sysctl -a 2>&1 | grep ip_local_port_range | grep -v error
/sbin/sysctl -a 2>&1 | grep rmem_ | grep -v error
/sbin/sysctl -a 2>&1 | grep wmem_ | grep -v error

echo "Checking OS components ..."
rpm -q binutils
rpm -q libaio
rpm -q gcc
rpm -q libstdc++
rpm -q libstdc++-devel
rpm -q gcc-c++
rpm -q glibc
rpm -q gnome-libs
rpm -q make
rpm -q pdksh
rpm -q sysstat
----------------- end script check_OS_linux.sh -----------------

To check kernel parameters for Solaris

---------------- start script check_OS_solaris.sh ----------------
/sbin/sysctl -a 2>&1 | grep sem | grep -v error

/sbin/sysctl -a 2>&1 | grep shm | grep -v error

/sbin/sysctl -a 2>&1 | grep file-max | grep -v error

/sbin/sysctl -a 2>&1 | grep ip_local_port_range | grep -v error

/sbin/sysctl -a 2>&1 | grep rmem_ | grep -v error

/sbin/sysctl -a 2>&1 | grep wmem_ | grep -v error
----------------- end script check_OS_solaris.sh ----------------

RMAN Backup types

Full
A backup of a datafile that includes every allocated block in the file being backed up. A full backup of a datafile can be an image copy, in which case every data block is backed up. It can also be stored in a backup set, in which case datafile blocks not in use may be skipped, according to certain rules.

A full backup cannot be part of an incremental backup strategy; that is, it cannot be the parent for a subsequent incremental backup.

Incremental
An incremental backup is either a level 0 backup, which includes every block in the file except blocks compressed out because they have never been used, or a level 1 backup, which includes only those blocks that have been changed since the parent backup was taken.

A level 0 incremental backup is physically identical to a full backup. The only difference is that the level 0 backup is recorded as an incremental backup in the RMAN repository, so it can be used as the parent for a level 1 backup.

Saturday, November 17, 2007

How to setup VIP (Virtual IP) in a DG setup

I will write soon...

How to start databases in a DG setup

Scenario : Power outage at your data center brought down the databases in the DG configuration.

Solution : Start up the primary database
SQL> startup
: startup mount the standby database
SQL> startup mount

Login to Primary node and start dgmgrl
DGMGRL> connect sys/password
DGMGRL> enable configuration;
DGMGRL> enable fast_start failover;
DGMGRL> show configuration;

It should show you the primary database and physical standby database

Thursday, November 1, 2007

Some Important Linux/Unix Commands

How to find bit level in Unix
$ isainfo -kv

In Linux
$ uname -a
gives you OS, version and bit level.

Find out KDE Desktop version:
konqueror --version

Find out Gnome Desktop version:
gnome-panel --version

Find out Mozilla browser version:
mozilla --version

Find out Firefox browser version:
firefox --version

Find out current Language:
set | egrep '^(LANG|LC_)'

Find out disk space usage:
df -h

Find/Estimate file space usage:
du -h

Find out version of Linux glibc:
ls -l /lib/libc-*.so /lib/libc.so*

Find out user limits:
ulimit -a

Find out installed device drivers (modules)
lsmod

Find out information about an X server:
xdpyinfo
It can find out:
• Name of display:
• Version number
• Vendor name (such as The XFree86 Project)
• Vendor release number
• And XFree86 version number

Find out information about Linux CPU
cat /proc/cpuinfo

Find out information about Linux Memory
cat /proc/meminfo
OR
free -m
OR
free -g

Find out user shell name:
ps -p $$ | tail -1 | awk '{ print $4 }'

Dump Linux kernel variables
/sbin/sysctl -a

Find out running Linux kernel version:
uname -mrs
uname -a
cat /proc/version

Dump or display memory information and swap information:
free -m

Network card and IP address information:
ifconfig -a
ifconfig -a|less

Debian / Ubuntu Linux network configuration file (all interface eth0,eth1,…ethN)
more /etc/network/interfaces
Redhat / CentOS / Fedora Linux network configuration file (eth0)
more /etc/sysconfig/network-scripts/ifcfg-eth0
Note replace eth1 for 2nd network card and so on.

Display routing information
route -n
route

Display list of all open ports
netstat -tulpn

View login related logs
tail -f /var/log/secure
vi /var/log/secure
grep 'something' /var/log/secure
View mail server related logs
tail -f /var/log/maillog
vi /var/log/maillog
grep 'something' /var/log/maillog

Find how long the system has been running
uname
w

Show who is logged on and what they are doing
w
who

Display list of tasks
top

Display all running process
ps aux
ps aux | grep process-name

Display list of all installed software on Redhat / CentOS / Fedora
rpm -qa
rpm -qa | grep 'software-name'
rpm -qa | less

Display list of all installed software on Debian / Ubuntu
dpkg --list

Once information collected it can be easily send as an email to help desk. You can use all above command to gathers information about a remote Linux system over secure ssh session (see related functions that gathers up information about a Linux and FreeBSD system). Best part is all above commands runs in non privileged

Wednesday, October 31, 2007

Update rows using replace in SQL*Plus

For Example

SQL> desc rk
Name Null? Type
----------------------------------------- -------- ----------------------------
CONTENT_VALUE_SMALL VARCHAR2(512 CHAR)

SQL> select * from rk;

http://i.a.rk.net/v5cache/TBS/veryf/images/i0/3telecom_expo_400_304.jpg

Now you want to replace rk.net with xyz.net

SQL> update rk set CONTENT_VALUE_SMALL=replace(CONTENT_VALUE_SMALL,'http://i.a.rk','http://i.a.xyz');

Manual installation of Oracle Text 10g

Manual installation of Text 10gR1 (10.1.0.x) and (10.2.0.x)

1. Text dictionary, schema name CTXSYS, is created by calling following script from SQL*Plus connected as SYSDBA:

SQL> connect SYS/password as SYSDBA
SQL>@?/ctx/admin/catctx.sql CTXSYS SYSAUX TEMP NOLOCK

Where:CTXSYS - is the ctxsys user password
SYSAUX - is the default tablespace for ctxsys
TEMP - is the temporary tablespace for ctxsys
LOCK|NOLOCK - ctxsys user account will be locked or not

2. The next step is to install appropriate language-specific default preferences.To manually install US default preferences, for example, log into sqlplus as CTXSYS, and run 'drdefus.sql' as described below:

SQL> connect CTXSYS/password
SQL>@?/ctx/admin/defaults/drdefus.sql

Tuesday, October 30, 2007

CPIO Command

Extract the file:
cpio -idmv < file_name

Thursday, October 25, 2007

Find out last sql statement issued from the same session

select sql_text from v$sql
where hash_value = ( select prev_hash_value from v$session
where sid = (select sid from v$mystat where rownum = 1))
/

Thursday, September 27, 2007

How to find object/schema size

Find SCHEMA SIZE or USER SIZE

SELECT tablespace_name,
Sum(bytes)/1024/1024 AS total_size_mb
FROM dba_segments
WHERE owner = Upper('&User_Name')
GROUP BY owner, rollup(tablespace_name)

Find OBJECT Size

select segment_name, sum(bytes/(1024*1024)) size_mb from user_segments
where segment_type = 'TABLE' and segment_name='&SEGMENT_NAME'
group by segment_name

Simple Unix Commands AWK, SED etc....

Find files greater than 1M

$> find . -depth -xdev -size +1000000c -print

Delete files older than 3 days

$> find . -mtime +3 -print -exec rm * {} ;

To find out what is running on a port

$> lsof -i tcp grep port number

Change contents of file from lower case to upper case

$> cat x.lst tr [a-z] [A-Z] > x.sql

AWK COMMANDS

To print the first two fields in opposite order, enter:

$> awk '{ print $2, $1 }' chapter1

To display all lines between the words start and stop, including "start" and "stop", enter:

$> awk '/start/,/stop/' chapter1

To run an awk command program, sum2.awk, that processes the file, abc1, enter:

$> awk -f sum2.awk abc1

contents of file ‘abc1’
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
------- Start of Program sum2.awk ------
{
sum += $2
}
END {
print "Sum: ", sum;
print "Average:", sum/NR;
}
------- End of Program sum2.awk ------
Explanation: The first action adds the value of the second field of each line to the variable sum. All variables are initialized to the numeric value of 0 (zero) when first referenced. The pattern END before the second action causes those actions to be performed after all of the input file has been read. The NR special variable, which is used to calculate the average, is a special variable specifying the number of records that have been read.

****************************************************************************************
Repeat everyline in the file and prefix it with some word

Example :

Source : a.sh

abc
def
ghi

Output : b.sh

echo abc
tnsping abc
echo def
tnsping def
echo ghi
tnsping ghi

Solution

for line in `cat a.sh`
do
echo echo "$line"
echo tnsping $line
done >> b.sh

*********************************************************************************

Bring the 1st line at the end of 2nd line and then 3rd line at the end of 4th line and so on…..

Example :

Input

ADMINDB1
SYSTEM/manager@
ADSERVD1
SYSTEM/manager@

Output

SYSTEM/manager@ADMINDB1
SYSTEM/manager@ADSERVD1

Solution

awk 'NR % 2 == 0' testfile > file1
awk 'NR % 2 ' testfile > file2
paste file1 file2 tr -d 't'

******************************************************

Wednesday, September 26, 2007

DBMS_JOB

Here's few examples of how to submit the jobs.

The following would create a job to refresh a MV everyday at Midnight starting from tomorrow

SQL> variable jno number;
SQL> exec dbms_job.submit(:jno,'DBMS_MVIEW.REFRESH(''MV_TESTING'',''C'');',trunc(sysdate)+1,'sysdate+1');

Here's another example

declare
jno number;
begin
dbms_job.submit(job => jno,
what => 'my_proc(abc);'
next_date => sysdate+1,
interval => 'sysdate+1');
end;
/

next_date = > next date/time you want to run the job to run. An argument of DATE datatype must be passed in.

Example : to_date('09/30/2007 11:24 AM','MM/DD/YYYY HH:MI AM');
OR
sysdate+1 Same time tomorrow
trunc(sysdate)+1 Midnight tomorrow
trunc(sysdate)+19/24 7 pm (1900 Hrs) today

interval => at which frequency the job should be run . The argument passed is in varchar2.
Example
'sysdate+1' 24 hours after the job's current run time
'trunc(sysdate)+1' Midnight after the day the job is run
'trunc(sysdate)+19/24' 7 PM on the day the job is run

Monday, September 24, 2007

Data Guard Tips

Shutting Down a Physical Standby Database

To shut down a physical standby database, use the SQL*Plus SHUTDOWN command. If the database is performing managed recovery, you must cancel managed recovery operations before issuing the SHUTDOWN command.

If the primary database is up and running, defer the archive log destination on the primary database and perform a log switch operation (to make the defer operation take effect) before shutting down the standby database. Otherwise, log transport services will not be able to transmit redo data to this standby site.

The following steps show you how to shut down a standby database:

Find out if the standby database is performing managed recovery. If the MRP0 or MRP process exists, then the standby database is performing managed recovery.

You can find out which instance is the apply-instance from the ps -eaf | grep mrp command at the OS Level OR using SQL*Plus at the standby database

SQL> select b.host_name,b.INSTANCE_NAME,a.PROCESS,a.STATUS
from gv$managed_standby a, gv$instance b where a.inst_id=b.inst_id

Cancel managed recovery operations.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Shut down the standby database.
SQL> SHUTDOWN IMMEDIATE;

Monday, September 10, 2007

Generate Test Data

Came across this website for generating test data.

http://www.generatedata.com/#about

http://www.sqledit.com/dg/

http://www.igs-edv.de/tdg_e/

Where is your Query stuck

If you know the process ID you can find the SID, SERIAL#, SQL_ID and other details using the following query: (process id can be taken from the unix TOP command)

-- To find the SID, username and SQL_ID associated with a process id taken from unix
select a.sid,a.serial#,a.username,a.status,a.sql_id
from v$session a,v$process b
where b.spid='&PROCESS_ID'
and a.paddr=b.addr
/

You can take a look at the actual explain_plan by executing the following

select operation,options,object_name,object_alias,object_type
from v$sql_plan where sql_id=''

Select from v$session_longops to find out where your query is

select qcsid,sid,username,opname,target,sofar,totalwork,units,start_time,
time_remaining,elapsed_seconds,message
from v$session_longops
order by qcsid
/

Thursday, September 6, 2007

Refreshing a Materialized View owned by other user

Suppost USER_A owns the MV and USER_B wants the privileges to execute it.

While trying to refresh a materialized view owned by other users you might get

SQL> EXECUTE DBMS_MVIEW.REFRESH('SALES_SUMMARY','F');
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2430
ORA-06512: at line 2

In order to refresh a materialized view owned by other user, you must have the following privileges in addition to privileges on objects owned by USER_A which are being used in the MV.

SQL> GRANT ALTER ANY MATERIALIZED VIEW TO &USER_B

The DBMS_MVIEW package can manually invoke either a fast refresh or a complete refresh. F means Fast Refresh and C means Complete Refresh:

EXECUTE DBMS_MVIEW.REFRESH('SALES_SUMMARY','F');

**********
The privileges required to create a materialized view log directly relate to the privileges necessary to create the underlying objects associated with a materialized view log.

If you own the master table, you can create an associated materialized view log if you have the CREATE TABLE privilege.

If you are creating a materialized view log for a table in another user's schema, you must have the CREATE ANY TABLE and COMMENT ANY TABLE system privileges, as well as either the SELECT object privilege on the master table or the SELECT ANY TABLE system privilege.

In either case, the owner of the materialized view log must have sufficient quota in the tablespace intended to hold the materialized view log or must have the UNLIMITED TABLESPACE system privilege.

Tuesday, September 4, 2007

How does one clone database with RMAN

How does one clone/duplicate a database with RMAN?

The first step to clone or duplicate a database with RMAN is to create a new INIT.ORA and password file (use the orapwd utility) on the machine you need to clone the database to. Review all parameters and make the required changed. For example, set the DB_NAME parameter to the new database's name.

Secondly, you need to change your environment variables, and do a STARTUP NOMOUNT from sqlplus. This database is referred to as the AUXILIARY in the script below.

Lastly, write a RMAN script like this to do the cloning, and call it with "rman cmdfile dupdb.rcv":

connect target sys/secure@origdb

connect catalog rman/rman@catdb

connect auxiliary /

run {

set newname for datafile 1 to '/ORADATA/u01/system01.dbf';

set newname for datafile 2 to '/ORADATA/u02/undotbs01.dbf';

set newname for datafile 3 to '/ORADATA/u03/users01.dbf';

set newname for datafile 4 to '/ORADATA/u03/indx01.dbf';

set newname for datafile 5 to '/ORADATA/u02/example01.dbf';

allocate auxiliary channel dupdb1 type disk;

set until sequence 2 thread 1;

duplicate target database to dupdb

logfile

GROUP 1 ('/ORADATA/u02/redo01.log') SIZE 200k REUSE,

GROUP 2 ('/ORADATA/u03/redo02.log') SIZE 200k REUSE;

}

The above script will connect to the "target" (database that will be cloned), the recovery catalog (to get backup info), and the auxiliary database (new duplicate DB). Previous backups will be restored and the database recovered to the "set until time" specified in the script.

Notes: the "set newname" commands are only required if your datafile names will different from the target database.

The newly cloned DB will have its own unique DBID.

PL/SQL Procedure to update values

PL/SQL Procedure to update values in a table based on values taken from another table.

EXAMPLE :
SQL> desc change_log

Name Null? Type
----------------------------------------- -------- ----------------------------
LOG_ID NOT NULL NUMBER(11)
POLL_ID NOT NULL NUMBER(11)
QUESTION_ID NOT NULL NUMBER(11)
ANSWER_ID NOT NULL NUMBER(11)
COUNT NOT NULL NUMBER(11)
TIME NOT NULL DATE

SQL> desc answers

Name Null? Type
----------------------------------------- -------- ----------------------------
ANSWER_ID NOT NULL NUMBER(11)
QUESTION_ID NOT NULL NUMBER(11)
POLL_ID NOT NULL NUMBER(11)
ANSWER_TEXT NOT NULL VARCHAR2(512)
ANSWER_VOTES NOT NULL NUMBER(11)

## Update answers table based on certain conditons on the change_log table.

EXECUTE DBMS_SESSION.SET_NLS('NLS_DATE_FORMAT','''DD-MON-YYYY HH24:MI:SS''');
SET SERVEROUTPUT ON;
execute dbms_output.enable(1000000000);

Declare
Begin
for rec in (
select sum(NVL(change_log.count,0)) as count_total , POLL_ID,QUESTION_ID,ANSWER_ID
from change_log
group by POLL_ID,QUESTION_ID,ANSWER_ID
)
LOOP
begin
update answers
set answers.ANSWER_VOTES= NVL(rec.count_total,0)
where answers.ANSWER_ID=rec.ANSWER_ID
and answers.QUESTION_ID=rec.QUESTION_ID
and answers.POLL_ID=rec.POLL_ID
and answers.answer_id=3;
exception when Others then
dbms_output.put_line ( rec.ANSWER_ID || ' - ' ||rec.ANSWER_ID||' - '||rec.POLL_ID);
end;
END LOOP;
End;
/

Load Test Data

Many times development teams asks DBAs to populate test data so that they can perform some performance test.

Below you will find some examples to load test data.

SQL> create table sales
(trans_date date, cust_id int, sales_amount number );

SQL> insert /*+ APPEND */ into sales
select trunc(sysdate,'year')+mod(rownum,366) TRANS_DATE,
mod(rownum,100) CUST_ID,
abs(dbms_random.random)/100 SALES_AMOUNT
from all_objects;
SQL> commit;

SQL> begin
for i in 1 .. 4
loop
insert /*+ APPEND */ into sales
select trans_date, cust_id, abs(dbms_random.random)/100
from sales;
commit;
end loop;
end;
/
PL/SQL procedure successfully completed.

SQL> commit;

## SELECT A RANDOM VALUE ##
select cust_id from (select cust_id from sales order by dbms_random.value) where rownum=1;

Some other commands using DBMS_RANDOM to generate and populate values

SELECT dbms_random.normal FROM dual;
SELECT ABS(dbms_random.normal) FROM dual;
SELECT (1+ABS(MOD(dbms_random.random,100000))) FROM dual; ==>> force output to +ve values only
SELECT dbms_random.value(2, 3) FROM dual; ==>> Get a value between 2 and 3
## Random String ##
dbms_random.string(opt IN CHAR, len IN NUMBER)
RETURN VARCHAR2;

select dbms_random.string('A', 12) from dual; ==>> random alphanumeric values

opt seed values:
'a','A' alpha characters only (mixed case)
'l','L' lower case alpha characters only
'p','P' any printable characters
'u','U' upper case alpha characters only
'x','X' any alpha-numeric characters (upper)

-- create test data == another example ==
CREATE TABLE test (
col1 VARCHAR2(20),
col2 VARCHAR2(20));

DECLARE
x VARCHAR2(20);
y VARCHAR2(20);
BEGIN
FOR i IN 1..100
LOOP
x := dbms_random.string('A', 20);
y := dbms_random.string('A', 20);

INSERT INTO test
(col1, col2)
VALUES
(x,y);
END LOOP;
COMMIT;
END;
/

Monday, July 30, 2007

RAC Architecture Overview

RAC Architecture Overview

Let's begin with a brief overview of RAC architecture.

A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to perform the same task.

A RAC database is 2 or more instances running on a set of clustered nodes, with all instances accessing a shared set of database files.

Depending on the O/S platform, a RAC database may be deployed on a cluster that uses vendor clusterware plus Oracle's own clusterware (Cluster Ready Services), or on a cluster that solely uses Oracle's own clusterware.

Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the primary tool DBAs use to configure CRS for their RAC database and processes.

Cluster Ready Services and the OCR

Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own clusterware. On most platforms, Oracle supports vendor clusterware; in these cases, CRS interoperates with the vendor clusterware, providing high availability support and service and workload management. On Linux and Windows clusters, CRS serves as the sole clusterware. In all cases, CRS provides a standard cluster interface that is consistent across all platforms.

CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks:
the Oracle Cluster Registry (OCR), and the voting disk.
CRS manages the following resources:
The ASM instances on each node
Databases
The instances on each node
Oracle Services on each node

The cluster nodes themselves, including the following processes, or "nodeapps":
VIP
GSD
The listener
The ONS daemon

CRS stores information about these resources in the OCR. If the information in the OCR for one of these resources becomes damaged or inconsistent, then CRS is no longer able to manage that resource. Fortunately, the OCR automatically backs itself up regularly and frequently.
Interacting with CRS and the OCR: srvctl
srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster registry. Oracle does provide several tools to interface with the cluster registry and CRS more directly, at a lower level, but these tools are deliberately undocumented and intended only for use by Oracle Support. srvctl, in contrast, is well documented and easy to use. Using other tools to modify the OCR or manage CRS without the assistance of Oracle Support runs the risk of damaging the OCR.
Using srvctl
Even if you are experienced with 9i srvctl, it's worth taking a look at this section; 9i and 10g srvctl commands are slightly different.
srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is
srvctl [options]
where command is one of
enabledisablestartstoprelocatestatusaddremovemodifygetenvsetenvunsetenvconfig
and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps.
The srvctl commands are summarized in this table:
Table 1. Summary of srvctl commands.
Command
Targets
Description
srvctl addsrvctl modifysrvctl remove
databaseinstanceservicenodeapps
srvctl add / remove adds/removes target's configuration information to/from the OCR.
srvctl modify allows you to change some of target's configuration information in the OCR without wiping out the rest.
srvctl relocate
service
Allows you to reallocate a service from one named instance to another named instance.
srvctl config
databaseservicenodeappsasm
Lists configuration information for target from the OCR.
srvctl disablesrvctl enable
databaseinstanceserviceasm
srvctl disable disables target, meaning CRS will not consider it for automatic startup, failover, or restart. This option is useful to ensure an object that is down for maintenance is not accidentally automatically restarted.
srvctl enable reenables the specified object.
srvctl getenvsrvctl setenvsrvctl unsetenv
databaseinstanceservicenodeapps
srvctl getenv displays the environment variables stored in the OCR for target.
srvctl setenv allows these variables to be set, and unsetenv unsets them.
srvctl startsrvctl statussrvctl stop
databaseinstanceservicenodeappsasm
Start, stop, or display status (started or stopped) of target.
As you can see, srvctl is a powerful utility with a lot of syntax to remember. Fortunately, there are only really two commands to memorize: srvctl -help displays a basic usage message, and srvctl -h displays full usage information for every possible srvctl command.

Examples for using srvctl

Example 1. Bring up the MYSID1 instance of the MYSID database.
[oracle@myserver oracle]$ srvctl start instance -d MYSID -i MYSID1

Example 2. Stop the MYSID database: all its instances and all its services, on all nodes.
[oracle@myserver oracle]$ srvctl stop database -d MYSID

Example 3. Stop the nodeapps on the myserver node. NB: Instances and services also stop.
[oracle@myserver oracle]$ srvctl stop nodeapps -n myserver

Example 4. Add the MYSID3 instance, which runs on the myserver node, to the MYSID clustered database.
[oracle@myserver oracle]$ srvctl add instance -d MYSID -i MYSID3 -n myserver

Example 5. Add a new node, the mynewserver node, to a cluster.[oracle@myserver oracle]$ srvctl add nodeapps -n mynewserver -o $ORACLE_HOME -A 149.181.201.1/255.255.255.0/eth1
(The -A flag precedes an address specification.)

Example 6. To change the VIP (virtual IP) on a RAC node, use the command
[oracle@myserver oracle]$ srvctl modify nodeapps -A new_address

Example 7. Find out whether the nodeapps on mynewserver are up.[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserverVIP is running on node: mynewserverGSD is running on node: mynewserverListener is not running on node: mynewserverONS daemon is running on node: mynewserver

Example 8. Disable the ASM instance on myserver for maintenance.
[oracle@myserver oracle]$ srvctl disable asm -n myserver
Debugging srvctl
Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable.
[oracle@myserver bin]$ export SRVM_TRACE=true
Let's repeat Example 6 with SRVM_TRACE set to true:[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserver/u01/app/oracle/product/10.1.0/jdk/jre//bin/java -classpath /u01/app/oracle/product/10.1.0/jlib/netcfg.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/rt.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/i18n.jar:/u01/app/oracle/product/10.1.0/jlib/srvm.jar:/u01/app/oracle/product/10.1.0/jlib/srvmhas.jar:/u01/app/oracle/product/10.1.0/jlib/srvmasm.jar:/u01/app/oracle/product/10.1.0/srvm/jlib/srvctl.jar -DTRACING.ENABLED=true -DTRACING.LEVEL=2 oracle.ops.opsctl.OPSCTLDriver status nodeapps -n mynewserver[main] [19:53:31:778] [OPSCTLDriver.setInternalDebugLevel:165] tracing is true at level 2 to file null[main] [19:53:31:825] [OPSCTLDriver.:94] Security manager is set[main] [19:53:31:843] [CommandLineParser.parse:157] parsing cmdline args[main] [19:53:31:844] [CommandLineParser.parse2WordCommandOptions:900] parsing 2-word cmdline[main] [19:53:31:866] [GetActiveNodes.create:212] Going into GetActiveNodes constructor...[main] [19:53:31:875] [HASContext.getInstance:191] Module init : 16[main] [19:53:31:875] [HASContext.getInstance:216] Local Module init : 19...[main] [19:53:32:285] [ONS.isRunning:186] Status of ora.ganges.ons on mynewserver is trueONS daemon is running on node: mynewserver[oracle@myserver oracle]$

Pitfalls

A little impatience when dealing with srvctl can corrupt your OCR, ie, put it into a state where the information for a given object is inconsistent or partially missing. Specifically, the srvctl remove command provides the -f option, to allow you to force removal of an object from the OCR. Use this option judiciously, as it can easily put the OCR into an inconsistent state.

Restoring the OCR from an inconsistent state is best done with the assistance of Oracle Support, who will guide you in using the undocumented $CRS_HOME/bin/crs_* tools to repair it. The OCR can also be restored from backup.

Error messages

srvctl errors are PRK% errors, which are not documented in the 10gR1 error messages manual. However, for those with a Metalink account, they are documented on Metalink here.
Conclusion

srvctl is a powerful tool that will allow you to administer your RAC easily and effectively. In addition, it provides a valuable buffer between the DBA and the OCR, making it more difficult to corrupt the OCR.

Global Cache Service GCS and Global Enqueue Service GES together manage the Cache Fusion processes, resource transfer and resource escalation among the instances. Enqueues are internal oracle locks; GCS to handle the buffer cache across all instances.

GCS and GES together maintain Global Resource Directory (GRD). GRD remains in the memory and is stored on all instances. GRD records current status of the data blocks.
It is distributed across all instances in a cluster and is located in the variable or shared pool section of the SGA.

RAC Processes : LMON – Global Enqueue Service Monitor (LMON) monitors the entire cluster. It manages instance and process failures and associated recovery for GCS and GES.

LMDx – Global Enqueue Service Daemon (LMD) is the lock agent. It also handles dead lock detection and remote enqueue requests.

LMSx – Global Cache Service Processes are the processes that handle remote GCS messages. Can have upto 10 in number. LMS is interconnect process and monitors the block transfer between instances.

LCKx – It manages Global Enqueue requests and the cross-instance broadcast.

DIAG – Diagnosability Daemon – Monitors the health of the instance and captures the data for instance process failures. Can be in non-rac database too after 9i.

The use of single SPFILE provide administrative ease and SPFILE has to be located on a clustered file system.

OCR contains cluster and database configuration information for RAC and CRS, instance and

Voting Disk is a file on a shared cluster system; used to maintain cluster integrity or cluster membership.

CRS starts up first and at that time ASM is not up, so we can not have OCR and voting disk on ASM.
$CRS_HOME/log contains the alert log and

ASM – has its init.ora file which tells which instances it has.

Nodeapps : VIP/GSD/Listener/Oracle Notification Service (ONS).

VIP (virtual IP) – CRS resource associated with an ip address.

TAF – Transparent Application Failover

ASM

ASM stores the metadata that is required to make available the files stored within the ASM storage system to non-ASM oracle databases.

Two additional Processes : RBAL – Rebalancer Process to rebalance activity for ASM disk groups and ARBn – Actual rebalancer process to rebalance of data extent movements.

wmos=
(DESCRIPTION=
(LOAD_BALANCE=yes)
(FAILOVER=on)
(ADDRESS= (PROTOCOL=tcp)(HOST=wmsdevrac1-vip.acme.com)(PORT=1526))
(ADDRESS= (PROTOCOL=tcp)(HOST=wmsdevrac2-vip.acme.com)(PORT=1526))
(CONNECT_DATA=
(SERVICE_NAME=wmos)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))

Overview of Transparent Application Failover
Uncommitted insert, update, and delete commands are rolled-back and must be resubmitted after reconnection. Again, use of the OCI packages should be utilized to have the DML operations reissued.

The Oracle Net process carries out TAF functionality. The failover is configured in the tnsnames file. The TAF settings are placed in the net service name area, within the connect_data section of the tnsnames, using the failover_mode and instance_role parameters.

Load Balancing

The listener connection load-balancing feature improves connection performance by balancing the number of active connections among multiple dispatchers and instances. In a single-instance environment, the listener selects the least-loaded dispatcher to handle the incoming client requests. In an Oracle Database 10g Real Application Clusters environment, connection load balancing also has the capability to balance the number of active connections among multiple instances. Due to dynamic service registration, a listener is always aware of all instances, and in the case of the multi-threaded server (MTS), a listener is aware of the dispatchers, regardless of their locations. Depending on the load information, a listener decides which instance, and if the multi-threaded server (shared server) is configured, it decides which dispatcher to send the incoming client request. In a MTS configuration, a listener selects a dispatcher in the following order: