GML blog: 2011/06

Monday, 20 June 2011

Oracle database backup, restore and recovery - notes to self

Backup, Restore and Recovery with Oracle
(Based on notes from Ganesh's friend largely based on Oracle docs)

Backup
A database backup is is a copy of the data stored in a database. There are two types of backups: physical backups and logical backups.
A physical backup is a copy of the files that belong to a database at a specific point in type. It can either be a cold backup or hot backup.
A logical backup is does not copy the files, instead, it extracts the data into a specific file format (for example with exp.)
A backup is needed to restore the data in a database if it gets damaged.
Some structural changes as well as direct path load require a backup.

Consistent and inconsistent backups

Consistent backup
A consistent backup exhibits the following three properties:
All headers of datafiles that belong to a writable tablespaces have the same checkpoint SCN.
These datafiles don't have any changes past this checkpoint SCN.
Lastly, The SCNs of the datafile headers match the checkpoint information in the controlfiles.

Inconsistent backup
An inconsistent backup is where some files contain changes that were made after the files were checkpointed. A recovery is needed in order to make the backup consistent.
An inconsistent backup is created by a hot backup. Also, if the database crashed (or was shutdown abort) and then the backup was made.

Restore
A restore is the copying of backed up data to their original location in order to recover the database.
Usually, a file is restored after a media failure. However, there are other situations where a file is restored, for example a point in time recovery.

Recovery
Recovery is the process of bringing the database "up to date" since the last backup (whose data files would have been restored). All database changes are applied to the restored database files by "rolling forward" on the archived redolog files.
It is also possible to recover (incomplete) if the current redo log is lost or if the controlfiles are lost.
In rare cases consider applying the unarchived redolog file/s in the recovery process if it makes sense.

Types of recovery
There are four types of recovery:
• Media/Datafile media recovery
• Crash recovery
Oracle performs an automatic crash recovery when the instance is started after a shutdown abort or after a crash.
A crash recovery 'only' uses the online redo log to recover online datafiles.
Crash recovery is the recovery of a database in a single-instance configuration or an Oracle Real Application Clusters configuration in which all instances have crashed. In contrast, instance recovery is the recovery of one failed instance by a live instance in an Oracle Real Application Clusters configuration.
• Instance recovery
Instance Recovery is only used in a RAC environment. It is the application of online redo log by one instance that detected the crash of another instance.
• Disaster recovery

There are at least five options to be considered for disaster recovery when running Oracle:
• OPS (Oracle Parallel Server)
• Replication
• Standby databasess
• OS oriented solutions
• Architectural solutions

A recovery (if it is not a crash recovery) can either be a
• Complete recovery, or an
• Incomplete recovery (Also known as Point in time recovery (PITR), or database point in time recovery (DBPITR))
Not all redo generated since the last backup is applied in a incomplete recovery. It is also called database point in time recovery (DBPITR).
Flashback can sometimes be used as an alternative to point in time recovery

Performing backup and recovery

Backups and recovery can be performed either
• with RMAN (the recovery manager) or
• using SQL*Plus and operating system file copy commands (such as cp on Unix and copy on Windows).

recover syntax

recover database until cancel
recover database until time '2004-03-21:22:59:04'
recover database until change 123456

recover datafile 'filename' until cancel
recover datafile 'filename' until time '2004-03-21:22:59:04'
recover datafile 'filename' until change 123456

recover tablespace ts_name until cancel
recover tablespace ts_name until time '2004-03-21:22:59:04'
recover tablespace ts_name until change 123456

recover database using backup controlfile

Until time
Performs a incomplete recovery (=Point in time recovery).
The format of the time is 'YYYY-MM-DD:HH24:MI:SS'

Until change
Performs a incomplete recovery.
The number that follows until change is an SCN. In order to recover up to and including SCN 999, use recover until change 1000.

Standby database
Recovers a standby database.

Using backup controlfile
Don't use the current controlfiles, use backed up control files instead.

Useful SQL to run in a recovery scenario

select substr(name, 1, 50), status from v$datafile;
select substr(name,1,40), recover, fuzzy, checkpoint_change#
select name, open_mode, checkpoint_change#, ARCHIVE_CHANGE# from v$database;
select GROUP#,THREAD#,SEQUENCE#,MEMBERS,ARCHIVED,STATUS,FIRST_CHANGE# from v$log;
select GROUP#,substr(member,1,60) from v$logfile;
select * from v$recover_file;
select distinct status from v$backup;
select hxfil FILENUMBER,fhsta STATUS,fhscn SCN,fhrba_Seq SEQUENCE from x$kcvfh;
select fhsta, count(*) from X$KCVFH group by fhsta;
select * from v$log;
select * from v$logfile;

Thursday, 9 June 2011

SQL syntax for banded count of counts

This is a simple example of SQL syntax for banded count of counts (works on Teradata 12.0)

select case when a.num_occ > 10000 then 'a. > 10000'
when a.num_occ > 5000 then 'b. > 5000'
when a.num_occ > 2000 then 'c. > 2000'
when a.num_occ > 1000 then 'd. > 1000'
when a.num_occ > 500 then 'e. > 500'
when a.num_occ > 250 then 'f. > 250'
when a.num_occ > 100 then 'g. > 100'
when a.num_occ > 50 then 'h. > 50'
when a.num_occ > 10 then 'i. > 10'
when a.num_occ > 1 then 'j. >1'
else 'k. = 1'
end
, count(*), sum(a.num_occ)
from (select id, count(*) as num_occ
from sandbox.events
group by id) a
group by 1
order by 1 desc

Friday, 3 June 2011

Teradata data migration from one Teradata system to another using arcmain

Problem

Need to migrate data from Teradata system SRC (i.e. SouRCe) to Teradata system TGT (i.e. TarGeT).

Approach

Create a pair of named pipes called $SRC_FIFO and $TGT_FIFO.

Link these together by cat'ing $SRC_FIFO and outputting into $TGT_FIFO and running in the background like this:

cat $SRC_FIFO > $TGT_FIFO &

The Producer for $SRC_FIFO is the arcmain process extracting data from SRC.

The Consumer for $SRC_FIFO is the cat command above i.e. cat $SRC_FIFO

The Producer for $TGT_FIFO is the input to $TGT_FIFO from the cat command above i.e. > $TGT_FIFO.

The Consumer for $TGT_FIFO is the arcmain process copying data into TGT.

Supporting Files

A single file is required with a list of tables and their associated databases in the SRC system with the following format:

databaseX.tableA

databaseY.tableB

...

databaseZ.tableC

Note in this version - there should not be no whitespace on any line in the file.

In the code, this file is referenced as $ARC_OBJECT_FILE.

Command line

To run the code, use something like:

./mig_test2dev.ksh ARC_OBJ_LIST=/tmp/example_archive_file_list 2>&1 | tee mig_test2dev.log

where the list of objects to be migrated is in the file /tmp/example_archive_file

Code

The code can be found at the end of this post.

Potential Problems

There is a reasonable desire to improve the throughput of both arcmain processes by increasing the SESSIONS=NN argument. I have had "general communication" errors when setting this argument to greater values.

Future improvements

This script was coded quickly and there are parts that need improvements - esp to make it generic for running any environment.

Allow whitespace in the file with the list of database tables (see example below)
Add break-point to a staging area to support migration across remote sites

Add compression support to arcmain procedures
Add chunking of arcmain archives (breaking into xGB chunks)
Adding copy across a network via SSH

Add check to determine whether the db.table already exists in the TGT system

If so, allow option of suffixing with date YYYYMMDD or NNN version number

GUI - there is already a product to do this - so probably not

Use and Support

Happy for the code to be used (at your own risk).

There is no support for this code - though would be happy to assist if able.

Code Listing

$ cat mig_test2dev.ksh

#!/bin/ksh

# Program: mig_test2dev.ksh

# Description: Teradata data migration script.

# Moves data between TD system SRC (ie SouRCe) and TD system TGT (ie TarGeT)

# Version: 1.0 gml 31-May-2011 Initial version

# ---------------------------------------------------------------------------------------

err_msg() {

echo "ERROR: in $PRG - $1"

echo "usage: mig_test2dev.ksh ARC_OBJ_LIST="

echo "example: ./mig_test2dev.ksh ARC_OBJ_LIST=/tmp/example_archive_file_list 2>&1 | tee mig_test2dev.log"

exit 1

}

# Remove remnant FIFOs

cleanup_fifo() {

if [ -n "$DEBUG" ]

then

set -x

SRC_OR_TGT=$1

# clean up existing ARC_FIFO

if [ -p "${ARC_FIFO}_${SRC_OR_TGT}" ]

then

rm ${ARC_FIFO}_${SRC_OR_TGT}

}

# create the arcmain scripts

create_arcmain_script() {

if [ -n "$DEBUG" ]

then

set -x

SRC_OR_TGT=$1

THE_ARCMAIN_SCRIPT=${ARCMAIN_SCRIPT}_${NOW}_${SRC_OR_TGT}.arc

TMP_ARC_SCRIPT=${ARC_OBJ_LIST}.$$ # To create a temp script file

THE_ARC_FIFO=${ARC_FIFO}_${SRC_OR_TGT} # The FIFO is named in the arcmain script (instead of a filename)

if [ "$SRC_OR_TGT" = "SRC" ]

then

LOGON_STR="${TD_SRC}/${TD_SRC_LOGON}"

ARC_CMD="ARCHIVE"

cat ${ARC_OBJ_LIST} | awk -F'.' ' { printf("(%c%s%c.%c%s%c),\n",34,$1,34,34,$2,34) } ' > ${TMP_ARC_SCRIPT}

else # [ "$SRC_OR_TGT" = "TGT" ]

LOGON_STR="${TD_TGT}/${TD_TGT_LOGON}"

ARC_CMD="COPY"

cat ${ARC_OBJ_LIST} | awk -F'.' -v v_tgt_db=$TGT_DB -v v_tgt_tab_suffix=$TGT_TAB_SUFFIX ' { printf("(%c%s%c.%c%s%s%c) (FROM(%c%s%c.%c%s%c)),\n",34,v_tgt_db,34,34,$2,v_tgt_tab_suffix,34,34,$1,34,34,$2,34) } ' > ${TMP_ARC_SCRIPT}

# build the arcmain_script from the arcmain_script_master

if [ -n "${TMP_ARC_SCRIPT}" ]

then

cat <${THE_ARCMAIN_SCRIPT}

logon ${LOGON_STR};

${ARC_CMD} DATA TABLES

EOF

cat ${TMP_ARC_SCRIPT} >> ${THE_ARCMAIN_SCRIPT}

cat <>${THE_ARCMAIN_SCRIPT}

RELEASE LOCK,

FILE=${THE_ARC_FIFO};

LOGOFF;

EOF

rm ${TMP_ARC_SCRIPT}

else

err_msg "No TMP_ARC_SCRIPT (${TMP_ARC_SCRIPT}) produced"

}

# Main Program

# evaluate any arguments like X=A Y=B ... Z=C

eval $@

# if the DEBUG flag is set, then show the result/impact of every line

if [ -n "$DEBUG" ]

then

set -x

# set up variables

PRG=`basename $0`

NOW=`date +"%Y%m%d%H%M%S"` # used (with $$) to allow multiple executions of this

TD_DEV=10.172.6.41 # Dev TD system

TD_TEST=10.172.6.71 # Test TD system

TD_SRC=${TD_SRC:-${TD_TEST}}

TD_TGT=${TD_TGT:-${TD_DEV}}

TD_SRC_LOGON=${TD_SRC_LOGON:-"ARCUSER1,xxxx"}

TD_TGT_LOGON=${TD_TGT_LOGON:-"ARCUSER1,yyyy"}

TD_SRC_LOGON_STR="${TD_SRC}/${TD_SRC_LOGON}"

TD_TGT_LOGON_STR="${TD_TGT}/${TD_TGT_LOGON}"

TGT_DB=${TGT_DB:-"ST3NF03_DB01"}

TGT_TAB_SUFFIX=${TGT_TAB_SUFFIX:-"001"} # use suffix to avoid overwriting - needs to be more sophisticated

if [ -z "${ARC_OBJ_LIST}" ]

then

err_msg "The ARC_OBJ_LIST argument is missing"

if [ ! -f "${ARC_OBJ_LIST}" ]

then

err_msg "ARC_OBJ_LIST (${ARC_OBJ_LIST}) file does not exist"

HOME_DIR=/home/glourei1 # my home dir

SCRIPT_DIR=${HOME_DIR}/bin # script dir

BASE_ARC_OBJ_LIST=`basename ${ARC_OBJ_LIST}`

LOCK_FILE=$SCRIPT_DIR/${PRG}_${BASE_ARC_OBJ_LIST}.lck # lock file to ensure only one of these scripts runs at a time

# add - SRC_/TGT_ARC_FIFO_DIR

ARC_FIFO=FIFO

SRC_ARC_FIFO=${ARC_FIFO}_SRC

TGT_ARC_FIFO=${ARC_FIFO}_TGT

ARCMAIN_SCRIPT=$SCRIPT_DIR/arcmain_script

SRC_ARCMAIN_SCRIPT=${ARCMAIN_SCRIPT}_${NOW}_SRC.arc

TGT_ARCMAIN_SCRIPT=${ARCMAIN_SCRIPT}_${NOW}_TGT.arc

#let Teradata decide - was getting errors using high numbers

#SRC_NUM_SESSIONS="SESSIONS=8"

#TGT_NUM_SESSIONS="SESSIONS=4"

# This ensures only one of this processes runs at a time

if [ -f $LOCK_FILE ]

then

echo "ERROR: Already running $PRG"

echo " Protected by the lock file $LOCK_FILE"

exit 1

else

touch $LOCK_FILE

# cleanup legacy fifos if required

cleanup_fifo SRC

cleanup_fifo TGT

# create the SRC and TGT arcmain scripts

create_arcmain_script SRC

create_arcmain_script TGT

# Set up the FIFOs

# arcmain ARCHIVE script writes as the Producer part of the SRC_ARC_FIFO

# which is Consumed by the cat below and written as the Producer to the

# TGT_ARC_FIFO which is Consumed by the arcmain COPY into the Target (TGT) database

mkfifo $SRC_ARC_FIFO

mkfifo $TGT_ARC_FIFO

cat ${SRC_ARC_FIFO} > ${TGT_ARC_FIFO} &

# run the arcmain archive

echo "Starting source arcmain ..."

arcmain $SRC_NUM_SESSIONS <$SRC_ARCMAIN_SCRIPT && SRC_RC=$? && [ $SRC_RC -ne 0 ] && err_msg "arcmain archive failed [SRC_RC=${SRC_RC}]" &

sleep 30

# run the arcmain copy/restore

echo "Starting target arcmain ..."

arcmain $TGT_NUM_SESSIONS <$TGT_ARCMAIN_SCRIPT && TGT_RC=$? && [ $TGT_RC -ne 0 ] && err_msg "arcmain copy/restore failed [TGT_RC=${TGT_RC}]" &

# need to wait till all the background processes have completed

wait

# clean up

rm $LOCK_FILE

# all's well - fin

exit 0

Thursday, 2 June 2011

Nested SQL Query syntax for Teradata (Teradata syntax error 3707)

In Oracle, a query like this will work fine:

SELECT num_occurrences, COUNT(*)
FROM (SELECT xxxx, COUNT(*) num_occurrences
FROM mytable
GROUP BY xxxx)
GROUP BY num_occurrences

In Teradata (v 12 as least), one must alias the nested query like this:

SELECT num_occurrences, COUNT(*)

FROM (SELECT xxxx, COUNT(*) num_occurrences

FROM mytable

GROUP BY xxxx) a

GROUP BY num_occurrences

If not you get a message like this ...

3707: Syntax error, expected something like a name or a Unicode delimited identifier or an 'UDFCALLNAME' keyword between ')' and the 'group' keyword

To answer Christi/Christophe's question in the comment below ...

select count(*) from
(select ctm_job, count(*)
from BCRTEC.x_ctm_job
group by ctm_job
having count(*) > 1);

I think this should just work.
Try this ... (note column alias for count(*) mycount in the inner query and the subquery alias a highlighted in red below) ...

select count(*) from
(select ctm_job, count(*) mycount
from BCRTEC.x_ctm_job
group by ctm_job
having count(*) > 1) a

GML blog

Monday, 20 June 2011

Oracle database backup, restore and recovery - notes to self

Thursday, 9 June 2011

SQL syntax for banded count of counts

Friday, 3 June 2011

Teradata data migration from one Teradata system to another using arcmain

Thursday, 2 June 2011

Nested SQL Query syntax for Teradata (Teradata syntax error 3707)

Blog Archive

About Me