Tuesday, 19 July 2011

Simple Tara Teradata Netbackup backup shell script - handles multiple Jobs

Here is a simple script to run Tara Teradata Netbackup backups.


In the example below, I am backing up 3 jobs in 2 batches as follows:

  • On its own
    • 2650_FullBackup_6Stream_4week_ret.arc
  • The following two in parallel
    • 2650_FullBackup_2Stream_4week_ret.arc
    • TD_5550_FULLBACKUP_4STR.arc

Run this from the command line like this:
/usr/local/bin/run_tara_job.ksh DEBUG=0 JOB_NAME=2650_FullBackup_6Stream_4week_ret.arc#:2650_FullBackup_2Stream_4week_ret.arc:TD_5550_FULLBACKUP_4STR.arc# >/var/misc/log/run_tara_job.log 2>&1


Or set up a cron job like this:
0 19 19 7 * /usr/local/bin/run_tara_job.ksh DEBUG=0 JOB_NAME=2650_FullBackup_6Stream_4week_ret.arc#:2650_FullBackup_2Stream_4week_ret.arc:TD_5550_FULLBACKUP_4STR.arc# >/var/misc/log/run_tara_job.log 2>&1

Note - you will need the Job Names from the Tara scripts (typically find these in /opt/teradata/tara/server/sr directory) or from the Tara GUI.


#!/bin/ksh -x
# Program: run_tara_job.ksh
# Description: Run Tara backup job.
# Parameters: JOB_NAME=
# Version: 0.1 gml 19-Jul-2011 Initial version


SEP=":"
SUBSEP="#"              # take care not to pick a delimiter that interferes with greps
DEBUG=0
eval $@


BKP_ROOT_DIR=/opt/teradata/tara/server/bin
TARA_BKP_PASSWD_FILE=/opt/teradata/tara/server/bin/tara.pwd


if [ -z "$JOB_NAME" ]
then
  echo "ERROR: No JOB_NAME arg. It is mandatory"
  exit 1
fi


JOB_NAME=`echo $JOB_NAME | tr "$SEP" " "`




cd $BKP_ROOT_DIR


#exit 0


for JOB in `echo $JOB_NAME`
do
  WAIT_FLAG=`echo $JOB | grep -c "\${SUBSEP}"`
  if [ $WAIT_FLAG -gt 0 ]
  then
    JOB=`echo $JOB | tr -d "$SUBSEP"`
    echo $JOB
  fi
  echo "Starting job $JOB at `date`"
  # note -w will wait for tara job to complete fully
  if [ $DEBUG -eq 0 ]
  then
    ./taralaunch -u administrator -e $TARA_BKP_PASSWD_FILE -j $JOB -w &
  else
    echo "./taralaunch -u administrator -e $TARA_BKP_PASSWD_FILE -j $JOB -w" &
  fi
  if [ $WAIT_FLAG -gt 0 ]
  then
    wait
  fi
done

How to extract (CSV) data from Teradata via bteq script wrapped in a Unix shell script

There is a better, far simpler way to extract data from Teradata than my previous blog entry.
Ignore it. Use this one.

The example example below extracts data from a Teradata table or view into a pipe delimited file via bteq in a Unix shell script.

I had a serious hassle eliminating the first two lines of output - i.e. the column heading and the minuses underlining the heading. Initially a colleague Arturo told me to replace the ".export report file ..." row with ".export data file ..." but this caused the output to have hidden control characters in it, like these marked in red below ... I assume for Teradata-to-Teradata data transfers.

^P^@^N^@-1||Null Value
^W^@^U^@125|BUSINESS|Business
^[^@^Y^@149|INDIVIDUAL|Individual



He then found that the TITLE could be removed using the (TITLE '') phrase highlighted in the script in bold red below.



#!/bin/ksh -x
# Program: td_bteq_extract_better.ksh
# Description: To extract data from TD via bteq without any headings or "report formatting"
# Version: 1.0 gml 19-Jul-2011 Initial version from Arturo Gonzalez
# -----------------------------------------------------------------------------------------


LOGON_STR="prodtd/glourei1,xxxxxx;"     # Teradata logon acct
TMP=${HOME}/tmp                         # extract directory
FILENAME=test2                          # output will go in ${FILENAME}2.dat
SEP='|'                                 # pipe separator
DATAFILE=${TMP}/${FILENAME}.dat        # extract file


> $DATAFILE                             # Otherwise if run more than once, it will append records to $DATAFILE


bteq >/dev/null 2>&1 <<EOF
.logon $LOGON_STR
.export report file = $DATAFILE         -- .export report file = $DATAFILE will result in column header and header underlining - 2 unwanted rows
.set recordmode off                     -- not sure what this does - leaving it out seems to make no difference
-- an arbitrary example SQL query
SELECT TRIM(COALESCE(Acct_Type_Id,-99)) || '${SEP}' ||   -- Without the TRIM, the numerics will be right justified and have leading blanks
       TRIM(COALESCE(Acct_Type_Cd,'??')) || '${SEP}' ||
       TRIM(COALESCE(Acct_Type_Name,'??')) (TITLE '')
FROM nuc_user_view.acct_type
ORDER BY Acct_Type_id;
.export reset
.logoff
.quit
EOF

Friday, 15 July 2011

Deprecated entry

This entry is deprecated.


See my next blog entry for a much simpler way to extract (CSV) data from Teradata via bteq script wrapped in a Unix shell script.


This first attempt was built in frustration to eliminate the first two lines of output - i.e. the column heading and the minuses underlining the heading (very mainframe-ish) ... A colleague Arturo told me to replace the ".export report file ..." row with ".export data file ..." but this caused the output to have hidden control characters in it, like these marked in red below ... I assume for Teradata-to-Teradata data transfers.

^P^@^N^@-1||Null Value
^W^@^U^@125|BUSINESS|Business
^[^@^Y^@149|INDIVIDUAL|Individual


This example does show the FIFO technique which can be a useful, if elaborate, way to solve problems. 


#!/bin/ksh -x
# Program: td_bteq_extract.ksh
# Description: To extract data from TD via bteq without any headings or "report formatting"
# Version: 1.0 gml 15-Jul-2011 Initial version with help from A. Gonzalez
# -----------------------------------------------------------------------------------------

LOGON_STR="prodtd/username,passwd;"   # Teradata logon acct
TMP=${HOME}/tmp                         # extract directory
FILENAME=test                           # output will go in ${FILENAME}.dat via the FIFO
SEP='|'                                 # pipe separator
FIFO=${TMP}/${FILENAME}.fifo
DATAFILE=${TMP}/${FILENAME}.dat         # extract file

> $DATAFILE                             # Otherwise if run more than once, it will append records to $DATAFILE

mkfifo $FIFO                            # create FIFO
cat $FIFO | egrep -v '^ZZZZZZZZZZ|^---------' > $DATAFILE &
bteq >/dev/null 2>&1 <<EOF
.logon $LOGON_STR
.export report file = $FIFO             -- .export report file = $DATAFILE will result in column header and header underlining - 2 unwanted rows
.set recordmode off                     -- not sure what this does - leaving it out seems to make no difference

-- an arbitrary example SQL query
SELECT TRIM(COALESCE(Acct_Type_Id,-99)) || '${SEP}' ||   -- Without the TRIM, the numerics will be right justified and have leading blanks
       TRIM(COALESCE(Acct_Type_Cd,'??')) || '${SEP}' ||
       TRIM(COALESCE(Acct_Type_Name,'??')) AS "ZZZZZZZZZZ"
FROM nuc_user_view.acct_type
ORDER BY Acct_Type_id;
.export reset
.logoff
.quit
EOF

# cleanup 
rm $FIFO

Teradata Tutorial - Useful Link (note to self)

For useful syntax and notes re Teradata, look here
 http://readvitamin.com/teradata/


To remove leading '0's in Teradata 
select trim(leading '0' from col) from table_A;


Add one day to date field

select cast('28/02/2012' as date format 'dd/mm/yyyy' ) + INTERVAL '1' day;

Monday, 4 July 2011

Performance comparisons between Teradata Enterprise Class system and DW Appliance

See this old post on the Teradata forums.



Why can't Teradata run their benchmark suite to determine a TPerf system performance rating  for their DW Appliances as they have done for their DW Appliances since the 1990s? I haven't had a convincing reason for not doing this.


Also - why is it not possible to run WorkLoad Management on their DW appliances?


Does anyone have a good set of benchmark tests for Teradata databases with data?
Has anyone measured the relative performance of the enterprise class systems against the DW appliances?


Let me know if you have experience in this area. If/when I get round to doing something in this area, I will publish the results. 

Question: Anyone built UDFs to handle regexp functions

Was talking to Teradata CTO, Stephen Brobst this week (21/11/11) and he confirmed these functions will be available in Teradata v. 14 ...
Just wonder how long we'll take to upgrade to this version ... 
Teradata should backport these to v.12 and v.13 for slow-to-upgrade organisations


Original post ...


Just wondering if anyone out there has built Teradata UDFs to support Oracle-like regex functions to perform regular expression string manipulations.


Oracle has 4 very useful functions: 

  • regexp_like - to match regular expression patterns (used in the WHERE clause as per LIKE)
  • regexp_replace - to replace the (nth occurring from position i - default is 1st from 1st) matching regular expression - this is sed-like in functionality with a similar \1..\9 replacement substitution mechanism
  • regexp_substr - to return the (nth occurring from position i - default is 1st from 1st) matching regular expression pattern in the source (e.g. column)
  • regexp_instr - to determine the position of the (nth occurring starting from the position i - default is 1st from 1st) matching regular expression in the source (e.g. column)
Also - can anyone point me to a good guide on building UDFs? And can these be developed and tested on Teradata Express?