Jerome Rajan

0 %
Jerome Rajan
Staff Solutions Consultant at Google
Data & Analytics
  • Residence:
    India
  • City:
    Mumbai
SQL
Dataproc, EMR
Hadoop
BigQuery
AWS Glue
PySpark, Python
Data Pipeline Design
Tableau, Redshift, Snowflake
IBM DataStage
  • AWS Lambda, S3, EMR, SQS, DynamoDB, Step Functions, Cloud Functions
  • Unix Shell Scripting, Python
  • Oracle, DB2, Redis
  • Alteryx, VBA, Blueprism, UiPath
English
Tamil
Hindi
Malayalam
Marathi

DataStage Orchadmin Reference Guide

February 25, 2018

export DSHOME=$(cat /.dshome)
. $DSHOME/dsenv

export LD_LIBRARY_PATH=$APT_ORCHHOME/lib
export APT_CONFIG_FILE=$DSHOME/../Configurations/default.apt
export PATH=$DSHOME/bin:$APT_ORCHHOME/bin:/$PATH

NAME
orchadmin – delete, copy, describe and dump ORCHESTRATE files

SYNOPSIS
orchadmin command [ -options… ] descriptor-files…

orchadmin command [-help] # prints help message for one command

orchadmin [-help] # prints help message for all commands

orchadmin -f command-file # executes commands from specified file

orchadmin – # executes commands from standard input

DESCRIPTION
orchadmin executes commands which delete, copy, and describe
ORCHESTRATE files. These commands may be given on the command
line or read from a file or the standard input.

command delete, copy, describe, dump or check.

-f command-file Path of a file containing orchadmin commands.
The file may have multiple commands separated
by semicolons. A command may be spread over
multiple lines. C and C++ style comments and
csh style quotation marks are allowed.

– Read commands from the standard input as if it
were a command file.

-help | -h Write usage information to the standard output.

In addition there are the following NLS related options:

-input_charset map-name Specifies the encoding of option values.
-output_charset map-name Specifies the encoding of orchadmin output.
-os_charset map-name Specifies the encoding of data passed to or
received from the operating system via
“char *”.
-escaped Allows command line characters to be presented
in a two-byte Unicode hex format.

COMMAND: copy | cp source-descriptor-file target-descriptor-file

Copy the schema, contents and preserve-partitioning flag of the
specified ORCHESTRATE file dataset. If the preserve-partitioning
flag is set, the copy will have the same number of partitions and
record order as the original. If the target file already exists,
it will be truncated first. If the preserve-partitioning flag of
the source file is set and the target file already exists, it must
have the same number of partitions as the source file.

The copy command has no options. A warning message is issued if
the target does not already exist. This is a bug, not a feature.

COMMAND: delete | del | rm [ -options… ] descriptor-files…

Delete the specified descriptor files and all of their data files.

OPTIONS:
-f Force. Proceed even if some partitions of the
dataset are on nodes that are inaccessible from the
current configuration file. This will leave orphan
data files on those nodes. They must be deleted by
some other means.

-x Use the system config file rather than the one
stored in the dataset.

EXAMPLE:
Delete all datasets in the current directory that end in “.ds”.

orchadmin rm *.ds

COMMAND: truncate [ -options… ] descriptor-files…

Remove data from the specified datasets.

OPTIONS:
-f Force. Proceed even if some partitions of the
dataset are on nodes that are inaccessible from the
current configuration file. This will leave orphan
data files on those nodes. They must be truncated by
some other means.

-x Use the system config file rather than the one
stored in the dataset.

-n segment Leave this many segments. The default is 0.

EXAMPLE:
Truncate big.ds to 10 segments:

orchadmin truncate -n 10 big.ds

Remove all data from small.ds:

orchadmin truncate small.ds

COMMAND: dump [ -options… ] descriptor-files…

Dump the specified ORCHESTRATE parallel files as text to the
standard output. If no options are specified, all records are
dumped in order from the first record of the first partition to
the last record of the last partition. Each field value is
followed by a space, and each record is followed by a newline.
Specific top-level fields may be dumped with the -field option.

OPTIONS:
-field name Dump the specified top-level field. The default is
to dump all fields. This option can occur multiple
times. Each occurrence adds to the list of fields.

-name Precede each value by its field name and a colon.

-n numrec Limit the number of records dumped per partition.
The default is not to limit.

-part N Dump only the specified partition. The default is
to dump all partitions.

-p period Dump every N’th record in a partition, starting
with the first record not skipped (see -skip).
The period must be greater than 0. The default
is 1.

-skip N Skip the first N records in each partition. The
default is 0.

-x Use the system config file rather than the one
stored in the dataset.

If an option occurs multiple times, the last one takes effect.
The -field option is an exception: each occurrence adds to the
list of fields to be dumped.

EXAMPLES:
Dump all records of all partitions of a parallel file named
small.ds. Precede each value by its field name and a colon.

orchadmin dump -name small.ds

Dump the value of the customer field of the first 99 records
of partition 0 of big.ds.

orchadmin dump -part 0 -n 99 -field customer big.ds

COMMAND: describe | lp | lf | ls | ll [ -options… ] descriptor-files…

Print a report about each of the specified parallel files.
lp = describe -p; lf = describe -f; ls = describe -s; ll = describe -l

OPTIONS:
-p List partitioning information (except for datafile info).

-c Print the stored config file, if any.

-f List the data files.

-s Print the schema.

-x Use the system config file rather than the one
stored in the dataset.

-e Describe segments individually.

-v Describe all segments, valid or otherwise

-d Print numbers exactly, not in pretty form

-l Means -p -f -s -e -v -c.

EXAMPLE:
List the partitioning info, data files and schema of file1 and file2.

orchadmin ll file1 file2

COMMAND: diskinfo [ -a | -np nodepool | -n node… ] diskpool

Print a report about the specified disk pool

OPTIONS:
-a Print information for all nodes

-np Print information for just the specified node pool

-n Print information for the specified nodes

-q Print summary of information only

If no options are supplied, the default node pool is used.

EXAMPLE:
Describe disk pool pool1 in node pool bignodes

orchadmin diskinfo -np bignodes pool1

COMMAND: check

Check the configuration file for any problems. This command has
no options.

Posted in TechnologyTags:
Write a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Be Original
Would the boy you were be proud of the man you are?