NAME

rwcat - Concatenate SiLK Flow files into single stream

SYNOPSIS

  rwcat [--output-path=PATH] [--note-add=TEXT] [--note-file-add=FILE]
        [--print-filenames] [--byte-order={big | little | native}]
        [--ipv4-output] [--compression-method=COMP_METHOD]
        [--site-config-file=FILENAME]
        {[--xargs] | [--xargs=FILENAME] | [FILE [FILE...]]}

  rwcat --help

  rwcat --version

DESCRIPTION

rwcat reads SiLK Flow records and writes the records in the standard binary SiLK format to the specified output-path; rwcat writes the records to the standard output when stdout is not the terminal and --output-path is not provided.

rwcat reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file is uncompressed as it is read. When the --xargs switch is provided, rwcat reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

rwcat does not copy the invocation history and annotations (notes) from the header(s) of the source file(s) to the destination file. The --note-add or --note-file-add switch may be used to add a new annotation to the destination file.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--output-path=PATH

Write the binary SiLK Flow records to PATH, where PATH is a filename, a named pipe, the keyword stderr to write the output to the standard error, or the keyword stdout or - to write the output to the standard output. If PATH names an existing file, rwcat exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. When PATH ends in .gz, the output is compressed using the library associated with gzip(1). If this switch is not given, the output is written to the standard output. Attempting to write the binary output to a terminal causes rwcat to exit with an error.

--note-add=TEXT

Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.

--note-file-add=FILENAME

Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.

--byte-order=ENDIAN

Set the byte order for the output SiLK Flow records. The argument is one of the following:

native

Use the byte order of the machine where rwcat is running. This is the default.

big

Use network byte order (big endian) for the output.

little

Write the output in little endian format.

--ipv4-output

Force the output to contain only IPv4 flow records. When this switch is specified, IPv6 flow records that contain addresses in the ::ffff:0:0/96 prefix are converted to IPv4 and written to the output, and all other IPv6 records are ignored. When SiLK has not been compiled with IPv6 support, rwcat acts as if this switch were always in effect.

--compression-method=COMP_METHOD

Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.

none

Do not compress the output using an external library.

zlib

Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.

lzo1x

Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.

snappy

Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.

best

Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.

Print the names of input files and the number of records each file contains as the files are read.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwcat searches for the site configuration file in the locations specified in the "FILES" section.

--xargs
--xargs=FILENAME

Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwcat opens each named file in turn and reads records from it as if the filenames had been listed on the command line.

--help

Print the available options and exit.

--version

Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

To combine the results of several rwfilter(1) runs---stored in the files run1.rw, run2.rw, ... runN.rw---together to create the file combined.rw, you can use:

 $ rwcat --output=combined.rw  *.rw

If the shell complains about too many arguments, you can use the UNIX find(1) function and pipe its output to rwcat:

 $ find . -name '*.rw' -print                   \
   | rwcat --xargs --output=combined.rw

ENVIRONMENT

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_COMPRESSION_METHOD

This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwcat may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwcat may use this environment variable. See the "FILES" section for details.

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

SEE ALSO

rwfilter(1), rwfileinfo(1), silk(7), gzip(1), find(1), zlib(3)

BUGS

Although rwcat will read from the standard input, this feature should be used with caution. rwcat will treat the standard input as a single file, as it has no way to know when one file ends and the next begins. The following will not work:

 $ cat run1.rw run2.rw | rwcat --output=combined.rw     # WRONG!

The header of run2.rw will be treated as data of run1.rw, resulting in corrupt output.