NAME

rwset - Generate binary IPset files of unique IP addresses

SYNOPSIS

  rwset {--sip-file=FILE | --dip-file=FILE
         | --nhip-file=FILE | --any-file=FILE [...]}
        [--record-version=VERSION] [--invocation-strip]
        [--note-strip] [--note-add=TEXT] [--note-file-add=FILE]
        [--print-filenames] [--copy-input=PATH]
        [--compression-method=COMP_METHOD]
        [--ipv6-policy={ignore,asv4,mix,force,only}]
        [--site-config-file=FILENAME]
        {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

  rwset --help

  rwset --version

DESCRIPTION

rwset reads SiLK Flow records and generates one to four binary IPset file(s). In a single pass, rwset can create one of each type of its possible outputs, which are IPset files containing:

The output files must not exist prior to invoking rwset. To write an IPset file to the standard output, specify stdout or - as the output file name. rwset will complain if you attempt to write the IPset to the standard output and standard output is connected to the terminal. Only one IPset file may be written to the standard output.

rwset reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file is uncompressed as it is read. When the --xargs switch is provided, rwset reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

IPset files are in a binary format that efficiently stores a set of IP addresses. The file only stores the presence of an IP address; no volume information (such as a count of the number of times the IP address occurs) is maintained. To store volume information, use rwbag(1).

Use rwsetcat(1) to see the IP addresses in a binary IPset file. To create a binary IPset file from a list of IP addresses, use rwsetbuild(1). rwsettool(1) allows you to perform set operations on binary IPset files. To determine if an IP address is a member of a binary IPset, use rwsetmember(1).

To list the IPs that appear in the SiLK Flow file flows.rw, the command

 $ rwset --sip-file=stdout flows.rw | rwsetcat

is faster than rwuniq(1), but rwset does not report the number of flow records or compute byte and packets counts.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

At least one of the following output switches is required; multiple output switches can be given, but an output switch cannot be repeated.

--sip-file=FILE

Store the unique source IP addresses in the binary IPset file FILE. rwset will write the IPset file to the standard output when FILE is stdout or - and the standard output is not a terminal.

--dip-file=FILE

Store the unique destination IP addresses in the binary IPset file FILE. rwset will write the IPset file to the standard output when FILE is stdout or - and the standard output is not a terminal.

--nhip-file=FILE

Store the unique next-hop IP addresses in the binary IPset file FILE. rwset will write the IPset file to the standard output when FILE is stdout and the standard output is not a terminal.

--any-file=FILE

Store the unique source and destination IP addresses in the binary IPset file FILE. rwset will write the IPset file to the standard output when FILE is stdout or - and the standard output is not a terminal.

Only one of the above switches my use stdout as the name of the file.

rwset supports these additional switches:

--record-version=VERSION

Specify the format of the IPset records that are written to the output. VERSION may be 2, 3, 4, 5 or the special value 0. When the switch is not provided, the SILK_IPSET_RECORD_VERSION environment variable is checked for a version. The default version is 0.

0

Use the default version for an IPv4 IPset and an IPv6 IPset. Use the --help switch to see the versions used for your SiLK installation.

2

Create a file that may hold only IPv4 addresses and is readable by all versions of SiLK.

3

Create a file that may hold IPv4 or IPv6 addresses and is readable by SiLK 3.0 and later.

4

Create a file that may hold IPv4 or IPv6 addresses and is readable by SiLK 3.7 and later. These files are more compact that version 3 and often more compact than version 2.

5

Create a file that may hold only IPv6 addresses and is readable by SiLK 3.14 and later. When this version is specified, IPsets containing only IPv4 addresses are written in version 4. These files are usually more compact that version 4.

--invocation-strip

Do not record any command line history: do not copy the invocation history from the input files to the output file, and do not record the current command line invocation in the output. The invocation may be viewed with rwfileinfo(1).

--note-strip

Do not copy the notes (annotations) from the input files to the output file. Normally notes from the input files are copied to the output.

--note-add=TEXT

Add the specified TEXT to the header of every output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.

--note-file-add=FILENAME

Open FILENAME and add the contents of that file to the header of every output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.

Print to the standard error the names of input files as they are opened.

--copy-input=PATH

Copy all binary SiLK Flow records read as input to the specified file or named pipe. PATH may be stdout or - to write flows to the standard output as long as no IPset file is being written there.

--ipv6-policy=POLICY

Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mix. When SiLK has not been compiled with IPv6 support, IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:

ignore

Ignore any flow record marked as IPv6, regardless of the IP addresses it contains. Only IP addresses contained in IPv4 flow records will be added to the IPset(s).

asv4

Convert IPv6 flow records that contain addresses in the ::ffff:0:0/96 netblock (that is, IPv4-mapped IPv6 addresses) to IPv4 and ignore all other IPv6 flow records.

mix

Process the input as a mixture of IPv4 and IPv6 flow records. When the input contains IPv6 addresses outside of the ::ffff:0:0/96 netblock, this policy is equivalent to force; otherwise it is equivalent to asv4.

force

Convert IPv4 flow records to IPv6, mapping the IPv4 addresses into the ::ffff:0:0/96 netblock.

only

Process only flow records that are marked as IPv6. Only IP addresses contained in IPv6 flow records will be added to the IPset(s).

Regardless of the IPv6 policy, when all IPv6 addresses in the IPset are in the ::ffff:0:0/96 netblock, rwset treats them as IPv4 addresses and writes an IPv4 IPset. When any other IPv6 addresses are present in the IPset, the IPv4 addresses in the IPset are mapped into the ::ffff:0:0/96 netblock and rwset writes an IPv6 IPset.

--compression-method=COMP_METHOD

Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.

none

Do not compress the output using an external library.

zlib

Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.

lzo1x

Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.

snappy

Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.

best

Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwset searches for the site configuration file in the locations specified in the "FILES" section.

--xargs
--xargs=FILENAME

Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwset opens each named file in turn and reads records from it as if the filenames had been listed on the command line.

--help

Print the available options and exit.

--version

Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

rwset is intended to work tightly with rwfilter(1). For example, consider generating two IPsets: the first file, low_packet_tcp.set, contains the source IP addresses for incoming flow records (that is, the external hosts) where the record has no more than three packets in its sessions. The second IPset file, high_packet_tcp.set, contains the external IPs for records with four or more packets.

The first set, for TCP traffic on 03/01/2003 can be generated with:

 $ rwfilter --start-date=2003/03/01:00 --end-date=2003/03/01:23     \
        --proto=6 --packets=1-3 --pass=stdout                       \
   | rwset --sip-file=low_packet_tcp.set

The second set with:

 $ rwfilter --start-date=2003/03/01:00 --end-date=2003/03/01:23    \
        --proto=6 --packets=4- --pass=stdout                       \
   | rwset --sip-file=high_packet_tcp.set

ENVIRONMENT

SILK_IPSET_RECORD_VERSION

This environment variable is used as the value for the --record-version when that switch is not provided. Since SiLK 3.7.0.

SILK_IPV6_POLICY

This environment variable is used as the value for --ipv6-policy when that switch is not provided.

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_COMPRESSION_METHOD

This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwset may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwset may use this environment variable. See the "FILES" section for details.

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

SEE ALSO

rwsetbuild(1), rwsetcat(1), rwsettool(1), rwsetmember(1), rwfilter(1), rwfileinfo(1), rwbag(1), rwuniq(1), silk(7), zlib(3)

NOTES

Prior to SiLK 3.0, an IPset file could not contain IPv6 addresses and the record version was 2. The --record-version switch was added in SiLK 3.0 and its default was 3. In SiLK 3.6, an argument of 0 was allowed and made the default. Version 4 was added in SiLK 3.7 as was support for the SILK_IPSET_RECORD_VERSION environment variable. Version 5 was added in SiLK 3.14.