CERT/CC
background
background
CERT NetSA Security Suite 
Open Source Tools for Network Monitoring 
News | Downloads | Documentation | Wiki | Tooltips
SiLK 2.1.0 | YAF 1.0.0.2 | IPA 0.4.0 | fixbuf 0.8.0 | Portal 0.9.0 | RAVE 1.9.16 | iSiLK 0.1.6
SiLK - Documentation - rwflowpack
Documentation | Downloads | Release Notes | FAQ | License | Credits | Reference Data | Live CD


NAME

rwflowpack - Collects flow data and stores it in binary SiLK Flow files


SYNOPSIS

  rwflowpack --sensor-configuration=FILE_PATH [--packing-logic=PLUGIN]
        { --log-destination=DESTINATION
          | --log-directory=DIR_PATH [--log-basename=BASENAME]
          | --log-pathname=FILE_PATH }
        [--byte-order=ENDIAN] [--compression-method=COMP_METHOD]
        [--pack-interfaces] [--no-file-locking]
        [--flush-timeout=VAL] [--file-cache-size=VAL]
        [--site-config-file=FILENAME] [--log-level=LEVEL]
        [--log-sysfacility=NUMBER] [--pidfile=FILE_PATH] [--no-daemon]
        [--input-mode=MODE] [--output-mode=MODE]
        MODE_SPECIFIC_SWITCHES

To collect flow data over the network or directory polling (default):

  rwflowpack ... [--input-mode=stream] [--sensor-name=SENSOR]
        [--polling-interval=NUMBER] [--archive-directory=DIR_PATH]
        [--error-directory=DIR_PATH] ...

To collect from a file containing NetFlow v5 PDUs:

  rwflowpack ... --input-mode=pdufile --netflow-file=FILE_PATH
        [--sensor-name=SENSOR] [--archive-directory=DIR_PATH]
        [--error-directory=DIR_PATH] ...

To collect from local files containing flows created by flowcap(8):

  rwflowpack ... --input-mode=fcfiles --incoming-directory=DIR_PATH
        [--polling-interval=NUMBER] [--archive-directory=DIR_PATH]
        [--error-directory=DIR_PATH] ...

To store the SiLK Flow files on the local machine (default):

  rwflowpack ... [--output-mode=local-storage]
        --root-directory=DIR_PATH ...

To forward the SiLK Flow files to a remote machine:

  rwflowpack ... --output-mode=sending --sender-directory=DIR_PATH
        --incremental-directory=DIR_PATH ...

Help options:

  rwflowpack --sensor-configuration=FILE_PATH [--packing-logic=PLUGIN]
        { --verify-sensor-config | --verify-sensor-config=VERBOSE }
  rwflowpack --help
  rwflowpack --version


DESCRIPTION

rwflowpack is a daemon that collects NetFlow v5 or IPFIX (Internet Protocol Flow Information eXport) data, converts the data to the SiLK Flow record format, categorizes each flow (e.g., as incoming or outgoing), and stores the data in binary flat files within a directory tree, with one file per hour-category-sensor tuple.

Processing of IPFIX is only available when SiLK is compiled with support for libfixbuf, which is available from http://tools.netsa.cert.org/.

See the sensor.conf(5) manual page or the SiLK Installation Handbook for an explanation of how SiLK categorizes flows and converts data to the SiLK format.


OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

General Configuration

The following switch is required:

--sensor-configuration=FILE_PATH

Give the path to the configuration file that rwflowpack will consult to determine whether a record represents an incoming or outgoing flow. The complete syntax of the configuration file is described in the sensor.conf(5) manual page; see also the SiLK Installation Handbook.

The following set of switches is optional:

--packing-logic=PLUGIN

Specify the plug-in that rwflowpack should load, where the plug-in provides functions that determine into which class and type each flow record will be categorized and the format of the files that rwflowpack will write. When SiLK has been configured with hard-coded packing logic (i.e., when --enable-packing-logic was specified to the configure script), this switch will not be present on rwflowpack. A default value for this switch may be specified in the silk.conf(5) site configuration file (see the description of the --site-config-file switch). When PLUGIN contains a slash (/), rwflowpack assumes the path to PLUGIN is correct. Otherwise, rwflowpack will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwflowpack does not find the file, it assumes the plug-in is in the current directory. To force rwflowpack to look in the current directory first, specify --packing-logic=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwflowpack prints status messages to the standard error as it tries to open the plug-in.

--byte-order=ENDIAN

Set the byte order for newly created SiLK Flow files. When appending records to an existing file, the byte order of the file is maintained. The argument is one of the following:

native

Use the byte order of the machine where rwflowpack is running. This is the default.

big

Use network byte order (big endian) for the flow files.

little

Write the flow files in little endian format.

--compression-method=COMP_METHOD

Set the compression method for newly created SiLK Flow files to COMP_METHOD. When appending records to an existing file, the compression method of the file is maintained.

In addition to the packing (shrinking) of the flow records that SiLK normally does, rwflowpack can use an external library to further reduce the size of the records on disk. The list of available compression methods and the default method are set when SiLK is compiled (the --help and --version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support the following:

none

Do not compress the SiLK Flow records using an external library.

zlib

Use the zlib(3) library for compressing the flow records.

lzo1x

Use the lzo1x algorithm from the LZO real-time compression library for compressing the flow records.

best

Use whichever available method gives the best compression in general, though not necessarily the best for this particular file.

--pack-interfaces

Allow one to override the default file output format of the packed SiLK Flow files that rwflowpack writes. When this switch is present, rwflowpack writes additional information into the packed files: the router's SNMP input and output interfaces and the next-hop IP address. The extra data produced by this switch is useful for determining why traffic is being stored in certain files.

Note that this switch will only affect newly created files. New records will always be appended to an existing file in the file's current output format to maintain file integrity.

--no-file-locking

Do not use advisory write locks. Normally, rwflowpack will attempt to obtain a write lock on the data files prior to writing records to them; these locks prevent two instances of rwflowpack from writing to the same data file. However, not all file systems support advisory write locks, and this switch must be used when writing data to such a file system.

--flush-timeout=VAL

Set the timeout for flushing any in-memory records to disk to VAL seconds. If not specified, the default is 2 minutes (120 seconds). When using local storage mode, this value specifies how often the files are flushed to disk to ensure that any records in memory are written to disk. When using sending output mode, this value specifies how often to close the files and moves them from the incremental-directory to the sender-directory.

--file-cache-size=VAL

Set the maximum number of data files to have open for writing at any one time to VAL. If not specified, the default is 64 files.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory specified in the --root-directory switch; the directory specified in the SILK_DATA_ROOTDIR environment variable (sending mode only); the data root directory that is compiled into SiLK (sending mode only); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.

Logging and Daemon Configuration

One of the following mutually-exclusive switches is required:

--log-destination=DESTINATION

Specify the destination where logging messages are written. When DESTINATION begins with a slash /, it is treated as a file system path and all log messages are written to that file; there is no log rotation. When DESTINATION does not begin with /, it must be one of the following strings:

none

Messages are not written anywhere.

stdout

Messages are written to the standard output.

stderr

Messages are written to the standard error.

syslog

Messages are written using the syslog(3) facility.

both

Messages are written to the syslog facility and to the standard error (this option is not available on all platforms).

--log-directory=DIR_PATH

Use DIR_PATH as the directory where the log files are written. DIR_PATH must be a complete directory path. The log files have the form

  DIR_PATH/LOG_BASENAME-YYYYMMDD.log

where YYYYMMDD is the current date and LOG_BASENAME is the application name or the value passed to the --log-basename switch when provided. The log files will be rotated: at midnight local time a new log will be opened and the previous day's log file will be compressed using gzip(1). (Old log files are not removed by rwflowpack; the administrator should use another tool to remove them.) When this switch is provided, a process-ID file (PID) will also be written in this directory unless the --pidfile switch is provided.

--log-pathname=FILE_PATH

Use FILE_PATH as the complete path to the log file. The log file will not be rotated.

The following set of switches is optional:

--log-level=LEVEL

Set the severity of messages that will be logged. The levels from most severe to least are: emerg, alert, crit, err, warning, notice, info, debug. The default is info.

--log-sysfacility=NUMBER

Set the facility that syslog(3) uses for logging messages. This switch takes a number as an argument. The default is a value that corresponds to LOG_USER on the system where rwflowpack is running. This switch produces an error unless --log-destination=syslog is specified.

--log-basename=LOG_BASENAME

Use LOG_BASENAME in place of the application name for the files in the log directory. See the description of the --log-directory switch.

--pidfile=FILE_PATH

Set the complete path to the file in which rwflowpack writes its process ID (PID) when it is running as a daemon. No PID file is written when --no-daemon is given. When this switch is not present, no PID file is written unless the --log-directory switch is specified, in which case the PID is written to LOGPATH/rwflowpack.pid.

--no-daemon

Force rwflowpack to stay in the foreground---it does not become a daemon. Useful for debugging.

Input and Output Mode

rwflowpack supports multiple ways of getting data (the input mode) and storing data (the output mode).

--input-mode=MODE

Determine how rwflowpack will gather data. The default input MODE is stream. The available modes are

stream

rwflowpack uses the probes in the sensor configuration file that specify a network port, a unix domain socket, or a polling directory. For these probes, rwflowpack opens the ports and/or begins processing data files in the named directories.

pdufile

rwflowpack reads NetFlow v5 PDUs from a file, where the file's format is that created by NetFlow Collector: The file's size must be an integer multiple of 1464, where each 1464 byte chunk contains a 24 byte NetFlow v5 header and space for thirty 48 byte NetFlow records. The number of valid records per chunk is specified in the header.

fcfiles

rwflowpack polls a local directory for files disk that were created by the flowcap(8) daemon. Typically flowcap runs on a separate machine near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on its local disk. For the fcfiles input mode, the files are moved between the flowcap and rwflowpack machines by separate programs, typically the rwsender(8) and rwreceiver(8) daemons. In this mode, rwflowpack ignores the probe defintions in the sensor configuration file.

--output-mode=MODE

Determines what rwflowpack will do with the data as it is packed into SiLK binary files. The default output MODE is local-storage. The available modes are

local-storage

rwflowpack writes the data on the local machine into a directory tree with a specific structure.

sending

rwflowpack writes the data into a temporary location on the local disk. A separate program, rwsender(8), moves the data from the local machine to remote machines where rwreceiver(8) working in concert with the rwflowappend(8) will write the data into a directory tree with a specific structure.

Stream Collection Switches (--input-mode=stream)

When the --input-mode switch is set to stream or when the switch is not provided, rwflowpack expects to receive stream(s) of NetFlow or IPFIX data over the network and/or files of NetFlow v5 records by polling directories on the local machine. rwflowpack will open a port for every probe specified in the sensor.conf(5) configuration file that has a listen-on-port attribute, and poll every directory for probes that list a poll-directory. See the next section for a description of the NetFlow v5 file format.

The following switches are optional:

--sensor-name=SENSOR

Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. Only data for this sensor will be collected. This allows a common configuration file to be used by multiple rwflowpack invocations, yet also allow each rwflowpack instance only collect data for a single sensor. There must be a sensor definition for SENSOR in the configuration file. When this switch is not present, rwflowpack will collect and pack data for all sensors.

--polling-interval=NUMBER

Specify the number of seconds rwflowpack will wait between queries of the poll-directorys. This defaults to 15 seconds.

--archive-directory=DIR_PATH

When using probes that specify a poll-directory, this switch names the full path of the directory root to which the NetFlow files will be moved after they have been processed by rwflowpack. If this switch is not provided, the original NetFlow source files are deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.

--error-directory=DIR_PATH

When using probes that specify a poll-directory, this switch names the full path of the directory to which problem files are moved. Problem files are those that rwflowpack cannot open or are not of the correct format (non-NetFlow files). If this switch is not provided, problem files remain in place and cause rwflowpack to exit. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.

PDU File Collection Switches (--input-mode=pdufile)

In this mode, rwflowpack processes a single file of NetFlow v5 data. Typically these files are generated by a NetFlow collector. rwflowpack will not become a daemon in this mode; instead it will remain in the foreground, process the NetFlow file, and exit.

The NetFlow v5 file has a particular format: The file's length should be an integer multiple of 1464 bytes, where 1464 is the maximum length of the NetFlow v5 PDU. Each 1464 block should contain the 24-byte NetFlow v5 header and space for 30 48-byte flow records, even if data for only one NetFlow record is valid.

The --netflow-file switch is required in this mode; it specifies the NetFlow file to process. Any value specified in the read-from-file command in the sensor.conf file is ignored; the value is typically set to /dev/null. The --sensor-name switch is also requried in pdufile mode unless the sensor.conf file contains a single sensor.

The following switches are available in PDU File mode:

--netflow-file=FILE_PATH

Name the full path of the file from which rwflowpack reads NetFlow v5 PDUs. This switch is required in PDU File mode.

--sensor-name=SENSOR

Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. There must be a sensor definition for SENSOR in the configuration file. This switch is required in this mode unless the sensor.conf file only defines a single sensor.

--archive-directory=DIR_PATH

Name the full path of the directory to which the NetFlow file will be moved after it has been processed by rwflowpack. If this switch is not provided, the original NetFlow source file is not modified, moved, or deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.

--error-directory=DIR_PATH

Name the full path of the directory to which the NetFlow file will be moved if it cannot be opened or if it is not a NetFlow v5 file. If this switch is not provided, a bad source file remains in place. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.

Flowcap Files Collection Switches (--input-mode=fcfiles)

When the --input-mode=fcfiles switch is provided, rwflowpack will process files created by another SiLK daemon called flowcap(8). Typically flowcap runs near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on the local disk. These files are transferred between the flowcap machine and rwflowpack machine by external programs (typically the rwsender(8) and rwreceiver(8) daemons). rwflowpack polls a local directory for these files, and then processes the files to generate SiLK Flow files.

In fcfiles mode, rwflowpack ignores the probe definitions in the sensor.conf file since flowcap labeled the files with probe where the flows were collected. rwflowpack will use the sensor definitions in sensor.conf.

rwflowpack from SiLK-1.0 and later will process files created by flowcap from SiLK-0.11.x or earlier, but you need to take care in setting up the sensor.conf file. See the NOTES section below and sensor.conf(5).

When operating in flowcap files input mode, the first of the following switches are required:

--incoming-directory=DIR_PATH

Name the full path of the directory which rwflowpack will monitor for files created by flowcap. Once processed by rwflowpack, files are moved from this directory to the archive-directory, if it has been specified.

--polling-interval=NUMBER

Specify the number of seconds rwflowpack will wait between polls of the incoming-directory for new files created by flowcap. If not given, the default value is 15 seconds.

--archive-directory=DIR_PATH

Name the full path of the directory used to store the files after rwflowpack has processed them. If this switch is not provided, the files are deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.

--error-directory=DIR_PATH

Name the full path of the directory to which problem files are moved. Problem files are those that rwflowpack cannot open, are not of the correct format, or contain an unrecognized probe name. If this switch is not provided, problem files remain in place and cause rwflowpack to exit. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.

Local Storage Mode Switches (--output-mode=local-storage)

Once rwflowpack has collected data, categorized it, and written it into files, it can do one of two things with the files:

  1. Store the files on the local disk in a well-defined location.

  2. Transfer the files to another machine and store them in a well defined location (see sending mode below).

(The data files must be stored in a well-defined location so that rwfilter(1) can find them. To see rwfilter's idea of the well-defined location, run rwfilter --version.)

The default output-mode is to store the files on the local disk (i.e., local-storage). When operating in this mode, the following switch is required:

--root-directory=DIR_PATH

Name the full path of the directory under which the files containing the packed SiLK Flow records will be stored. rwflowpack will create subdirectories below DIR_PATH based on the data received.

Sending Mode Storage Switches (--output-mode=sending)

To transfer the packed SiLK Flow files to another machine, specify the --output-mode=sending switch and invoke the rwsender(8) to transfer the files. When rwflowpack is used with rwsender, the following three switches must be provided:

--incremental-directory=DIR_PATH

Name the full path of the directory under which packed SiLK files will initially be created. Files in this directory are considered to be incomplete; any files in this directory will be removed when rwflowpack is started. Once complete, files are moved from this directory to the sender-directory.

--sender-directory=DIR_PATH

Name the full path of the directory under which completed incremental files are stored while awaiting action by rwsender. The rwsender is responsible for removing files from this directory.

Help Options

--verify-sensor-config
--verify-sensor-config=VERBOSE

Verify that the syntax of the sensor configuration file is correct and then exit rwflowpack. If the file is incorrect or if it does not define any sensors, an error message is printed and rwflowpack exits abnormally. If the file is correct and no argument is provided to the --verify-sensor-config switch, rwflowpack simply exits with status 0. If an argument (other than the empty string and 0) is provided to the switch, the names of the probes and sensors found in the sensor configuration file are printed to the standard output, and then rwflowpack exits.

--help

Print the available options and exit.

--version

Print the version number and information about how SiLK was configured, then exit the application.


ENVIRONMENT

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwset looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.

SILK_PATH

This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwset checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.

SILK_PLUGIN_DEBUG

When set to 1, rwflowpack print status messages to the standard error as it tries to open the packing logic plug-in.


FILES

The root of the directory tree that contains the packed, binary SiLK Flow files is set by the --root-directory switch; this directory is called the SILK_DATA_ROOTDIR. Immediately underneath it are subdirectories corresponding to the traffic categories (directions) discussed above. Under these are directories representing the year, month, and day in YYYY/MM/DD format. That is

  $SILK_DATA_ROOTDIR/in/{$YEAR}/{$MONTH}/{$DAY}/*
  $SILK_DATA_ROOTDIR/inweb/{$YEAR}/{$MONTH}/{$DAY}/*
  $SILK_DATA_ROOTDIR/innull/{$YEAR}/{$MONTH}/{$DAY}/*
  $SILK_DATA_ROOTDIR/out/{$YEAR}/{$MONTH}/{$DAY}/*
  $SILK_DATA_ROOTDIR/outweb/{$YEAR}/{$MONTH}/{$DAY}/*
  $SILK_DATA_ROOTDIR/outnull/{$YEAR}/{$MONTH}/{$DAY}/*

For example, output web files for October 4th, 2003 are recorded in $SILK_DATA_ROOTDIR/outweb/2003/10/04/

The names of the files in these directories include all of this information, and are written in the form:

flowType-sensorName_YYYYMMDD.HH

where flowType encodes the category and sensorName is the sensor on which the flow was collected.


SEE ALSO

SiLK Installation Handbook, sensor.conf(5), silk.conf(5), flowcap(8), rwfilter(1), rwflowappend(8), rwreceiver(8), rwsender(8), silk(7), syslog(3), cron(8)


NOTES

rwflowpack in SiLK-1.0 and later can process files created by flowcap from SiLK-0.11.x or earlier when you properly name the probes in your sensor.conf file. To communicate information about where a flow is collected, flowcap writes information into the each file's header, and when rwflowpack processes the file, it uses this information to find entries in the sensor.conf file. Since SiLK-1.0, this information is the name of the probe where the flow was collected. Files created by flowcap prior to the SiLK-1.0 release contain the sensor name and probe name in the file's header. When rwflowpack reads these files, it converts the sensor and probe names to the value sensor_probe and attempts to find that probe in the sensor.conf file. For flowcap-files with no explicit probe name, rwflowpack looks for a probe named sensor.

For example, when the sensor.conf file used by pre-1.0 flowcap contains these entries

 sensor-probe S2
    probe-type ipfix
    probe-name ipfix
    protocol tcp
    ...
 sensor-probe S4
    probe-type netflow
    protocol udp
    ...

You should create the following entries in the sensor.conf used by rwflowpack:

 probe S2_ipfix ipfix
    protocol tcp
    ...
 end probe
 sensor S2
    ipfix-probes S2_ipfix
    ...
 end sensor
 probe S4 netflow
    protocol udp
    ...
 end probe
 sensor S4
    netflow-probes S4
    ...
 end sensor

SiLK comes with the update-sensor-conf script that will re-write the sensor.conf file to be compatible with the new format.