NAME
rwflowpack - Collects flow data and stores it in binary SiLK Flow files
SYNOPSIS
rwflowpack --sensor-configuration=FILE_PATH [--packing-logic=PLUGIN]
{ --log-destination=DESTINATION
| --log-directory=DIR_PATH [--log-basename=BASENAME]
| --log-pathname=FILE_PATH }
[--byte-order=ENDIAN] [--compression-method=COMP_METHOD]
[--pack-interfaces] [--no-file-locking]
[--flush-timeout=VAL] [--file-cache-size=VAL]
[--site-config-file=FILENAME] [--log-level=LEVEL]
[--log-sysfacility=NUMBER] [--pidfile=FILE_PATH] [--no-daemon]
[--input-mode=MODE] [--output-mode=MODE]
MODE_SPECIFIC_SWITCHES
To collect flow data over the network or directory polling (default):
rwflowpack ... [--input-mode=stream] [--sensor-name=SENSOR]
[--polling-interval=NUMBER] [--archive-directory=DIR_PATH]
[--error-directory=DIR_PATH] ...
To collect from a file containing NetFlow v5 PDUs:
rwflowpack ... --input-mode=pdufile --netflow-file=FILE_PATH
[--sensor-name=SENSOR] [--archive-directory=DIR_PATH]
[--error-directory=DIR_PATH] ...
To collect from local files containing flows created by flowcap(8):
rwflowpack ... --input-mode=fcfiles --incoming-directory=DIR_PATH
[--polling-interval=NUMBER] [--archive-directory=DIR_PATH]
[--error-directory=DIR_PATH] ...
To store the SiLK Flow files on the local machine (default):
rwflowpack ... [--output-mode=local-storage]
--root-directory=DIR_PATH ...
To forward the SiLK Flow files to a remote machine:
rwflowpack ... --output-mode=sending --sender-directory=DIR_PATH
--incremental-directory=DIR_PATH ...
Help options:
rwflowpack --sensor-configuration=FILE_PATH [--packing-logic=PLUGIN]
{ --verify-sensor-config | --verify-sensor-config=VERBOSE }
rwflowpack --help
rwflowpack --version
DESCRIPTION
rwflowpack is a daemon that collects NetFlow v5 or IPFIX (Internet Protocol Flow Information eXport) data, converts the data to the SiLK Flow record format, categorizes each flow (e.g., as incoming or outgoing), and stores the data in binary flat files within a directory tree, with one file per hour-category-sensor tuple.
Processing of IPFIX is only available when SiLK is compiled with support for libfixbuf, which is available from http://tools.netsa.cert.org/.
See the sensor.conf(5) manual page or the SiLK Installation Handbook for an explanation of how SiLK categorizes flows and converts data to the SiLK format.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
General Configuration
The following switch is required:
- --sensor-configuration=FILE_PATH
-
Give the path to the configuration file that rwflowpack will consult to determine whether a record represents an incoming or outgoing flow. The complete syntax of the configuration file is described in the sensor.conf(5) manual page; see also the SiLK Installation Handbook.
The following set of switches is optional:
- --packing-logic=PLUGIN
-
Specify the plug-in that rwflowpack should load, where the plug-in provides functions that determine into which class and type each flow record will be categorized and the format of the files that rwflowpack will write. When SiLK has been configured with hard-coded packing logic (i.e., when --enable-packing-logic was specified to the configure script), this switch will not be present on rwflowpack. A default value for this switch may be specified in the silk.conf(5) site configuration file (see the description of the --site-config-file switch). When PLUGIN contains a slash (
/), rwflowpack assumes the path to PLUGIN is correct. Otherwise, rwflowpack will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwflowpack does not find the file, it assumes the plug-in is in the current directory. To force rwflowpack to look in the current directory first, specify --packing-logic=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwflowpack prints status messages to the standard error as it tries to open the plug-in. - --byte-order=ENDIAN
-
Set the byte order for newly created SiLK Flow files. When appending records to an existing file, the byte order of the file is maintained. The argument is one of the following:
native-
Use the byte order of the machine where rwflowpack is running. This is the default.
big-
Use network byte order (big endian) for the flow files.
little-
Write the flow files in little endian format.
- --compression-method=COMP_METHOD
-
Set the compression method for newly created SiLK Flow files to COMP_METHOD. When appending records to an existing file, the compression method of the file is maintained.
-
In addition to the packing (shrinking) of the flow records that SiLK normally does, rwflowpack can use an external library to further reduce the size of the records on disk. The list of available compression methods and the default method are set when SiLK is compiled (the --help and --version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support the following:
- none
-
Do not compress the SiLK Flow records using an external library.
- zlib
-
Use the zlib(3) library for compressing the flow records.
- lzo1x
-
Use the lzo1x algorithm from the LZO real-time compression library for compressing the flow records.
- best
-
Use whichever available method gives the
bestcompression in general, though not necessarily thebestfor this particular file. - --pack-interfaces
-
Allow one to override the default file output format of the packed SiLK Flow files that rwflowpack writes. When this switch is present, rwflowpack writes additional information into the packed files: the router's SNMP input and output interfaces and the next-hop IP address. The extra data produced by this switch is useful for determining why traffic is being stored in certain files.
-
Note that this switch will only affect newly created files. New records will always be appended to an existing file in the file's current output format to maintain file integrity.
- --no-file-locking
-
Do not use advisory write locks. Normally, rwflowpack will attempt to obtain a write lock on the data files prior to writing records to them; these locks prevent two instances of rwflowpack from writing to the same data file. However, not all file systems support advisory write locks, and this switch must be used when writing data to such a file system.
- --flush-timeout=VAL
-
Set the timeout for flushing any in-memory records to disk to VAL seconds. If not specified, the default is 2 minutes (120 seconds). When using local storage mode, this value specifies how often the files are flushed to disk to ensure that any records in memory are written to disk. When using sending output mode, this value specifies how often to close the files and moves them from the incremental-directory to the sender-directory.
- --file-cache-size=VAL
-
Set the maximum number of data files to have open for writing at any one time to VAL. If not specified, the default is 64 files.
- --site-config-file=FILENAME
-
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory specified in the --root-directory switch; the directory specified in the SILK_DATA_ROOTDIR environment variable (sending mode only); the data root directory that is compiled into SiLK (sending mode only); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.
Logging and Daemon Configuration
One of the following mutually-exclusive switches is required:
- --log-destination=DESTINATION
-
Specify the destination where logging messages are written. When DESTINATION begins with a slash
/, it is treated as a file system path and all log messages are written to that file; there is no log rotation. When DESTINATION does not begin with/, it must be one of the following strings: none-
Messages are not written anywhere.
stdout-
Messages are written to the standard output.
stderr-
Messages are written to the standard error.
syslog-
Messages are written using the syslog(3) facility.
both-
Messages are written to the syslog facility and to the standard error (this option is not available on all platforms).
- --log-directory=DIR_PATH
-
Use DIR_PATH as the directory where the log files are written. DIR_PATH must be a complete directory path. The log files have the form
-
DIR_PATH/LOG_BASENAME-YYYYMMDD.log
-
where YYYYMMDD is the current date and LOG_BASENAME is the application name or the value passed to the --log-basename switch when provided. The log files will be rotated: at midnight local time a new log will be opened and the previous day's log file will be compressed using gzip(1). (Old log files are not removed by rwflowpack; the administrator should use another tool to remove them.) When this switch is provided, a process-ID file (PID) will also be written in this directory unless the --pidfile switch is provided.
- --log-pathname=FILE_PATH
-
Use FILE_PATH as the complete path to the log file. The log file will not be rotated.
The following set of switches is optional:
- --log-level=LEVEL
-
Set the severity of messages that will be logged. The levels from most severe to least are:
emerg,alert,crit,err,warning,notice,info,debug. The default isinfo. - --log-sysfacility=NUMBER
-
Set the facility that syslog(3) uses for logging messages. This switch takes a number as an argument. The default is a value that corresponds to
LOG_USERon the system where rwflowpack is running. This switch produces an error unless --log-destination=syslog is specified. - --log-basename=LOG_BASENAME
-
Use LOG_BASENAME in place of the application name for the files in the log directory. See the description of the --log-directory switch.
- --pidfile=FILE_PATH
-
Set the complete path to the file in which rwflowpack writes its process ID (PID) when it is running as a daemon. No PID file is written when --no-daemon is given. When this switch is not present, no PID file is written unless the --log-directory switch is specified, in which case the PID is written to LOGPATH/rwflowpack.pid.
- --no-daemon
-
Force rwflowpack to stay in the foreground---it does not become a daemon. Useful for debugging.
Input and Output Mode
rwflowpack supports multiple ways of getting data (the input mode) and storing data (the output mode).
- --input-mode=MODE
-
Determine how rwflowpack will gather data. The default input MODE is
stream. The available modes are stream-
rwflowpack uses the probes in the sensor configuration file that specify a network port, a unix domain socket, or a polling directory. For these probes, rwflowpack opens the ports and/or begins processing data files in the named directories.
pdufile-
rwflowpack reads NetFlow v5 PDUs from a file, where the file's format is that created by NetFlow Collector: The file's size must be an integer multiple of 1464, where each 1464 byte chunk contains a 24 byte NetFlow v5 header and space for thirty 48 byte NetFlow records. The number of valid records per chunk is specified in the header.
fcfiles-
rwflowpack polls a local directory for files disk that were created by the flowcap(8) daemon. Typically flowcap runs on a separate machine near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on its local disk. For the fcfiles input mode, the files are moved between the flowcap and rwflowpack machines by separate programs, typically the rwsender(8) and rwreceiver(8) daemons. In this mode, rwflowpack ignores the probe defintions in the sensor configuration file.
- --output-mode=MODE
-
Determines what rwflowpack will do with the data as it is packed into SiLK binary files. The default output MODE is
local-storage. The available modes are local-storage-
rwflowpack writes the data on the local machine into a directory tree with a specific structure.
sending-
rwflowpack writes the data into a temporary location on the local disk. A separate program, rwsender(8), moves the data from the local machine to remote machines where rwreceiver(8) working in concert with the rwflowappend(8) will write the data into a directory tree with a specific structure.
Stream Collection Switches (--input-mode=stream)
When the --input-mode switch is set to stream or when the switch
is not provided, rwflowpack expects to receive stream(s) of NetFlow
or IPFIX data over the network and/or files of NetFlow v5 records by
polling directories on the local machine. rwflowpack will open a
port for every probe specified in the sensor.conf(5) configuration
file that has a listen-on-port attribute, and poll every directory
for probes that list a poll-directory. See the next section for a
description of the NetFlow v5 file format.
The following switches are optional:
- --sensor-name=SENSOR
-
Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. Only data for this sensor will be collected. This allows a common configuration file to be used by multiple rwflowpack invocations, yet also allow each rwflowpack instance only collect data for a single sensor. There must be a sensor definition for SENSOR in the configuration file. When this switch is not present, rwflowpack will collect and pack data for all sensors.
- --polling-interval=NUMBER
-
Specify the number of seconds rwflowpack will wait between queries of the
poll-directorys. This defaults to 15 seconds. - --archive-directory=DIR_PATH
-
When using probes that specify a
poll-directory, this switch names the full path of the directory root to which the NetFlow files will be moved after they have been processed by rwflowpack. If this switch is not provided, the original NetFlow source files are deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory. - --error-directory=DIR_PATH
-
When using probes that specify a
poll-directory, this switch names the full path of the directory to which problem files are moved. Problem files are those that rwflowpack cannot open or are not of the correct format (non-NetFlow files). If this switch is not provided, problem files remain in place and cause rwflowpack to exit. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.
PDU File Collection Switches (--input-mode=pdufile)
In this mode, rwflowpack processes a single file of NetFlow v5 data. Typically these files are generated by a NetFlow collector. rwflowpack will not become a daemon in this mode; instead it will remain in the foreground, process the NetFlow file, and exit.
The NetFlow v5 file has a particular format: The file's length should be an integer multiple of 1464 bytes, where 1464 is the maximum length of the NetFlow v5 PDU. Each 1464 block should contain the 24-byte NetFlow v5 header and space for 30 48-byte flow records, even if data for only one NetFlow record is valid.
The --netflow-file switch is required in this mode; it specifies
the NetFlow file to process. Any value specified in the
read-from-file command in the sensor.conf file is ignored; the
value is typically set to /dev/null. The --sensor-name switch
is also requried in pdufile mode unless the sensor.conf file
contains a single sensor.
The following switches are available in PDU File mode:
- --netflow-file=FILE_PATH
-
Name the full path of the file from which rwflowpack reads NetFlow v5 PDUs. This switch is required in PDU File mode.
- --sensor-name=SENSOR
-
Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. There must be a sensor definition for SENSOR in the configuration file. This switch is required in this mode unless the sensor.conf file only defines a single sensor.
- --archive-directory=DIR_PATH
-
Name the full path of the directory to which the NetFlow file will be moved after it has been processed by rwflowpack. If this switch is not provided, the original NetFlow source file is not modified, moved, or deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.
- --error-directory=DIR_PATH
-
Name the full path of the directory to which the NetFlow file will be moved if it cannot be opened or if it is not a NetFlow v5 file. If this switch is not provided, a bad source file remains in place. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.
Flowcap Files Collection Switches (--input-mode=fcfiles)
When the --input-mode=fcfiles switch is provided, rwflowpack will process files created by another SiLK daemon called flowcap(8). Typically flowcap runs near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on the local disk. These files are transferred between the flowcap machine and rwflowpack machine by external programs (typically the rwsender(8) and rwreceiver(8) daemons). rwflowpack polls a local directory for these files, and then processes the files to generate SiLK Flow files.
In fcfiles mode, rwflowpack ignores the probe definitions in the sensor.conf file since flowcap labeled the files with probe where the flows were collected. rwflowpack will use the sensor definitions in sensor.conf.
rwflowpack from SiLK-1.0 and later will process files created by flowcap from SiLK-0.11.x or earlier, but you need to take care in setting up the sensor.conf file. See the NOTES section below and sensor.conf(5).
When operating in flowcap files input mode, the first of the following switches are required:
- --incoming-directory=DIR_PATH
-
Name the full path of the directory which rwflowpack will monitor for files created by flowcap. Once processed by rwflowpack, files are moved from this directory to the archive-directory, if it has been specified.
- --polling-interval=NUMBER
-
Specify the number of seconds rwflowpack will wait between polls of the incoming-directory for new files created by flowcap. If not given, the default value is 15 seconds.
- --archive-directory=DIR_PATH
-
Name the full path of the directory used to store the files after rwflowpack has processed them. If this switch is not provided, the files are deleted. Under DIR_PATH, a directory tree for each year, month, day, and hour (based on the current UTC time) is created, making the full path to a file DIR_PATH/YEAR/MONTH/DAY/HOUR/FILE. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.
- --error-directory=DIR_PATH
-
Name the full path of the directory to which problem files are moved. Problem files are those that rwflowpack cannot open, are not of the correct format, or contain an unrecognized probe name. If this switch is not provided, problem files remain in place and cause rwflowpack to exit. Unlike the --archive-directory, files are stored directly in this directory, that is DIR_PATH/FILE.
Local Storage Mode Switches (--output-mode=local-storage)
Once rwflowpack has collected data, categorized it, and written it into files, it can do one of two things with the files:
-
Store the files on the local disk in a well-defined location.
-
Transfer the files to another machine and store them in a well defined location (see sending mode below).
(The data files must be stored in a well-defined location so that rwfilter(1) can find them. To see rwfilter's idea of the well-defined location, run rwfilter --version.)
The default output-mode is to store the files on the local disk (i.e., local-storage). When operating in this mode, the following switch is required:
- --root-directory=DIR_PATH
-
Name the full path of the directory under which the files containing the packed SiLK Flow records will be stored. rwflowpack will create subdirectories below DIR_PATH based on the data received.
Sending Mode Storage Switches (--output-mode=sending)
To transfer the packed SiLK Flow files to another machine, specify the --output-mode=sending switch and invoke the rwsender(8) to transfer the files. When rwflowpack is used with rwsender, the following three switches must be provided:
- --incremental-directory=DIR_PATH
-
Name the full path of the directory under which packed SiLK files will initially be created. Files in this directory are considered to be incomplete; any files in this directory will be removed when rwflowpack is started. Once complete, files are moved from this directory to the sender-directory.
- --sender-directory=DIR_PATH
-
Name the full path of the directory under which completed
incrementalfiles are stored while awaiting action by rwsender. The rwsender is responsible for removing files from this directory.
Help Options
- --verify-sensor-config
- --verify-sensor-config=VERBOSE
-
Verify that the syntax of the sensor configuration file is correct and then exit rwflowpack. If the file is incorrect or if it does not define any sensors, an error message is printed and rwflowpack exits abnormally. If the file is correct and no argument is provided to the --verify-sensor-config switch, rwflowpack simply exits with status 0. If an argument (other than the empty string and
0) is provided to the switch, the names of the probes and sensors found in the sensor configuration file are printed to the standard output, and then rwflowpack exits. - --help
-
Print the available options and exit.
- --version
-
Print the version number and information about how SiLK was configured, then exit the application.
ENVIRONMENT
- SILK_CONFIG_FILE
-
This environment variable is used as the value for the --site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
-
When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwset looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
- SILK_PATH
-
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwset checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
- SILK_PLUGIN_DEBUG
-
When set to 1, rwflowpack print status messages to the standard error as it tries to open the packing logic plug-in.
FILES
The root of the directory tree that contains the packed, binary SiLK Flow files is set by the --root-directory switch; this directory is called the SILK_DATA_ROOTDIR. Immediately underneath it are subdirectories corresponding to the traffic categories (directions) discussed above. Under these are directories representing the year, month, and day in YYYY/MM/DD format. That is
$SILK_DATA_ROOTDIR/in/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/inweb/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/innull/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/out/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/outweb/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/outnull/{$YEAR}/{$MONTH}/{$DAY}/*
For example, output web files for October 4th, 2003 are recorded in $SILK_DATA_ROOTDIR/outweb/2003/10/04/
The names of the files in these directories include all of this information, and are written in the form:
flowType-sensorName_YYYYMMDD.HH
where flowType encodes the category and sensorName is the sensor on which the flow was collected.
SEE ALSO
SiLK Installation Handbook, sensor.conf(5), silk.conf(5), flowcap(8), rwfilter(1), rwflowappend(8), rwreceiver(8), rwsender(8), silk(7), syslog(3), cron(8)
NOTES
rwflowpack in SiLK-1.0 and later can process files created by flowcap from SiLK-0.11.x or earlier when you properly name the probes in your sensor.conf file. To communicate information about where a flow is collected, flowcap writes information into the each file's header, and when rwflowpack processes the file, it uses this information to find entries in the sensor.conf file. Since SiLK-1.0, this information is the name of the probe where the flow was collected. Files created by flowcap prior to the SiLK-1.0 release contain the sensor name and probe name in the file's header. When rwflowpack reads these files, it converts the sensor and probe names to the value sensor_probe and attempts to find that probe in the sensor.conf file. For flowcap-files with no explicit probe name, rwflowpack looks for a probe named sensor.
For example, when the sensor.conf file used by pre-1.0 flowcap contains these entries
sensor-probe S2
probe-type ipfix
probe-name ipfix
protocol tcp
...
sensor-probe S4
probe-type netflow
protocol udp
...
You should create the following entries in the sensor.conf used by rwflowpack:
probe S2_ipfix ipfix
protocol tcp
...
end probe
sensor S2
ipfix-probes S2_ipfix
...
end sensor
probe S4 netflow
protocol udp
...
end probe
sensor S4
netflow-probes S4
...
end sensor
SiLK comes with the update-sensor-conf script that will re-write the sensor.conf file to be compatible with the new format.


