NAME
rwflowpack - Collects flow data and stores it in binary SiLK Flow files
SYNOPSIS
rwflowpack --packing-logic=DYNLIB --sensor-configuration=FILE_PATH
{ --log-destination=DESTINATION
| --log-directory=DIR_PATH [--log-basename=BASENAME]
| --log-pathname=FILE_PATH }
[--byte-order=ENDIAN] [--compression-method=COMP_METHOD]
[--site-config-file=FILENAME] [--flush-timeout=VAL]
[--pack-interfaces] [--no-file-locking] [--log-level=LEVEL]
[--log-sysfacility=NUMBER] [--pidfile=FILE_PATH] [--no-daemon]
[--input-mode=MODE] [--output-mode=MODE]
MODE_SPECIFIC_SWITCHES
To collect NetFlow v5 or IPFIX data from the network (default):
rwflowpack ... [--input-mode=stream] [--sensor-name=SENSOR] ...
To collect from files containing NetFlow v5 PDUs:
rwflowpack ... --input-mode=file --netflow-file=FILE_PATH
[--sensor-name=SENSOR] [--archive-directory=DIR_PATH] ...
To collect from local files containing flows created by flowcap(8):
rwflowpack ... --input-mode=fcfiles --incoming-directory=DIR_PATH
[--polling-interval=NUMBER] [--archive-directory=DIR_PATH] ...
To collect from a remote flowcap process:
rwflowpack ... --input-mode=flowcap [--flowcap-port=NUMBER]
--flowcap-address=IP_ADDR[:NUMBER][,IP_ADDR[:NUMBER]]
--work-directory=DIR_PATH --valid-directory=DIR_PATH
[--archive-directory=DIR_PATH] ...
To store the SiLK Flow files on the local machine (default):
rwflowpack ... [--output-mode=local-storage]
--root-directory=DIR_PATH ...
To forward the SiLK Flow files to a remote machine:
rwflowpack ... --output-mode=sending --sender-directory=DIR_PATH
--incremental-directory=DIR_PATH
...
DESCRIPTION
rwflowpack is a daemon that collects NetFlow v5 or IPFIX (Internet Protocol Flow Information eXport) data, converts the data to the SiLK Flow record format, categorizes each flow (e.g., as incoming or outgoing), and stores the data in binary flat files within a directory tree, with one file per hour-category-sensor tuple.
See the SiLK Installation Handbook for an explanation of how SiLK categorizes flows and converts data to the SiLK format.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
General Configuration
The following switch is required:
- --packing-logic=DYNLIB
-
Specify the plug-in that rwflowpack should load, where the plug-in
provides functions that determine into which class and type each flow
record will be categorized and the format of the files that
rwflowpack will write. When SiLK has been configured with
hard-coded packing logic (i.e., when --enable-packing-logic was
specified to the configure script), this switch will not be present
on rwflowpack. A default value for this switch may be specified in
the silk.conf(5) site configuration file. When DYNLIB contains
a slash (
/), rwflowpack assumes the path to DYNLIB is correct. Otherwise, rwflowpack will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwflowpack does not find the file, it assumes the plug-in is in the current directory. To force rwflowpack to look in the current directory first, specify --packing-logic=./DYNLIB. When the SILK_DYNLIB_DEBUG environment variable is non-empty, rwflowpack prints status messages to the standard error as it tries to open the plug-in. - --sensor-configuration=FILE_PATH
- Give the path to the configuration file that rwflowpack will consult to determine whether a record represents an incoming or outgoing flow. The complete syntax of the configuration file is described in the sensor.conf(5) manual page; see also the SiLK Installation Handbook.
The following set of switches is optional:
- --byte-order=ENDIAN
- Set the byte order for newly created SiLK Flow files. When appending records to an existing file, the byte order of the file is maintained. The argument is one of the following:
native
- Use the byte order of the machine where rwflowpack is running. This is the default.
big
- Use network byte order (big endian) for the flow files.
little
- Write the flow files in little endian format.
- --compression-method=COMP_METHOD
- Set the compression method for newly created SiLK Flow files to COMP_METHOD. When appending records to an existing file, the compression method of the file is maintained.
-
In addition to the packing (shrinking) of the flow records that SiLK normally does, rwflowpack can use an external library to further reduce the size of the records on disk. The list of available compression methods and the default method are set when SiLK is compiled (the --help and --version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support the following:
- none
- Do not compress the SiLK Flow records using an external library.
- zlib
- Use the zlib(3) library for compressing the flow records.
- lzo1x
- Use the lzo1x algorithm from the LZO real-time compression library for compressing the flow records.
- best
-
Use whichever available method gives the
bestcompression in general, though not necessarily thebestfor this particular file. - --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory specified in the --root-directory switch; the directory specified in the SILK_DATA_ROOTDIR environment variable (sending mode only); the data root directory that is compiled into SiLK (sending mode only); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.
- --flush-timeout=VAL
- Set the timeout for flushing any in-memory records to disk to VAL seconds. If not specified, the default is 5 minutes (600 seconds). When using local storage mode, this value specifies how often the files are flushed to disk to ensure that any records in memory are written to disk. When using sending output mode with a stream input mode, this value specifies how often to move the files from the incremental-directory to the sender-directory.
- --pack-interfaces
- Allow one to override the default file output format of the packed SiLK Flow files that rwflowpack writes. When this switch is present, rwflowpack writes additional information into the packed files: the router's SNMP input and output interfaces and the next-hop IP address. The extra data produced by this switch is useful for determining why traffic is being stored in certain files.
-
Note that this switch will only affect newly created files. New records will always be appended to an existing file in the file's current output format to maintain file integrity.
- --no-file-locking
- Do not use advisory write locks. Normally, rwflowpack will attempt to obtain a write lock on the data files prior to writing records to them; these locks prevent two instances of rwflowpack from writing to the same data file. However, not all file systems support advisory write locks, and this switch must be used when writing data to such a file system.
Logging and Daemon Configuration
One of the following mutually-exclusive switches is required:
- --log-destination=DESTINATION
-
Specify the destination where logging messages are written. When
DESTINATION begins with a slash
/, it is treated as a file system path and all log messages are written to that file; there is no log rotation. When DESTINATION does not begin with/, it must be one of the following strings: none
- Messages are not written anywhere.
stdout
- Messages are written to the standard output.
stderr
- Messages are written to the standard error.
syslog
- Messages are written using the syslog(3) facility.
both
- Messages are written to the syslog facility and to the standard error (this option is not available on all platforms).
- --log-directory=DIR_PATH
- Use DIR_PATH as the directory where the log files are written. DIR_PATH must be a complete directory path. The log files have the form
-
DIR_PATH/LOG_BASENAME-YYYYMMDD.log
-
where YYYYMMDD is the current date and LOG_BASENAME is the application name or the value passed to the --log-basename switch when provided. The log files will be rotated: at midnight local time a new log will be opened and the previous day's log file will be compressed using gzip(1). (Old log files are not removed by rwflowpack; the administrator should use another tool to remove them.) When this switch is provided, a process-ID file (PID) will also be written in this directory unless the --pidfile switch is provided.
- --log-pathname=FILE_PATH
- Use FILE_PATH as the complete path to the log file. The log file will not be rotated.
The following set of switches is optional:
- --log-level=LEVEL
-
Set the severity of messages that will be logged. The levels from
most severe to least are:
emerg,alert,crit,err,warning,notice,info,debug. The default isinfo. - --log-sysfacility=NUMBER
-
Set the facility that syslog(3) uses for logging messages. This
switch takes a number as an argument. The default is a value that
corresponds to
LOG_USERon the system where rwflowpack is running. This switch produces an error unless --log-destination=syslog is specified. - --log-basename=LOG_BASENAME
- Use LOG_BASENAME in place of the application name for the files in the log directory. See the description of the --log-directory switch.
- --pidfile=FILE_PATH
- Set the complete path to the file in which rwflowpack writes its process ID (PID) when it is running as a daemon. No PID file is written when --no-daemon is given. When this switch is not present, no PID file is written unless the --log-directory switch is specified, in which case the PID is written to LOGPATH/rwflowpack.pid.
- --no-daemon
- Force rwflowpack to stay in the foreground---it does not become a daemon. Useful for debugging.
Input and Output Mode
rwflowpack supports multiple ways of getting (the input mode) and storing (the output mode) data.
- --input-mode=MODE
-
Determine how rwflowpack will gather data. The default input
MODE is
stream. The available modes are stream
- rwflowpack opens a port for every network-listening probe specified in the sensor configuration file. These ports expect to receive either IPFIX data or NetFlow v5 PDUs as generated by a router or other flow meter.
pdufile
- rwflowpack reads NetFlow v5 PDUs from a file. The file's size must be an integer multiple of 1464, where each 1464 byte chunk contains a 24 byte NetFlow v5 header and space for thirty 48 byte NetFlow records. The number of valid records per chunk is specified in the header.
fcfiles
- rwflowpack polls a local directory for files disk that were created by the flowcap(8) daemon. Typically flowcap runs on a separate machine near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on its local disk. For the fcfiles input mode, the files are moved between the flowcap and rwflowpack machines by separate programs, typically the rwsender(8) and rwreceiver(8) daemons.
flowcap
- rwflowpack connects over a TCP socket to a machine at a remote location running the flowcap daemon. rwflowpack transfers the SiLK Flow files that flowcap has collected to the location disk for processing.
- --output-mode=MODE
-
Determines what rwflowpack will do with the data as it is packed
into SiLK binary files. The default output MODE is
local-storage. The available modes are local-storage
- rwflowpack writes the data on the local machine into a directory tree with a specific structure.
sending
-
rwflowpack writes the data into a temporary location on the local
disk. A separate program, rwsender(8), moves the data from the
local machine to remote machines where rwreceiver(8) working in
concert with the rwflowappend(8) will write the data into a
directory tree with a specific structure. Note that rwflowpack may
have been built without support for
sendingmode.
Stream Collection Switches (--input-mode=stream)
When the --input-mode switch is set to stream or when the switch
is not provided, rwflowpack expects to receive stream(s) of NetFlow
or IPFIX data over the network. rwflowpack will open a port for
every probe specified in the sensor.conf(5) configuration file that
has a listen-on-port attribute.
The following switch is optional:
- --sensor-name=SENSOR
- Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. Only data for this sensor will be collected. This allows a common configuration file to be used by multiple rwflowpack invocations, yet also allow each rwflowpack instance only collect data for a single sensor. There must be a sensor definition for SENSOR in the configuration file. When this switch is not present, rwflowpack will collect and pack data for all sensors.
PDU File Collection Switches (--input-mode=file)
Instead of reading flows from the network, rwflowpack can process files containing NetFlow data. These files are typically generated by a NetFlow collector.
To make rwflowpack process a file generated by a NetFlow collector,
pass it the --input-mode=pdufile switch. The file must have a
particular format: The file's length should be an integer multiple of
1464 bytes, where 1464 is the maximum length of the NetFlow v5 PDU.
Each 1464 block should contain the 24-byte NetFlow v5 header and space
for 30 48-byte flow records, even if data for only one NetFlow record
is valid.
Although rwflowpack can get the names of the NetFlow files
from the read-from-file attributes in the sensor configuration
file, it is more common to set read-from-file to /dev/null, and
to pass the name of the NetFlow file on the command line with the
--netflow-file switch. This simplifies scripting; otherwise, the
sensor configuration file would have to be rewritten for each run.
In this mode, rwflowpack will not become a daemon; it will remain in the foreground, process the NetFlow file(s), and exit.
The following switches are all optional:
- --sensor-name=SENSOR
- Cause rwflowpack to ignore all probes in the sensor configuration file except the probes for SENSOR. See above for a full description of this switch.
- --netflow-file=FILE_PATH
- Name the full path of the file from which rwflowpack reads NetFlow v5 PDUs.
- --archive-directory=DIR_PATH
- Name the full path of the directory to which NetFlow files will be moved after they have been processed by rwflowpack. If this switch is not provided, the original NetFlow source files are not modified, moved, or deleted. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.
Flowcap Collection Switches (--input-mode=flowcap)
When the --input-mode=flowcap switch is provided, rwflowpack will process files created by another SiLK daemon called flowcap(8). Typically flowcap runs near a router or other flow meter that is generating NetFlow v5 or IPFIX records. flowcap collects the records, compresses them, and stores them on the local disk. rwflowpack will contact the machine where flowcap is running and transfer the files via TCP to its local disk; rwflowpack then processes the files to generate SiLK Flow files.
rwflowpack from SiLK-1.0 will process files created by flowcap from previous releases of SiLK, but you need to take care in setting up the sensor.conf file. See the </NOTES> section below and sensor.conf(5).
When operating in flowcap input mode, the first four of the following switches are required:
- --flowcap-address=IP_ADDR[:NUMBER][,IP_ADDR[:NUMBER]] ...
- Specify the host addresses of the flowcap servers. rwflowpack will attempt to contact these machines on the ports given. If no port is specified for an address, it will use the port specified by the --flowcap-port switch.
- --flowcap-port=NUMBER
- Specify the default port on which rwflowpack will attempt to contact the flowcap servers.
- --work-directory=DIR_PATH
- Name the full path of the directory used to store files as they are being received from flowcap. The files in this directory are incomplete; any files in this directory will be removed when rwflowpack is started. Once complete, files are moved from this directory to the valid-directory.
- --valid-directory=DIR_PATH
- Name the full path of the directory used to store files that have been successfully received from flowcap but which have not yet been processed by rwflowpack. Once processed by rwflowpack, files are moved from this directory to the archive-directory, if it has been specified.
- --archive-directory=DIR_PATH
- Name the full path of the directory used to store files after rwflowpack has processed them. If this switch is not provided, the files are deleted. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.
Flowcap Files Collection Switches (--input-mode=fcfiles)
When the --input-mode=fcfiles switch is provided, rwflowpack will process files created by another SiLK daemon called flowcap(8). This mode is similar to the flowcap input mode, except rwflowpack does not contact flowcap directly. Instead, the files are transferred between the machines by external programs (typically the rwsender(8) and rwreceiver(8) daemons) and rwflowpack polls a directory for these files. rwflowpack processes the files to generate SiLK Flow files.
rwflowpack from SiLK-1.0 will process files created by flowcap from previous releases of SiLK, but you need to take care in setting up the sensor.conf file. See the </NOTES> section below and sensor.conf(5).
When operating in flowcap files input mode, the first of the following switches are required:
- --incoming-directory=DIR_PATH
- Name the full path of the directory which rwflowpack will monitor for files created by flowcap. Once processed by rwflowpack, files are moved from this directory to the archive-directory, if it has been specified.
- --polling-interval=NUMBER
- Specify the number of seconds rwflowpack will wait between polls of the incoming-directory for new files created by flowcap. This defaults to 15 seconds.
- --archive-directory=DIR_PATH
- Name the full path of the directory used to store the files after rwflowpack has processed them. If this switch is not provided, the files are deleted. Removing files from the archive-directory is not the job of rwflowpack; the system administrator should implement a separate cron(8) job to clean this directory.
Local Storage Mode Switches (--output-mode=local-storage)
Once rwflowpack has collected data, categorized it, and written it into files, it can do one of two things with the files:
-
Store the files on the local disk in a well-defined location.
Transfer the files to another machine and store them in a well defined
location (see sending mode below).
(The data files must be stored in a well-defined location so that rwfilter(1) can find them. To see rwfilter's idea of the well-defined location, run rwfilter --version.)
The default output-mode is to store the files on the local disk (i.e., local-storage). When operating in this mode, the following switch is required:
- --root-directory=DIR_PATH
- Name the full path of the directory under which the files containing the packed SiLK Flow records will be stored. rwflowpack will create subdirectories below DIR_PATH based on the data received.
Sending Mode Storage Switches (--output-mode=sending)
To transfer the packed SiLK Flow files to another machine, specify the --output-mode=sending switch and invoke the rwsender(8) to transfer the files. When rwflowpack is used with rwsender, the following three switches must be provided:
- --incremental-directory=DIR_PATH
- Name the full path of the directory under which packed SiLK files will initially be created. Files in this directory are considered to be incomplete; any files in this directory will be removed when rwflowpack is started. Once complete, files are moved from this directory to the sender-directory.
- --sender-directory=DIR_PATH
-
Name the full path of the directory under which completed
incrementalfiles are stored while awaiting action by rwsender. The rwsender is responsible for removing files from this directory.
ENVIRONMENT
- SILK_CONFIG_FILE
- This environment variable is used as the value for the --site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwset looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
- SILK_PATH
- This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwset checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
- SILK_DYNLIB_DEBUG
- When set to 1, rwflowpack print status messages to the standard error as it tries to open the packing logic plug-in.
FILES
The root of the directory tree that contains the packed, binary SiLK Flow files is set by the --root-directory switch; this directory is called the SILK_DATA_ROOTDIR. Immediately underneath it are subdirectories corresponding to the traffic categories (directions) discussed above. Under these are directories representing the year, month, and day in YYYY/MM/DD format. That is
$SILK_DATA_ROOTDIR/in/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/inweb/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/innull/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/out/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/outweb/{$YEAR}/{$MONTH}/{$DAY}/*
$SILK_DATA_ROOTDIR/outnull/{$YEAR}/{$MONTH}/{$DAY}/*
For example, output web files for October 4th, 2003 are recorded in $SILK_DATA_ROOTDIR/outweb/2003/10/04/
The names of the files in these directories include all of this information, and are written in the form:
I<flowType>-I<sensorName>_YYYYMMDD.HH
where flowType encodes the category and sensorName is the sensor on which the flow was collected.
SEE ALSO
SiLK Installation Handbook, sensor.conf(5), silk.conf(5), flowcap(8), rwfilter(1), rwflowappend(8), rwreceiver(8), rwsender(8), silk(7), syslog(3), cron(8)
NOTES
rwflowpack in SiLK-1.0 can process files created by flowcap from previous releases of SiLK when you properly name the probes in your sensor.conf file. To communicate information about where a flow is collected, flowcap writes information into the each file's header, and when rwflowpack processes the file, it uses this information to find entries in the sensor.conf file. As of SiLK-1.0, this information is the name of the probe where the flow was collected. Files created by flowcap prior to the SiLK-1.0 release contain the sensor name and probe name in the file's header. When rwflowpack reads these files, it converts the sensor and probe names to the value sensor_probe and attempts to find that probe in the sensor.conf file. For flowcap-files with no explicit probe name, rwflowpack looks for a probe named sensor.
For example, when the sensor.conf file used by pre-1.0 flowcap contains these entries
sensor-probe S2
probe-type ipfix
probe-name ipfix
protocol tcp
...
sensor-probe S4
probe-type netflow
protocol udp
...
You should create the following entries in the sensor.conf used by rwflowpack:
probe S2_ipfix ipfix
protocol tcp
...
end probe
sensor S2
ipfix-probes S2_ipfix
...
end sensor
probe S4 netflow
protocol udp
...
end probe
sensor S4
netflow-probes S4
...
end sensor
SiLK-1.0 comes with the update-sensor-conf script that will re-write the sensor.conf file to be compatible with the new format.


