Use of the SiLK system and related source code is subject to the terms of the following licenses:
GNU Public License (GPL) Rights pursuant to Version 2, June 1991
Government Purpose License Rights (GPLR) pursuant to DFARS 252.227.7013 NO WARRANTY ANY INFORMATION, MATERIALS, SERVICES, INTELLECTUAL PROPERTY OR OTHER PROPERTY OR RIGHTS GRANTED OR PROVIDED BY CARNEGIE MELLON UNIVERSITY PURSUANT TO THIS LICENSE (HEREINAFTER THE "DELIVERABLES") ARE ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, INFORMATIONAL CONTENT, NONINFRINGEMENT, OR ERROR-FREE OPERATION. CARNEGIE MELLON UNIVERSITY SHALL NOT BE LIABLE FOR INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES, SUCH AS LOSS OF PROFITS OR INABILITY TO USE SAID INTELLECTUAL PROPERTY, UNDER THIS LICENSE, REGARDLESS OF WHETHER SUCH PARTY WAS AWARE OF THE POSSIBILITY OF SUCH DAMAGES. LICENSEE AGREES THAT IT WILL NOT MAKE ANY WARRANTY ON BEHALF OF CARNEGIE MELLON UNIVERSITY, EXPRESS OR IMPLIED, TO ANY PERSON CONCERNING THE APPLICATION OF OR THE RESULTS TO BE OBTAINED WITH THE DELIVERABLES UNDER THIS LICENSE. Licensee hereby agrees to defend, indemnify, and hold harmless Carnegie Mellon University, its trustees, officers, employees, and agents from all claims or demands made against them (and any related losses, expenses, or attorney’s fees) arising out of, or relating to Licensee’s and/or its sub licensees’ negligent use or willful misuse of or negligent conduct or willful misconduct regarding the Software, facilities, or other rights or assistance granted by Carnegie Mellon University under this License, including, but not limited to, any claims of product liability, personal injury, death, damage to property, or violation of any laws or regulations. Carnegie Mellon University Software Engineering Institute authored documents are sponsored by the U.S. Department of Defense under Contract F19628-00-C-0003. Carnegie Mellon University retains copyrights in all material produced under this contract. The U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce these documents, or allow others to do so, for U.S. Government purposes only pursuant to the copyright license under the contract clause at 252.227.7013. |
The SiLK Reference Guide contains the manual page for each analysis tool, utility, plug-in, file format, and collection facility in the SiLK Collection and Analysis Suite.
This document is meant for reference only. The SiLK Analysis Handbook provides both a tutorial for learning about the tools and examples of how they can be used in analyzing flow data. See the SiLK Installation Handbook for instructions on installing SiLK at your site.
This reference guide is broken into sections like the traditional UNIX manual: end-user analysis tools and utilities are described in Section 1; the plug-ins that augment the behavior of some tools are presented in Section 3; Section 5 contains information about file formats; miscellaneous information is in Section 7; and commands for the installer and administor of SiLK appear in Section 8.
This section provides the manual page for each analysis tool and utility that the users of SiLK may employ in their day-to-day work.
Map sensor name to sensor number or vice versa
mapsid [--site-config-file=FILENAME] [--print-classes]
[{ <sensor-name> | <sensor-number> } ...] |
mapsid --help
|
mapsid --version
|
mapsid is a utility that maps sensor names to sensor numbers or vice versa depending on the input arguments. When no arguments are given, the mapping of all sensor numbers to names is printed. When a numeric argument is given, the number to name mapping is printed for the specified argument. When a name is given, its numeric id is printed. For convenience when typing in sensor names, the case is irrelevant.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
For each sensor, print the classes for which the sensor collects data.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
Name to number mapping:
$ mapsid beta
BETA -> 1 |
Number to name mapping:
$ mapsid 3
3 -> DELTA |
Print all mappings:
$ mapsid
0 -> ALPHA 1 -> BETA 2 -> GAMMA 3 -> DELTA 4 -> EPSLN 5 -> ZETA .... |
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, mapsid looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, mapsid checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), rwcut(1)
Convert an integer IP to dotted-decimal notation
num2dot [--ip-fields=FIELDS] [--delimiter=C]
|
num2dot --help
|
num2dot --version
|
num2dot is a filter to speedup sorting of IP numbers and yet result in both a natural order (i.e., 29.23.1.1 will appear before 192.168.1.1) and readable output (i.e., dotted decimal rather than an integer representation of the IP number).
It is designed specifically to deal with the output of rwcut(1). Its job is to read stdin and convert specified fields (default field 1) separated by a delimiter (default ’|’) from an integer number into a dotted decimal IP address. Up to three IP fields can be specified via the –ip-fields=FIELDS option. The –delimiter option can be used to specify an alternate delimiter.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Column number of the input that should be considered IP numbers. Column numbers start from 1. If not specified, the default is 1.
The character that separates the columns of the input. Default is ’|’.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
In addition to the default fields of 1-12 produced by rwcut, you also want to prefix each row with an integer form of the destination IP and the start time to make processing by another tool (e.g., a spreadsheet) easier. However, within the default rwcut output fields of 1-12, you want to see dotted-decimal IP addresses.
rwfilter ... --pass=stdout | \
rwcut --integer-ip --fields=2,9,1-12 --epoch-time | \ num2dot --ip-field=3,4 |
The first six columns produced by rwcut will be dIP, sTime, sIP, dIP, sPort, dPort. The –integer-ip switch makes the first, third, and fourth columns be integers, but you only want the first column to be an integer representation. The pipe through num2dot will convert the third and fourth columns to dotted-decimal IP numbers.
rwcut(1)
num2dot has no support for IPv6 addresses.
Count activity by IP address
rwaddrcount {--print-recs | --print-ips | --print-stat}
[--use-dest] [--min-bytes=BYTEMIN] [--max-bytes=BYTEMAX] [--min-records=RECMIN] [--max-records=RECMAX] [--min-packets=PACKMIN] [--max-packets=PACKMAX] [--set-file=PATHNAME] [--sort-ips] [{--integer-ips | --zero-pad-ips}] [--no-titles] [--no-columns] [--column-separator=CHAR] [--no-final-delimiter] [{--delimited | --delimited=CHAR}] [--print-filenames] [--copy-input=PATH] [--output-path=PATH] [--pager=PAGER_PROG] [--site-config-file=FILENAME] [{--legacy-timestamps | --legacy-timestamps=NUM}] [FILES...] |
rwaddrcount --help
|
rwaddrcount --version
|
rwaddrcount reads SiLK Flow records from files named on the command line or from the standard input, sums the byte-, packet-, and record-counts by individual source or destination IP address and maintains the time window during which that IP address was active. At the end of the count operation, the results per IP address are displayed when the –print-recs switch is given. rwaddrcount includes facilities for displaying only those IP address whose byte-, packet- or flow-counts are between specified minima and maxima.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
For the application to operate, one of the three –print options must be chosen.
Print out count records: IP address, number of bytes, number of packets, number of filter records, earliest start time and latest end time.
Print out IP addresses exclusively
Print the following statistics for all SiLK flows that were read and for those meeting the minima and maxima criteria: byte, packet, and flow record counts and the number of unique IP addresses.
Count by destination IP address in the filter record rather than source IP.
Filtering criterion; for the final output (stats or printing), only include count records where the total number of bytes exceeds BYTEMIN
Filtering criterion; for the final output (stats or printing), only include count records where the total number of packets exceeds PACKMIN
Filtering criterion; for the final output (stats or printing), only include count records where the total number of filter records contributing to that count record exceeds RECMIN.
Filtering criterion; for the final output (stats or printing), only include count records where the total number of bytes is less than BYTEMAX.
Filtering criterion; for the final output (stats or printing), only include count records where the total number of packets is less than PACKMAX.
Filtering criterion; for the final output (stats or printing), only include count records which at most RECMAX filter records contributed to.
Write the IPs into the rwset(1)-style binary IP-set file named PATHNAME. Use rwsetcat(1) to see the contents of this file.
For the –print-recs and –print-ips output formats, print the IPs as integers. By default, IP addresses are printed as dotted decimal.
For the –print-recs and –print-ips output formats, print IP addresses as dotted decimal, but use three digits per octet by adding zero-padding, e.g, 000.000.000.000.
For the –print-recs and –print-ips output formats, the results are presented sorted by IP address.
Turn off column titles. By default, titles are printed.
Disable fixed-width columnar output.
Use specified character between columns and after the final column. When this switch is not specified, the default of ’|’ is used.
Do not print the column separator after the final column. Normally a delimiter is printed.
Run as if –no-columns –no-final-delimiter –column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default ’|’.
Print to the standard error the names of input files as they are opened.
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the –output-path switch has been used to redirect rwaddrcount’s ASCII output.
Determine where the output of rwaddrcount (ASCII text) is written. If this option is not given, output is written to the standard output.
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Specify the format for human readable timestamps, either the default (new) style, YYYY/MM/DDThh:mm:ss , or the legacy style, MM/DD/YYYY hh:mm:ss . When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
The following switches are deprecated.
Deprecated alias for –min-bytes.
Deprecated alias for –min-packets.
Deprecated alias for –min-records.
Deprecated alias for –max-bytes.
Deprecated alias for –max-packets.
Deprecated alias for –max-records.
To print out a set of IP’s with exactly one tcp record during the time period, use:
rwfilter --start-date=2003/09/01:00 --end-date=2003/09/01:12 \
--proto=6 --pass=stdout \ | rwaddrcount --max-records=1 --print-ips |
In general, to print out record information, use rwaddrcount with –print-recs
rwfilter --start-date=2003/01/17:00 --end-date=2003/01/17:23 \
--proto=6 --pass=stdout \ | rwaddrcount --print-rec | head -3 |
10.10.10.1| 65792| 147| 21| 2003/01/17T00:19:01| 2003/01/17T02:00:13|
10.10.10.2| 110744| 89| 7| 2003/01/17T01:21:42| 2003/01/17T01:39:21| 10.10.10.3| 864| 18| 6| 2003/01/17T00:20:33| 2003/01/17T01:25:38| |
When set to a non-empty string, rwcut automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcut does not automatically page its output.
When set and SILK_PAGER is not set, rwcut automatically invokes this program to display its output a screen at a time.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwaddrcount looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwaddrcount checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), rwset(1), rwsetcat(1), rwstats(1), rwtotal(1), rwuniq(1)
When used in an IPv6 environment, rwaddrcount will attempt to convert any IPv6 addresses to IPv4. Records that can be converted will be processed, all other records will be silently ignored.
rwaddrcount uses a fairly large hashtable to store data, but it is likely that as the amount of data expands, the application will take more time to process data.
Similar binning of records are produced by rwstats(1), rwtotal(1), and rwuniq(1).
To generate a list of IP addresses without the volume information, use rwset(1).
Append SiLK Flow file(s) to an existing SiLK Flow file
rwappend [--create=[TEMPLATE_FILE]] [--site-config-file=FILENAME]
[--print-statistics] TARGET_FILE SOURCE_FILE [SOURCE_FILE...] |
rwappend --help
|
rwappend --version
|
rwappend reads SiLK Flow records from the specified SOURCE_FILEs and appends them to the TARGET_FILE. If stdin is used as the name of one of the SOURCE_FILEs, SiLK flow records will be read from the standard input.
When the TARGET_FILE does not exist and the –create switch is not provided, rwappend will exit with an error. When –create is specified and TARGET_FILE does not exist, rwappend will create the TARGET_FILE using the same format, version, and byte-order as the specified TEMPLATE_FILE. If no TEMPLATE_FILE is given, the TARGET_FILE is created in the default format and version (the same format that rwcat(1) would produce).
The TARGET_FILE must be an actual file—it cannot be a named pipe or the standard output. In addition, the header of TARGET_FILE must not be compressed; that is, you cannot append to a file whose entire contents has been compressed with gzip (those files normally end in the .gz extension).
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Create the TARGET_FILE if it does not exist. The file will have the same format, version, and byte-order as the TEMPLATE_FILE if it is provided; otherwise the defaults are used. The TEMPLATE_FILE will NOT be appended to TARGET_FILE unless it also appears in as the name of a SOURCE_FILE.
Print to the standard error the number of records read from each SOURCE_FILE and the total number of records appened to the TARGET_FILE.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
Standard usage where results.dat exists:
rwappend results.dat sample5.dat sample6.dat
|
To append files sample*.dat to results.dat, or to create results.dat using the same format as the first file argument (note that sample1.dat must be repeated):
rwappend results.dat --create=sample1.dat \
sample1.dat sample2.dat |
If results.dat does not exist, the following two commands are equivalent:
rwappend --create results.dat sample1.dat sample2.dat
|
rwcat sample1.dat sample2.dat > results.dat
|
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwappend looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwappend checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwcat(1)
When used in an IPv6 environment, rwappend will convert IP addresses into the form used by the TARGET_FILE. Any records containing IP addresses that cannot be converted will be silently ignored.
rwappend makes some attempts to avoid appending a file to itself (which would eventually exhaust the disk space) by comparing the names of files it is given; it should be smarter about this.
Build a binary Bag from SiLK Flow records.
rwbag [--sip-flows=OUTPUTFILE] [--dip-flows=OUTPUTFILE]
[--sport-flows=OUTPUTFILE] [--dport-flows=OUTPUTFILE] [--proto-flows=OUTPUTFILE] [--sensor-flows=OUTPUTFILE] [--input-flows=OUTPUTFILE] [--output-flows=OUTPUTFILE] [--nhip-flows=OUTPUTFILE] [--sip-packets=OUTPUTFILE] [--dip-packets=OUTPUTFILE] [--sport-packets=OUTPUTFILE] [--dport-packets=OUTPUTFILE] [--proto-packets=OUTPUTFILE] [--sensor-packets=OUTPUTFILE] [--input-packets=OUTPUTFILE] [--output-packets=OUTPUTFILE] [--nhip-packets=OUTPUTFILE] [--sip-bytes=OUTPUTFILE] [--dip-bytes=OUTPUTFILE] [--sport-bytes=OUTPUTFILE] [--dport-bytes=OUTPUTFILE] [--proto-bytes=OUTPUTFILE] [--sensor-bytes=OUTPUTFILE] [--input-bytes=OUTPUTFILE] [--output-bytes=OUTPUTFILE] [--nhip-bytes=OUTPUTFILE] [--note-add=TEXT] [--note-file-add=FILE] [--compression-method=COMP_METHOD] [--print-filenames] [--copy-input=PATH] [--site-config-file=FILENAME] [INPUTFILE[ INPUTFILE...]] |
rwbag --help
|
rwbag --legacy-help
|
rwbag --version
|
rwbag reads SiLK Flow records and builds a Bag. Source IP address, destination IP address, next hop IP address, source port, destination port, protocol, input interface index, output interface index, or sensor ID may be used as the unique key by which to count volumes. Flows, packets, or bytes may be used as the counter. rwbag attempts to read raw flow records from the standard input or from any INPUTFILE arguments. INPUTFILE may also explicitly be the keyword stdin. If the raw flow records do not contain the proper key and counter fields, rwbag prints an error to stderr and exits abnormally.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
At least one of the following output flags must be defined. For each, OUTPUTFILE is the name of a non-existent file, a named pipe, or the keyword stdout to write the binary Bag to the standard output. Only one switch may use the standard output as its output stream.
Count number of flows by unique source IP.
Count number of packets by unique source IP.
Count number of bytes by unique source IP.
Count number of flows by unique destination IP.
Count number of packets by unique destination IP.
Count number of bytes by unique destination IP.
Count number of flows by unique source port.
Count number of packets by unique source port.
Count number of bytes by unique source port.
Count number of flows by unique destination port.
Count number of packets by unique destination port.
Count number of bytes by unique destination port.
Count number of flows by unique protocol.
Count number of packets by unique protocol.
Count number of bytes by unique protocol.
Count number of flows by unique sensor ID.
Count number of packets by unique sensor ID.
Count number of bytes by unique sensor ID.
Count number of flows by unique input interface index.
Count number of packets by unique input interface index.
Count number of bytes by unique input interface index.
Count number of flows by unique output interface index.
Count number of packets by unique output interface index.
Count number of bytes by unique output interface index.
Count number of flows by unique next hop IP.
Count number of packets by unique next hop IP.
Count number of bytes by unique next hop IP.
Add the specified TEXT to the header of every output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of every output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Prints to the standard error the names of input files as they are opened.
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the –output-path switch has been used to redirect rwbag’s ASCII output.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Print the available options and exit.
Print the usage information for rwbag and include the names of the deprecated options in the output, then exit.
Print the version number and information about how SiLK was configured, then exit the application.
The following options are deprecated.
Deprecated alias for –sip-flows.
Deprecated alias for –sip-packets.
Deprecated alias for –sip-bytes.
Deprecated alias for –dip-flows.
Deprecated alias for –dip-packets.
Deprecated alias for –dip-bytes.
Deprecated alias for –sport-flows.
Deprecated alias for –sport-packets.
Deprecated alias for –sport-bytes.
Deprecated alias for –dport-flows.
Deprecated alias for –dport-packets.
Deprecated alias for –dport-bytes.
Deprecated alias for –proto-flows.
Deprecated alias for –proto-packets.
Deprecated alias for –proto-bytes.
To build both source IP and destination IP Bags of flows:
rwfilter... | rwbag --sip-flow=sf.bag --dip-flow=df.bag
|
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwbag looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwbag checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwbagbuild(1), rwbagcat(1), rwbagtool(1), rwfileinfo(1), rwfilter(1)
Currently there is no support for Bag files keyed by an IPv6 address.
When used in an IPv6 environment, rwbag will process every record when creating Bags that are not keyed by the IP address. For Bags keyed by the IP address, rwbag will attempt to convert any IPv6 addresses to IPv4. Records that can be converted will be processed, all other records will be silently ignored for the IP-keyed Bags, but will be used for any non-IP-keyed Bags.
Create a binary Bag from non-flow data.
rwbagbuild { --set-input=SETFILE | --bag-input=TEXTFILE }
[--delimiter=C] [--default-count=DEFAULTCOUNT] [--note-add=TEXT] [--note-file-add=FILE] [--compression-method=COMP_METHOD] [--output-path=OUTPUTFILE] |
rwbagbuild --help
|
rwbagbuild --version
|
rwbagbuild builds a binary Bag file from an IPset file or from textual input.
When creating a Bag from an IPset, the value associated with each IP address is the value given by the –default-count switch, or 1 if the switch isn’t provided.
The textual input read from the argument to the –bag-input switch is processed a line at a time. Comments begin with a ’#’-character and continue to the end of the line; they are stripped from each line. Any line that is blank or contains only whitespace is ignored. All other lines must contain a valid key or key-count pair; whitespace around the key and count is ignored.
If the delimiter character (specified by the –delimiter switch and having pipe (’|’) as its default) is not present, the line must contain only an IP address or an integer key. If the delimiter is present, the line must contain an IP address or integer key before the delimiter and an integer count after the delimiter. These lines may have a second delimiter after the integer count; the second delimiter and any text to the right of it are ignored.
When the –default-count switch is specified, its value will used as the count for each key, and the count value parsed from each line, if any, is ignored. Otherwise, the parsed count is used, or 1 is used as the count if no delimiter was present.
For each key-count pair, the key will be inserted into Bag with its count or, if the key is already present in the Bag, its total count will be incremented by the count from this line.
The IP address or integer key must be expresed in one of these formats:
Dotted decimal—all 4 octets are required:
10.1.2.4
|
An unsigned 32-bit integer:
167838212
|
Either of the above with a CIDR designation—for dotted decimal all four octets are still required:
10.1.2.4/31
167838212/31 |
SiLK wildcard notation: Four octets separated by periods where each octet may be a single number, a range of numbers, e.g., 1-10, a comma separated list of numbers and ranges, or the character ’x’ used to represent all values in an octet, that is 0-255:
10.x.1-2.4,5
|
If an IP address or count cannot be parsed, or if a line contains a delimiter character but no count, rwbagbuild prints an error and exits.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
The following two switches control the type of input; one and only one must be provided:
Create a Bag from an IPset. SETFILE is a filename, a named pipe, or the keyword stdin. Counts have a volume of 1 unless overridden with –default-count.
Create a Bag from a delimited text file. TEXTFILE is a filename, a named pipe, or the keyword stdin. See the DESCRIPTION section for the syntax of the TEXTFILE.
The delimiter to expect between each key-count pair of the TEXTFILE read by the –bag-input switch. The delimiter is ignored if the –set-input switch is specified. Since ’#’ is used to denote comments and newline is used to used to denote records, neither is a valid delimiter character.
Override the counts of all values in the input bag or set with the value of DEFAULTCOUNT. DEFAULTCOUNT must be a positive integer.
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Redirect output to OUTPUTFILE. OUTPUTFILE is a filename, named pipe, or the keyword stdout.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
Assume the file mybag.txt contains the following (ignore leading whitespace and every line ends with a newline):
192.168.0.1|5
192.168.0.2|500 192.168.0.3|3 192.168.0.4|14 192.168.0.5|5 |
To build a bag with it:
rwbagbuild --bag-input=mybag.txt > mybag.bag
|
To create a Bag of protocol data from the text file myproto.txt:
1| 4|
6| 138| 17| 131| |
use
rwbag --bag-input=myproto.txt > myproto.bag
|
Given the IP set myset.set, create a bag where every entry in the set has a count of 3:
rwbagbuild --set-input=myset.set --default-count=3 \
--out=mybag2.bag |
rwbag(1), rwbagcat(1), rwbagtool(1), rwfileinfo(1), rwset(1)
Output a binary Bag as text.
rwbagcat [--stats[=OUTFILE]] [--tree-stats[=OUTFILE]]
[ --network-structure[=STRUCTURE] | --bin-ips[=SCALE] ] [--minkey=VALUE] [--maxkey=VALUE] [--mask-set=PATH] [--mincounter=VALUE] [--maxcounter=VALUE] [--zero-counts] [--integer-keys | --zero-pad-ips] [--output-path=OUTPUTFILE] [--no-columns] [--column-separator=C] [--no-final-delimiter] [{--delimited | --delimited=C}] [--pager=PAGER_PROG] [BAGFILE...] |
rwbagcat --help
|
rwbagcat --version
|
rwbagcat reads a binary Bag as created by rwbag(1) or rwbagbuild(1), converts it to text, and outputs it to the standard output or the specified file. It can also print various statistics and summary information about the Bag.
rwbagcat reads the BAGFILEs specified on the command line; if no BAGFILE arguments are given, rwbagcat attempts to read the Bag from the standard input. BAGFILE may also explicitly be the keyword stdin or a hyphen (-) to allow rwbagcat to combine files and piped input. If any input does not contain a Bag, rwbagcat prints an error to the standard error and exits abnormally.
When multiple BAGFILEs are specified, each is handled individually; to process the combination of the BAGFILEs, invoke rwbagcat on the output from rwbagtool(1).
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Print the sum of the counters for each CIDR block of the specified size listed in STRUCTURE. The switch can also, for each CIDR block, print the number of hosts and smaller CIDR blocks that are occupied. STRUCTURE has one of three forms: CIDR_LIST, CIDR_LIST/, or CIDR_LIST/SUMMARY_EXTRAS. CIDR_LIST and SUMMARY_EXTRAS are each a comma separated list of integers from 1 to 32 as well as the following letters:
T: /0 network (the total network; ignored in SUMMARY_EXTRAS)
A: /8 network (legacy class A)
B: /16 nework (legacy class B)
C: /24 network (legacy class C)
X: /27 network
H: /32 network (individual host IP addresses)
A comma is not required between adjacent letters. Any combination of integers and the symbols T,A,B,C,X,H may be specified in CIDR_LIST. In addition, if the argument contains the letter S or a slash (/), the output line for a CIDR block will also show the number of hosts and smaller CIDR blocks that are occupied. This list of smaller CIDR blocks to summarize is generated by forming the union of CIDR_LIST and SUMMARY_EXTRAS. By default, SUMMARY_EXTRAS is 8,16,24,27, and this default is used when the argument contains S but no slash. If the argument includes a slash and SUMMARY_EXTRAS is empty, the list of smaller subnets is set exactly to CIDR_LIST. If an argument is provided, the CIDR_LIST must contain at least one element. If no argument is specified to the switch, the default is TS/ABCX. An argument that contains nothing but S and/or slash is illegal. This option disables printing of the individual IPs; specify the H argument to the switch to print the IP addresses and their counters.
Invert the bag and count the total number of unique IP addresses for a given value of the volume bin. For example, turn a Bag {sip:flow} into {flow:count(sip)}. SCALE is a string containing the value linear, binary, or decimal.
The default behavior is linear: Each distinct counter gets its own bin. Any counter in the input Bag file that is larger than the maximum possible key will be attributed to the maximum key; to prevent this, specify --maxcounter=4294967295.
binary creates a bag of {log2(flow):count(sip)}. Bin n contains counts in the range [ 2^n, 2^(n+1) ).
decimal creates one hundred bins for each counter in the range [1,100), and one hundred bins for each counter in the range [100,1000), each counter in the range [1000,10000), etc. Counters are logarithmically distributed among the bins.
Print out breakdown of the network hosts seen, and print out general statistics about the keys and counters.
count of unique keys
sum of all the counters
minimum key
maximum key
minimum counter
maximum counter
mean of counters
variance of counters
standard deviation of counters
skew of counters
kurtosis of counters
OUTFILE is a filename, named pipe, or one of the keywords stdout or stderr. Defaults to printing on stderr unless output is being paged, in which case output is to stdout.
Print out metadata about how the bag is performing:
count of nodes allocated
total bytes allocated for nodes
count of leaves allocated
total bytes allocated for leaves
count of keys entered
density of data
OUTFILE is a filename, named pipe, or one of the keywords stdout or stderr. Defaults to printing on stdout.
Only output records whose minimum key value is VALUE or higher. The valid range is of VALUE 0 to 4294967295, or 0.0.0.0 to 255.255.255.255. Default is 0 (for port or protocol) or 0.0.0.0 (for IP address). Accepts dotted decimal or integer notation.
Only output records whose maximum key value is VALUE or lower. The valid range of VALUE is 0 to 4294967295, or 0.0.0.0 to 255.255.255.255. Default is all ports or protocols, or the maximum IP address 255.255.255.255. Accepts dotted decimal or integer notation.
Only output records whose key appears in the IPset read from the file PATH. When used with –minkey and/or –maxkey, the key must be in the IPset and within when the specified range.
Only output records whose minimum counter value is VALUE or higher. The valid range of VALUE is 1 to 18446744073709551615. The default is to print all records with non-zero counter; use –zero-counts to show records whose counter is 0.
Only output records whose maximum counter value is VALUE or lower. The valid range of VALUE is 1 to 18446744073709551615, with the default being the maximum counter value.
Print keys whose counter is zero. Normally, keys with a counter of zero are suppressed since all keys have a default counter of zero. In order to use this flag, either –mask-set or both –minkey and –maxkey must be specified. When this switch is specified, any counter limit explicitly set by the –maxcounter switch will still be applied.
Redirect output of the –network-structure or –bin-ips options to OUTPUTFILE. OUTPUTFILE is a filename, named pipe, or the keyword stdout.
Pad IP address octets with zeros so that every octet is three characters wide.
Print the keys as integers. This flag should be used if the bag is a port or protocol bag.
Disable fixed-width columnar output.
Use specified character between columns and after the final column. When this switch is not specified, the default of ’|’ is used.
Do not print the column separator after the final column. Normally a delimiter is printed. When the network summary is requested (–network-structure=S), the separator is always printed before the summary column and never that column.
Run as if –no-columns –no-final-delimiter –column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default ’|’.
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
To print the bag:
$ rwbagcat mybag.bag
172.23.1.1| 5| 172.23.1.2| 231| 172.23.1.3| 9| 172.23.1.4| 19| 192.168.0.100| 1| 192.168.0.101| 1| 192.168.0.160| 15| 192.168.20.161| 1| 192.168.20.162| 5| 192.168.20.163| 5| |
To print it with full network:
$ rwbagcat --network-structure=TABCHX mybag.bag
172.23.1.1 | 5| 172.23.1.2 | 231| 172.23.1.3 | 9| 172.23.1.4 | 19| 172.23.1.0/27 | 264| 172.23.1.0/24 | 264| 172.23.0.0/16 | 264| 172.0.0.0/8 | 264| 192.168.0.100 | 1| 192.168.0.101 | 1| 192.168.0.96/27 | 2| 192.168.0.160 | 15| 192.168.0.160/27 | 15| 192.168.0.0/24 | 17| 192.168.20.161 | 1| 192.168.20.162 | 5| 192.168.20.163 | 5| 192.168.20.160/27 | 11| 192.168.20.0/24 | 11| 192.168.0.0/16 | 28| 192.0.0.0/8 | 28| TOTAL | 292| |
Or an abbreviated network structure by class A and C only, including summary information:
$ rwbagcat --network-structure=ACS mybag.bag
172.23.1.0/24 | 264| 4 hosts in 1 /27 172.0.0.0/8 | 264| 4 hosts in 1 /16, 1 /24, and 1 /27 192.168.0.0/24 | 17| 3 hosts in 2 /27s 192.168.20.0/24 | 11| 3 hosts in 1 /27 192.0.0.0/8 | 28| 6 hosts in 1 /16, 2 /24s, and 3 /27s |
To bin by number of unique IP addresses by volume:
$ rwbagcat --bin-ips mybag.bag
1| 3| 5| 3| 9| 1| 15| 1| 19| 1| 231| 1| |
This means there were 3 source hosts in the bag that had a single flow; 3 hosts that had 5 flows; and one host each that had 9, 15, 19, and 231 flows.
For a log2 breakdown of the counts:
$ rwbagcat --bin-ips=binary mybag.bag
2^0 to 2^1-1| 3| 2^2 to 2^3-1| 3| 2^3 to 2^4-1| 2| 2^4 to 2^5-1| 1| 2^7 to 2^8-1| 1| |
Statistics:
$ rwbagcat --stats mybag.bag
|
Statistics
keys: 10 sum of counters: 292 minimum key: 172.23.1.1 maximum key: 192.168.20.163 minimum count: 1 maximum count: 231 mean: 29.2 variance: 5064 standard deviation: 71.16 skew: 2.246 kurtosis: 8.1 |
$ rwbagcat --tree-stats mybag.bag
nodes allocated: 5 (10240 bytes) leaves allocated: 4 (1024 bytes) keys inserted: 10 (10 unique) counter density: 7.81% |
When set to a non-empty string, rwbagcat automatically invokes this program to display its output a screen at a time. If set to an empty string, rwbagcat does not automatically page its output.
When set and SILK_PAGER is not set, rwbagcat automatically invokes this program to display its output a screen at a time.
rwbag(1), rwbagbuild(1), rwbagtool(1)
Perform high-level operations on binary Bag files
rwbagtool [BAGFILE[,BAGFILE...]]
{ --add | --subtract | --minimize | --maximize | --divide | --scalar-multiply=VALUE | --compare={lt | le | eq | ge | gt} } [--intersect=SETFILE | --complement-intersect=SETFILE] [--mincounter=VALUE] [--maxcounter=VALUE] [--minkey=VALUE] [--maxkey=VALUE] [--invert] [--coverset] [--output-path=OUTPUTFILE] [--note-strip] [--note-add=TEXT] [--note-file-add=FILE] [--compression-method=COMP_METHOD] |
rwbagtool --help
|
rwbagtool --version
|
rwbagtool performs various operations on Bags. It can add Bags together, subtract a subset of data from a Bag, perform key intersection of a Bag with an IP set, extract the key list of a Bag as an IP set, or filter Bag records based on their counter value.
BAGFILE is a the name of a file or a named pipe, or the names stdin or - to have rwbagtool read from the standard input. If no Bag file names are given on the command line, rwbagtool attempts to read a Bag from the standard input. If BAGFILE does not contain a Bag, rwbagtool prints an error to stderr and exits abnormally.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
The first set of options are mutually exclusive; only one may be specified. If none are specified, the counters in the Bag files are summed.
Sum the counters for each key for all Bag files given on the command line. If a key does not exist, it has a counter of zero. If no other operation is specified, the add operation is the default.
Subtract from the first Bag file all subsequent Bag files. If a key does not appear in the first Bag file, rwbagtool assumes it has a value of 0. If any counter subtraction results in a negative number, the key will not appear in the resulting Bag file.
Cause the output to contain the minimum counter seen for each key. Keys that do not appear in all input Bags will not appear in the output.
Cause the output to contain the maximum counter seen for each key. The output will contain each key that appears in any input Bag.
Divide the first Bag file by the second Bag file. It is an error if more than two Bag files are specified. Every key in the first Bag file must appear in the second file; the second Bag may have keys that do not appear in the first, and those keys will not appear in the output. Since Bags do not support floating point numbers, the result of the division is rounded to the nearest integer (values ending in .5 are rounded up). If the result of the division is less than 0.5, the key will not appear in the output.
Multiply each counter in the Bag file by the scalar VALUE, where VALUE is an integer in the range 1 to 18446744073709551615. This switch accepts a single Bag as input.
Compare the key/counter pairs in exactly two Bag files. It is an error if more than two Bag files are specified. The keys in the output Bag will only be those whose counter in the first Bag is OPERATION the counter in the second Bag. The counters for all keys in the output will be 1. Any key that does not appear in both input Bag files will not appear in the result. The possible OPERATION values are the strings:
GetCounter(Bag1, key) < GetCounter(Bag2, key)
GetCounter(Bag1, key) <= GetCounter(Bag2, key)
GetCounter(Bag1, key) == GetCounter(Bag2, key)
GetCounter(Bag1, key) >= GetCounter(Bag2, key)
GetCounter(Bag1, key) > GetCounter(Bag2, key)
The result of the above operation is an intermediate Bag file. The following switches are applied next to remove entries from the intermediate Bag:
Mask the keys in the intermediate Bag using the set in SETFILE. SETFILE is the name of a file or a named pipe containing an IPset, or the name stdin or - to have rwbagtool read the IPset from the standard input. If SETFILE does not contain an IPset, rwbagtool prints an error to stderr and exits abnormally. Only key/counter pairs where the key matches an entry in SETFILE are written to the output.
As –intersect, but only writes key/counter pairs for keys which do not match an entry in SETFILE.
Cause the output to contain only those records whose counter value is VALUE or higher. The allowable range is 1 to the maximum counter value; the default is 1.
Cause the output to contain only those records whose counter value is VALUE or lower. The allowable range is 1 to the maximum counter value; the default is the maximum counter value.
Cause the output to contain only those records whose key value is VALUE or higher. Default is 0 (or 0.0.0.0). Accepts input as an integer or as an IP address in dotted decimal notation.
Cause the output to contain only those records whose key value is VALUE or higher. Default is 4294967295 (or 255.255.255.255). Accepts input as an integer or as an IP address in dotted decimal notation.
The following switches control the output.
Generate a new Bag whose keys are the counters in the intermediate Bag and whose counter is the number of times the counter was seen. For example, this turns the Bag {sip:flow} into the Bag {flow:count(sip)}. Any counter in the intermediate Bag that is larger than the maximum possible key will be attributed to the maximum key; to prevent this, specify --maxcounter=4294967295.
Instead of creating a Bag file as the output, write an IPset which contains the keys contained in the intermediate Bag.
Redirect output to OUTPUTFILE. OUTPUTFILE is the name of a file or a named pipe, or the name stdout or - to write the result to the standard output.
Do not copy the notes (annotations) from the input files to the output file. Normally notes from the input files are copied to the output.
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
The examples assume the following contents for the files:
Bag1.bag Bag2.bag Bag3.bag Bag4.bag Mask.set
3| 10| 1| 1| 2| 8| 1| 1| 2 4| 7| 4| 2| 4| 10| 4| 3| 4 6| 14| 7| 32| 6| 14| 6| 4| 6 7| 23| 8| 2| 7| 12| 7| 4| 8 8| 2| 9| 8| 8| 6| |
$ rwbagtool --add Bag1.bag Bag2.bag > Bag-sum.bag
$ rwbagcat --integer-keys Bag-sum.bag 1| 1| 3| 10| 4| 9| 6| 14| 7| 55| 8| 4| |
$ rwbagtool --add Bag1.bag Bag2.bag Bag3.bag > Bag-sum2.bag
$ rwbagcat --integer-keys Bag-sum2.bag 1| 1| 2| 8| 3| 10| 4| 19| 6| 28| 7| 67| 8| 4| 9| 8| |
$ rwbagtool --sub Bag1.bag Bag2.bag > Bag-diff.bag
$ rwbagcat --integer-keys Bag-diff.bag 3| 10| 4| 5| 6| 14| |
$ rwbagtool --sub Bag2.bag Bag1.bag > Bag-diff2.bag
$ rwbagcat --integer-keys Bag-diff2.bag 1| 1| 7| 9| |
$ rwbagtool --minimize Bag1.bag Bag2.bag Bag3.bag > Bag-min.bag
$ rwbagcat --integer-keys Bag-min.bag 4| 2| 7| 12| |
$ rwbagtool --maximize Bag1.bag Bag2.bag Bag3.bag > Bag-max.bag
$ rwbagcat --integer-keys Bag-max.bag 1| 1| 2| 8| 3| 10| 4| 10| 6| 14| 7| 32| 8| 2| 9| 8| |
$ rwbagtool --divide Bag2.bag Bag4.bag > Big-div1.bag
$ rwbagcat --integer-keys Big-div1.bag 1| 1| 4| 1| 7| 8| $ rwbagtool --divide Bag4.bag Bag2.bag > Big-div2.bag rwbagtool: Error dividing bags; key 6 not in divisor bag |
$ rwbagtool --scalar-multiply=7 Bag1.bag > Bag-multiply.bag
$ rwbagcat --integer-keys Bag-multiply.bag 3| 70| 4| 49| 6| 98| 7| 161| 8| 14| |
$ rwbagtool --compare=lt Bag1.bag Bag2.bag > Bag-lt.bag
$ rwbagcat --integer-keys Bag-lt.bag 7| 1| |
$ rwbagtool --compare=le Bag1.bag Bag2.bag > Bag-le.bag
$ rwbagcat --integer-keys Bag-le.bag 7| 1| 8| 1| |
$ rwbagtool --compare=eq Bag1.bag Bag2.bag > Bag-eq.bag
$ rwbagcat --integer-keys Bag-eq.bag 8| 1| |
$ rwbagtool --compare=ge Bag1.bag Bag2.bag > Bag-ge.bag
$ rwbagcat --integer-keys Bag-ge.bag 4| 1| 8| 1| |
$ rwbagtool --compare=gt Bag1.bag Bag2.bag > Bag-gt.bag
$ rwbagcat --integer-keys Bag-gt.bag 4| 1| |
$ rwbagtool --coverset Bag1.bag Bag2.bag Bag3.bag > Cover.set
$ rwsetcat --integer-keys Cover.set 1 2 3 4 6 7 8 9 |
$ rwbagtool --invert Bag1.bag > Bag-inv1.bag
$ rwbagcat --integer-keys Bag-inv1.bag 2| 1| 7| 1| 10| 1| 14| 1| 23| 1| |
$ rwbagtool --invert Bag2.bag > Bag-inv2.bag
$ rwbagcat --integer-keys Bag-inv2.bag 1| 1| 2| 2| 32| 1| |
$ rwbagtool --invert Bag3.bag > Bag-inv3.bag
$ rwbagcat --integer-keys Bag-inv3.bag 8| 2| 10| 1| 12| 1| 14| 1| |
$ rwbagtool --intersect=Mask.set Bag1.bag > Bag-mask.bag
$ rwbagcat --integer-keys Bag-mask.bag 4| 7| 6| 14| 8| 2| |
$ rwbagtool --complement-intersect=Mask.set Bag1.bag > Bag-mask2.bag
$ rwbagcat --integer-keys Bag-mask2.bag 3| 10| 7| 23| |
$ rwbagtool --add --maxkey=5 Bag1.bag Bag2.bag > Bag-res1.bag
$ rwbagcat --integer-keys Bag-res1.bag 1| 1| 3| 10| 4| 9| |
$ rwbagtool --minkkey=3 --maxkey=6 Bag1.bag > Bag-res2.bag
$ rwbagcat --integer-keys Bag-res2.bag 3| 10| 4| 9| 6| 14| |
$ rwbagtool --mincounter=20 Bag1.bag Bag2.bag > Bag-res3.bag
$ rwbagcat --integer-keys Bag-res3.bag 7| 55| |
$ rwbagtool --sub --maxcounter=9 Bag1.bag Bag2.bag > Bag-res4.bag
$ rwbagcat --integer-keys Bag-res4.bag 4| 5| |
rwbag(1), rwbagbuild(1), rwbagcat(1), rwfileinfo(1), rwset(1), rwsetcat(1)
Concatenate SiLK Flow files into single stream
rwcat [--output-path=FILE] [--note-add=TEXT] [--note-file-add=FILE]
[--print-filenames] [--byte-order={big | little | native}] [--ipv4-output] [--compression-method=COMP_METHOD] [--site-config-file=FILENAME] {[--xargs] | [--xargs=FILENAME] | [ input-files ... ]} |
rwcat --help
|
rwcat --version
|
rwcat reads SiLK Flow records from the specified input files and writes the records in the standard binary SiLK format to the specified output-path; rwcat will write the records to the standard output when stdout is not the terminal and –output-path is not provided.
When the –xargs switch is provided, rwcat will read the names of the files to process from the named text file, or from the standard input if no file name argument is provided to the switch. The input should contain one filename per line.
If the input file names end in .gz, they will be uncompressed as they are read. When stdin is provided as an input file name, rwcat will read records from the standard input.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Write the SiLK Flow records to FILE, which must not exist. If the switch is not provided or if FILE is stdout, flows are written to the standard output. If the name ends in .gz, the output will be compressed using gzip(1).
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Set the byte order for the output SiLK Flow records. The argument is one of the following:
Use the byte order of the machine where rwcat is running. This is the default.
Use network byte order (big endian) for the output.
Write the output in little endian format.
Force the output to contain only IPv4 addresses. When this switch is specified, IPv6 addresses are ignored unless the IPv6 address is an encapsulation of an IPv4 address, in which case the IPv4 address will be written to the output. By default, rwcat writes IP addresses in the same format as the input file. When SiLK has not been compiled with IPv6 support, this switch has no effect.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Print the names of input files and the number of records each file contains as the files are read.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Causes rwcat to read file names from FILENAME or from the standard input if FILENAME is not provided. The input should have one file name per line. rwcat will open each file in turn and read records from it, as if the files had been listed on the command line.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
To combine the results of several rwfilter runs—stored in the files run1.rwf, run2.rwf, ... runN.rwf —together, you can use:
rwcat --output=combined.dat *.rwf
|
If the shell complains about too many arguments, you can use the UNIX find(1) function and pipe its output to rwcat:
find . -name ’*.rwf’ -print | \
rwcat --xargs --output=combined.dat |
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcat looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcat checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), gzip(1), find(1)
Although rwcat will read from the standard input, this feature should be used with caution. rwcat will treat the standard input as a single file, as it has no way to know when one file ends and the next begins. The following will not work:
cat run1.rwf run2.rwf | rwcat --output=combined.dat # WRONG!
|
The header of run2.rwf will be treated as data of run1.rwf, resulting in corrupt output.
Compare the records in two SiLK Flow files
rwcompare [--quiet] FILE1 FILE2
|
rwcompare --help
|
rwcompare --version
|
rwcompare opens the two files named on the command and compares the SiLK Flow records they contain. If the records are identical, rwcompare exits with status 0. If any of the records differ, rwcompare prints a message and exits with status 1. If there is an issue reading either file, an error is printed and the exit status is 2. Use the –quiet switch to suppress all output (error messages included). You may use - or stdin for one of the file names, in which case rwcompare reads from the standard input.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Do not print a message if the files differ, and do not an print error message if a file cannot be opened or read.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
rwfileinfo(1)
Print traffic summary across time
rwcount [--bin-size=SIZE] [--load-scheme=LOADSTYLE]
[--start-epoch=START_TIME] [--end-epoch=END_TIME] [--epoch-slots] [--bin-slots] [--skip-zeroes] [--no-titles] [--no-columns] [--column-separator=CHAR] [--no-final-delimiter] [{--delimited | --delimited=CHAR}] [--print-filenames] [--copy-input=PATH] [--output-path=PATH] [--pager=PAGER_PROG] [--site-config-file=FILENAME] [{--legacy-timestamps | --legacy-timestamps=NUM}] [FILES...] |
rwcount --help
|
rwcount --version
|
rwcount summarizes SiLK flow records across time. It counts the records in the input stream, and groups their byte and packet totals into time bins. rwcount produces textual output with a row for each bin.
When input files are not specified on the command line, rwcount will read records from the standard input.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Denote the size of each time bin, in seconds; defaults to 30 seconds. rwcount supports millisecond size bins; SIZE may be a floating point value equal to or greater than than 0.001.
Determine how the duration of each flow is mapped onto the time bins. LOADSTYLE can be one of the following:
Assume the traffic is evenly distributed across the bins that contain any part of the flow’s duration. For a flow whose duration spans five bins, each bin’s packet- and byte-counts will be incremented with 1/5 of the values for the entire flow.
The traffic is NOT evenly distributed across the flow’s duration, since, when using a bin-size of 30 seconds, a particularly placed 32 second flow will span three bins, and each bin will receive 1/3 of the flow. Compare with option 4.
Assume all of the traffic occurs in the initial millisecond of the flow’s duration. For a flow whose duration spans five bins, the first bin’s packet- and byte-counts will be incremented with the values for the entire flow.
Assume all of the traffic occurs in the last millisecond of the flow’s duration. For a flow whose duration spans five bins, the fifth bin’s packet- and byte-counts will be incremented with the values for the entire flow.
Assume all of the traffic occurs in the middle millisecond of the flow’s duration. For a flow whose duration spans five bins, the third bin’s packet- and byte-counts will be incremented with the values for the entire flow.
Assume the traffic is evenly distributed during each millisecond that the flow is active. For a flow whose duration spans five bins, each bin will receive a portion of the flow-, packet-, and byte-counts weighted by the amount of time the flow spent in each bin.
When using 30 second bins, a particularly placed 32 second flow will add 1/32 of its value to the first and last bins, and 30/32 to the middle bin.
The default LOADSTYLE is 4.
Denote the time to use for the first bin. START_TIME may be in UNIX epoch seconds or in yyyy/mm/dd:HH[:MM[:SS[.sss]]] format.
Denote the time to use for the final bin. END_TIME may be in UNIX epoch seconds or in yyyy/mm/dd:HH[:MM[:SS[.sss]]] format. When neither START_TIME nor END_TIME are not specified to the millisecond, the ceiling of END_TIME is used. END_TIME will be adjusted so that the number of bins is an integer value. When both START_TIME and END_TIME are used, rwcount will allocate bins for the entire time span before it begins processing data, or exit abnormally if it cannot allocate the required memory.
Use the UNIX epoch time as the label for each bin in the output; the default is to label each bin with the time in a human-readable format.
Use the internal bin index as the label for each bin in the output; the default is to label each bin with the time in a human-readable format.
Disable printing of bins with no traffic. By default, all bins are printed.
Turn off column titles. By default, titles are printed.
Disable fixed-width columnar output.
Use specified character between columns and after the final column. When this switch is not specified, the default of ’|’ is used.
Do not print the column separator after the final column. Normally a delimiter is printed.
Run as if –no-columns –no-final-delimiter –column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default ’|’.
Print to the standard error the names of input files as they are opened.
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the –output-path switch has been used to redirect rwcount’s ASCII output.
Determine where the output of rwcount (ASCII text) is written. If this option is not given, output is written to the standard output.
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Specify the format for human readable timestamps, either the default (new) style, YYYY/MM/DDThh:mm:ss , or the legacy style, MM/DD/YYYY hh:mm:ss . When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
To count all web traffic on Jan 1, 2003, into 1 hour bins:
rwfilter --pass=stdout --start-date=2003/01/01:00 \
--end-date=2003/01/01:24 --proto=6 --aport=80 \ | rwcount --bin-size=3600 Date| Records| Bytes| Packets| 2003/01/01T00:00:00| 12947.00| 1968190.00| 34312.00| 2003/01/01T01:00:00| 65318.00| 5783959.00| 100143.00| 2003/01/01T02:00:00| 13765.00| 1895933.00| 36121.00| 2003/01/01T03:00:00| 69599.00| 7062388.00| 144130.00| 2003/01/01T04:00:00| 204717.00| 18491693.00| 385293.00| 2003/01/01T05:00:00| 18664.00| 2352966.00| 45296.00| .... |
To force the hourly bins in the previous example to run from 30 minutes past the hour, use the –start-epoch switch:
rwfilter ...| \
rwcount --bin-size=3600 --start-epoch=2002/12/31:23:30 |
When set to a non-empty string, rwcount automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcount does not automatically page its output.
When set and SILK_PAGER is not set, rwcount automatically invokes this program to display its output a screen at a time.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcount looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcount checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), rwuniq(1)
rwuniq(1)’s –bin-time switch can do time-binning similar to what rwcount supports, but rwuniq cannot divide a SiLK record among multiple bins, i.e., there is no support for a –load-factor type switch. Such a feature could greatly increase rwuniq’s already large memory requirements.
Print selected fields of binary SiLK Flow records
rwcut [--fields=FIELDS] [--all-fields] [--plugin=PLUGIN]
[--start-rec-num=NUM] [--end-rec-num=NUM] [--num-recs=NUM] [--dry-run] [--icmp-type-and-code] [--epoch-time] [{--integer-ips | --zero-pad-ips}] [--integer-sensors] [--no-titles] [--no-columns] [--column-separator=CHAR] [--no-final-delimiter] [{--delimited | --delimited=CHAR}] [--print-filenames] [--copy-input=PATH] [--output-path=PATH] [--pager=PAGER_PROG] [--site-config-file=FILENAME] [--ipv6-policy={ignore,asv4,mix,force,only}] [{--legacy-timestamps | --legacy-timestamps=NUM}] [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]] [--pmap-column-width=NUM] [--python-file=PATH ...] [FILES...] |
rwcut [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH ...] --help |
rwcut --version
|
rwcut reads binary SiLK Flow records from files listed on the command line or from the standard input and prints the records to the screen in a textual, bar (|) delimited format. See the EXAMPLES section below for sample output.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
FIELDS contains the list of flow attributes (a.k.a. fields or columns) to print. The columns will be displayed in the order the fields are specified. Fields may be repeated. FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Field-names are case-insensitive. Example:
--fields=stime,10,1-5
|
If the –fields switch is not given, FIELDS defaults to:
sIP,dIP,sPort,dPort,protocol,packets,bytes,flags,sTime,dur,eTime,sensor
|
The complete list of built-in fields that the SiLK tool suite supports follows, though note that not all fields are present in all SiLK file formats; when a field is not present, its value is 0.
source IP address
destination IP address
source port for TCP and UDP, or equivalent
destination port for TCP and UDP, or equivalent
IP protocol
packet count
byte count
bit-wise OR of TCP flags over all packets
starting time of flow (millisecond resolution unless the –legacy-timestamps switch is specified)
duration of flow (millisecond resolution unless the –legacy-timestamps switch is specified)
end time of flow (millisecond resolution unless the –legacy-timestamps switch is specified)
name or ID of sensor at the collection point
class of sensor at the collection point
type of sensor at the collection point
starting time of flow including milliseconds (milliseconds are always displayed)
end time of flow including milliseconds (milliseconds are always displayed)
duration of flow including milliseconds (milliseconds are always displayed)
include two columns, iType and iCode that contain the ICMP type and code for ICMP flows; for non-ICMP flows, these columns are empty
Many SiLK file formats do not store the following fields and their values will always be 0; they are listed here for completeness:
router SNMP input interface
router SNMP output interface
router next hop IP
SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field’s value is always 0.
TCP flags on first packet in the flow
bit-wise OR of TCP flags over all packets except the first in the flow
flow attributes set by the flow generator:
flow generator saw additional packets in this flow following a packet with a FIN flag (excluding ACK packets)
flow generator prematurely created a record for a long-running connection due to a timeout. (When the flow generator yaf(1) is run with the –silk switch, it will prematurely create a flow and mark it with T if the byte count of the flow cannot be stored in a 32-bit value.)
flow generator created this flow as a continuation of long-running connection, where the previous flow for this connection met a timeout (or a byte threshold in the case of yaf).
Consider a long-running ssh session that exceeds the flow generator’s active timeout. (This is the active timeout since the flow generator creates a flow for a connection that still has activity). The flow generator will create multiple flow records for this ssh session, each spanning some portion of the total session. The first flow record will be marked with a T indicating that it hit the timeout. The second through next-to-last records will be marked with TC indicating that this flow both timed out and is a continuation of a flow that timed out. The final flow will be marked with a C, indicating that it was created as a continuation of an active flow.
guess as to the content the flow. Some software that generates flow records from packet data, such as yaf, will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80).
The list of built-in fields may be augmented by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwcut automatically looks for the following plug-ins:
ADDRESS TYPE (addrtype.so)
for the source IP address, the value 0 if the address is non-routable, 1 if it is internal, or 2 if it is routable and external. See addrtype(3).
as stype for the destination IP address
COUNTRY CODE (ccfilter.so)
for the source IP, a two-letter country code abbreviation denoting the country who owns that IP address. See ccfilter(3).
as scc for the destination IP
PREFIX MAP (pmapfilter.so)
value determined by passing the source IP or the protocol/source-port to the user-defined mapping defined in the prefix map associated with MAPNAME. See the description of the –pmap-file switch and the pmapfilter(3) manual page.
as src-MAPNAME for the destination IP or protocol/destination-port.
These are deprecated field names created by pmapfilter that correspond to src-MAPNAME and dst-MAPNAME, respectively. These fields are available when a prefix map is used that is not associated with a MAPNAME.
Instruct rwcut to print all known fields. This switch cannot be combined with the –fields switch. This switch suppresses error messages from the plug-ins.
Augment the list of fields by using run-time loading of the plug-in (shared object) whose path is PLUGIN. The creation of these plug-ins is beyond the scope of this manual page. When PLUGIN contains a slash (/), rwcut assumes the path to PLUGIN is correct. Otherwise, rwcut will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application’s directory: lib/silk, share/lib, and lib. If rwcut does not find the file, it assumes the plug-in is in the current directory. To force rwcut to look in the current directory first, specify –plugin=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwcut prints status messages to the standard error as it tries to open each of its plug-ins.
Begin printing with the START_NUM’th record by skipping the first START_NUM-1 records. The default is 1; that is, to start printing at the first record; START_NUM must be a positive integer. If START_NUM is greater than the number of input records, the only output will be the title. This parameter does not affect the records written to the stream specified by –copy-input.
Stop printing after the END_NUM’th record. When END_NUM is 0, the default, printing stops once all input records have been printed; that is, END_NUM is effectively infinity. If this value is non-zero, it must not be less than START_NUM. This parameter does not affect the records written to the stream specified by –copy-input.
Print no more than REC_COUNT records; however, if both –start-rec-num and –end-rec-num are specified or if END_NUM is less than REC_COUNT, this switch is ignored. Specifying a REC_COUNT of 0 will print all records, which is the default.
Causes rwcut to print the column headers and exit. Useful for testing.
Unlike TCP or UDP, ICMP messages do not use ports, but instead have types and codes. Specifying this switch will cause rwcut to print, for ICMP records, the message’s type and code in the sPort and dPort columns, respectively. The use of this switch is discouraged; use the icmpTypeCode field instead.
Print timestamps as epoch time (number of seconds since midnight GMT on 1970-01-01).
Print IPs as integers. By default, IP addresses are printed in their canonical form.
Print IP addresses in their canonical form, but add zeros to the IP address so it fully fills the width of column. For IPv4, use three digits per octet, e.g, 127.000.000.001. For IPv6, use four digits per hexadectet and expand empty hexadectets, e.g.; 0000:0000:0000:0000:0000:FFFF:FF00:0001.
Print the integer ID of the sensor rather than its name.
Turn off column titles. By default, titles are printed.
Disable fixed-width columnar output.
Use specified character between columns and after the final column. When this switch is not specified, the default of ’|’ is used.
Do not print the column separator after the final column. Normally a delimiter is printed.
Run as if –no-columns –no-final-delimiter –column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default ’|’.
Print to the standard error the names of input files as they are opened.
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the –output-path switch has been used to redirect rwcut’s ASCII output.
Determines where the output of rwcut (ASCII text) is written. If this option is not given, output is written to the standard output.
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mixed. When SiLK has not been compiled with IPv6 support; IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:
Completely ignore IPv6 flows. Only IPv4 flows will be printed.
Convert IPv6 addresses to IPv4 if possible, otherwise ignore the IPv6 flows.
Process the input as a mixture of IPv4 and IPv6 flows.
Force IPv4 flows to be converted to IPv6.
Only process flows that were marked as IPv6 and completely ignore IPv4 flows.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Specify the format for human readable timestamps, either the default (new) style, YYYY/MM/DDThh:mm:ss.sss , or the legacy style, MM/DD/YYYY hh:mm:ss . When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error.
This switch also controls whether fractional seconds are displayed in the sTime and eTime fields when –epoch-time is requested. If the –legacy-timestamps switch is present with no value or with a value of 1, milliseconds will not be displayed; when not present or specified with a value of 0, milliseconds will be displayed.
Print the available options and exit. Options that add fields can be specified before –help so that the new options appear in the output.
Print the version number and information about how SiLK was configured, then exit the application.
This switch is deprecated. It is an alias for –plugin.
When the prefix map plug-in is used, rwcut reads the mapping file located at PATH. When MAPNAME is provided, it will be used to refer to the fields specific to that prefix map. If MAPNAME is not provided, rwcut will check the prefix map file to see if a map-name was specified when the file was created. Using multiple –prefix-map switches allows additional prefix map files to be read as long as each uses a unique map-name. For more information, see pmapfilter(3).
When the pmapfilter plug-in is used, this switch gives the maximum number of characters to use when displaying the textual value of any field.
When the SiLK Python plug-in is used, rwcut reads the Python code from the file PATH to define additional fields for possible output. This file should call register_plugin_field() for each field it wishes to define. For details and examples, see the silkpython(3) and pysilk(3) manual pages.
A standard rwcut output will look like this (with the text wrapped for readability):
sIP| dIP|sPort|dPort|pro|\
10.30.30.31| 10.70.70.71| 80|36761| 6|\ |
packets| bytes| flags|\
7| 3227| FS PA |\ |
sTime| dur| eTime|senso|
2003/01/01T00:00:14.625| 3.959|2003/01/01T00:00:18.584|EDGE1| |
The first line of the output is the title line–this line shows what the selected fields are; the –no-titles switch will disable the printing of that line. The second line onwards will contain data.
The most basic use of rwcut is by being directly connected to rwfilter(1). For example, to see representative TCP traffic:
rwfilter --start-date=2002/01/19:00 --end-date=2002/01/19:01 \
--proto=6 --pass=stdout | rwcut |
To see only limited field, use the –fields switch. For example, to see only the protocols, use:
rwcut --fields=5
|
The silkpython(3) manual page provides examples that use PySiLK to create and print arbitrary fields for rwcut.
The order of the FIELDS is significant, and fields can be repeated. For example, here is a case where in addition to the default fields of 1-12, you also to prefix each row with an integer form of the destination IP and the start time to make processing by another tool easier. However, within the default fields of 1-12, you want to see dotted-decimal IP addresses.
rwfilter ... --pass=stdout | \
rwcut --integer-ip --fields=2,9,1-12 --epoch-time | \ num2dot --ip-field=3,4 |
This environment variable is used as the value for the –ipv6-policy when that switch is not provided.
When set to a non-empty string, rwcut automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcut does not automatically page its output.
When set and SILK_PAGER is not set, rwcut automatically invokes this program to display its output a screen at a time.
This environment variable is used by Python to locate modules. When –python-file is specified, rwcut loads Python which in turn loads the PySiLK module which is comprised of several files (silk/pysilk_nl.so, silk/__init__.py, etc). If this silk/ directory is located outside Python’s normal search path (for example, in the SiLK installation tree), it may be necessary to set or modify the PYTHONPATH environment variable to include the parent directory of silk/ so that Python can find the PySiLK module.
When set, Python plug-ins will output traceback information on Python errors to the standard error.
This environment variable allows the user to specify the country code mapping file that the ccfilter(3) plug-in will use. The value may be a complete path or a file relative to the SILK_PATH. If the variable is not specified, the code looks for a file named country_codes.pmap in the location specified by SILK_PATH.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcut looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcut checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwcut looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.
When set to 1, rwcut prints status messages to the standard error as it tries to open each of its plug-ins.
The ordering of the field numbers in –fields is significant, specifying –fields=2,1 will print destination IP, then source IP.
If you are interested in only a few fields, use the –fields option to reduce the volume of data to be processed. For example, if you are checking to see which internal host got hit with the slammer worm (signature: UDP, destPort 1434, pkt size 404), then the following rwfilter, rwcut combination will be much faster than simply using default values:
rwfilter --proto-17 --dport=1434 --bytes-per-packet=404-404 \
| rwcut --fields=2 |
To get a mapping from the integer representing a sensor to its name, use the mapsid(1) command.
rwfilter(1), mapsid(1), num2dot(1), addrtype(3), ccfilter(3), pmapfilter(3), silkpython(3), pysilk(3), yaf(1)
Eliminate duplicate SiLK Flow records
rwdedupe [--ignore-fields=FIELDS] [--packets-delta=NUM]
[--bytes-delta=NUM] [--stime-delta=NUM] [--duration-delta=NUM] [--temp-directory=DIR_PATH] [--buffer-size=SIZE] [--compression-method=COMP_METHOD] [--output-path=PATH] [--site-config-file=FILENAME] [FILES ...] |
rwdedupe --help
|
rwdedupe --version
|
rwdedupe reads SiLK Flow records from the files named on the command line or from the standard input. Records that appear in the input file(s) multiple times will only appear in the output stream once; that is, duplicate records are not written to the output. The SiLK Flows are written to the file specified by the –output-path switch or to the standard output when the –output-path switch is not provided and the standard output is not connected to a terminal.
As part of its processing, rwdedupe will re-order the records before writing them.
By default, rwdedupe will consider one record to be a duplicate of another when all the fields in the records match exactly. From another point on view, any difference in two records results in both records appearing in the output. Note that all means every field that exists on a SiLK Flow record. The complete list of fields is specified in the description of –ignore-fields in the OPTIONS section below.
To have rwdedupe ignore fields in the comparison, specify those fields in the –ignore-fields switch. When –ignore-fields=FIELDS is specified, a record is considered a duplicate of another if all fields except those in FIELDS match exactly. rwdedupe will treat FIELDS as being identical across all records. Put another way, if the only difference between two records is in the FIELDS fields, only one of those records will be written to the output.
The –packets-delta, –bytes-delta, –stime-delta and –duration-delta switches allow for ”fuzziness” in the input. For example, if –stime-delta=NUM is specified and the only difference between two records is in the sTime fields, and the fields are within NUM milliseconds of each other, only one record will be written to the output.
During its processing, rwdedupe will try to allocate a large (near 2GB) in-memory array to hold the records. (You may use the –buffer-size switch to change this maximum buffer size.) If more records are read than will fit into memory, the in-core records are temporarily stored on disk as described by the –temp-directory switch. When all records have been read, the on-disk files are merged to produce the output.
By default, the temporary files are stored in the /tmp directory. Because of the sizes of the temporary files, it is strongly recommended that /tmp not be used as the temporary directory, and rwdedupe will print a warning when /tmp is used. To modify the temporary directory used by rwdedupe, provide the –temp-directory switch, set the SILK_TMPDIR environment variable, or set the TMPDIR environment variable.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Ignore the fields listed in FIELDS when determining if two flow records are identical; that is, treat FIELDS as being identical across all flows. By default, all fields are treated as significant.
FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Field-names are case-insensitive. Example:
--ignore-fields=stime,12-15
|
The list of supported fields are:
source IP address
destination IP address
source port for TCP and UDP, or equivalent
destination port for TCP and UDP, or equivalent
IP protocol
packet count
byte count
bit-wise OR of TCP flags over all packets
starting time of flow (milliseconds resolution)
duration of flow (milliseconds resolution)
name or ID of sensor at the collection point
router SNMP input interface
router SNMP output interface
router next hop IP
class of sensor at the collection point
type of sensor at the collection point
TCP flags on first packet in the flow
bit-wise OR of TCP flags over all packets except the first in the flow
flow attributes set by flow generator
guess as to the content the flow. Some software that generates flow records from packet data, such as yaf(1), will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80).
Treat the packets field on two records as being the same if the values differ by NUM packets or less. If not specified, the default is 0.
Treat the bytes field on two records as being the same if the values differ by NUM bytes or less. If not specified, the default is 0.
Treat the start-time field on two records as being the same if the values differ by NUM milliseconds or less. If not specified, the default is 0.
Treat the duration field on two records as being the same if the values differ by NUM milliseconds or less. If not specified, the default is 0.
Specify the name of the directory in which to store data files temporarily when more records have been read that will fit into RAM. This switch overrides the directory specified in the SILK_TMPDIR environment variable, which overrides the directory specified in the TMPDIR variable, which overrides the default, /tmp.
Set the maximum size of the buffer to use for holding the records, in bytes. A larger buffer means fewer temporary files need to be created, reducing the I/O wait times. The default maximum for this buffer is near 2GB. The SIZE may be given as an ordinary integer, or as a real number followed by a suffix K, M or G, which represents the numerical value multiplied by 1,024 (kilo), 1,048,576 (mega), and 1,073,741,824 (giga), respectively. For example, 1.5K represents 1,536 bytes, or one and one-half kilobytes. (This value does not represent the absolute maximum amount of RAM that rwdedupe will allocate, since additional buffers will be allocated for reading the input and writing the output.)
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Write the SiLK Flow records to the specified file or named pipe. This switch must not name an existing regular file. When the standard output is not a terminal and this switch is not provided or its argument is stdout, the records are written to the standard output.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
When the temporary files and the final output are stored on the same file volume, rwdedupe will require approximately twice as much free disk space as the size of input data.
When the temporary files and the final output are on different volumes, rwdedupe will require between 1 and 1.5 times as much free space on the temporary volume as the size of the input data.
Suppose you have made several rwfilter(1) runs to find interesting traffic:
rwfilter --start-date=2008/02/04 ... --pass=data1.rwf
rwfilter --start-date=2008/02/04 ... --pass=data2.rwf rwfilter --start-date=2008/02/04 ... --pass=data3.rwf rwfilter --start-date=2008/02/04 ... --pass=data4.rwf |
You now want to merge that traffic into a single output file, but you want to ensure that any records appearing in multiple output files are only counted once. You can use rwdedupe to merge the output:
rwdedupe data1.rwf data2.rwf data3.rwf data4.rwf --output=data.rwf
|
When set and –temp-directory is not specified, rwdedupe writes the temporary files it creates to this directory. SILK_TMPDIR overrides the value of TMPDIR.
When set and SILK_TMPDIR is not set, rwdedupe writes the temporary files it creates to this directory.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwdedupe looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwdedupe checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), yaf(1), zlib(3)
Print files that rwfilter’s File Selection switches will access
rwfglob [--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]]
{ [--class=CLASS] [--type={all | TYPE[,TYPE ...]}] | [--flowtype=CLASS/TYPE[,CLASS/TYPE ...]] } [--sensors=SENSOR[,SENSOR ...]] [--data-rootdir=PATH] [--site-config-file=FILENAME] [--print-missing-files] [--no-file-names] [--no-summary] |
rwfglob [--data-rootdir=PATH] [--site-config-file=FILENAME] --help
|
rwfglob --version
|
rwfglob accepts the normal File Selection options of rwfilter(1) and prints, to the standard output, the names of the files that would normally be accessed. At the end, a summary is printed of the number of files that exist and the number of those files that are on tape. (The on tape number is determined by seeing how many files had 0 blocks allocated to them.) By default, rwfglob only prints the names of files that exist; to see the names of files that it did not find, supply the –print-missing-files switch.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
The date predicates indicate which days and hours to consider when creating the list of files. The dates are expressed in YYYY/MM/DD:HH format. For example, 2003/01/18:00 represents the first hour of January 18th, 2003, while 2002/10/01:22 corresponds to 22:00 on October 1st, 2002.
Whether the date strings represent times in GMT or the local timezone depend on how SiLK was compiled. See the output from –help or check the Timezone support setting in the –version output to determine how your version of SiLK was compiled.
When both –start-date and –end-date are specified to hour precision, all hours within that time range are processed.
When –start-date is specified to day precision, the hour specified in –end-date (if any) is ignored, and files for all dates between midnight on start-date and 23:59 on end-date are processed.
When –end-date is not specified and –start-date is specified to day precision, files for that complete day are processed.
When –end-date is not specified and –start-date is specified to hour precision, files for that single hour are processed.
It is an error to specify –end-date without specifying –start-date.
When neither –start-date nor –end-date is given, rwfglob prints all files for the current day.
The –class switch is used to specify a group of data to process. Only a single class may be selected. Classes are defined in the silk.conf(5) site configuration file. If the –class option is not given, the default-class as specified in silk.conf is used. Use the –help option to see the list of available classes and the default class.
The –type predicate further specifies data within the selected CLASS by listing the TYPEs of traffic to process. The switch takes a comma-separated list of types or the keyword all which specifies all types for the specified CLASS. Types are defined in silk.conf, they typically refer to the direction of the flow, and they may vary by class. Classes typically define default-types to use when the –type switch is not specified. Use the –help option to get the list of available types for each class.
The –flowtype predicate provides an alternate way to specify class/type pairs. The –flowtype switch allows a single rwfglob invocation to print data from multiple classes. The keyword all may be used for the CLASS and/or TYPE to select all classes and/or types.
The –sensor switch is used to select data from specific sensors. The parameter is a comma separated list of sensor names, sensor IDs (integers), and/or ranges of sensor IDs. Sensors are defined in the silk.conf(5) site configuration file, and the mapsid(1) command can be used to print a mapping of sensor names to IDs and classes. When the –sensor switch is not specified, the default is to use all sensors which are valid for the specified class(es).
This option causes rwfglob to use PATH as the root of the data store directory, which overrides the location given in the SILK_DATA_ROOTDIR environment variable, which overrides the location that was compiled into rwfglob. The default data store directory will be shown when the –version option is given.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory (see –data-rootdir); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
This option prints to the standard error file names that rwfglob expected to find but did not. This switch is useful for debugging, but the list of files it produces can be misleading. For example, suppose there is a decommissioned sensor that still appears in the silk.conf file to permit retrieval of historical data; these data files will be missing even though their absence is expected. Use the output from this switch judiciously.
This option instructs rwfglob not to print the names of the files that it successfully finds. By default, rwfglob prints the names of the files it finds and a summary line showing the number of files it found.
This option instructs rwfglob not to print the summary line (that is, the line that shows the number of files found). By default, rwfglob prints the names of the files it finds and a summary line showing the number of files it found.
Print the available options and exit. The available classes and types will be included in output; you may specify a different root directory or site configuration file before –help to see the classes and types available for that site.
Print the version number and information about how SiLK was configured, then exit the application.
Looking at a day on a single sensor:
$ rwfglob --start=2003/10/11 --sensor=2
/data/in/2003/10/11/in-GAMMA_20031011.23 /data/in/2003/10/11/in-GAMMA_20031011.22 /data/in/2003/10/11/in-GAMMA_20031011.21 /data/in/2003/10/11/in-GAMMA_20031011.20 /data/in/2003/10/11/in-GAMMA_20031011.19 /data/in/2003/10/11/in-GAMMA_20031011.18 /data/in/2003/10/11/in-GAMMA_20031011.17 /data/in/2003/10/11/in-GAMMA_20031011.16 /data/in/2003/10/11/in-GAMMA_20031011.15 /data/in/2003/10/11/in-GAMMA_20031011.14 /data/in/2003/10/11/in-GAMMA_20031011.13 /data/in/2003/10/11/in-GAMMA_20031011.12 /data/in/2003/10/11/in-GAMMA_20031011.11 /data/in/2003/10/11/in-GAMMA_20031011.10 /data/in/2003/10/11/in-GAMMA_20031011.09 /data/in/2003/10/11/in-GAMMA_20031011.08 /data/in/2003/10/11/in-GAMMA_20031011.07 /data/in/2003/10/11/in-GAMMA_20031011.06 /data/in/2003/10/11/in-GAMMA_20031011.05 /data/in/2003/10/11/in-GAMMA_20031011.04 /data/in/2003/10/11/in-GAMMA_20031011.03 /data/in/2003/10/11/in-GAMMA_20031011.02 /data/in/2003/10/11/in-GAMMA_20031011.01 /data/in/2003/10/11/in-GAMMA_20031011.00 globbed 24 files; 0 on tape |
If you only want the summary, specify –no-file-names
$ rwfglob --start-date=2003/10/11 --sensor=2 --no-file-names
globbed 24 files; 0 on tape |
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When set, overrides the compiled-in value for the location of the directory tree containing the files of SiLK Flow records collected and stored by the packing system (rwflowpack(8)). In addition, when the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwfglob looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwfglob checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
rwfilter(1), mapsid(1), silk.conf(5)
The –print-missing-files option needs to be smarter about what files are really missing.
The block size check is of unknown portability across different tape-farm systems.
Print information about a SiLK file
rwfileinfo [--fields=FIELDS] [--summary] [--no-titles] FILE [ FILE ... ]
|
rwfileinfo --help
|
rwfileinfo --version
|
rwfileinfo prints information about a SiLK file. The information that may be printed is:
format. The output file format, a string and its hexadecimal equivalent: FT_RWSPLIT(0x12), FT_RWFILTER(0x13), etc
version. The version of the file, an integer. As of SiLK 1.0, the version of the file is distinct from the version of the records in the file.
byte-order. The byte-order (endian-ness) of the file, a string
compression. The compression library used to compress the data-section of the file, a string and its decimal equivalent (none(0), lzo1x(2). Does not include any external compression, such as if the entire file has been compressed with gzip(1).
header-length. The length of the header in bytes
record-length. The length of a single record in bytes. This will be 1 if the records do not have a fixed size.
count-records. The number of records in the file. If the record-size is 1, this value is the uncompressed size of the data section of the file.
file-size. The size of the file as it is on disk
command-lines. The command(s) used to generate this file, for tools that support writing that information to the header and for formats that store that information.
record-version. The version of the records contained in the file
silk-version. The release of SiLK that wrote this file, e.g., 1.0.0. This value is 0 for files written by releases of SiLK prior to 1.0.
packed-file-info. The timestamp, flowtype, and sensor for a file in the SiLK data repository.
probe-name. The probe information for files created by flowcap(8)
annotations. The notes (annotations) that have been added to the file with the –note-add and –note-file-add switches
prefix-map. The mapname value for a prefix map file.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
Determines which information about the file is printed. FIELDS is a list of integers representing fields to print. The FIELDS may be a comma separated list of integers; a range may be specified by separating the start and end of the range with a hyphen (-). The available fields are listed above. Fields are always printed in the order given above. If the –fields option is not given, all fields are printed.
Prints a summary that lists the number of files processed, the sizes of those files, and the number of records contained in those files.
Suppresses printing of the file name and field names; only the values are printed, left justified and one per line.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
$ rwfileinfo tcp-data.rwf
tcp-data.rwf: format(id) FT_RWGENERIC(0x16) version 16 byte-order littleEndian compression(id) none(0) header-length 208 record-length 52 record-version 5 silk-version 1.0.1 count-records 7 file-size 572 command-lines 1 rwfilter --proto=6 --pass=tcp-data.rwf ... annotations 1 This is some interesting TCP data |
$ rwfileinfo --no-titles --field=count-records tcp-data.rwf
7 |
rwfilter(1)
Choose which SiLK Flow records to process
rwfilter [--threads=N] [--plugin=PLUGIN [--plugin=PLUGIN ...]]
[--pass-destination=PASS_PATH] [--fail-destination=FAIL_PATH] [--all-destination=ALL_PATH] [--input-pipe=INPUT_PATH] [--xargs=INPUT_STREAM] [{ --print-statistics | --print-volume-statistics }] [--print-filenames] [--print-missing-filenames] [--dry-run] [--max-pass-records=N] [--max-fail-records=N] [--note-add=TEXT] [--note-file-add=FILE] [--compression-method=COMP_METHOD] [--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]] { [--class=CLASS] [--type={all | TYPE[,TYPE ...]}] | [--flowtype=CLASS/TYPE[,CLASS/TYPE ...]] } [--sensors=SENSOR[,SENSOR ...]] [--data-rootdir=PATH] [--site-config-file=FILENAME] [--stime=DATE_RANGE] [--etime=DATE_RANGE] [--active-time=DATE_RANGE] [--duration=DECIMAL_RANGE] [--sport=INTEGER_LIST] [--dport=INTEGER_LIST] [--aport=INTEGER_LIST] [--protocol=INTEGER_LIST] [--icmp-type=INTEGER_LIST] [--icmp-code=INTEGER_LIST] [--bytes=INTEGER_RANGE] [--packets=INTEGER_RANGE] [--bytes-per-packet=DECIMAL_RANGE] [{--saddress=IP_ADDR_MASK | --not-saddress=IP_ADDR_MASK}] [{--daddress=IP_ADDR_MASK | --not-daddress=IP_ADDR_MASK}] [{--any-address=IP_ADDR_MASK | --not-any-address=IP_ADDR_MASK}] [{--next-hop-id=IP_ADDR_MASK | --not-next-hop-id=IP_ADDR_MASK}] [{--sipset=IP_SET_FILENAME | --not-sipset=IP_SET_FILENAME}] [{--dipset=IP_SET_FILENAME | --not-dipset=IP_SET_FILENAME}] [{--anyset=IP_SET_FILENAME | --not-anyset=IP_SET_FILENAME}] [{--nhipset=IP_SET_FILENAME | --not-nhipset=IP_SET_FILENAME}] [--input-index=INTEGER_LIST] [--output-index=INTEGER_LIST] [--tcp-flags=TCP_FLAGS] [--flags-all=HIGH_MASK_FLAGS_LIST] [--fin-flag=SCALAR] [--syn-flag=SCALAR] [--rst-flag=SCALAR] [--psh-flag=SCALAR] [--ack-flag=SCALAR] [--urg-flag=SCALAR] [--ece-flag=SCALAR] [--cwr-flag=SCALAR] [--flags-initial=HIGH_MASK_FLAGS_LIST] [--flags-session=HIGH_MASK_FLAGS_LIST] [--attributes=ATTRIBUTES_LIST] [--application=INTEGER_LIST] [--ip-version=INTEGER_LIST] [--scc=COUNTRY_CODE_LIST] [--dcc=COUNTRY_CODE_LIST] [--stype=SCALAR] [--dtype=SCALAR] [--ippair-any=FILENAME] [--ipport-any=FILENAME] [--tuple-file=TUPLE_FILENAME { [--tuple-fields=FIELDS] [--tuple-direction=DIRECTION] [--tuple-delimiter=CHAR] } ] [--python-expr=PYTHON_EXPR] [--python-file=FILENAME [--python-file=FILENAME ...]] [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...] { [--pmap-src-MAPNAME=LABELS] [--pmap-dst-MAPNAME=LABELS] [--pmap-any-MAPNAME=LABELS] } ] |
rwfilter [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH] [--data-rootdir=PATH] [--site-config-file=FILENAME] --help |
rwfilter --version
|
rwfilter serves two purposes: (1) It acts as an interface to the data store to select which SiLK Flow records to process, and (2) it partitions those records into one or more pass and/or fail streams.
The selection switches let one choose records by where the flow was collected (its sensor), the date of collection, and the flow’s direction.
The partitioning switches describe various types of traffic behavior (e.g., TCP traffic, or all traffic going to port 80). rwfilter identifies records matching or violating the behavior(s), and partitions them into appropriate output streams (i.e., files) as specified.
These output streams from rwfilter are always binary. The output must be passed through another tool in the SiLK Tool Suite for further processing to get human-readable output.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
At least one of the following output switches must be provided:
PASS_PATH refers to a non-existent file, a named pipe, or stdout. The pass-destination will output records which have passed ALL of the partitioning predicates.
FAIL_PATH refers to a non-existent file, a named pipe, or stdout. The fail-destination will output records which failed ANY of the partitioning predicates.
ALL_PATH refers to a file, a named pipe, or stdout. This output will output all records read by rwfilter.
Prints out the statistics on files read - the number of records which passed, the number which failed and the total read. If a PATH is provided, the statistics will be printed there; otherwise they are printed to the standard error.
An enhanced version of –print-statistics, in that the statistics include the number of records, packets, and bytes that passed and failed the filter.
Print the available options and exit. Options that add fields can be specified before –help so that the new options appear in the output. The available classes and types will be included in output; you may specify a different root directory or site configuration file before –help to see the classes and types available for that site.
Print the version number and information about how SiLK was configured, then exit the application.
Invoke rwfilter with N threads reading the input files. When this switch is not provided, the value in the SILK_RWFILTER_THREADS environment variable is used. If that variable is not set, rwfilter runs with a single thread. Using multiple threads, performance of rwfilter is greatly improved for queries that look at many files but return few records. Preliminary testing has found that performance peaks around four threads per CPU, but performance will vary depending on the type of query and the number of records returned.
INPUT_PATH is a named pipe or the string stdin. This refers to another source of rwfilter records. Note that rwfilter will not read from the standard input by default, to get this behavior, you must use –input-pipe=stdin.
Causes rwfilter to read file names from INPUT_PATH; the input should have one file name per line. rwfilter will open each file in turn and read records from it.
Print the names of input files as they are read. This can be useful feedback for a long-running rwfilter process.
Perform a sanity check on the input arguments to check that the arguments are acceptable. In addition, prints to the standard output the names of the files that would be accessed (and the names of missing files if –print-missing is specified). rwfglob(1) can also be used to generate the lists of files that rwfilter will access.
Write N records to each –pass-destination. rwfilter will stop reading input once it has written these N records unless the –fail-destination or –all-destination switches were specified.
Write N records to each –fail-destination. rwfilter will stop reading input once it has written these N records unless the –pass-destination or –all-destination switches were specified.
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
The following options determine which files are read from the data store to provide the records.
The date predicates indicate which days and hours to consider when creating the list of files. The dates are expressed in YYYY/MM/DD:HH format. For example, 2003/01/18:00 represents the first hour of January 18th, 2003, while 2002/10/01:22 corresponds to 22:00 on October 1st, 2002.
Whether the date strings represent times in GMT or the local timezone depend on how SiLK was compiled. See the output from –help or check the Timezone support setting in the –version output to determine how your version of SiLK was compiled.
When both –start-date and –end-date are specified to hour precision, all hours within that time range are processed.
When –start-date is specified to day precision, the hour specified in –end-date (if any) is ignored, and files for all dates between midnight on start-date and 23:59 on end-date are processed.
When –end-date is not specified and –start-date is specified to day precision, files for that complete day are processed.
When –end-date is not specified and –start-date is specified to hour precision, files for that single hour are processed.
It is an error to specify –end-date without specifying –start-date.
When neither –start-date nor –end-date is given, rwfilter processes all files for the current day.
The –class switch is used to specify a group of data to process. Only a single class may be selected. Classes are defined in the silk.conf(5) site configuration file. If the –class option is not given, the default-class as specified in silk.conf is used. Use the –help option to see the list of available classes and the default class.
The –type predicate further specifies data within the selected CLASS by listing the TYPEs of traffic to process. The switch takes a comma-separated list of types or the keyword all which specifies all types for the specified CLASS. Types are defined in silk.conf, they typically refer to the direction of the flow, and they may vary by class. Classes typically define default-types to use when the –type switch is not specified. Use the –help option to get the list of available types for each class.
The –flowtype predicate provides an alternate way to specify class/type pairs. The –flowtype switch allows a single rwfilter invocation to process data from multiple classes. The keyword all may be used for the CLASS and/or TYPE to select all classes and/or types.
The –sensor switch is used to select data from specific sensors. The parameter is a comma separated list of sensor names, sensor IDs (integers), and/or ranges of sensor IDs. Sensors are defined in the silk.conf(5) site configuration file, and the mapsid(1) command can be used to print a mapping of sensor names to IDs and classes. When the –sensor switch is not specified, the default is to use all sensors which are valid for the specified class(es).
This option causes rwfilter to use PATH as the root of the data store directory, which overrides the location given in the SILK_DATA_ROOTDIR environment variable, which overrides the location that was compiled into rwfilter. The default data store directory will be shown when the –version option is given.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory (see –data-rootdir); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
This option prints to the standard error file names that rwfilter’s file selection switches expected to find but did not. This switch is useful for debugging, but the list of files it produces can be misleading. For example, suppose there is a decommissioned sensor that still appears in the silk.conf file to permit retrieval of historical data; these data files will be missing even though their absence is expected. Use the output from this switch judiciously.
rwfilter supports the following partitioning switches, at least one of which must be specified. The switches are AND’ed together; i.e., to pass the filter, the record must pass the test implied by each switch. Any record that does not pass will be sent to the fail-destination(s), if specified.
SWITCH PARAMETERS
The forms of the parameters to these partitioning switches are:
DATE_RANGE is a range of two dates, start-range and end-range, each in the form YYYY/MM/DD[:HH[:MM[:SS[.ssssss]]]], for example 2003/01/31:23:45:00.000-2003/01/31:23:59:59.999 represents the last fifteen minutes of Jan 31, 2003. The start-range and end-range must be set to at least day precision. For the start-range, unspecified hour, minute, second, and millisecond values set to 0; for the end-range, those values are set to 23, 59, 59, and 999 respectively. Thus 2003/01/31:23-2003/01/31:23 will become 2003/01/31:23:00:00.000-2003/01/31:23:59:59.999. If an end-range is not given, it is set to the start-range, giving a range of a single millisecond.
SCALAR is a single integer; for example 4.
INTEGER_RANGE is a range of two positive integers: MIN-MAX; for example 1-500. If a single value is given, the range consists of that single value. For many options, the upper limit of the range may be omitted, such as 1-, in which case the limit is set to the maximum value.
INTEGER_LIST is a comma separated list of SCALARs and INTEGER_RANGEs; for example, 1,2,3,5-10,99-103.
DECIMAL_RANGE is a range of decimal values with accuracy up to 10ˆ-4 expressed as MIN-MAX; for example, 5.0-10.031. If a single value is given, the range consists of that single value. If the upper limit of the range may be omitted, such as 1.5-, the limit is set to the maximum value.
IP_ADDR_MASK are expressed in one of two forms. As CIDR blocks (192.168.0.0/16) or as four INTEGER_LISTs joined by dot .. The character x can be used as an abbreviation for 0-255. For example, 10.10,16-31.x.x represents the following CIDR blocks:
10.10.0.0/16
10.16.0.0/20 |
TCP_FLAGS is any combination of the letters F,S,R,P,A,U,E,C, where F=FIN flag; S=SYN; R=RST; P=PSH; A=ACK; U=URG; E=ECE; C=CWR
HIGH_MASK_FLAGS is a pair of TCP_FLAGS strings separated by a slash (/). Flags to the right of the slash are the mask; any flag not listed in the mask may have any value. Flags to the left of the slash are the expected high flags; they must be set in the flow. Thus, flags listed in mask but not in high must be off for all packets in the flow. It is an error if a flag is listed in high but not in mask. Some examples:
means ACK,SYN must be high, FIN,RST must be low, and the other flags (PSH, URG, ECE, CWR) may have any value.
means the ACK packet must be SET. All other flags may have any value.
means the FIN packet must be OFF. All other flags may have any value.
is an error; use F/FS instead, which means FIN must be high, SYN must be low, and other flags can have any value.
HIGH_MASK_FLAGS_LIST is a comma separated list of HIGH_MASK_FLAGS.
IP_SET_FILENAME is the name of a file containing a binary IPset. Binary IPsets are created from rwfilter output with the rwset tool, or from text input with the rwsetbuild(1) tool.
COUNTRY_CODE_LIST is a comma separated list of lowercase two-letter country codes, as well as the following special codes:
N/A (e.g. private and experimental reserved addresses)
anonymous proxy
satellite provider
other
An example: cx,uk,kr,jp,--
ATTRIBUTES is any combination of the letters F,T,C, where
flow generator saw additional packets in this flow following a packet with a FIN flag (excluding ACK packets)
flow generator prematurely created a record for a long-running connection due to a timeout. (When the flow generator yaf(1) is run with the –silk switch, it will prematurely create a flow and mark it with T if the byte count of the flow cannot be stored in a 32-bit value.)
flow generator created this flow as a continuation of long-running connection, where the previous flow for this connection met a timeout
Consider a long-running ssh session that exceeds the flow generator’s active timeout. (This is the active timeout since the flow generator creates a flow for a connection that still has activity). The flow generator will create multiple flow records for this ssh session, each spanning some portion of the total session. The first flow record will be marked with a T indicating that it hit the timeout. The second through next-to-last records will be marked with TC indicating that this flow timed out and that this flow is a continuation of a connection that timed out. The final flow will be marked with a C, indicating that it was created as a continuation of an active flow.
HIGH_MASK_ATTRIBUTES is similar to HIGH_MASK_FLAGS: It is a pair of ATTRIBUTES strings separated by a slash (/). Attributes to the right of the slash are the mask; an attribute not listed in the mask may have any value in the flow. Attributes to the left of the slash are the expected high attributes; they must be set in the flow. Thus, attributes listed in mask but not in high must be off for all packets in the flow. It is an error if an attribute is listed in high but not in mask.
ATTRIBUTES_LIST is a comma separated list of HIGH_MASK_ATTRIBUTES.
SWITCHES
The switches are:
Pass the record if its starting time is in this DATE_RANGE.
As –stime for the ending time.
Pass the record if the record was active at ANY time during this DATE_RANGE. If a single time is specified, pass the record if it was active at that instant.
Pass the record if its duration (eTime-sTime) is in this DECIMAL_RANGE. The DECIMAL_RANGE represents the time in seconds; use floating point numbers to specify millisecond ranges.
Pass the record if its source port is in this INTEGER_LIST, possible values are 0-65535.
Pass the record if its destination port is in this INTEGER_LIST, possible values are 0-65535
Pass the record if its source port and/or its destination port is in this INTEGER_LIST, possible values are 0-65535. For example, use –aport=25 to see all SMTP conversions regardless or where they originated.
Pass the record if its IP Suite Protocol is in this INTEGER_LIST, possible values are 0-255.
Pass the record if its ICMP (or ICMPv6) type is in this INTEGER_LIST; possible values 0-255. This switch will also verify that the flow’s protocol is 1 (or 58 if the flow is IPv6). It is an error to specify a –protocol that does not include 1 and/or 58.
Pass the record if its ICMP (or ICMPv6) code is in this INTEGER_LIST; possible values 0-255. This switch will also verify that the flow’s protocol is 1 (or 58 if the flow is IPv6). It is an error to specify a –protocol that does not include 1 and/or 58.
Pass the record if its byte count is in this INTEGER_RANGE.
Pass the record if its packet count is in this INTEGER_RANGE.
Pass the record if its average bytes per packet count (bytes/packet) is in this DECIMAL_RANGE.
Pass the record if its source IP address is matched by this IP_ADDR_MASK. To match on multiple IPs, use an IPset (see –sipset).
Pass the record if its destination IP address is matched by this IP_ADDR_MASK (see also –dipset).
Pass the record if either its source or its destination IP address is matched by this IP_ADDR_MASK (see also –anyset). Does not consider the next-hop IP address.
Pass the record if its source IP address is not matched by this IP_ADDR_MASK (see also –not-sipset).
Pass the record if its destination IP address is not matched by this IP_ADDR_MASK (see also –not-dipset).
Pass the record if neither its source nor its destination IP address is matched by this IP_ADDR_MASK (see also –not-anyset). Does not consider the next-hop IP address.
Pass the record if its source IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME
As –sipset for the destination IP address.
Pass the record if either its source IP address or its destination IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME. Does not consider the next-hop IP.
As –sipset for the next-hop IP address.
Pass the record if its source IP address is not in the list of IPs contained in the binary set file IP_SET_FILENAME
As –not-sipset for the destination IP address.
Pass the record if neither its source IP address nor its destination IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME. Does not consider the next-hop IP.
As –not-sipset for the next-hop IP address.
Pass the record if, for any one of its packets, any of the specified TCP_FLAGS was on.
HIGH_MASK_FLAGS_LIST is a comma separated list of up to 16 HIGH_FLAGS/MASK_FLAGS pairs, where HIGH_FLAGS and MASK_FLAGS are lists of TCP_FLAGS. HIGH_FLAGS must be a subset of MASK_FLAGS. Pass the record if the flags listed in HIGH_FLAGS are set and the flags listed in MASK_FLAGS but not listed in HIGH_FLAGS are not-set. This switch accepts a list of values, so that --flags-all=S/S,A/A will pass flows that have either only-SYN high or only-ACK high.
Set to 0, only passes records where the FIN Flag is Low, Set to 1, only passes records where the FIN Flag is high.
As –fin-flag except for the SYN Flag
As –fin-flag except for the RST Flag
As –fin-flag except for the PSH Flag
As –fin-flag except for the ACK Flag
As –fin-flag except for the URG Flag
As –fin-flag except for the ECE Flag
As –fin-flag except for the CWR Flag
This switch provides support for partitioning by arbitrary subsets of the basic five-tuple:
{source-ip,destination-ip,source-port,destination-ip-port,protocol}
|
A SiLK Flow record will pass the test when the record’s fields match one of the tuples; if the SiLK record does not match any tuple, the record fails. The tuples are read from the text file TUPLE_FILENAME which must contain lines of delimited fields. The default delimiter is |, but may be specified with the –tuple-delimiter switch. Each field contains one member of the tuple; the fields may appear in any order. The fields may represent any subset of the five-tuple, but each line in the file must define the same subset. A field that is present but has no value will generate an error. If you want the field to match any value, it is best that you not include that field in your input.
In addition to the tuple-lines, TUPLE_FILENAME may contain blank lines and comments (which begin with # and continue to the end of the line). The first line of TUPLE_FILENAME may contain a title labeling the fields in the file. This title line will be ignored when the –tuple-fields switch is given.
The IP fields may contain an IPv4 address, an integer, or a IP in CIDR block notation. Comma-separated lists (80,443) and ranges (0-1023,8080) are supported for the ports and protocol fields. NOTE: Currently the code is not clever in its support for CIDR notation and ranges in that each occurrence is fully expanded. When this occurs, the memory required to hold the search tree will quickly grow.
FIELDS contains the list of fields (columns) to parse from the TUPLE_FILENAME in the order in which they appear in the file. When this switch is not provided, rwfilter will treat the first line in TUPLE_FILENAME as a title line and attempt to determine the fields (a la rwtuc(1)); rwfilter will exit if it cannot determine the fields.
FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Names can be abbreviated to their shortest unique prefix. The field names and their descriptions are:
source IP address
destination IP address
source port
destination port
IP protocol
Allows you to change the comparison between the tuple and the SiLK Flow record. This switch allows one to look for traffic in the reverse direction (or both directions) without having to write all of the rules twice. The available directions are:
The tuple’s fields are compared against the corresponding fields on the flow; that is, sIP is compared with sIP, dIP with dIP, sPort with sPort, dPort with dPort, and protocol with protocol. This is the default.
The tuple’s fields are compared against the opposite fields on the flow; that is, sIP is compared with dIP, dIP with sIP, sPort with dPort, dPort with sPort, and protocol with protocol.
Both of the above comparisons are performed.
Specifies the character separating the input fields. When the switch is not provided, the default of | is used.
Pass the record if the source IP and destination IP (in either order) match one of the IP-pairs listed in the text file FILENAME. Each line of FILENAME should contain two IP addresses separated by whitespace. This switch is equivalent to –tuple-file=FILENAME –tuple-fields=sIP,dIP –tuple-direction=both –tuple-delimiter=’ ’. You cannot use this switch in conjunction with –tuple-file or –ipport-any. This switch is deprecated and it exists for backward compatibility only; it may be removed in a future release.
Pass the record if either the source IP and port pair or the destination IP and port pair are listed in the text file FILENAME. Each line in FILENAME should contain an IP address and port list of interest for that IP separated by whitespace. This switch is equivalent to –tuple-file=FILENAME –tuple-fields=sIP,sPort –tuple-direction=both –tuple-delimiter=’ ’. You cannot use this switch in conjunction with –tuple-file or –ippair-any. This switch is deprecated and it exists for backward compatibility only; it may be removed in a future release.
Augment the partitioning switches by using run-time loading of the plug-in (shared object) whose path is PLUGIN. The switch may be repeated to load multiple plug-ins. The creation of plug-ins is beyond the scope of this manual page; the process is described in Analysts’ Handbook: Using SiLK for Network Traffic Analysis. When multiple Partitioning Switches are given, the code specified by the –plugin switch(es) will be last to be invoked. When PLUGIN contains a slash (/), rwfilter assumes the path to PLUGIN is correct. Otherwise, rwfilter will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application’s directory: lib/silk, share/lib, and lib. If rwfilter does not find the file, it assumes the plug-in is in the current directory. To force rwfilter to look in the current directory first, specify –plugin=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwfilter prints status messages to the standard error as it tries to open each of its plug-ins.
This switch is deprecated. It is an alias for –plugin.
SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional switches; for flows without this additional information, the field’s value is always 0.
As –flags-all, except this switch considers only the initial packet in the flow.
As –flags-all, except this switch ignores the initial packet in the flow.
ATTRIBUTES_LIST is a comma separated list of up to 8 HIGH_ATTRIBUTES/MASK_ATTRIBUTES pairs, where HIGH_ATTRIBUTES and MASK_ATTRIBUTES is a string of the ATTRIBUTE characters F,T,C; see above for a description of these values. HIGH_ATTRIBUTES must be a subset of MASK_ATTRIBUTES. Pass the record if the attributes listed in HIGH_ATTRIBUTES are set and the attributes listed in MASK_ATTRIBUTES but not listed in HIGH_ATTRIBUTES are not-set.
Some software that generates flow records from packet data, such as yaf(1), will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80). The flow generator uses a value for 0 if the application cannot be determined. The –application switch passes the flow if the flow’s application value is in the specified INTEGER_LIST. For example, passing a value of 21 to this switch will find traffic that the flow generation software labeled as FTP regardless of which port the traffic actually used.
Passes the flow if the IP Version is in the specified INTEGER_LIST. INTEGER_LIST can be 4, 6, or 4,6 when SiLK has been compiled with IPv6 support. If SiLK does not have IPv6 support, the only legal value for this switch is 4.
Pass the record if the country code of its source IP address is in the specified COUNTRY_CODE_LIST. This switch requires that the country code mapping file is installed. See ccfilter(3).
As –scc for the destination IP address.
For the following three filter tests, some file formats do not store these values, in which case the value is always 0:
Pass the record if its next hop IP address is matched by this IP_ADDR_MASK.
Pass the record if its next hop IP address is not matched by this IP_ADDR_MASK.
Pass the record if its incoming SNMP interface is in this INTEGER_LIST.
Pass the record if its outgoing SNMP interface is in this INTEGER_LIST.
Additional filtering switches are provided by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwfilter automatically looks for the following plug-ins:
ADDRESS TYPE (addrtype.so)
When SCALAR is 0, pass the record if its source IP address is non-routable. When 1, pass if internal. When 2, pass if external (i.e., routable but not internal). When 3, pass if not internal (non-routable or external). See addrtype(3).
As –stype for the destination IP address.
PREFIX MAP (pmapfilter.so)
When the prefix map plug-in is used, rwfilter reads the mapping file located at PATH. When MAPNAME is provided, it will be used to refer to the switches specific to that prefix map. If MAPNAME is not provided, rwfilter will check the prefix map file to see if a map-name was specified when the file was created. Using multiple –prefix-map switches allows additional prefix map files to be read as long as each uses a unique map-name. The –pmap-file switch(es) must precede all other –pmap-* switches. For more information, see pmapfilter(3).
If the prefix map associated with MAPNAME is an IP prefix map, this matches records with a source IPv4 address that maps to a label contained in the list of labels in LABELS.
If the prefix map associated with MAPNAME is a proto-port prefix map, this matches records with a protocol and source port combination that maps to a label contained in the list of labels in LABELS.
Similar to –pmap-src-MAPNAME, but uses the destination IP or the protocol and destination port.
If the prefix map associated with MAPNAME is an IP prefix map, this matches records with a source IP address or a destination IP address that maps to a label contained in the list of labels in LABELS.
If the prefix map associated with MAPNAME is a port/protocol prefix map, this matches records with a protocol and source port or destination port combination that maps to a label contained in the list of labels in LABELS.
These are deprecated switches created by pmapfilter that correspond to –pamp-src-MAPNAME, –pmap-dst-MAPNAME, and –pmap-any-MAPNAME, respectively. These switches are available when an IP prefix map is used that is not associated with a MAPNAME.
These are deprecated switches created by pmapfilter that correspond to –pamp-src-MAPNAME, –pmap-dst-MAPNAME, and –pmap-any-MAPNAME, respectively. These switches are available when a proto-port prefix map is used that is not associated with a MAPNAME.
PYTHON (silkpython.so)
The SiLK Python plug-in provides support for filtering by expressions or complex functions written in the Python programming language. See the silkpython(3) and pysilk(3) manual pages for information and examples for how to use Python to manipulate SiLK data structures. When multiple Partitioning Switches are given, the Python plug-in will be the next-to-last to be invoked. Only the code specified by the –plugin switch is called after the Python code.
Pass the record if the result of the processing the flow with the function named rwfilter() in FILENAME is true. The function should take a single silk.RWRec object as an argument. See silkpython(3) for details.
Pass the record if the result of the processing the flow with the specified PYTHON_EXPRESSION is true. The expression is evaluated as if it appeared in the following context:
from silk import *
def rwfilter(rec): return (PYTHON_EXPRESSION) |
The most basic filtering involves looking at specific traffic over a specific time. For example:
rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--pass=alltcp.rwf --proto=6 |
will create a file, alltcp.rwf containing all TCP traffic. This file contains SiLK Flow data in a binary format. To examine the contents, use the command rwcut(1).
Please note that the output file described above could be extremely large.
Once a file is written, rwfilter can filter the file again, for example:
rwfilter --aport=80 alltcp.rwf --pass=allweb.rwf
|
will generate allweb.rwf. This progressive filtering can also be done at the command line, but the interim files can be examined with rwcut, rwuniq(1) and other tools.
Multiple filters can be chained at the command line using pipes:
rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --pass=stdout | \ rwfilter --input-pipe=stdin --aport=80 --packets=1-5 \ --pass=smallweb.rwf |
The number of threads to use while reading input files or files selected from the data store.
This environment variable is used by Python to locate modules. When –python-file or –python-expr is specified, rwfilter loads Python which in turn loads the PySiLK module which is comprised of several files (silk/pysilk_nl.so, silk/__init__.py, etc). If this silk/ directory is located outside Python’s normal search path (for example, in the SiLK installation tree), it may be necessary to set or modify the PYTHONPATH environment variable to include the parent directory of silk/ so that Python can find the PySiLK module. For information on using Python from within rwfilter, see pysilk(3).
When set, Python plug-ins will output traceback information on Python errors to stderr.
This environment variable allows the user to specify the country code mapping file that the –scc and –dcc switches use. The value may be a complete path or a file relative to the SILK_PATH. If the variable is not specified, the code looks for a file named country_codes.pmap in the location specified by SILK_PATH.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When set, overrides the compiled-in value for the location of the directory tree containing the files of SiLK Flow records collected and stored by the packing system (rwflowpack(8)). In addition, when the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwfilter looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwfilter checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwfilter looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.
When set to 1, rwfilter prints status messages to the standard error as it tries to open each of its plug-ins.
When set to a non-empty value, rwfilter will treat the value as the path to an external program to execute with information about this rwfilter invocation. If the value in SILK_LOGSTATS does not contain a slash or if it references a file that does not exist, is not a regular file, or is not executable, the SILK_LOGSTATS value is silently ignored. The arguments to the external program are:
The application name, i.e., rwfilter. Note that rwfilter is always used as this argument, regardless of the name of the executable.
The version number of this command line, currently v0001.
The start time of this invocation, as seconds since the UNIX epoch.
The end time of this invocation, as seconds since the UNIX epoch.
The number of data files opened for reading.
The number of records read.
The number of records written.
A variable number of arguments that are the complete command line used to invoke rwfilter, including the name of the executable.
If set, this environment variable overrides the value specified in SILK_LOGSTATS.
If the environment variable is set to a non-empty value, rwfilter will print messages to the standard error about the SILK_LOGSTATS value being used and either the reason why the value cannot be used or the arguments to the external program being executed.
rwfilter is the most commonly used application in the suite. It provides access to the data files and performs all the basic queries.
rwfilter supports a variety of I/O options - in addition to reading from the data store, rwfilter results can be chained together with named pipes to output results to multiple files simultaneously. An introduction to named pipes is outside the scope of this document, however.
Two often underused options are –dry-run and –print-statistics
–dry-run does a sanity check on the input arguments and should be used, especially for complicated arguments, to check that the arguments are acceptable.
–print-statistics used without –pass-destination or –fail-destination simply dumps aggregate statistics to stderr (not stdout) in the following format:
File <#input files> Read <# of recs read> \
Pass <# of recs passing the filter> \ Fail <# of recs failing the filter> |
and can be used to do a quick pass through the data to get aggregate counts before going in deeper into the phenomenon being investigated.
–print-filename can be used as a progress meter; during long jobs, it shows which file is currently being read by the application. –print-filename will not provide meaningful results with piped input.
Filters are applied in the order given on the command line. It is best to apply the biggest filters first.
The switches used to create a filter output file are stored in the file itself. Use the rwfileinfo(1) command to see this information.
rwcount(1), rwcut(1), rwfglob(1), rwfileinfo(1), rwset(1), rwsort(1), rwstats(1), rwtotal(1), rwuniq(1), rwtuc(1), rwsetbuild(1), mapsid(1), addrtype(3), ccfilter(3), pmapfilter(3), pysilk(3), silkpython(3), silk.conf(5), silk(7), rwflowpack(8), yaf(1), zlib(3), Analysts’ Handbook: Using SiLK for Network Traffic Analysis
Create a country code prefix map from a GeoIP data file
unzip -p GeoIPCountryCSV.zip | \
rwgeoip2ccmap --csv-input > country_codes.pmap |
gzip -d -c GeoIP.dat.gz | \
rwgeoip2ccmap --encoded-input > country_codes.pmap |
Prefix maps provide a way to map field values to string labels based on a user-defined map file. The country code prefix map, typically named country_codes.pmap, is a special prefix map that maps an IP address to a two-letter country code. It uses the country codes defined by the Internet Assigned Numbers Authority (http://www.iana.org/root-whois/index.html).
The country code prefix map is used by the ccfilter(3) plug-in to partition by, count by, sort by, and display the country code in SiLK Flow files. The rwip2cc(1) command can use the map file to display the country code for textual IP addresses.
The country code prefix map is based on the GeoIP Country(R) or free GeoLite database created by MaxMind(R) and available from http://www.maxmind.com/. The GeoLite database is a free evaluation copy that is 98% accurate which is updated monthly. MaxMind sells the GeoIP Country database which has over 99% accuracy and is updated weekly.
The database comes in two forms:
as a compressed (zip) textual file containing the IP range, country name, and county code in a comma separated value (CSV) form
as a compressed (gzip) binary file containing an encoded form of the IP address range and country code
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
One of the following switches is required:
Treat the standard input as a textual stream containing the CSV (comma separated value) GeoIP country code data.
Treat the standard input as a binary stream the encoded GeoIP country code data.
Obtain your copy of the MaxMind GeoIP Country database, either the comma separated value version or the binary version (GeoIP.dat.gz). To create the country_codes.pmap data file, run
For the CSV version:
$ unzip -p GeoIPCountryCSV.zip | \
rwgeoip2ccmap --csv-input > country_codes.pmap |
For the binary data format:
$ gzip -d -c GeoIP.dat.gz | \
rwgeoip2ccmap --encoded-input > country_codes.pmap |
Once you have created the country_codes.pmap file, you will need to copy it to $SILK_PATH/share/silk/country_codes.pmap so that the ccfilter plug-in will use it.
ccfilter(3), rwip2cc(1)
Tag similar SiLK records with a common next hop IP value
rwgroup
{--id-fields=KEY | --delta-field=FIELD --delta-value=DELTA} [--objective] [--summarize] [--plugin=PLUGIN] [--rec-threshold=THRESHOLD] [--group-offset=IP] [--note-add=TEXT] [--note-file-add=FILE] [--output-path=PATH] [--copy-input=PATH] [--compression-method=COMP_METHOD] [--site-config-file=FILENAME] [--python-file=PATH ...] [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]] [FILE] |
rwgroup [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH ...] --help |
rwgroup --version
|
rwgroup reads sorted SiLK Flow records (c.f. rwsort(1)) from the standard input or from a single file name listed on the command line, marks records that form a group with an identifier in the Next Hop IP field, and prints the binary SiLK Flow records to the standard output. In some ways rwgroup is similar to rwuniq(1), but rwgroup writes SiLK flow records instead of textual output.
Two SiLK records are defined as being in the same group when the fields specified in the –id-fields switch match exactly and when the field listed in the –delta-field matches within the value given by the –delta-value switch. Either –id-fields or –delta-fields is required; both may be specified. A –delta-value must be given when –delta-fields is present.
The records that make up the first group will have the value 0 written into their Next Hop IP field. Each subsequent group will value their Next Hop IP value incremented by 1. The –group-offset switch will change the initial group’s Next Hop IP value.
The –rec-threshold switch may be used to only print groups that contain a certain number of records. The –summarize switch attempts to merge records in the same group to a single output record.
rwgroup requires that the records are sorted on the fields listed in the –id-fields and –delta-fields switches. For example, a call using
rwgroup --id-field=2 --delta-field=9 --delta-value=3
|
should read the output of
rwsort --field=2,9
|
otherwise the results are unpredictable.
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as –arg=param or –arg param, though the first form is required for options that take optional parameters.
At least one value for –id-field or –delta-field must be provided; rwgroup will terminate with an error if no fields are specified.
KEY contains the list of flow attributes (a.k.a. fields or columns) that must match exactly for flows to be considered part of the same group. Each field may be specified once only. KEY is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Field-names are case insensitive. Example:
--id-fields=stime,10,1-5
|
There is no default value for the –id-fields switch.
The complete list of built-in fields that the SiLK tool suite supports follows, though note that not all fields are present in all SiLK file formats; when a field is not present, its value is 0.
source IP address
destination IP address
source port for TCP and UDP, or equivalent
destination port for TCP and UDP, or equivalent
IP protocol
packet count
byte count
bit-wise OR of TCP flags over all packets
starting time of flow (seconds resolution)
duration of flow (seconds resolution)
end time of flow (seconds resolution)
name or ID of sensor at the collection point
class of sensor at the collection point
type of sensor at the collection point
the ICMP type and code
Many SiLK file formats do not store the following fields and their values will always be 0; they are listed here for completeness:
router SNMP input interface
router SNMP output interface
SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field’s value is always 0.
TCP flags on first packet in the flow
bit-wise OR of TCP flags over all packets except the first in the flow
flow attributes set by the flow generator:
flow generator saw additional packets in this flow following a packet with a FIN flag (excluding ACK packets)
flow generator prematurely created a record for a long-running connection due to a timeout. (When the flow generator yaf(1) is run with the –silk switch, it will prematurely create a flow and mark it with T if the byte count of the flow cannot be stored in a 32-bit value.)
flow generator created this flow as a continuation of long-running connection, where the previous flow for this connection met a timeout (or a byte threshold in the case of yaf).
Consider a long-running ssh session that exceeds the flow generator’s active timeout. (This is the active timeout since the flow generator creates a flow for a connection that still has activity). The flow generator will create multiple flow records for this ssh session, each spanning some portion of the total session. The first flow record will be marked with a T indicating that it hit the timeout. The second through next-to-last records will be marked with TC indicating that this flow both timed out and is a continuation of a flow that timed out. The final flow will be marked with a C, indicating that it was created as a continuation of an active flow.
guess as to the content the flow. Some software that generates flow records from packet data, such as yaf, will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80).
The list of built-in fields may be augmented by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwgroup automatically looks for the following plug-ins:
ADDRESS TYPE (addrtype.so)
categorize the source IP address as non-routable, internal, or external and group based on the category. See addrtype(3).
as stype for the destination IP address
COUNTRY CODE (ccfilter.so)
the country code of the source IP address. See ccfilter(3).
as scc for the destination IP
PREFIX MAP (pmapfilter.so)
value determined by passing the source IP or the protocol/source-port to the user-defined mapping defined in the prefix map associated with MAPNAME. See the description of the –pmap-file switch and the pmapfilter(3) manual page.
as src-MAPNAME for the destination IP or protocol/destination-port.
These are deprecated field names created by pmapfilter that correspond to src-MAPNAME and dst-MAPNAME, respectively. These fields are available when a prefix map is used that is not associated with a MAPNAME.
Specify a single field that can differ by a specified delta-value among the SiLK records that make up a group. The FIELD identifiers include most of those specified for –id-fields. The exceptions are that plug-in fields are not supported, nor are fields that do not have numeric values (e.g., class, type, flags). The most common value for this switch is stime, which allows records that are identical in the id-fields but temporally far apart to be in different groups. The switch takes a single argument; multiple delta fields cannot be specified. When this switch is specified, the –delta-value switch is required.
Specify the acceptable difference between the values of the –delta-field. The –delta-value switch is required when the –delta-field switch is provided. For fields other than those holding IPs, when two consecutive records have values less than or equal to DELTA_VALUE, the records are considered members of the same group. When the delta-field refers to an IP field, DELTA_VALUE is the number of least significant bits of the IPs to remove before comparing them. For example, when –delta-field=sIP –delta-value=8 is specified, two records are the same group if their source IPv4 addresses belong to the same /24 or if their source IPv6 addresses belong to the same /120. The –objective switch affects the meaning of this switch.
Change the behavior of the –delta-value switch so that a record is considered part of a group if the value of its –delta-field is within the DELTA_VALUE of the first record in the group. (When this switch is not specified, consecutive records are compared.)
Cause rwgroup to print (typically) a single record for each group. By default, all records in each group having at least –rec-threshold members is printed. When –summarize is active, the record that is written for the group is the first record in the group with the following modifications:
The packets and bytes values are the sum of the packets and bytes values, respectively, for all records in the group.
The start-time value is the earliest start time for the records in the group.
The end-time value is the latest end time for the records in the group.
The flags and session-flags values are the bitwise-OR of all flags and session-flags values, respectively, for the records in the group.
Note that multiple records for a group may be printed if the bytes, packets, or elapsed time values are too large to be stored in a SiLK flow record.
Augment the list of fields by using run-time loading of the plug-in (shared object) whose path is PLUGIN. The creation of these plug-ins is beyond the scope of this manual page. When PLUGIN contains a slash (/), rwgroup assumes the path to PLUGIN is correct. Otherwise, rwgroup will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application’s directory: lib/silk, share/lib, and lib. If rwgroup does not find the file, it assumes the plug-in is in the current directory. To force rwgroup to look in the current directory first, specify –plugin=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwgroup prints status messages to the standard error as it tries to open each of its plug-ins.
Specify the minimum number of SiLK records a group must contain before the records in the group are written to the output stream. The default is 1; i.e., write all records. The maximum threshold is 65535.
Specify the value to write into the Next Hop IP for the records that comprise the first group. The value IP may be an integer, or an IPv4 or IPv6 address in the canonical presenation form. If not specified, counting begins at 0. The value for each subsequent group is incremented by 1.
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the –output-path switch has been used to redirect rwgroup’s output.
Determines where the output of rwgroup is written. If this option is not given, output is written to the standard output.
Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the –help and –version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:
Do not compress the output using an external library
Use the zlib(3) library for compressing the output
Use the lzo1x algorithm from the LZO real time compression library for compression
Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the –version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application’s directory.
Print the available options and exit. Options that add fields can be specified before –help so that the new options appear in the output.
Print the version number and information about how SiLK was configured, then exit the application.
When the prefix map plug-in is used, rwgroup reads the mapping file located at PATH. When MAPNAME is provided, it will be used to refer to the fields specific to that prefix map. If MAPNAME is not provided, rwgroup will check the prefix map file to see if a map-name was specified when the file was created. Using multiple –prefix-map switches allows additional prefix map files to be read as long as each uses a unique map-name. For more information, see pmapfilter(3).
When the SiLK Python plug-in is used, rwgroup reads the Python code from the file PATH to define additional fields that can be used as part of the group key. This file should call register_plugin_field() for each field it wishes to define. For details and examples, see the silkpython(3) and pysilk(3) manual pages.
rwgroup requires sorted data. The application works by comparing records in the order that the records are received (similar to the UNIX uniq(1) command), odd orders will produce odd groupings.
As a rule of thumb, the –id-fields and –delta-field parameters should match rwsort(1)’s call, with –delta-field being the last parameter. A call to group all web traffic by queries from the same addresses (field=2) within 10 seconds (field=9) of the first query from that address will be:
rwfilter --proto=6 --dport=80 --pass=stdout | \
rwsort --field=2,9 | \ rwgroup --id-field=2 --delta-field=9 --delta-value=10 --objective |
This environment variable is used by Python to locate modules. When –python-file is specified, rwgroup loads Python which in turn loads the PySiLK module which is comprised of several files (silk/pysilk_nl.so, silk/__init__.py, etc). If this silk/ directory is located outside Python’s normal search path (for example, in the SiLK installation tree), it may be necessary to set or modify the PYTHONPATH environment variable to include the parent directory of silk/ so that Python can find the PySiLK module.
When set, Python plug-ins will output traceback information on Python errors to the standard error.
This environment variable allows the user to specify the country code mapping file that the ccfilter(3) plug-in will use. The value may be a complete path or a file relative to the SILK_PATH. If the variable is not specified, the code looks for a file named country_codes.pmap in the location specified by SILK_PATH.
This environment variable is used as the value for the –site-config-file when that switch is not provided.
When the –site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwgroup looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwgroup checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwgroup looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.
When set to 1, rwgroup prints status messages to the standard error as it tries to open each of its plug-ins.
rwfilter(1), rwfileinfo(1), rwsort(1), rwuniq(1), addrtype(3), ccfilter(3), pmapfilter(3), silkpython(3), pysilk(3), uniq(1), yaf(1), zlib(3)
Invoke rwfilter to find flows matching Snort signatures
rwidsquery --intype=INPUT_TYPE
[--output-file=OUTPUT_FILE] [--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]] [--year=YEAR] [--tolerance=SECONDS] [--config-file=CONFIG_FILE] [--mask=PREDICATE_LIST] [--verbose] [--dry-run] [INPUT_FILE | -] [-- EXTRA_RWFILTER_ARGS...] |
rwidsquery --help
|
rwidsquery --version
|
rwidsquery facilitates selection of SiLK flow records that correspond to Snort IDS alerts and signatures. rwidsquery takes as input either a snort alert log or rule file, analyzes the alert or rule contents, and invokes rwfilter(1) with the appropriate arguments to retrieve flow records that match attributes of the input file. rwidsquery will process the Snort rules or alerts from a single file named on the command line; if no file name is given, rwidsquery will attempt to read the Snort rules or alerts from the standard input, unless the standand input is connected to a terminal. An input file name of - or stdin will force rwidsquery to read from the standard input, even when the standard input is a terminal.
In addition to the options listed below, you can pass extra options through to rwfilter(1) on the rwidsquery command line. The syntax for doing so is to place a double-hyphen (–) sequence after all valid rwidsquery options, and before all of the options you wish to pass through to rwfilter.
Specify the type of input contained in the input file. This switch is required. Two alert formats and one rule format are currently supported. Valid values for this option are:
Input is a Snort ”fast” log file entry. Alerts are written in this format when Snort is configured with the snort_fast output module enabled. snort_fast alerts resemble the following:
Jan 1 01:23:45 hostname snort[1976]: [1:1416:11] ...
|
Input is a Snort ”full” log file entry. Alerts are written in this format when Snort is configured with the snort_full output module enabled. snort_full alerts look like the following example:
[**] [116:151:1] (snort decoder) Bad Traffic ...
|
Input is a Snort rule (signature). For example:
alert tcp $EXTERNAL_NET any -> $HOME_NET any ...
|
Specify the output file that flows will be written to. If not specified, the default is to write to stdout. The argument to this option becomes the argument to rwfilter’s –pass switch.
Used in conjunction with rule file input only. The date predicates indicate which time to start and end the search. See the rwfilter(1) manual page for details of the date format.
Used in conjunction with alert file input only. Timestamps in Snort alert files do not contain year information. By default, the current calendar year is used, but this option can be used to override this default behavior.
Used in conjunction with alert file input only. This option is provided to compensate for timing differences between the timestamps in Snort alerts and the start/end time of the corresponding flows. The default –tolerance value is 3600 seconds, which means that flow records +/- one hour from the alert timestamp will be searched.
Used in conjunction with rule file input only. Snort requires a configuration file which, among other things, contains variables that can be used in Snort rule definitions. This option allows you to specify the location of this configuration file so that IP addresses, port numbers, and other information from the snort configuration file can be used to find matching flows.
Exclude the rwfilter predicates named in PREDICATE_LIST from the selection criteria. This option is provided to widen the scope of queries by making them more general than the Snort rule or alert provided. For instance, –mask=dport will return flows with any destination port, not just those which match the input Snort alert or rule.
Print the resulting rwfilter(1) command on stderr prior to invoking it.
Print the resulting rwfilter(1) command on stderr but do not actually run it.
Print the available options and exit.
Print the version number and information about how SiLK was configured, then exit the application.
To find SiLK flows matching a Snort alert in snort_fast format:
$ rwidsquery --intype fast --year 2007 --tolerance 300 alert.fast.txt
|
For the following Snort alert:
Nov 15 00:00:58 hostname snort[5214]: [1:1416:11]
SNMP broadcast trap [Classification: Attempted Information Leak] [Priority: 2]: {TCP} 192.168.0.1:4161 -> 127.0.0.1:139 |
The resulting rwfilter(1) command would look similar to:
rwfilter --start-date=2007/11/14:23 --end-date=2007/11/15:00 \
--stime=2007/11/14:23:55:58-2007/11/15:00:05:58 \ --saddress=192.168.0.1 --sport=4161 --daddress=127.0.0.1 \ --dport=139 --protocol=6 --pass=stdout |
If you want to find flows matching the same criteria, except you want UDP flows instead of TCP flows, use the following syntax:
$ rwidsquery --intype fast --year 2007 --tolerance 300 \
--mask protocol alert.fast.txt -- --protocol=17 |
which would yield the following rwfilter command line:
$ rwfilter --start-date=2007/11/14:23 --end-date=2007/11/15:00 \
--stime=2007/11/14:23:55:58-2007/11/15:00:05:58 \ --saddress=192.168.0.1 --sport=4161 --daddress=127.0.0.1 \ --dport=139 --protocol=17 --pass=stdout |
To find SiLK flows matching a Snort rule:
$ rwidsquery --intype rule --start 2008/02/20:00 --end 2008/02/20:02 \
-c /opt/local/etc/snort/snort.conf -v rule.txt |
For the following Snort rule:
alert icmp $EXTERNAL_NET any -> $HOME_NET any
(msg:"ICMP Parameter Problem Bad Length"; icode:2; itype:12; classtype:misc-activity; sid:425; rev:6;) |
The resulting rwfilter(1) command would look similar to:
rwfilter --start-date=2008/02/20:00 --end-date=2008/02/20:02 \
--stime=2008/02/20:00-2008/02/20:02 --sipset=/tmp/tmpeKIPn2.set --icmp-code=2 --icmp-type=12 --pass=stdout |
snort(8), rwfilter(1)
Maps IP addresses to country codes
rwip2cc { --address=IP_ADDRESS | --input-file=FILE }
[--map-file=PMAP_FILE] [--print-ips={0,1}] [{--integer-ips | --zero-pad-ips}] [--no-columns] [--column-separator=CHAR] [--no-final-delimiter] [{--delimited | --delimited=CHAR}] [--output-path=PATH] [--pager=PAGER_PROG] |