CERT/CC
background
background
CERT NetSA Security Suite 
Open Source Tools for Network Monitoring 
News | Downloads | Documentation | Wiki | Tooltips
SiLK 2.1.0 | YAF 1.0.0.2 | IPA 0.4.0 | fixbuf 0.8.0 | Portal 0.9.0 | RAVE 1.9.16 | iSiLK 0.1.6
SiLK - Documentation - rwfilter
Documentation | Downloads | Release Notes | FAQ | License | Credits | Reference Data | Live CD


NAME

rwfilter - Choose which SiLK Flow records to process


SYNOPSIS

  rwfilter [--threads=N] [--plugin=PLUGIN [--plugin=PLUGIN ...]]
        [--pass-destination=PASS_PATH]
        [--fail-destination=FAIL_PATH] [--all-destination=ALL_PATH]
        [--input-pipe=INPUT_PATH] [--xargs=INPUT_STREAM]
        [{ --print-statistics | --print-volume-statistics }]
        [--print-filenames] [--print-missing-filenames]
        [--dry-run] [--max-pass-records=N] [--max-fail-records=N]
        [--note-add=TEXT] [--note-file-add=FILE]
        [--compression-method=COMP_METHOD]
        [--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]]
        { [--class=CLASS] [--type={all | TYPE[,TYPE ...]}]
         | [--flowtype=CLASS/TYPE[,CLASS/TYPE ...]] }
        [--sensors=SENSOR[,SENSOR ...]]
        [--data-rootdir=PATH] [--site-config-file=FILENAME]
        [--stime=DATE_RANGE] [--etime=DATE_RANGE]
        [--active-time=DATE_RANGE] [--duration=DECIMAL_RANGE]
        [--sport=INTEGER_LIST] [--dport=INTEGER_LIST]
        [--aport=INTEGER_LIST] [--protocol=INTEGER_LIST]
        [--icmp-type=INTEGER_LIST] [--icmp-code=INTEGER_LIST]
        [--bytes=INTEGER_RANGE] [--packets=INTEGER_RANGE]
        [--bytes-per-packet=DECIMAL_RANGE]
        [{--saddress=IP_ADDR_MASK | --not-saddress=IP_ADDR_MASK}]
        [{--daddress=IP_ADDR_MASK | --not-daddress=IP_ADDR_MASK}]
        [{--any-address=IP_ADDR_MASK | --not-any-address=IP_ADDR_MASK}]
        [{--next-hop-id=IP_ADDR_MASK | --not-next-hop-id=IP_ADDR_MASK}]
        [{--sipset=IP_SET_FILENAME | --not-sipset=IP_SET_FILENAME}]
        [{--dipset=IP_SET_FILENAME | --not-dipset=IP_SET_FILENAME}]
        [{--anyset=IP_SET_FILENAME | --not-anyset=IP_SET_FILENAME}]
        [{--nhipset=IP_SET_FILENAME | --not-nhipset=IP_SET_FILENAME}]
        [--input-index=INTEGER_LIST] [--output-index=INTEGER_LIST]
        [--tcp-flags=TCP_FLAGS] [--flags-all=HIGH_MASK_FLAGS_LIST]
        [--fin-flag=SCALAR] [--syn-flag=SCALAR] [--rst-flag=SCALAR]
        [--psh-flag=SCALAR] [--ack-flag=SCALAR] [--urg-flag=SCALAR]
        [--ece-flag=SCALAR] [--cwr-flag=SCALAR]
        [--flags-initial=HIGH_MASK_FLAGS_LIST]
        [--flags-session=HIGH_MASK_FLAGS_LIST]
        [--attributes=ATTRIBUTES_LIST] [--application=INTEGER_LIST]
        [--ip-version=INTEGER_LIST]
        [--scc=COUNTRY_CODE_LIST] [--dcc=COUNTRY_CODE_LIST]
        [--stype=SCALAR] [--dtype=SCALAR]
        [--ippair-any=FILENAME] [--ipport-any=FILENAME]
        [--tuple-file=TUPLE_FILENAME { [--tuple-fields=FIELDS]
                                       [--tuple-direction=DIRECTION]
                                       [--tuple-delimiter=CHAR] } ]
        [--python-expr=PYTHON_EXPR]
        [--python-file=FILENAME [--python-file=FILENAME ...]]
        [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]
         { [--pmap-src-MAPNAME=LABELS] [--pmap-dst-MAPNAME=LABELS]
           [--pmap-any-MAPNAME=LABELS] } ]
  rwfilter [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
        [--plugin=PLUGIN ...] [--python-file=PATH]
        [--data-rootdir=PATH] [--site-config-file=FILENAME]
        --help
  rwfilter --version


DESCRIPTION

rwfilter serves two purposes: (1) It acts as an interface to the data store to select which SiLK Flow records to process, and (2) it partitions those records into one or more pass and/or fail streams.

The selection switches let one choose records by where the flow was collected (its sensor), the date of collection, and the flow's direction.

The partitioning switches describe various types of traffic behavior (e.g., TCP traffic, or all traffic going to port 80). rwfilter identifies records matching or violating the behavior(s), and partitions them into appropriate output streams (i.e., files) as specified.

These output streams from rwfilter are always binary. The output must be passed through another tool in the SiLK Tool Suite for further processing to get human-readable output.


OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

Output Switches

At least one of the following output switches must be provided:

--pass-destination=PASS_PATH

PASS_PATH refers to a non-existent file, a named pipe, or stdout. The pass-destination will output records which have passed ALL of the partitioning predicates.

--fail-destination=FAIL_PATH

FAIL_PATH refers to a non-existent file, a named pipe, or stdout. The fail-destination will output records which failed ANY of the partitioning predicates.

--all-destination=ALL_PATH

ALL_PATH refers to a file, a named pipe, or stdout. This output will output all records read by rwfilter.

--print-statistics
--print-statistics=PATH

Prints out the statistics on files read - the number of records which passed, the number which failed and the total read. If a PATH is provided, the statistics will be printed there; otherwise they are printed to the standard error.

--print-volume-statistics
--print-volume-statistics=PATH

An enhanced version of --print-statistics, in that the statistics include the number of records, packets, and bytes that passed and failed the filter.

--help

Print the available options and exit. Options that add fields can be specified before --help so that the new options appear in the output. The available classes and types will be included in output; you may specify a different root directory or site configuration file before --help to see the classes and types available for that site.

--version

Print the version number and information about how SiLK was configured, then exit the application.

Additional Switches

--threads=N

Invoke rwfilter with N threads reading the input files. When this switch is not provided, the value in the SILK_RWFILTER_THREADS environment variable is used. If that variable is not set, rwfilter runs with a single thread. Using multiple threads, performance of rwfilter is greatly improved for queries that look at many files but return few records. Preliminary testing has found that performance peaks around four threads per CPU, but performance will vary depending on the type of query and the number of records returned.

--input-pipe=INPUT_PATH

INPUT_PATH is a named pipe or the string stdin. This refers to another source of rwfilter records. Note that rwfilter will not read from the standard input by default, to get this behavior, you must use --input-pipe=stdin.

--xargs=INPUT_PATH

Causes rwfilter to read file names from INPUT_PATH; the input should have one file name per line. rwfilter will open each file in turn and read records from it.

--print-filenames

Print the names of input files as they are read. This can be useful feedback for a long-running rwfilter process.

--dry-run

Perform a sanity check on the input arguments to check that the arguments are acceptable. In addition, prints to the standard output the names of the files that would be accessed (and the names of missing files if --print-missing is specified). rwfglob(1) can also be used to generate the lists of files that rwfilter will access.

--max-pass-records=N

Write N records to each --pass-destination. rwfilter will stop reading input once it has written these N records unless the --fail-destination or --all-destination switches were specified.

--max-fail-records=N

Write N records to each --fail-destination. rwfilter will stop reading input once it has written these N records unless the --pass-destination or --all-destination switches were specified.

--note-add=TEXT

Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.

--note-file-add=FILENAME

Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.

--compression-method=COMP_METHOD

Set the compression method of the output to COMP_METHOD. Some SiLK tools can use an external library to compress their binary output. The list of available compression methods and the default method are set when SiLK is compiled (the --help and --version switches print the available and default compression methods) and depend on which supported libraries are found. SiLK can support:

none

Do not compress the output using an external library

zlib

Use the zlib(3) library for compressing the output

lzo1x

Use the lzo1x algorithm from the LZO real time compression library for compression

best

Use whichever available method gives the best compression in general, though not necessarily the best for this particular output.

File Selection Options

The following options determine which files are read from the data store to provide the records.

--start-date=YYYY/MM/DD[:HH]
--end-date=YYYY/MM/DD[:HH]

The date predicates indicate which days and hours to consider when creating the list of files. The dates are expressed in YYYY/MM/DD:HH format. For example, 2003/01/18:00 represents the first hour of January 18th, 2003, while 2002/10/01:22 corresponds to 22:00 on October 1st, 2002.

Whether the date strings represent times in GMT or the local timezone depend on how SiLK was compiled. See the output from --help or check the Timezone support setting in the --version output to determine how your version of SiLK was compiled.

When both --start-date and --end-date are specified to hour precision, all hours within that time range are processed.

When --start-date is specified to day precision, the hour specified in --end-date (if any) is ignored, and files for all dates between midnight on start-date and 23:59 on end-date are processed.

When --end-date is not specified and --start-date is specified to day precision, files for that complete day are processed.

When --end-date is not specified and --start-date is specified to hour precision, files for that single hour are processed.

It is an error to specify --end-date without specifying --start-date.

When neither --start-date nor --end-date is given, rwfilter processes all files for the current day.

--class=CLASS

The --class switch is used to specify a group of data to process. Only a single class may be selected. Classes are defined in the silk.conf(5) site configuration file. If the --class option is not given, the default-class as specified in silk.conf is used. Use the --help option to see the list of available classes and the default class.

--type={all | TYPE[,TYPE]}

The --type predicate further specifies data within the selected CLASS by listing the TYPEs of traffic to process. The switch takes a comma-separated list of types or the keyword all which specifies all types for the specified CLASS. Types are defined in silk.conf, they typically refer to the direction of the flow, and they may vary by class. Classes typically define default-types to use when the --type switch is not specified. Use the --help option to get the list of available types for each class.

--flowtypes=CLASS/TYPE[,CLASS/TYPE ...]

The --flowtype predicate provides an alternate way to specify class/type pairs. The --flowtype switch allows a single rwfilter invocation to process data from multiple classes. The keyword all may be used for the CLASS and/or TYPE to select all classes and/or types.

--sensors=SENSOR[,SENSOR ...]

The --sensor switch is used to select data from specific sensors. The parameter is a comma separated list of sensor names, sensor IDs (integers), and/or ranges of sensor IDs. Sensors are defined in the silk.conf(5) site configuration file, and the mapsid(1) command can be used to print a mapping of sensor names to IDs and classes. When the --sensor switch is not specified, the default is to use all sensors which are valid for the specified class(es).

--data-rootdir=PATH

This option causes rwfilter to use PATH as the root of the data store directory, which overrides the location given in the SILK_DATA_ROOTDIR environment variable, which overrides the location that was compiled into rwfilter. The default data store directory will be shown when the --version option is given.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the root of the data directory (see --data-rootdir); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.

--print-missing-files

This option prints to the standard error file names that rwfilter's file selection switches expected to find but did not. This switch is useful for debugging, but the list of files it produces can be misleading. For example, suppose there is a decommissioned sensor that still appears in the silk.conf file to permit retrieval of historical data; these data files will be missing even though their absence is expected. Use the output from this switch judiciously.

Partitioning Switches

rwfilter supports the following partitioning switches, at least one of which must be specified. The switches are AND'ed together; i.e., to pass the filter, the record must pass the test implied by each switch. Any record that does not pass will be sent to the fail-destination(s), if specified.

SWITCH PARAMETERS

The forms of the parameters to these partitioning switches are:

SWITCHES

The switches are:

--stime=DATE_RANGE

Pass the record if its starting time is in this DATE_RANGE.

--etime=DATE_RANGE

As --stime for the ending time.

--active-time=DATE_RANGE

Pass the record if the record was active at ANY time during this DATE_RANGE. If a single time is specified, pass the record if it was active at that instant.

--duration=DECIMAL_RANGE

Pass the record if its duration (eTime-sTime) is in this DECIMAL_RANGE. The DECIMAL_RANGE represents the time in seconds; use floating point numbers to specify millisecond ranges.

--sport=INTEGER_LIST

Pass the record if its source port is in this INTEGER_LIST, possible values are 0-65535.

--dport=INTEGER_LIST

Pass the record if its destination port is in this INTEGER_LIST, possible values are 0-65535

--aport=INTEGER_LIST

Pass the record if its source port and/or its destination port is in this INTEGER_LIST, possible values are 0-65535. For example, use --aport=25 to see all SMTP conversions regardless or where they originated.

--protocol=INTEGER_LIST

Pass the record if its IP Suite Protocol is in this INTEGER_LIST, possible values are 0-255.

--icmp-type=INTEGER_LIST

Pass the record if its ICMP (or ICMPv6) type is in this INTEGER_LIST; possible values 0-255. This switch will also verify that the flow's protocol is 1 (or 58 if the flow is IPv6). It is an error to specify a --protocol that does not include 1 and/or 58.

--icmp-code=INTEGER_LIST

Pass the record if its ICMP (or ICMPv6) code is in this INTEGER_LIST; possible values 0-255. This switch will also verify that the flow's protocol is 1 (or 58 if the flow is IPv6). It is an error to specify a --protocol that does not include 1 and/or 58.

--bytes=INTEGER_RANGE

Pass the record if its byte count is in this INTEGER_RANGE.

--packets=INTEGER_RANGE

Pass the record if its packet count is in this INTEGER_RANGE.

--bytes-per-packet=DECIMAL_RANGE

Pass the record if its average bytes per packet count (bytes/packet) is in this DECIMAL_RANGE.

--saddress=IP_ADDR_MASK

Pass the record if its source IP address is matched by this IP_ADDR_MASK. To match on multiple IPs, use an IPset (see --sipset).

--daddress=IP_ADDR_MASK

Pass the record if its destination IP address is matched by this IP_ADDR_MASK (see also --dipset).

--any-address=IP_ADDR_MASK

Pass the record if either its source or its destination IP address is matched by this IP_ADDR_MASK (see also --anyset). Does not consider the next-hop IP address.

--not-saddress=IP_ADDR_MASK

Pass the record if its source IP address is not matched by this IP_ADDR_MASK (see also --not-sipset).

--not-daddress=IP_ADDR_MASK

Pass the record if its destination IP address is not matched by this IP_ADDR_MASK (see also --not-dipset).

--not-any-address=IP_ADDR_MASK

Pass the record if neither its source nor its destination IP address is matched by this IP_ADDR_MASK (see also --not-anyset). Does not consider the next-hop IP address.

--sipset=IP_SET_FILENAME

Pass the record if its source IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME

--dipset=IP_SET_FILENAME

As --sipset for the destination IP address.

--anyset=IP_SET_FILENAME

Pass the record if either its source IP address or its destination IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME. Does not consider the next-hop IP.

--nhipset=IP_SET_FILENAME

As --sipset for the next-hop IP address.

--not-sipset=IP_SET_FILENAME

Pass the record if its source IP address is not in the list of IPs contained in the binary set file IP_SET_FILENAME

--not-dipset=IP_SET_FILENAME

As --not-sipset for the destination IP address.

--not-anyset=IP_SET_FILENAME

Pass the record if neither its source IP address nor its destination IP address is in the list of IPs contained in the binary set file IP_SET_FILENAME. Does not consider the next-hop IP.

--not-nhipset=IP_SET_FILENAME

As --not-sipset for the next-hop IP address.

--tcp-flags=TCP_FLAGS

Pass the record if, for any one of its packets, any of the specified TCP_FLAGS was on.

--flags-all=HIGH_MASK_FLAGS_LIST

HIGH_MASK_FLAGS_LIST is a comma separated list of up to 16 HIGH_FLAGS/MASK_FLAGS pairs, where HIGH_FLAGS and MASK_FLAGS are lists of TCP_FLAGS. HIGH_FLAGS must be a subset of MASK_FLAGS. Pass the record if the flags listed in HIGH_FLAGS are set and the flags listed in MASK_FLAGS but not listed in HIGH_FLAGS are not-set. This switch accepts a list of values, so that --flags-all=S/S,A/A will pass flows that have either only-SYN high or only-ACK high.

--fin-flag=SCALAR

Set to 0, only passes records where the FIN Flag is Low, Set to 1, only passes records where the FIN Flag is high.

--syn-flag=SCALAR

As --fin-flag except for the SYN Flag

--rst-flag=SCALAR

As --fin-flag except for the RST Flag

--psh-flag=SCALAR

As --fin-flag except for the PSH Flag

--ack-flag=SCALAR

As --fin-flag except for the ACK Flag

--urg-flag=SCALAR

As --fin-flag except for the URG Flag

--ece-flag=SCALAR

As --fin-flag except for the ECE Flag

--cwr-flag=SCALAR

As --fin-flag except for the CWR Flag

--tuple-file=TUPLE_FILENAME

This switch provides support for partitioning by arbitrary subsets of the basic five-tuple:

 {source-ip,destination-ip,source-port,destination-ip-port,protocol}

A SiLK Flow record will pass the test when the record's fields match one of the tuples; if the SiLK record does not match any tuple, the record fails. The tuples are read from the text file TUPLE_FILENAME which must contain lines of delimited fields. The default delimiter is |, but may be specified with the --tuple-delimiter switch. Each field contains one member of the tuple; the fields may appear in any order. The fields may represent any subset of the five-tuple, but each line in the file must define the same subset. A field that is present but has no value will generate an error. If you want the field to match any value, it is best that you not include that field in your input.

In addition to the tuple-lines, TUPLE_FILENAME may contain blank lines and comments (which begin with # and continue to the end of the line). The first line of TUPLE_FILENAME may contain a title labeling the fields in the file. This title line will be ignored when the --tuple-fields switch is given.

The IP fields may contain an IPv4 address, an integer, or a IP in CIDR block notation. Comma-separated lists (80,443) and ranges (0-1023,8080) are supported for the ports and protocol fields. NOTE: Currently the code is not clever in its support for CIDR notation and ranges in that each occurrence is fully expanded. When this occurs, the memory required to hold the search tree will quickly grow.

--tuple-fields=FIELDS

FIELDS contains the list of fields (columns) to parse from the TUPLE_FILENAME in the order in which they appear in the file. When this switch is not provided, rwfilter will treat the first line in TUPLE_FILENAME as a title line and attempt to determine the fields (a la rwtuc(1)); rwfilter will exit if it cannot determine the fields.

FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Names can be abbreviated to their shortest unique prefix. The field names and their descriptions are:

sIP,sip,1

source IP address

dIP,dip,2

destination IP address

sPort,sport,3

source port

dPort,dport,4

destination port

protocol,5

IP protocol

--tuple-direction=DIRECTION

Allows you to change the comparison between the tuple and the SiLK Flow record. This switch allows one to look for traffic in the reverse direction (or both directions) without having to write all of the rules twice. The available directions are:

forward

The tuple's fields are compared against the corresponding fields on the flow; that is, sIP is compared with sIP, dIP with dIP, sPort with sPort, dPort with dPort, and protocol with protocol. This is the default.

reverse

The tuple's fields are compared against the opposite fields on the flow; that is, sIP is compared with dIP, dIP with sIP, sPort with dPort, dPort with sPort, and protocol with protocol.

both

Both of the above comparisons are performed.

--tuple-delimiter=CHAR

Specifies the character separating the input fields. When the switch is not provided, the default of | is used.

--ippair-any=FILENAME

Pass the record if the source IP and destination IP (in either order) match one of the IP-pairs listed in the text file FILENAME. Each line of FILENAME should contain two IP addresses separated by whitespace. This switch is equivalent to --tuple-file=FILENAME --tuple-fields=sIP,dIP --tuple-direction=both --tuple-delimiter=' '. You cannot use this switch in conjunction with --tuple-file or --ipport-any. This switch is deprecated and it exists for backward compatibility only; it may be removed in a future release.

--ipport-any=FILENAME

Pass the record if either the source IP and port pair or the destination IP and port pair are listed in the text file FILENAME. Each line in FILENAME should contain an IP address and port list of interest for that IP separated by whitespace. This switch is equivalent to --tuple-file=FILENAME --tuple-fields=sIP,sPort --tuple-direction=both --tuple-delimiter=' '. You cannot use this switch in conjunction with --tuple-file or --ippair-any. This switch is deprecated and it exists for backward compatibility only; it may be removed in a future release.

--plugin=PLUGIN

Augment the partitioning switches by using run-time loading of the plug-in (shared object) whose path is PLUGIN. The switch may be repeated to load multiple plug-ins. The creation of plug-ins is beyond the scope of this manual page; the process is described in Analysts' Handbook: Using SiLK for Network Traffic Analysis. When multiple Partitioning Switches are given, the code specified by the --plugin switch(es) will be last to be invoked. When PLUGIN contains a slash (/), rwfilter assumes the path to PLUGIN is correct. Otherwise, rwfilter will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwfilter does not find the file, it assumes the plug-in is in the current directory. To force rwfilter to look in the current directory first, specify --plugin=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwfilter prints status messages to the standard error as it tries to open each of its plug-ins.

--dynamic-library=PLUGIN

This switch is deprecated. It is an alias for --plugin.

SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional switches; for flows without this additional information, the field's value is always 0.

--flags-initial=HIGH_MASK_FLAGS_LIST

As --flags-all, except this switch considers only the initial packet in the flow.

--flags-session=HIGH_MASK_FLAGS_LIST

As --flags-all, except this switch ignores the initial packet in the flow.

--attributes=ATTRIBUTES_LIST

ATTRIBUTES_LIST is a comma separated list of up to 8 HIGH_ATTRIBUTES/MASK_ATTRIBUTES pairs, where HIGH_ATTRIBUTES and MASK_ATTRIBUTES is a string of the ATTRIBUTE characters F,T,C; see above for a description of these values. HIGH_ATTRIBUTES must be a subset of MASK_ATTRIBUTES. Pass the record if the attributes listed in HIGH_ATTRIBUTES are set and the attributes listed in MASK_ATTRIBUTES but not listed in HIGH_ATTRIBUTES are not-set.

--application=INTEGER_LIST

Some software that generates flow records from packet data, such as yaf(1), will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80). The flow generator uses a value for 0 if the application cannot be determined. The --application switch passes the flow if the flow's application value is in the specified INTEGER_LIST. For example, passing a value of 21 to this switch will find traffic that the flow generation software labeled as FTP regardless of which port the traffic actually used.

--ip-version=INTEGER_LIST

Passes the flow if the IP Version is in the specified INTEGER_LIST. INTEGER_LIST can be 4, 6, or 4,6 when SiLK has been compiled with IPv6 support. If SiLK does not have IPv6 support, the only legal value for this switch is 4.

--scc=COUNTRY_CODE_LIST

Pass the record if the country code of its source IP address is in the specified COUNTRY_CODE_LIST. This switch requires that the country code mapping file is installed. See ccfilter(3).

--dcc=COUNTRY_CODE_LIST

As --scc for the destination IP address.

For the following three filter tests, some file formats do not store these values, in which case the value is always 0:

--next-hop-id=IP_ADDR_MASK

Pass the record if its next hop IP address is matched by this IP_ADDR_MASK.

--not-next-hop-id=IP_ADDR_MASK

Pass the record if its next hop IP address is not matched by this IP_ADDR_MASK.

--input-index=INTEGER_LIST

Pass the record if its incoming SNMP interface is in this INTEGER_LIST.

--output-index=INTEGER_LIST

Pass the record if its outgoing SNMP interface is in this INTEGER_LIST.

Additional filtering switches are provided by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwfilter automatically looks for the following plug-ins:

ADDRESS TYPE (addrtype.so)

--stype=SCALAR

When SCALAR is 0, pass the record if its source IP address is non-routable. When 1, pass if internal. When 2, pass if external (i.e., routable but not internal). When 3, pass if not internal (non-routable or external). See addrtype(3).

--dtype=SCALAR

As --stype for the destination IP address.

PREFIX MAP (pmapfilter.so)

--pmap-file=MAPNAME:PATH
--pmap-file=PATH

When the prefix map plug-in is used, rwfilter reads the mapping file located at PATH. When MAPNAME is provided, it will be used to refer to the switches specific to that prefix map. If MAPNAME is not provided, rwfilter will check the prefix map file to see if a map-name was specified when the file was created. Using multiple --prefix-map switches allows additional prefix map files to be read as long as each uses a unique map-name. The --pmap-file switch(es) must precede all other --pmap-* switches. For more information, see pmapfilter(3).

--pmap-src-MAPNAME=LABELS

If the prefix map associated with MAPNAME is an IP prefix map, this matches records with a source IPv4 address that maps to a label contained in the list of labels in LABELS.

If the prefix map associated with MAPNAME is a proto-port prefix map, this matches records with a protocol and source port combination that maps to a label contained in the list of labels in LABELS.

--pmap-dst-MAPNAME=LABELS

Similar to --pmap-src-MAPNAME, but uses the destination IP or the protocol and destination port.

--pmap-any-MAPNAME=LABELS

If the prefix map associated with MAPNAME is an IP prefix map, this matches records with a source IP address or a destination IP address that maps to a label contained in the list of labels in LABELS.

If the prefix map associated with MAPNAME is a port/protocol prefix map, this matches records with a protocol and source port or destination port combination that maps to a label contained in the list of labels in LABELS.

--pmap-saddress=LABELS
--pmap-daddress=LABELS
--pmap-any-address=LABELS

These are deprecated switches created by pmapfilter that correspond to --pamp-src-MAPNAME, --pmap-dst-MAPNAME, and --pmap-any-MAPNAME, respectively. These switches are available when an IP prefix map is used that is not associated with a MAPNAME.

--pmap-sport-proto=LABELS
--pmap-dport-proto=LABELS
--pmap-any-port-proto=LABELS

These are deprecated switches created by pmapfilter that correspond to --pamp-src-MAPNAME, --pmap-dst-MAPNAME, and --pmap-any-MAPNAME, respectively. These switches are available when a proto-port prefix map is used that is not associated with a MAPNAME.

PYTHON (silkpython.so)

The SiLK Python plug-in provides support for filtering by expressions or complex functions written in the Python programming language. See the silkpython(3) and pysilk(3) manual pages for information and examples for how to use Python to manipulate SiLK data structures. When multiple Partitioning Switches are given, the Python plug-in will be the next-to-last to be invoked. Only the code specified by the --plugin switch is called after the Python code.

--python-file=FILENAME

Pass the record if the result of the processing the flow with the function named rwfilter() in FILENAME is true. The function should take a single silk.RWRec object as an argument. See silkpython(3) for details.

--python-expr=PYTHON_EXPRESSION

Pass the record if the result of the processing the flow with the specified PYTHON_EXPRESSION is true. The expression is evaluated as if it appeared in the following context:

 from silk import *
 def rwfilter(rec):
     return (PYTHON_EXPRESSION)


EXAMPLES

The most basic filtering involves looking at specific traffic over a specific time. For example:

  rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
        --pass=alltcp.rwf --proto=6

will create a file, alltcp.rwf containing all TCP traffic. This file contains SiLK Flow data in a binary format. To examine the contents, use the command rwcut(1).

Please note that the output file described above could be extremely large.

Once a file is written, rwfilter can filter the file again, for example:

  rwfilter --aport=80 alltcp.rwf --pass=allweb.rwf

will generate allweb.rwf. This progressive filtering can also be done at the command line, but the interim files can be examined with rwcut, rwuniq(1) and other tools.

Multiple filters can be chained at the command line using pipes:

  rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
        --proto=6 --pass=stdout | \
        rwfilter --input-pipe=stdin --aport=80 --packets=1-5 \
        --pass=smallweb.rwf


ENVIRONMENT

SILK_RWFILTER_THREADS

The number of threads to use while reading input files or files selected from the data store.

PYTHONPATH

This environment variable is used by Python to locate modules. When --python-file or --python-expr is specified, rwfilter loads Python which in turn loads the PySiLK module which is comprised of several files (silk/pysilk_nl.so, silk/__init__.py, etc). If this silk/ directory is located outside Python's normal search path (for example, in the SiLK installation tree), it may be necessary to set or modify the PYTHONPATH environment variable to include the parent directory of silk/ so that Python can find the PySiLK module. For information on using Python from within rwfilter, see pysilk(3).

SILK_PYTHON_TRACEBACK

When set, Python plug-ins will output traceback information on Python errors to stderr.

SILK_COUNTRY_CODES

This environment variable allows the user to specify the country code mapping file that the --scc and --dcc switches use. The value may be a complete path or a file relative to the SILK_PATH. If the variable is not specified, the code looks for a file named country_codes.pmap in the location specified by SILK_PATH.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

When set, overrides the compiled-in value for the location of the directory tree containing the files of SiLK Flow records collected and stored by the packing system (rwflowpack(8)). In addition, when the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwfilter looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.

SILK_PATH

This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwfilter checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwfilter looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.

SILK_PLUGIN_DEBUG

When set to 1, rwfilter prints status messages to the standard error as it tries to open each of its plug-ins.

SILK_LOGSTATS

When set to a non-empty value, rwfilter will treat the value as the path to an external program to execute with information about this rwfilter invocation. If the value in SILK_LOGSTATS does not contain a slash or if it references a file that does not exist, is not a regular file, or is not executable, the SILK_LOGSTATS value is silently ignored. The arguments to the external program are:

SILK_LOGSTATS_RWFILTER

If set, this environment variable overrides the value specified in SILK_LOGSTATS.

SILK_LOGSTATS_DEBUG

If the environment variable is set to a non-empty value, rwfilter will print messages to the standard error about the SILK_LOGSTATS value being used and either the reason why the value cannot be used or the arguments to the external program being executed.


NOTES

rwfilter is the most commonly used application in the suite. It provides access to the data files and performs all the basic queries.

rwfilter supports a variety of I/O options - in addition to reading from the data store, rwfilter results can be chained together with named pipes to output results to multiple files simultaneously. An introduction to named pipes is outside the scope of this document, however.

Two often underused options are --dry-run and --print-statistics

--dry-run does a sanity check on the input arguments and should be used, especially for complicated arguments, to check that the arguments are acceptable.

--print-statistics used without --pass-destination or --fail-destination simply dumps aggregate statistics to stderr (not stdout) in the following format:

  File <#input files> Read <# of recs read> \
  Pass <# of recs passing the filter> \
  Fail <# of recs failing the filter>

and can be used to do a quick pass through the data to get aggregate counts before going in deeper into the phenomenon being investigated.

--print-filename can be used as a progress meter; during long jobs, it shows which file is currently being read by the application. --print-filename will not provide meaningful results with piped input.

Filters are applied in the order given on the command line. It is best to apply the biggest filters first.

The switches used to create a filter output file are stored in the file itself. Use the rwfileinfo(1) command to see this information.


SEE ALSO

rwcount(1), rwcut(1), rwfglob(1), rwfileinfo(1), rwset(1), rwsort(1), rwstats(1), rwtotal(1), rwuniq(1), rwtuc(1), rwsetbuild(1), mapsid(1), addrtype(3), ccfilter(3), pmapfilter(3), pysilk(3), silkpython(3), silk.conf(5), silk(7), rwflowpack(8), yaf(1), zlib(3), Analysts' Handbook: Using SiLK for Network Traffic Analysis