NAME
rwcut - Print selected fields of binary SiLK Flow records
SYNOPSIS
rwcut [--fields=FIELDS] [--all-fields] [--dynamic-library=DYNLIB]
[--num-recs=NUM] [--start-rec-num=NUM] [--end-rec-num=NUM]
[--dry-run] [--icmp-type-and-code] [--epoch-time]
[{--integer-ips | --zero-pad-ips}] [--integer-sensors]
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [--site-config-file=FILENAME]
[--ipv6-policy={ignore,asv4,mix,force,only}]
[{--legacy-timestamps | --legacy-timestamps=NUM}]
[--pmap-file=PATH] [--pmap-column-width=NUM]
[--python-file=PATH] [FILES...]
DESCRIPTION
rwcut reads binary SiLK Flow records from files listed on the
command line or from the standard input and prints the records to the
screen in a textual, bar (|) delimited format. See the EXAMPLES
section below for sample output.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
- --fields=FIELDS
- FIELDS contains the list of flow attributes (a.k.a. fields or columns) to print. The columns will be displayed in the order the fields are specified. Fields may be repeated.
-
FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-), e.g.,
-
--fields=stime,10,1-5
-
If the --fields switch is not given, FIELDS defaults to:
-
sIP,dIP,sPort,dPort,protocol,packets,bytes,flags,sTime,dur,eTime
-
The complete list of built-in fields that the SiLK tool suite supports follows, though note that not all fields are present in all SiLK file formats; when a field is not present, its value is 0.
- sIP,sip,1
- source IP address
- dIP,dip,2
- destination IP address
- sPort,sport,3
- source port for TCP and UDP, or equivalent
- dPort,dport,4
- destination port for TCP and UDP, or equivalent
- protocol,5
- IP protocol
- packets,pkts,6
- packet count
- bytes,7
- byte count
- flags,8
- bit-wise OR of TCP flags over all packets
- sTime,stime,9
- starting time of flow (millisecond resolution unless the --legacy-timestamps switch is specified)
- dur,10
- duration of flow (millisecond resolution unless the --legacy-timestamps switch is specified)
- eTime,etime,11
- end time of flow (millisecond resolution unless the --legacy-timestamps switch is specified)
- sensor,12
- name or ID of sensor at the collection point
- class,20
- class of sensor at the collection point
- type,21
- type of sensor at the collection point
- sTime+msec,stime+msec,22
- starting time of flow including milliseconds (milliseconds are always displayed)
- eTime+msec,etime+msec,23
- end time of flow including milliseconds (milliseconds are always displayed)
- dur+msec,24
- duration of flow including milliseconds (milliseconds are always displayed)
- icmpTypeCode,icmptypecode,25
-
include two columns,
iTypeandiCodethat contain the ICMP type and code for ICMP flows; for non-ICMP flows, these columns are empty - initialFlags,initialflags,26
- TCP flags on first packet in the flow
- sessionFlags,sessionflags,27
- bit-wise OR of TCP flags over all packets except the first in the flow
- attributes,28
- flow attributes set by flow collector:
T
- flow collector generated a flow record for a long-running connection due to timeout.
C
- this flow is a continuation of a long-running connection that the collector terminated.
F
- additional non-ACK packets seen after a packet with the FIN flag set.
- application,29
- guess as to the application generating the flow; value will be standard port for the application, such as 80 for web traffic
- stype,16
- for the source IP address, the value 0 if the address is non-routable, 1 if it is internal, or 2 if it is routable and external. See addrtype(3).
- dtype,17
- as stype for the destination IP address
- scc,18
-
for the source IP, a two-letter country code abbreviation denoting the
country who
ownsthat IP address. See ccfilter(3). - dcc,19
- as scc for the destination IP
- sval
- value from the user-defined mapping (see the --pmap-file switch) for the source. For an IP-based map, this corresponds to sip. For a proto-port-based map, it is protocol/sport. See pmapfilter(3)
- dval
- as sval for the destination IP or proto/dport.
- --all-fields
- Instruct rwcut to print all known fields. This switch cannot be combined with the --fields switch. This switch suppresses error messages from the plug-ins.
- --dynamic-library=DYNLIB
-
Augment the list of fields by using run-time loading of the plug-in
(shared object) whose path is DYNLIB. The creation of these
plug-ins is beyond the scope of this manual page. When DYNLIB
contains a slash (
/), rwcut assumes the path to DYNLIB is correct. Otherwise, rwcut will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwcut does not find the file, it assumes the plug-in is in the current directory. To force rwcut to look in the current directory first, specify --dynamic-library=./DYNLIB. When the SILK_DYNLIB_DEBUG environment variable is non-empty, rwcut prints status messages to the standard error as it tries to open each of its plug-ins. - --num-recs=NUM
- The number of records to print. Setting this value to 0 will print all records, which is the default.
- --start-rec-num=NUM
- Skip the first NUM-1 records, then begin printing.
- --end-rec-num=NUM
- Stop printing after the NUM'th record.
- --dry-run
- Causes rwcut to print the column headers and exit. Useful for testing.
- --icmp-type-and-code
- Unlike TCP or UDP, ICMP messages do not use ports, but instead have types and codes. Specifying this switch will cause rwcut to print, for ICMP records, the message's type and code in the sPort and dPort columns, respectively. The use of this switch is discouraged; use the icmpTypeCode field instead.
- --epoch-time
- Print timestamps as epoch time (number of seconds since midnight GMT on 1970-01-01).
- --integer-ips
- Print IPs as integers. By default, IP addresses are printed as dotted decimal.
- --zero-pad-ips
-
Print IP addresses in dotted decimal, but use three digits per octet
by adding zero-padding, e.g,
000.000.000.000. - --integer-sensors
- Print the integer ID of the sensor rather than its name.
- --no-titles
- Turn off column titles. By default, titles are printed.
- --no-columns
- Disable fixed-width columnar output.
- --column-separator=C
- Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.
- --no-final-delimiter
- Do not print the column separator after the final column. Normally a delimiter is printed.
- --delimited
- --delimited=C
- Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.
- --print-filenames
- Prints to the standard error the names of input files as they are opened.
- --copy-input=PATH
-
Copy all binary input to the specified file or named pipe. PATH
can be
stdoutto print flows to the standard output as long as the --output-path switch has been used to redirect rwcut's ASCII output. - --output-path=PATH
- Determines where the output of rwcut (ASCII text) is written. If this option is not given, output is written to the standard output.
- --pager=PAGER_PROG
- When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
- --ipv6-policy=POLICY
- Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mixed. When SiLK has not been compiled with IPv6 support; IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:
- ignore
- Completely ignore IPv6 flows. Only IPv4 flows will be printed.
- asv4
- Convert IPv6 addresses to IPv4 if possible, otherwise ignore the IPv6 flows.
- mix
- Process the input as a mixture of IPv4 and IPv6 flows.
- force
- Force IPv4 flows to be converted to IPv6.
- only
- Only process flows that were marked as IPv6 and completely ignore IPv4 flows.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the --version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.
- --legacy-timestamps
- --legacy-timestamps=NUM
-
Specify the format for human readable timestamps, either the default
(new) style,
YYYY/MM/DDThh:mm:ss.sss, or the legacy style,MM/DD/YYYY hh:mm:ss. When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error. -
This switch also controls whether fractional seconds are displayed in the sTime and eTime fields when --epoch-time is requested. If the --legacy-timestamps switch is present with no value or with a value of 1, milliseconds will not be displayed; when not present or specified with a value of 0, milliseconds will be displayed.
- --pmap-file=PATH
- When the pmapfilter(3) plug-in is used, this switch gives the path to mapping file.
- --pmap-column-width=NUM
- When the pmapfilter plug-in is used, this switch gives the maximum number of characters to use when displaying the textual value of any field.
- --python-file=PATH
-
When the python plug-in is used, rwcut reads the Python code
from the file PATH to define additional fields for possible output.
This file must define the function
rwcut. Therwcutfunction should take zero arguments and return a sequence of(TITLE, FIELDLEN, FUNCTION)tuples, where TITLE is the name of a field, FIELDLEN is the length of the field, and FUNCTION is a function that takes a singleRWRecas an argument and returns the value to display for that field. (To be pedantic, FUNCTION returns an object, and thestr()value of that object is used as the field value for that record.) See the example below.
Many SiLK file formats do not store the following fields and their values will always be 0; they are listed here for completeness:
SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field's value is always 0.
The list of built-in fields may be augmented by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwcut automatically looks for the following plug-ins:
ADDRESS TYPE (addrytype.so)
COUNTRY CODE (ccfilter.so)
PREFIX MAP (pmapfilter.so)
EXAMPLES
A standard rwcut output will look like this (with the text wrapped for readability):
sIP| dIP|sPort|dPort|pro|\
10.30.30.31| 10.70.70.71| 80|36761| 6|\
packets| bytes| flags|\
7| 3227| FS PA |\
sTime| dur| eTime|senso|
2003/01/01T00:00:14.625| 3.959|2003/01/01T00:00:18.584|EDGE1|
The first line of the output is the title line--this line shows what the selected fields are; the --no-titles switch will disable the printing of that line. The second line onwards will contain data.
The most basic use of rwcut is by being directly connected to rwfilter. For example, to see representative TCP traffic:
rwfilter --start-date=2002/01/19:00 --end-date=2002/01/19:01 \
--proto=6 --pass=stdout | rwcut
To see only limited field, use the --fields switch. For example, to see only the protocols, use:
rwcut --fields=5
To use PySiLK to create output fields for rwcut, create a Python
file that contains the rwcut function. That function returns a
sequence of tuples, where each tuple lists the name of each field you
wish to add, its width, and the function that generates that field's
value. The following example specifies two fields, bracketed_sip
and running_pktcount:
import silk
def rwcut():
return [("bracketed_sip", 20, ouptut_bsip),
("running_pktcount", 14, output_running_pcount)]
def output_bsip(rec):
return "<%s>" % rec.sip
running_pcount = 0
def output_running_pcount(rec):
global running_pcount
running_pcount = running_pcount + rec.packets
return running_pcount
To use this code, specify the name of the Python file in the --python-file switch, and list the fields in the --fields switch:
rwcut --python-file=cut.py --fields=bracketed_sip,pkt,running_pktcount
bracketed_sip| packets|running_pktcou|
<10.10.10.93>| 23| 23|
<192.168.72.1>| 99| 122|
<192.168.69.1>| 114| 236|
<192.168.69.1>| 126| 362|
<192.168.72.1>| 126| 488|
<192.168.69.1>| 133| 621|
<192.168.69.1>| 260| 881|
The order of the FIELDS is significant, and fields can be repeated. For example, here is a case where in addition to the default fields of 1-12, you also to prefix each row with an integer form of the destination IP and the start time to make processing by another tool easier. However, within the default fields of 1-12, you want to see dotted-decimal IP addresses.
rwfilter ... --pass=stdout | \
rwcut --integer-ip --fields=2,9,1-12 --epoch-time | \
num2dot --ip-field=3,4
ENVIRONMENT
- SILK_IPV6_POLICY
- This environment variable is used as the value for the --ipv6-policy when that switch is not provided.
- SILK_PAGER
- When set to a non-empty string, rwcut automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcut does not automatically page its output.
- PAGER
- When set and SILK_PAGER is not set, rwcut automatically invokes this program to display its output a screen at a time.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the --site-config-file when that switch is not provided.
- PYTHONPATH
- The Python module for rwcut (python.so) is installed under SiLK's installation tree. It may be necessary to set or modify the PYTHONPATH environment variable so Python can find this module.
- SILK_DATA_ROOTDIR
- When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcut looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
- SILK_PATH
- This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcut checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwcut looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.
- SILK_DYNLIB_DEBUG
- When set to 1, rwcut print status messages to the standard error as it tries to open each of its plug-ins.
NOTES
The ordering of the field numbers in --fields is significant,
specifying --fields=2,1 will print destination IP, then source
IP.
If you are interested in only a few fields, use the --fields option to reduce the volume of data to be processed. For example, if you are checking to see which internal host got hit with the slammer worm (signature: UDP, destPort 1434, pkt size 404), then the following rwfilter, rwcut combination will be much faster than simply using default values:
rwfilter --proto-17 --dport=1434 --bytes-per-packet=404-404 \
| rwcut --fields=2
To get a mapping from the integer representing a sensor to its name, use the mapsid(1) command.
SEE ALSO
rwfilter(1), mapsid(1), num2dot(1), addrtype(3), ccfilter(3), pmapfilter(3)


