CERT/CC
background
background
CERT NetSA Security Suite 
Open Source Tools for Network Monitoring 
News | Documentation | Downloads
YAF 0.8.1 | NAF 0.6.0 | SiLK 1.0.1 | RAVE 1.9.9
fixbuf 0.7.3 | ipa 0.2.1 | airdbc 0.2.2 | airframe 0.7.2 | Portal 0.8.0
SiLK - Documentation - rwcut
Documentation | Downloads | Release Notes | FAQ | License | Credits | Reference Data | Live CD


NAME

rwcut - Print selected fields of binary SiLK Flow records


SYNOPSIS

  rwcut [--fields=FIELDS] [--all-fields] [--dynamic-library=DYNLIB]
        [--num-recs=NUM] [--start-rec-num=NUM] [--end-rec-num=NUM]
        [--dry-run] [--icmp-type-and-code] [--epoch-time]
        [{--integer-ips | --zero-pad-ips}] [--integer-sensors]
        [--no-titles] [--no-columns] [--column-separator=CHAR]
        [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
        [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
        [--pager=PAGER_PROG] [--site-config-file=FILENAME]
        [--ipv6-policy={ignore,asv4,mix,force,only}]
        [{--legacy-timestamps | --legacy-timestamps=NUM}]
        [--pmap-file=PATH] [--pmap-column-width=NUM]
        [--python-file=PATH] [FILES...]


DESCRIPTION

rwcut reads binary SiLK Flow records from files listed on the command line or from the standard input and prints the records to the screen in a textual, bar (|) delimited format. See the EXAMPLES section below for sample output.


OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--fields=FIELDS
FIELDS contains the list of flow attributes (a.k.a. fields or columns) to print. The columns will be displayed in the order the fields are specified. Fields may be repeated.

FIELDS is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-), e.g.,

  --fields=stime,10,1-5

If the --fields switch is not given, FIELDS defaults to:

  sIP,dIP,sPort,dPort,protocol,packets,bytes,flags,sTime,dur,eTime

The complete list of built-in fields that the SiLK tool suite supports follows, though note that not all fields are present in all SiLK file formats; when a field is not present, its value is 0.

sIP,sip,1
source IP address

dIP,dip,2
destination IP address

sPort,sport,3
source port for TCP and UDP, or equivalent

dPort,dport,4
destination port for TCP and UDP, or equivalent

protocol,5
IP protocol

packets,pkts,6
packet count

bytes,7
byte count

flags,8
bit-wise OR of TCP flags over all packets

sTime,stime,9
starting time of flow (millisecond resolution unless the --legacy-timestamps switch is specified)

dur,10
duration of flow (millisecond resolution unless the --legacy-timestamps switch is specified)

eTime,etime,11
end time of flow (millisecond resolution unless the --legacy-timestamps switch is specified)

sensor,12
name or ID of sensor at the collection point

class,20
class of sensor at the collection point

type,21
type of sensor at the collection point

sTime+msec,stime+msec,22
starting time of flow including milliseconds (milliseconds are always displayed)

eTime+msec,etime+msec,23
end time of flow including milliseconds (milliseconds are always displayed)

dur+msec,24
duration of flow including milliseconds (milliseconds are always displayed)

icmpTypeCode,icmptypecode,25
include two columns, iType and iCode that contain the ICMP type and code for ICMP flows; for non-ICMP flows, these columns are empty

Many SiLK file formats do not store the following fields and their values will always be 0; they are listed here for completeness:

in,13
router SNMP input interface

out,14
router SNMP output interface

nhIP,15
router next hop IP

SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field's value is always 0.

initialFlags,initialflags,26
TCP flags on first packet in the flow

sessionFlags,sessionflags,27
bit-wise OR of TCP flags over all packets except the first in the flow

attributes,28
flow attributes set by flow collector:
T
flow collector generated a flow record for a long-running connection due to timeout.

C
this flow is a continuation of a long-running connection that the collector terminated.

F
additional non-ACK packets seen after a packet with the FIN flag set.

application,29
guess as to the application generating the flow; value will be standard port for the application, such as 80 for web traffic

The list of built-in fields may be augmented by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwcut automatically looks for the following plug-ins:

ADDRESS TYPE (addrytype.so)

stype,16
for the source IP address, the value 0 if the address is non-routable, 1 if it is internal, or 2 if it is routable and external. See addrtype(3).

dtype,17
as stype for the destination IP address

COUNTRY CODE (ccfilter.so)

scc,18
for the source IP, a two-letter country code abbreviation denoting the country who owns that IP address. See ccfilter(3).

dcc,19
as scc for the destination IP

PREFIX MAP (pmapfilter.so)

sval
value from the user-defined mapping (see the --pmap-file switch) for the source. For an IP-based map, this corresponds to sip. For a proto-port-based map, it is protocol/sport. See pmapfilter(3)

dval
as sval for the destination IP or proto/dport.

--all-fields
Instruct rwcut to print all known fields. This switch cannot be combined with the --fields switch. This switch suppresses error messages from the plug-ins.

--dynamic-library=DYNLIB
Augment the list of fields by using run-time loading of the plug-in (shared object) whose path is DYNLIB. The creation of these plug-ins is beyond the scope of this manual page. When DYNLIB contains a slash (/), rwcut assumes the path to DYNLIB is correct. Otherwise, rwcut will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwcut does not find the file, it assumes the plug-in is in the current directory. To force rwcut to look in the current directory first, specify --dynamic-library=./DYNLIB. When the SILK_DYNLIB_DEBUG environment variable is non-empty, rwcut prints status messages to the standard error as it tries to open each of its plug-ins.

--num-recs=NUM
The number of records to print. Setting this value to 0 will print all records, which is the default.

--start-rec-num=NUM
Skip the first NUM-1 records, then begin printing.

--end-rec-num=NUM
Stop printing after the NUM'th record.

--dry-run
Causes rwcut to print the column headers and exit. Useful for testing.

--icmp-type-and-code
Unlike TCP or UDP, ICMP messages do not use ports, but instead have types and codes. Specifying this switch will cause rwcut to print, for ICMP records, the message's type and code in the sPort and dPort columns, respectively. The use of this switch is discouraged; use the icmpTypeCode field instead.

--epoch-time
Print timestamps as epoch time (number of seconds since midnight GMT on 1970-01-01).

--integer-ips
Print IPs as integers. By default, IP addresses are printed as dotted decimal.

--zero-pad-ips
Print IP addresses in dotted decimal, but use three digits per octet by adding zero-padding, e.g, 000.000.000.000.

--integer-sensors
Print the integer ID of the sensor rather than its name.

--no-titles
Turn off column titles. By default, titles are printed.

--no-columns
Disable fixed-width columnar output.

--column-separator=C
Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.

--no-final-delimiter
Do not print the column separator after the final column. Normally a delimiter is printed.

--delimited
--delimited=C
Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.

--print-filenames
Prints to the standard error the names of input files as they are opened.

--copy-input=PATH
Copy all binary input to the specified file or named pipe. PATH can be stdout to print flows to the standard output as long as the --output-path switch has been used to redirect rwcut's ASCII output.

--output-path=PATH
Determines where the output of rwcut (ASCII text) is written. If this option is not given, output is written to the standard output.

--pager=PAGER_PROG
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.

--ipv6-policy=POLICY
Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mixed. When SiLK has not been compiled with IPv6 support; IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:
ignore
Completely ignore IPv6 flows. Only IPv4 flows will be printed.

asv4
Convert IPv6 addresses to IPv4 if possible, otherwise ignore the IPv6 flows.

mix
Process the input as a mixture of IPv4 and IPv6 flows.

force
Force IPv4 flows to be converted to IPv6.

only
Only process flows that were marked as IPv6 and completely ignore IPv4 flows.

--site-config-file=FILENAME
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the --version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.

--legacy-timestamps
--legacy-timestamps=NUM
Specify the format for human readable timestamps, either the default (new) style, YYYY/MM/DDThh:mm:ss.sss, or the legacy style, MM/DD/YYYY hh:mm:ss. When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error.

This switch also controls whether fractional seconds are displayed in the sTime and eTime fields when --epoch-time is requested. If the --legacy-timestamps switch is present with no value or with a value of 1, milliseconds will not be displayed; when not present or specified with a value of 0, milliseconds will be displayed.

--pmap-file=PATH
When the pmapfilter(3) plug-in is used, this switch gives the path to mapping file.

--pmap-column-width=NUM
When the pmapfilter plug-in is used, this switch gives the maximum number of characters to use when displaying the textual value of any field.

--python-file=PATH
When the python plug-in is used, rwcut reads the Python code from the file PATH to define additional fields for possible output. This file must define the function rwcut. The rwcut function should take zero arguments and return a sequence of (TITLE, FIELDLEN, FUNCTION) tuples, where TITLE is the name of a field, FIELDLEN is the length of the field, and FUNCTION is a function that takes a single RWRec as an argument and returns the value to display for that field. (To be pedantic, FUNCTION returns an object, and the str() value of that object is used as the field value for that record.) See the example below.


EXAMPLES

A standard rwcut output will look like this (with the text wrapped for readability):

            sIP|            dIP|sPort|dPort|pro|\
    10.30.30.31|    10.70.70.71|   80|36761|  6|\
        packets|     bytes|          flags|\
              7|      3227|      FS PA    |\
                    sTime|      dur|                  eTime|senso|
  2003/01/01T00:00:14.625|    3.959|2003/01/01T00:00:18.584|EDGE1|

The first line of the output is the title line--this line shows what the selected fields are; the --no-titles switch will disable the printing of that line. The second line onwards will contain data.

The most basic use of rwcut is by being directly connected to rwfilter. For example, to see representative TCP traffic:

 rwfilter --start-date=2002/01/19:00 --end-date=2002/01/19:01 \
      --proto=6 --pass=stdout | rwcut

To see only limited field, use the --fields switch. For example, to see only the protocols, use:

 rwcut --fields=5

To use PySiLK to create output fields for rwcut, create a Python file that contains the rwcut function. That function returns a sequence of tuples, where each tuple lists the name of each field you wish to add, its width, and the function that generates that field's value. The following example specifies two fields, bracketed_sip and running_pktcount:

 import silk
 def rwcut():
     return [("bracketed_sip", 20, ouptut_bsip), 
             ("running_pktcount", 14, output_running_pcount)]
 def output_bsip(rec):
     return "<%s>" % rec.sip
 running_pcount = 0
 def output_running_pcount(rec):
     global running_pcount
     running_pcount = running_pcount + rec.packets
     return running_pcount

To use this code, specify the name of the Python file in the --python-file switch, and list the fields in the --fields switch:

 rwcut --python-file=cut.py --fields=bracketed_sip,pkt,running_pktcount
       bracketed_sip|   packets|running_pktcou|
       <10.10.10.93>|        23|            23|
      <192.168.72.1>|        99|           122|
      <192.168.69.1>|       114|           236|
      <192.168.69.1>|       126|           362|
      <192.168.72.1>|       126|           488|
      <192.168.69.1>|       133|           621|
      <192.168.69.1>|       260|           881|

The order of the FIELDS is significant, and fields can be repeated. For example, here is a case where in addition to the default fields of 1-12, you also to prefix each row with an integer form of the destination IP and the start time to make processing by another tool easier. However, within the default fields of 1-12, you want to see dotted-decimal IP addresses.

 rwfilter ... --pass=stdout | \
       rwcut --integer-ip --fields=2,9,1-12 --epoch-time | \
       num2dot --ip-field=3,4


ENVIRONMENT

SILK_IPV6_POLICY
This environment variable is used as the value for the --ipv6-policy when that switch is not provided.

SILK_PAGER
When set to a non-empty string, rwcut automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcut does not automatically page its output.

PAGER
When set and SILK_PAGER is not set, rwcut automatically invokes this program to display its output a screen at a time.

SILK_CONFIG_FILE
This environment variable is used as the value for the --site-config-file when that switch is not provided.

PYTHONPATH
The Python module for rwcut (python.so) is installed under SiLK's installation tree. It may be necessary to set or modify the PYTHONPATH environment variable so Python can find this module.

SILK_DATA_ROOTDIR
When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcut looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.

SILK_PATH
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcut checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwcut looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.

SILK_DYNLIB_DEBUG
When set to 1, rwcut print status messages to the standard error as it tries to open each of its plug-ins.


NOTES

The ordering of the field numbers in --fields is significant, specifying --fields=2,1 will print destination IP, then source IP.

If you are interested in only a few fields, use the --fields option to reduce the volume of data to be processed. For example, if you are checking to see which internal host got hit with the slammer worm (signature: UDP, destPort 1434, pkt size 404), then the following rwfilter, rwcut combination will be much faster than simply using default values:

  rwfilter --proto-17 --dport=1434 --bytes-per-packet=404-404 \
        | rwcut --fields=2

To get a mapping from the integer representing a sensor to its name, use the mapsid(1) command.


SEE ALSO

rwfilter(1), mapsid(1), num2dot(1), addrtype(3), ccfilter(3), pmapfilter(3)