NAME
rwstats - Print interval counts or top-N or bottom-N lists
SYNOPSIS
rwstats --fields=KEY [--values=VALUES] [--plugin=PLUGIN]
{--count=N | --threshold=N | --percentage=N}
[{--top | --bottom}] [--presorted-input] [--no-percents]
[--ipv6-policy={ignore,asv4,mix,force,only}]
[{--bin-time | --bin-time=SECONDS}] [--epoch-time]
[{--integer-ips | --zero-pad-ips}] [--integer-sensors]
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [--temp-directory=DIR_PATH]
[{--legacy-timestamps | --legacy-timestamps=NUM}]
[--site-config-file=FILENAME]
[--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--pmap-column-width=NUM] [--python-file=PATH ...] [FILES]
rwstats {--overall-stats | --detail-proto-stats=PROTO[,PROTO]}
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [FILES...]
rwstats [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH ...] --help
rwstats --legacy-help
rwstats --version
DESCRIPTION
rwstats has two modes of operation: it can compute a Top-N or Bottom-N list, or it can summarize data for each protocol.
TOP-N INVOCATION
rwstats reads SiLK Flow records and groups them by a key composed of user-specified attributes of the the flows. For each group (or bin), a collection of aggregate values is computed; these values are typically related to the volume of the bin, such as the sum of the bytes fields for all records that match the key. Once all the SiLK Flow records are read, the bins are sorted by the primary aggregate value, and rwstats prints the bins that had the largest or smallest values. The number of bins printed can be specified as a fixed value (e.g., print 10 bins), as a threshold (print bins whose byte count is less than 400), or as a percentage of the total volume across all bins (print bins who that contain at least 10% of all the packets).
The SiLK Flow records are read from the files named on the command
line or from the standard input when no file names are given and the
standard input is not a terminal. To read from both standard input
and files, use stdin or - as the name of an input file.
The flow attribute(s) (or field(s)) that make up the key for each bin
are selected by the user, with the available fields being similar to
those supported by rwcut(1). See the description of the
--fields switch in the OPTIONS section below for the names of
the keys. The list of fields can be extended by loading PySiLK files
(see silkpython(3)) or plug-ins. The user must specify the
--fields switch (or use legacy switches that map to the --fields
switch). The size of the key is basically unlimited, but a larger key
will more quickly use the available the memory leading to slower
performance.
The aggregate value(s) to compute for each bin are also chosen by the
user. As with the key fields, the user can extend the list of
aggregate fields by using PySiLK or plug-ins. The preferred way to
specify the aggregate fields is to use the --values switch; the
aggregate fields will be printed in the order they occur in the
--values switch. If the user does not select any aggregate
value(s), rwstats defaults to computing the number of flow records
for each bin. As with the key fields, requesting more aggregate
values slows performance.
The --presorted-input switch allows rwstats to process data more efficiently by assuming the data has been previously sorted with the rwsort(1) command. With this switch, rwstats does not need large amounts of memory during the binning stage because it does not bin each flow; instead, it keeps a running summation for the bin. When the key changes, the bin's primary aggregate value is compared with those of the current Top-N (or Bottom-N) to see if the new bin is a closer to the top (or bottom). For the output to be meaningful, rwsort and rwstats must be invoked with the same --fields value. When multiple input files are specified and --presorted-input is given, rwstats will merge-sort the flow records from the input files.
rwstats attempts to keep all key and aggregate value data in the computer's memory. If rwstats runs out of memory, the current key and aggregate value data is written to a temporary file. Once all input has been processed, the data from the temporary files is merged to produce the final output. By default, these temporary files are stored in the /tmp directory. Because these files can be large, it is strongly recommended that /tmp not be used as the temporary directory. To modify the temporary directory used by rwstats, provide the --temp-directory switch, set the SILK_TMPDIR environment variable, or set the TMPDIR environment variable.
When SiLK is compiled with IPv6 support, using the sip-distinct and
dip-distinct value fields is limited. Specifically, only one
distinct IP count is supported for unsorted input, and no distinct IP
counts are supported when when --presorted-input is given. Setting
the --ipv6-policy switch to ignore or asv4 will get around
this limitation at the expense of ignoring IPv6 addresses.
rwstats may run out of memory when computing distinct IP counts, causing the counts for some bins to be smaller than the actual number of distinct IPs. When this occurs, a single warning is printed the standard error noting that rwstats has run out of memory, processing continues, and rwstats exits with status 16.
rwstats may also run out of memory if the requested Top-N is too large.
PROTOCOL STATISTICS INVOCATION
Alternatively, rwstats can provide statistics for each of bytes, packets, and bytes-per-packet giving minima, maxima, quartile, and interval flow-counts across all flows or across a list of protocols specified by the user.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
TOP-N INVOCATION
To compute a Top-N or Bottom-N list, the key field(s) must be
specified. Normally the --fields switch is used to specify the key
field(s), but for backward compatibility the --fields switch is not
required.
- --fields=KEY
-
KEY contains the list of flow attributes (a.k.a. fields or columns) that make up the key into which flows are binned. The columns will be displayed in the order the fields are specified. Each field may be specified once only. KEY is a comma separated list of field-names, field-integers, and ranges of field-integers; a range is specified by separating the start and end of the range with a hyphen (-). Field-names are case insensitive. Example:
-
--fields=stime,10,1-5
-
There is no default value for the --fields switch.
-
The complete list of built-in fields that the SiLK tool suite supports follows, though note that not all fields are present in all SiLK file formats; when a field is not present, its value is 0.
- sIP,1
-
source IP address
- dIP,2
-
destination IP address
- sPort,3
-
source port for TCP and UDP, or equivalent
- dPort,4
-
destination port for TCP and UDP, or equivalent
- protocol,5
-
IP protocol
- packets,pkts,6
-
packet count
- bytes,7
-
byte count
- flags,8
-
bit-wise OR of TCP flags over all packets
- sTime,9
-
starting time of flow (seconds resolution)
- dur,10
-
duration of flow (seconds resolution)
- eTime,11
-
end time of flow (seconds resolution)
- sensor,12
-
name or ID of sensor at the collection point
- class,20
-
class of sensor at the collection point
- type,21
-
type of sensor at the collection point
- icmpTypeCode,25
-
include two columns,
iTypeandiCodethat contain the ICMP type and code for ICMP flows; for non-ICMP flows, these columns are empty - initialFlags,26
-
TCP flags on first packet in the flow
- sessionFlags,27
-
bit-wise OR of TCP flags over all packets except the first in the flow
- attributes,28
-
flow attributes set by the flow generator:
F-
flow generator saw additional packets in this flow following a packet with a FIN flag (excluding ACK packets)
T-
flow generator prematurely created a record for a long-running connection due to a timeout. (When the flow generator yaf(1) is run with the --silk switch, it will prematurely create a flow and mark it with
Tif the byte count of the flow cannot be stored in a 32-bit value.) C-
flow generator created this flow as a continuation of long-running connection, where the previous flow for this connection met a timeout (or a byte threshold in the case of yaf).
- application,29
-
guess as to the content the flow. Some software that generates flow records from packet data, such as yaf, will inspect the contents of the packets that make up a flow and use traffic signatures to label the content of the flow. SiLK calls this label the application; yaf refers to it as the appLabel. The application is the port number that is traditionally used for that type of traffic (see the /etc/services file on most UNIX systems). For example, traffic that the flow generator recognizes as FTP will have a value of 21, even if that traffic is being routed through the standard HTTP/web port (80).
- stype,16
-
for the source IP address, the value 0 if the address is non-routable, 1 if it is internal, or 2 if it is routable and external. See addrtype(3).
- dtype,17
-
as stype for the destination IP address
- scc,18
-
for the source IP, a two-letter country code abbreviation denoting the country who
ownsthat IP address. See ccfilter(3). - dcc,19
-
as scc for the destination IP
- src-MAPNAME
-
value determined by passing the source IP or the protocol/source-port to the user-defined mapping defined in the prefix map associated with MAPNAME. See the description of the --pmap-file switch and the pmapfilter(3) manual page.
- dst-MAPNAME
-
as src-MAPNAME for the destination IP or protocol/destination-port.
- sval
- dval
-
These are deprecated field names created by pmapfilter that correspond to src-MAPNAME and dst-MAPNAME, respectively. These fields are available when a prefix map is used that is not associated with a MAPNAME.
- --values=VALUES
-
When computing a Top-N or Bottom-N, all flows that have the same key
field(s)will be binned together. For each bin, one or more aggregate values are computed as specified by VALUES, a comma separated list of names. Names are case insensitive. The first entry in VALUES is the primary value, and it is used as the basis to compute the Top-N or Bottom-N. If the --values switch is not specified (and no legacy switch that sets values is specified), rwstats counts the number of flow records for each bin. The aggregate fields are printed in the order they occur in VALUES. The names of the built-in value fields follow. This list can be augmented through the use of PySiLK and plug-ins. - Records
-
Count the number of flow records that mapped to each bin.
- Packets
-
Sum the number of packets across all records that mapped to each bin.
- Bytes
-
Sum the number of bytes across all records that mapped to each bin.
- sIP-Distinct
-
Count the number of distinct source IP addresses that were seen for each bin.
- dIP-Distinct
-
Count the number of distinct destination IP addresses that were seen for each bin.
- --plugin=PLUGIN
-
Augment the list of key fields and/or aggregate value fields by using run-time loading of the plug-in (shared object) whose path is PLUGIN. The switch may be repeated to load multiple plug-ins. The creation of these plug-ins is beyond the scope of this manual page. When PLUGIN contains a slash (
/), rwstats assumes the path to PLUGIN is correct. Otherwise, rwstats will attempt to find the file in $SILK_PATH/lib/silk, $SILK_PATH/share/lib, $SILK_PATH/lib, and in these directories parallel to the application's directory: lib/silk, share/lib, and lib. If rwstats does not find the file, it assumes the plug-in is in the current directory. To force rwstats to look in the current directory first, specify --plugin=./PLUGIN. When the SILK_PLUGIN_DEBUG environment variable is non-empty, rwstats prints status messages to the standard error as it tries to open each of its plug-ins.
Many SiLK file formats do not store the following fields and their values will always be 0; they are listed here for completeness:
SiLK can store flows generated by enhanced collection software that provides more information than NetFlow v5. These flows may support some or all of these additional fields; for flows without this additional information, the field's value is always 0.
Consider a long-running ssh session that exceeds the flow generator's
active timeout. (This is the active timeout since the flow
generator creates a flow for a connection that still has activity).
The flow generator will create multiple flow records for this ssh
session, each spanning some portion of the total session. The first
flow record will be marked with a T indicating that it hit the
timeout. The second through next-to-last records will be marked with
TC indicating that this flow both timed out and is a continuation
of a flow that timed out. The final flow will be marked with a C,
indicating that it was created as a continuation of an active flow.
The list of built-in fields may be augmented by run-time loading of plug-ins (shared object files or dynamic libraries) when the plug-in is available. rwstats automatically looks for the following plug-ins:
ADDRESS TYPE (addrtype.so)
COUNTRY CODE (ccfilter.so)
PREFIX MAP (pmapfilter.so)
To determine the value of N for a Top-N (or Bottom-N) list, one of the following switches must be specified. The primary value may limit which switch may be specified. When --presorted-input is active, only the --count switch is supported.
- --count=N
-
Print the N bins with the largest (or smallest) values. This is always allowed.
- --threshold=N
-
Print the bins where the primary value is greater-than (or less-than) the value N. This switch is not allowed when the primary value comes from a plug-in.
- --percentage=N
-
Print the bins where the primary value is greater-than (or less-than) N percent of the sum of the primary values across all bins. To use this switch, the primary value must be
Bytes,Packets, orRecords.
To determine whether to compute the Top-N or the Bottom-N, specify one of the following switches. If neither switch is given, --top is assumed:
- --top
-
Print the top N keys and their values. This is the default.
- --bottom
-
Print the bottom N keys and their values.
PROTOCOL STATISTICS INVOCATION
The following switches will compute and print, for each of bytes, packets, and bytes per packet, the minimum value, the maximum value, quartiles, and a count of the number of flows that fall into each of one of ten intervals statistics. These switches cannot be combined with the switches that produce Top-N or Bottom-N lists.
- --overall-stats
-
Print intervals and quartiles across all flows that were read by rwstats.
- --detail-proto-stats=PROTO[,PROTO...]
-
Print intervals and quartiles for each individual protocol listed as an argument. The argument should be a comma separated list of protocols or ranges of protocols:
1-6,17. Specifying this option implies --overall-stats.
MISCELLANEOUS SWITCHES
The following switches are available when rwstats is running in either mode, though many only applicable to the Top-N mode.
- --presorted-input
-
Cause rwstats to assume that it is reading sorted input; i.e., that rwstats's input
file(s)were generated by rwsort(1) using the exact same value for the --fields switch. This option allows rwstats to process an endless stream of records. When multiple input files are specified, rwstats will merge-sort the flow records from the input files. - --no-percents
-
For the Top-N invocation, do not print the percent-of-total and cumulative-percentage columns. These columns will contain a question mark when the primary key is not one of
Bytes,Packets, orRecords, and this switch allows you to suppress them. - --ipv6-policy=POLICY
-
Determine how IPv4 and IPv6 flows are handled when SiLK has been compiled with IPv6 support. When the switch is not provided, the SILK_IPV6_POLICY environment variable is checked for a policy. If it is also unset or contains an invalid policy, the POLICY is mixed. When SiLK has not been compiled with IPv6 support; IPv6 flows are always ignored, regardless of the value passed to this switch or in the SILK_IPV6_POLICY variable. The supported values for POLICY are:
- ignore
-
Completely ignore IPv6 flows. Only IPv4 flows will be printed.
- asv4
-
Convert IPv6 addresses to IPv4 if possible, otherwise ignore the IPv6 flows.
- mix
-
Process the input as a mixture of IPv4 and IPv6 flows.
- force
-
Force IPv4 flows to be converted to IPv6.
- only
-
Only process flows that were marked as IPv6 and completely ignore IPv4 flows.
- --bin-time
- --bin-time=SECONDS
-
Adjust the key fields 'sTime' and 'eTime' to appear on SECONDS-second boundaries (the floor of the time is used). When no value is provided to the switch, 60-second time bins are used.
- --epoch-time
-
Print timestamps as epoch time (number of seconds since midnight GMT on 1970-01-01).
- --integer-ips
-
Print IP addresses as integers. By default, IPs are printed in their canonical form.
- --zero-pad-ips
-
Print IP addresses in their canonical form, but add zeros to the IP address so it fully fills the width of column. For IPv4, use three digits per octet, e.g,
127.000.000.001. For IPv6, use four digits per hexadectet and expand empty hexadectets, e.g.;0000:0000:0000:0000:0000:FFFF:FF00:0001. - --integer-sensors
-
Print the integer ID of the sensor rather than its name.
- --no-titles
-
Disable section and column titles. By default, titles are printed.
- --no-columns
-
Disable fixed-width columnar output.
- --column-separator=C
-
Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.
- --no-final-delimiter
-
Do not print the column separator after the final column. Normally a delimiter is printed.
- --delimited
- --delimited=C
-
Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.
- --print-filenames
-
Print to the standard error the names of input files as they are opened.
- --copy-input=PATH
-
Copy all binary input to the specified file or named pipe. PATH can be
stdoutto print flows to the standard output as long as the --output-path switch has been used to redirect rwstats's ASCII output. - --output-path=PATH
-
Determine where the output of rwstats (ASCII text) is written. If this option is not given, output is written to the standard output.
- --pager=PAGER_PROG
-
When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
- --temp-directory=DIR_PATH
-
Specify the name of the directory in which to store data files temporarily when the memory is not large enough to store all the bins and their aggregate values. This switch overrides the directory specified in the SILK_TMPDIR environment variable, which overrides the directory specified in the TMPDIR variable, which overrides the default, /tmp.
- --site-config-file=FILENAME
-
Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the --version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.
- --legacy-timestamps
- --legacy-timestamps=NUM
-
Specify the format for human readable timestamps, either the default (new) style,
YYYY/MM/DDThh:mm:ss, or the legacy style,MM/DD/YYYY hh:mm:ss. When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error. - --help
-
Print the available options and exit. Options that add fields can be specified before --help so that the new options appear in the output.
- --legacy-help
-
Print help, including legacy switches. See the LEGACY SWITCHES section below for these switches.
- --version
-
Print the version number and information about how SiLK was configured, then exit the application.
LEGACY SWITCHES
Use of the following switches is discouraged; instead, use the replacement switches as indicated.
- --sip
-
Use: --fields=sip
- --sip=CIDR
-
Use the most significant CIDR bits of the source address as the key. Using this switch with IPv6 data will cause an error. The user should use rwnetmask(1) to mask the data prior to processing it with rwstats.
- --dip
-
Use: --fields=dip
- --dip=CIDR
-
Use the most significant CIDR bits of the destination address as the key. Using this switch with IPv6 data will cause an error. The user should use rwnetmask to mask the data prior to processing it with rwstats.
- --sport
-
Use: --fields=sport
- --dport
-
Use: --fields=dport
- --protocol
-
Use: --fields=protocol
- --icmp
-
Use: --fields=icmpTypeCode
- --flows
-
Use:
--values=records - --packets
-
Use:
--values=packets - --bytes
-
Use:
--values=bytes
The following switches are highly deprecated. We plan to remove them in 2010.
- --sip-topn=N
-
Use:
--fields=sip [--top] [--values=flows] --count=N - --sip-top-threshold=N
-
Use:
--fields=sip [--top] [--values=flows] --threshold=N - --sip-top-pct=N
-
Use:
--fields=sip [--top] [--values=flows] --percentage=N - --sip-btmn=N
-
Use:
--fields=sip --bottom [--values=flows] --count=N - --sip-btm-threshold=N
-
Use:
--fields=sip --bottom [--values=flows] --threshold=N - --sip-btm-pct=N
-
Use:
--fields=sip --bottom [--values=flows] --percentage=N - --dip-topn=N
-
Use:
--fields=dip [--top] [--values=flows] --count=N - --dip-top-threshold=N
-
Use:
--fields=dip [--top] [--values=flows] --threshold=N - --dip-top-pct=N
-
Use:
--fields=dip [--top] [--values=flows] --percentage=N - --dip-btmn=N
-
Use:
--fields=dip --bottom [--values=flows] --count=N - --dip-btm-threshold=N
-
Use:
--fields=dip --bottom [--values=flows] --threshold=N - --dip-btm-pct=N
-
Use:
--fields=dip --bottom [--values=flows] --percentage=N - --pair-topn=N
-
Use:
--fields=sip,dip [--top] [--values=flows] --count=N - --pair-top-threshold=N
-
Use:
--fields=sip,dip [--top] [--values=flows] --threshold=N - --pair-top-pct=N
-
Use:
--fields=sip,dip [--top] [--values=flows] --percentage=N - --pair-btmn=N
-
Use:
--fields=sip,dip --bottom [--values=flows] --count=N - --pair-btm-threshold=N
-
Use:
--fields=sip,dip --bottom [--values=flows] --threshold=N - --pair-btm-pct=N
-
Use:
--fields=sip,dip --bottom [--values=flows] --percentage=N - --sport-topn=N
-
Use:
--fields=sport [--top] [--values=flows] --count=N - --sport-top-threshold=N
-
Use:
--fields=sport [--top] [--values=flows] --threshold=N - --sport-top-pct=N
-
Use:
--fields=sport [--top] [--values=flows] --percentage=N - --sport-btmn=N
-
Use:
--fields=sport --bottom [--values=flows] --count=N - --sport-btm-threshold=N
-
Use:
--fields=sport --bottom [--values=flows] --threshold=N - --sport-btm-pct=N
-
Use:
--fields=sport --bottom [--values=flows] --percentage=N - --dport-topn=N
-
Use:
--fields=dport [--top] [--values=flows] --count=N - --dport-top-threshold=N
-
Use:
--fields=dport [--top] [--values=flows] --threshold=N - --dport-top-pct=N
-
Use:
--fields=dport [--top] [--values=flows] --percentage=N - --dport-btmn=N
-
Use:
--fields=dport --bottom [--values=flows] --count=N - --dport-btm-threshold=N
-
Use:
--fields=dport --bottom [--values=flows] --threshold=N - --dport-btm-pct=N
-
Use:
--fields=dport --bottom [--values=flows] --percentage=N - --portpair-topn=N
-
Use:
--fields=sport,dport [--top] [--values=flows] --count=N - --portpair-top-threshold=N
-
Use:
--fields=sport,dport [--top] [--values=flows] --threshold=N - --portpair-top-pct=N
-
Use:
--fields=sport,dport [--top] [--values=flows] --percentage=N - --portpair-btmn=N
-
Use:
--fields=sport,dport --bottom [--values=flows] --count=N - --portpair-btm-threshold=N
-
Use:
--fields=sport,dport --bottom [--values=flows] --threshold=N - --portpair-btm-pct=N
-
Use:
--fields=sport,dport --bottom [--values=flows] --percentage=N - --proto-topn=N
-
Use:
--fields=protocol [--top] [--values=flows] --count=N - --proto-top-threshold=N
-
Use:
--fields=protocol [--top] [--values=flows] --threshold=N - --proto-top-pct=N
-
Use:
--fields=protocol [--top] [--values=flows] --percentage=N - --proto-btmn=N
-
Use:
--fields=protocol --bottom [--values=flows] --count=N - --proto-btm-threshold=N
-
Use:
--fields=protocol --bottom [--values=flows] --threshold=N - --proto-btm-pct=N
-
Use:
--fields=protocol --bottom [--values=flows] --percentage=N - --cidr-src=N
-
Using
--sip=Nis currently supported but deprecated. The user should use rwnetmask(1) to do the masking. - --cidr-dest=N
-
Using
--dip=Nis currently supported but deprecated. The user should use rwnetmask to do the masking.
EXAMPLES
$ rwstats --fields=sip --count=4 data.rwf
INPUT: 549092 Records for 12990 Bins and 549092 Total Records
OUTPUT: Top 4 Bins by Records
sIP| Records| %Records| cumul_%|
10.1.1.1| 36604| 6.666278| 6.666278|
10.1.1.2| 13897| 2.530906| 9.197184|
10.1.1.3| 12739| 2.320012| 11.517196|
10.1.1.4| 11807| 2.150277| 13.667473|
$ rwstats --fields=dip --values=packets --count=7 data.rwf
INPUT: 549092 Records for 44654 Bins and 6620587 Total Packets
OUTPUT: Top 7 Bins by Packets
dIP| Packets| %Packets| cumul_%|
10.1.1.1| 217574| 3.286325| 3.286325|
10.1.1.2| 138177| 2.087081| 5.373407|
10.1.1.3| 121892| 1.841106| 7.214512|
10.1.1.4| 97073| 1.466230| 8.680742|
10.1.1.5| 82284| 1.242851| 9.923593|
10.1.1.6| 80051| 1.209123| 11.132715|
10.1.1.7| 73602| 1.111714| 12.244430|
$ rwstats --fields=sip,dip --values=byte --threshold=100000000 data.rwf
INPUT: 549092 Records for 107136 Bins and 3410300252 Total Bytes
OUTPUT: Top 5 Bins by Bytes (threshold 100000000)
sIP| dIP| Bytes| %Bytes| cumul_%|
10.1.1.1| 10.1.1.2| 307478707| 9.016177| 9.016177|
10.1.1.3| 10.1.1.4| 172164463| 5.048367| 14.064544|
10.1.1.5| 10.1.1.6| 142059589| 4.165604| 18.230147|
10.1.1.7| 10.1.1.8| 119388394| 3.500818| 21.730965|
10.1.1.9| 10.1.1.10| 108268824| 3.174759| 24.905725|
$ rwstats --fields=sport --percentage=5 data.rwf
INPUT: 549092 Records for 56799 Bins and 549092 Total Records
OUTPUT: Top 3 Bins by Records (5% == 27454)
sPort| Records| %Records| cumul_%|
80| 86677| 15.785515| 15.785515|
53| 64681| 11.779629| 27.565144|
0| 47760| 8.697996| 36.263140|
$ rwstats --fields=dport --bottom --count=8 data.rwf
INPUT: 549092 Records for 44772 Bins and 549092 Total Records
OUTPUT: Bottom 8 Bins by Records
dPort| Records| %Records| cumul_%|
19417| 1| 0.000182| 0.000182|
12110| 1| 0.000182| 0.000364|
34777| 1| 0.000182| 0.000546|
8999| 1| 0.000182| 0.000728|
36404| 1| 0.000182| 0.000911|
16682| 1| 0.000182| 0.001093|
27420| 1| 0.000182| 0.001275|
14162| 1| 0.000182| 0.001457|
$ rwstats --fields=sport,dport --values=packets \
--top --threshold=500000 data.rwf
INPUT: 366309 Records for 130307 Bins and 5597540 Total Packets
OUTPUT: No bins above threshold of 500000
$ rwstats --fields=sport,dport --values=packets \
--top --threshold=50000 data.rwf
INPUT: 366309 Records for 130307 Bins and 5597540 Total Packets
OUTPUT: Top 3 Bins by Packets (threshold 50000)
sPort|dPort| Packets| %Packets| cumul_%|
6699| 3607| 138177| 2.468531| 2.468531|
80| 1179| 59774| 1.067862| 3.536393|
80| 9659| 50319| 0.898949| 4.435342|
$ rwstats --fields=protocol --bottom --count=10 data.rwf
INPUT: 545262 Records for 3 Bins and 545262 Total Records
OUTPUT: Bottom 10 Bins by Records
protocol| Records| %Records| cumul_%|
1| 46319| 8.494815| 8.494815|
17| 132634| 24.324820| 32.819635|
6| 366309| 67.180365|100.000000|
$ rwstats --detail-proto-stats=6,17 data.rwf
FLOW STATISTICS--ALL PROTOCOLS: 549092 records
*BYTES min 28; max 88906238
quartiles LQ 122.06478 Med 420.30930 UQ 876.21920 UQ-LQ 754.15442
interval_max|count<=max|%_of_input| cumul_%|
40| 35107| 6.393646| 6.393646|
60| 35008| 6.375616| 12.769263|
100| 49500| 9.014883| 21.784145|
150| 40014| 7.287303| 29.071449|
256| 65444| 11.918586| 40.990034|
1000| 224016| 40.797535| 81.787569|
10000| 75708| 13.787853| 95.575423|
100000| 21981| 4.003154| 99.578577|
1000000| 1901| 0.346208| 99.924785|
4294967295| 413| 0.075215|100.000000|
*PACKETS min 1; max 70023
quartiles LQ 1.76962 Med 3.68119 UQ 7.61567 UQ-LQ 5.84605
interval_max|count<=max|%_of_input| cumul_%|
3| 232716| 42.381969| 42.381969|
4| 61407| 11.183372| 53.565341|
10| 195310| 35.569631| 89.134972|
20| 33310| 6.066379| 95.201351|
50| 17686| 3.220954| 98.422304|
100| 4854| 0.884005| 99.306309|
500| 2760| 0.502648| 99.808957|
1000| 373| 0.067930| 99.876888|
10000| 637| 0.116010| 99.992897|
4294967295| 39| 0.007103|100.000000|
*BYTES/PACKET min 28; max 1500
quartiles LQ 57.98319 Med 90.71150 UQ 164.77250 UQ-LQ 106.78932
interval_max|count<=max|%_of_input| cumul_%|
40| 42568| 7.752435| 7.752435|
44| 15173| 2.763289| 10.515724|
60| 91003| 16.573361| 27.089085|
100| 163850| 29.840173| 56.929258|
200| 153190| 27.898786| 84.828043|
400| 39761| 7.241227| 92.069271|
600| 12810| 2.332942| 94.402213|
800| 7954| 1.448573| 95.850786|
1500| 22783| 4.149214|100.000000|
4294967295| 0| 0.000000|100.000000|
FLOW STATISTICS--PROTOCOL 6: 366309/549092 records
*BYTES min 40; max 88906238
quartiles LQ 310.47331 Med 656.53661 UQ 1089.75344 UQ-LQ 779.28013
interval_max|count<=max|%_of_proto| cumul_%|
40| 29774| 8.128110| 8.128110|
60| 11453| 3.126595| 11.254706|
100| 6915| 1.887751| 13.142456|
150| 16369| 4.468632| 17.611088|
256| 12651| 3.453642| 21.064730|
1000| 196881| 53.747246| 74.811976|
10000| 68989| 18.833553| 93.645529|
100000| 21099| 5.759891| 99.405420|
1000000| 1784| 0.487021| 99.892441|
4294967295| 394| 0.107559|100.000000|
*PACKETS min 1; max 70023
quartiles LQ 3.39682 Med 5.85903 UQ 8.80427 UQ-LQ 5.40745
interval_max|count<=max|%_of_proto| cumul_%|
3| 69358| 18.934288| 18.934288|
4| 55993| 15.285729| 34.220016|
10| 186559| 50.929407| 85.149423|
20| 30947| 8.448332| 93.597755|
50| 16186| 4.418674| 98.016429|
100| 4204| 1.147665| 99.164094|
500| 2178| 0.594580| 99.758674|
1000| 315| 0.085993| 99.844667|
10000| 537| 0.146598| 99.991264|
4294967295| 32| 0.008736|100.000000|
*BYTES/PACKET min 40; max 1500
quartiles LQ 60.19817 Med 96.78616 UQ 175.08044 UQ-LQ 114.88228
interval_max|count<=max|%_of_proto| cumul_%|
40| 36559| 9.980372| 9.980372|
44| 14929| 4.075521| 14.055893|
60| 39593| 10.808634| 24.864527|
100| 100117| 27.331297| 52.195824|
200| 111258| 30.372718| 82.568542|
400| 26020| 7.103293| 89.671834|
600| 8600| 2.347745| 92.019579|
800| 7726| 2.109148| 94.128727|
1500| 21507| 5.871273|100.000000|
4294967295| 0| 0.000000|100.000000|
FLOW STATISTICS--PROTOCOL 17: 132634/549092 records
*BYTES min 32; max 2115559
quartiles LQ 66.53665 Med 150.61551 UQ 242.44095 UQ-LQ 175.90430
interval_max|count<=max|%_of_proto| cumul_%|
20| 0| 0.000000| 0.000000|
40| 5195| 3.916794| 3.916794|
80| 42150| 31.779182| 35.695975|
130| 11528| 8.691587| 44.387563|
256| 45497| 34.302667| 78.690230|
1000| 23401| 17.643289| 96.333519|
10000| 4447| 3.352836| 99.686355|
100000| 389| 0.293288| 99.979643|
1000000| 23| 0.017341| 99.996984|
4294967295| 4| 0.003016|100.000000|
*PACKETS min 1; max 8839
quartiles LQ 0.84383 Med 1.68768 UQ 2.53149 UQ-LQ 1.68766
interval_max|count<=max|%_of_proto| cumul_%|
3| 117884| 88.879171| 88.879171|
4| 4452| 3.356605| 92.235777|
10| 6678| 5.034908| 97.270685|
20| 1766| 1.331484| 98.602168|
50| 1055| 0.795422| 99.397590|
100| 368| 0.277455| 99.675046|
500| 353| 0.266146| 99.941192|
1000| 33| 0.024880| 99.966072|
10000| 45| 0.033928|100.000000|
4294967295| 0| 0.000000|100.000000|
*BYTES/PACKET min 32; max 1415
quartiles LQ 63.23827 Med 91.27180 UQ 158.10219 UQ-LQ 94.86392
interval_max|count<=max|%_of_proto| cumul_%|
20| 0| 0.000000| 0.000000|
24| 0| 0.000000| 0.000000|
40| 5671| 4.275676| 4.275676|
100| 70970| 53.508150| 57.783826|
200| 39298| 29.628904| 87.412730|
400| 12175| 9.179396| 96.592126|
600| 4130| 3.113832| 99.705958|
800| 160| 0.120633| 99.826590|
1500| 230| 0.173410|100.000000|
4294967295| 0| 0.000000|100.000000|
The silkpython(3) manual page provides examples that use PySiLK to create arbitrary fields to use as part of the key for rwstats.
ENVIRONMENT
- SILK_IPV6_POLICY
-
This environment variable is used as the value for the --ipv6-policy when that switch is not provided.
- SILK_PAGER
-
When set to a non-empty string, rwstats automatically invokes this program to display its output a screen at a time. If set to an empty string, rwstats does not automatically page its output.
- PAGER
-
When set and SILK_PAGER is not set, rwstats automatically invokes this program to display its output a screen at a time.
- SILK_TMPDIR
-
When set and --temp-directory is not specified, rwstats writes the temporary files it creates to this directory. SILK_TMPDIR overrides the value of TMPDIR.
- TMPDIR
-
When set and SILK_TMPDIR is not set, rwstats writes the temporary files it creates to this directory.
- PYTHONPATH
-
This environment variable is used by Python to locate modules. When --python-file is specified, rwstats loads Python which in turn loads the PySiLK module which is comprised of several files (silk/pysilk_nl.so, silk/__init__.py, etc). If this silk/ directory is located outside Python's normal search path (for example, in the SiLK installation tree), it may be necessary to set or modify the PYTHONPATH environment variable to include the parent directory of silk/ so that Python can find the PySiLK module.
- SILK_PYTHON_TRACEBACK
-
When set, Python plug-ins will output traceback information on Python errors to the standard error.
- SILK_COUNTRY_CODES
-
This environment variable allows the user to specify the country code mapping file that the ccfilter(3) plug-in will use. The value may be a complete path or a file relative to the SILK_PATH. If the variable is not specified, the code looks for a file named country_codes.pmap in the location specified by SILK_PATH.
- SILK_CONFIG_FILE
-
This environment variable is used as the value for the --site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
-
When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwstats looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
- SILK_PATH
-
This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwstats checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share. These directories are also searched when any other configuration file is required (e.g., the country code map). In addition, rwstats looks for plug-ins in $SILK_PATH/lib/silk, $SILK_PATH/share/lib and $SILK_PATH/lib.
- SILK_PLUGIN_DEBUG
-
When set to 1, rwstats prints status messages to the standard error as it tries to open each of its plug-ins.
NOTES
When used in an IPv6 environment, rwstats will process every record as long as the IP address is not part of the key. When aggregating by an IP address or an IP-pair, rwstats will attempt to convert any IPv6 addresses to IPv4. Records that can be converted will be processed, all other records will be silently ignored.
The output of rwstats is similar to that of rwaddrcount(1), rwtotal(1), and rwuniq(1).
To compute Top-N lists for other key combinations or to see values for Records, Packets, and Bytes in a single view, consider using another SiLK tool and passing the output through sort and head. For example, to see the Top-10 lists for sip,sport combinations, counting by Bytes:
$ rwfilter ...| rwuniq --fields=sip,sport --all --no-titles \
| sort -r -t '|' -k 3 | head -10
rwstats uses an hash table internally when computing Top-N and Bottom-N lists. rwstats may run of memory when processing IP addresses, especially IP-pairs. If rwstats's hash table does run out of memory, rwstats will stop processing input, print a warning to the standard error, output the entries it has computed to that point, and exit with code 16.
SEE ALSO
rwfilter(1), rwcut(1), rwnetmask(1), rwsort(1), rwuniq(1), rwaddrcount(1), rwtotal(1), addrtype(3), ccfilter(3), pmapfilter(3), silkpython(3), pysilk(3), yaf(1)


