NAME
rwcount - Print traffic summary across time
SYNOPSIS
rwcount [--bin-size=SIZE] [--load-scheme=LOADSTYLE]
[--start-epoch=START_TIME] [--end-epoch=END_TIME]
[--epoch-slots] [--bin-slots] [--skip-zeroes] [--no-titles]
[--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [--site-config-file=FILENAME]
[{--legacy-timestamps | --legacy-timestamps=NUM}]
[FILES...]
DESCRIPTION
rwcount summarizes SiLK flow records across time. It counts the records in the input stream, and groups their byte and packet totals into time bins. rwcount produces textual output with a row for each bin.
When input files are not specified on the command line, rwcount will read records from the standard input.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
- --bin-size=SIZE
- Denote the size of each time bin, in seconds; defaults to 30 seconds. rwcount supports millisecond size bins; SIZE may be a floating point value equal to or greater than than 0.001.
- --load-scheme=LOADSTYLE
- Determine how the duration of each flow is mapped onto the time bins. LOADSTYLE can be one of the following:
- 0
- Assume the traffic is evenly distributed across the bins that contain any part of the flow's duration. For a flow whose duration spans five bins, each bin's packet- and byte-counts will be incremented with 1/5 of the values for the entire flow.
-
The traffic is NOT evenly distributed across the flow's duration, since, when using a bin-size of 30 seconds, a particularly placed 32 second flow will span three bins, and each bin will receive 1/3 of the flow. Compare with option 4.
- 1
- Assume all of the traffic occurs in the initial millisecond of the flow's duration. For a flow whose duration spans five bins, the first bin's packet- and byte-counts will be incremented with the values for the entire flow.
- 2
- Assume all of the traffic occurs in the last millisecond of the flow's duration. For a flow whose duration spans five bins, the fifth bin's packet- and byte-counts will be incremented with the values for the entire flow.
- 3
- Assume all of the traffic occurs in the middle millisecond of the flow's duration. For a flow whose duration spans five bins, the third bin's packet- and byte-counts will be incremented with the values for the entire flow.
- 4
- Assume the traffic is evenly distributed during each millisecond that the flow is active. For a flow whose duration spans five bins, each bin will receive a portion of the flow-, packet-, and byte-counts weighted by the amount of time the flow spent in each bin.
-
When using 30 second bins, a particularly placed 32 second flow will add 1/32 of its value to the first and last bins, and 30/32 to the middle bin.
- --start-epoch=START_TIME
-
Denote the time to use for the first bin. START_TIME may be in
UNIX epoch seconds or in
yyyy/mm/dd:HH[:MM[:SS[.sss]]]format. - --end-epoch=END_TIME
-
Denote the time to use for the final bin. END_TIME may be in UNIX
epoch seconds or in
yyyy/mm/dd:HH[:MM[:SS[.sss]]]format. When neither START_TIME nor END_TIME are not specified to the millisecond, the ceiling of END_TIME is used. END_TIME will be adjusted so that the number of bins is an integer value. When both START_TIME and END_TIME are used, rwcount will allocate bins for the entire time span before it begins processing data, or exit abnormally if it cannot allocate the required memory. - --epoch-slots
- Use the UNIX epoch time as the label for each bin in the output; the default is to label each bin with the time in a human-readable format.
- --bin-slots
- Use the internal bin index as the label for each bin in the output; the default is to label each bin with the time in a human-readable format.
- --skip-zeroes
- Disable printing of bins with no traffic. By default, all bins are printed.
- --no-titles
- Turn off column titles. By default, titles are printed.
- --no-columns
- Disable fixed-width columnar output.
- --column-separator=C
- Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.
- --no-final-delimiter
- Do not print the column separator after the final column. Normally a delimiter is printed.
- --delimited
- --delimited=C
- Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.
- --print-filenames
- Print to the standard error the names of input files as they are opened.
- --copy-input=PATH
-
Copy all binary input to the specified file or named pipe. PATH
can be
stdoutto print flows to the standard output as long as the --output-path switch has been used to redirect rwcount's ASCII output. - --output-path=PATH
- Determine where the output of rwcount (ASCII text) is written. If this option is not given, output is written to the standard output.
- --pager=PAGER_PROG
- When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the value of the pager is determined to be the empty string, no paging will be performed and all output will be printed to the terminal.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, the location specified by the SILK_CONFIG_FILE environment variable is used if that variable is not empty. The value of SILK_CONFIG_FILE should include the name of the file. Otherwise, the application looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK (use the --version switch to view this value); the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/; and the share/silk/ and share/ directories parallel to the application's directory.
- --legacy-timestamps
- --legacy-timestamps=NUM
-
Specify the format for human readable timestamps, either the default
(new) style,
YYYY/MM/DDThh:mm:ss, or the legacy style,MM/DD/YYYY hh:mm:ss. When this switch is not present, the timestamps will be in the default format. When this switch is present and no argument is given, timestamps are in the legacy format. When an argument is supplied, timestamps will be in the new format if the argument begins with 0, and in the old format if the argument begins with 1. Any other argument to the switch is an error.
The default LOADSTYLE is 4.
EXAMPLES
To count all web traffic on Jan 1, 2003, into 1 hour bins:
rwfilter --pass=stdout --start-date=2003/01/01:00 \
--end-date=2003/01/01:24 --proto=6 --aport=80 \
| rwcount --bin-size=3600
Date| Records| Bytes| Packets|
2003/01/01T00:00:00| 12947.00| 1968190.00| 34312.00|
2003/01/01T01:00:00| 65318.00| 5783959.00| 100143.00|
2003/01/01T02:00:00| 13765.00| 1895933.00| 36121.00|
2003/01/01T03:00:00| 69599.00| 7062388.00| 144130.00|
2003/01/01T04:00:00| 204717.00| 18491693.00| 385293.00|
2003/01/01T05:00:00| 18664.00| 2352966.00| 45296.00|
....
To force the hourly bins in the previous example to run from 30 minutes past the hour, use the --start-epoch switch:
rwfilter ...| \
rwcount --bin-size=3600 --start-epoch=2002/12/31:23:30
ENVIRONMENT
- SILK_PAGER
- When set to a non-empty string, rwcount automatically invokes this program to display its output a screen at a time. If set to an empty string, rwcount does not automatically page its output.
- PAGER
- When set and SILK_PAGER is not set, rwcount automatically invokes this program to display its output a screen at a time.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the --site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- When the --site-config-file switch is not provided and the SILK_CONFIG_FILE environment variable is not set, rwcount looks for the site configuration file in $SILK_DATA_ROOTDIR/silk.conf.
- SILK_PATH
- This environment variable gives the root of the install tree. As part of its search for the SiLK site configuration file, rwcount checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
SEE ALSO
BUGS
rwuniq(1)'s --bin-time switch can do time-binning similar to what rwcount supports, but rwuniq cannot divide a SiLK record among multiple bins, i.e., there is no support for a --load-factor type switch. Such a feature could greatly increase rwuniq's already large memory requirements.


