NAME

rwaddrcount - count activity by IP address

SYNOPSIS

  rwaddrcount {--print-recs | --print-ips | --print-stat}
        [--use-dest] [--min-bytes=BYTEMIN] [--max-bytes=BYTEMAX]
        [--min-records=RECMIN] [--max-records=RECMAX]
        [--min-packets=PACKMIN] [--max-packets=PACKMAX]
        [--set-file=PATHNAME] [--sort-ips] [--timestamp-format=FORMAT]
        [--ip-format=FORMAT] [--integer-ips] [--zero-pad-ips]
        [--no-titles] [--no-columns] [--column-separator=CHAR]
        [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
        [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
        [--pager=PAGER_PROG] [--site-config-file=FILENAME]
        [{--legacy-timestamps | --legacy-timestamps=NUM}]
        {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

  rwaddrcount --help

  rwaddrcount --version

DESCRIPTION

rwaddrcount reads SiLK Flow records, sums the byte-, packet-, and record-counts on those records by individual source or destination IP address and maintains the time window during which that IP address was active. At the end of the count operation, the results per IP address are displayed when the --print-recs switch is given. rwaddrcount includes facilities for displaying only those IP address whose byte-, packet- or flow-counts are between specified minima and maxima.

rwaddrcount reads SiLK Flow records from the files named on the command line or from the standard input when no file names are specified and --xargs is not present. To read the standard input in addition to the named files, use - or stdin as a file name. If an input file name ends in .gz, the file is uncompressed as it is read. When the --xargs switch is provided, rwaddrcount reads the names of the files to process from the named text file or from the standard input if no file name argument is provided to the switch. The input to --xargs must contain one file name per line.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

For the application to operate, one of the three --print options must be chosen.

Print one row for each bin that meets the minima/maxima criteria. Each bin contains the IP address, number of bytes, number of packets, number of flow records, earliest start time, and latest end time.

Print a single column containing the IP addresses for each bin that meets the minima/maxima criteria.

Print a one or two line summary (plus a title line) that summarizes the bins. The first line is a summary across all bins, and it contains the number of unique IP addresses and the sums of the bytes, packets, and flow records. The second line is printed only when one or more minima or maxima are specified. This second line contains the same columns as first, and its values are the sums across those bins that meet the criteria.

--use-dest

Count by destination IP address in the filter record rather than source IP.

--min-bytes=BYTEMIN

Filtering criterion; for the final output (stats or printing), only include count records where the total number of bytes exceeds BYTEMIN

--min-packets=PACKMIN

Filtering criterion; for the final output (stats or printing), only include count records where the total number of packets exceeds PACKMIN

--min-records=RECMIN

Filtering criterion; for the final output (stats or printing), only include count records where the total number of filter records contributing to that count record exceeds RECMIN.

--max-bytes=BYTEMAX

Filtering criterion; for the final output (stats or printing), only include count records where the total number of bytes is less than BYTEMAX.

--max-packets=PACKMAX

Filtering criterion; for the final output (stats or printing), only include count records where the total number of packets is less than PACKMAX.

--max-records=RECMAX

Filtering criterion; for the final output (stats or printing), only include count records which at most RECMAX filter records contributed to.

--set-file=PATHNAME

Write the IPs into the rwset(1)-style binary IP-set file named PATHNAME. Use rwsetcat(1) to see the contents of this file.

--timestamp-format=FORMAT

Specify the format and/or timezone to use when printing timestamps. When this switch is not specified, the SILK_TIMESTAMP_FORMAT environment variable is checked for a default format and/or timezone. If it is empty or contains invalid values, timestamps are printed in the default format, and the timezone is UTC unless SiLK was compiled with local timezone support. FORMAT is a comma-separated list of a format and/or a timezone. The format is one of:

default

Print the timestamps as YYYY/MM/DDThh:mm:ss

iso

Print the timestamps as YYYY-MM-DD hh:mm:ss

m/d/y

Print the timestamps as MM/DD/YYYY hh:mm:ss

epoch

Print the timestamps as the number of seconds since 00:00:00 UTC on 1970-01-01.

When a timezone is specified, it is used regardless of the default timezone support compiled into SiLK. The timezone is one of:

utc

Use Coordinated Universal Time to print timestamps.

local

Use the TZ environment variable or the local timezone.

--ip-format=FORMAT

For the --print-recs and --print-ips output formats, specify how IP addresses are printed. When this switch is not specified, the SILK_IP_FORMAT environment variable is checked for a format. If it is empty or contains an invalid format, IPs are printed in the canonical format. The FORMAT is one of:

canonical

Print IP addresses in their canonical form, 127.0.0.1.

zero-padded

Print IP addresses in their canonical form, but add zeros to the output so it fully fills the width of column. The address 127.0.0.1 is printed as 127.000.000.001.

decimal

Print IP addresses as integers in decimal format. The address 127.0.0.1 is printed as 2130706433.

hexadecimal

Print IP addresses as integers in hexadecimal format. The address 127.0.0.1 is printed as 7f000001.

force-ipv6

Print all IP addresses in the canonical form for IPv6 without using any IPv4 notation. Any IPv4 address is mapped into the ::ffff:0:0/96 netblock. The address 127.0.0.1 is printed as ::ffff:7f00:1.

--integer-ips

Print IP addresses as integers. This switch is equivalent to --ip-format=decimal, it is deprecated as of SiLK 3.7.0, and it will be removed in the SiLK 4.0 release.

--zero-pad-ips

Print IP addresses as fully-expanded, zero-padded values in their canonical form. This switch is equivalent to --ip-format=zero-padded, it is deprecated as of SiLK 3.7.0, and it will be removed in the SiLK 4.0 release

--sort-ips

For the --print-recs and --print-ips output formats, the results are presented sorted by IP address.

--no-titles

Turn off column titles. By default, titles are printed.

--no-columns

Disable fixed-width columnar output.

--column-separator=C

Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.

--no-final-delimiter

Do not print the column separator after the final column. Normally a delimiter is printed.

--delimited
--delimited=C

Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.

Print to the standard error the names of input files as they are opened.

--copy-input=PATH

Copy all binary SiLK Flow records read as input to the specified file or named pipe. PATH may be stdout or - to write flows to the standard output as long as the --output-path switch is specified to redirect rwaddrcount's textual output to a different location.

--output-path=PATH

Write the textual output to PATH, where PATH is a filename, a named pipe, the keyword stderr to write the output to the standard error, or the keyword stdout or - to write the output to the standard output (and bypass the paging program). If PATH names an existing file, rwaddrcount exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this switch is not given, the output is either sent to the pager or written to the standard output.

--pager=PAGER_PROG

When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the --output-path switch is given or if value of the pager is determined to be the empty string, no paging is performed and all output is written to the terminal.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwaddrcount searches for the site configuration file in the locations specified in the "FILES" section.

--legacy-timestamps
--legacy-timestamps=NUM

When NUM is not specified or is 1, this switch is equivalent to --timestamp-format=m/d/y. Otherwise, the switch has no effect. This switch is deprecated as of SiLK 3.0.0, and it will be removed in the SiLK 4.0 release.

--xargs
--xargs=FILENAME

Read the names of the input files from FILENAME or from the standard input if FILENAME is not provided. The input is expected to have one filename per line. rwaddrcount opens each named file in turn and reads records from it as if the filenames had been listed on the command line.

--help

Print the available options and exit.

--version

Print the version number and information about how SiLK was configured, then exit the application.

Deprecated Switches

The following switches are deprecated. They will be removed in SiLK 4.0.

--byte-min=BYTEMIN

Deprecated alias for --min-bytes.

--packet-min=PACKMIN

Deprecated alias for --min-packets.

--rec-min=RECMIN

Deprecated alias for --min-records.

--byte-max=BYTEMAX

Deprecated alias for --max-bytes.

--packet-max=PACKMAX

Deprecated alias for --max-packets.

--rec-max=RECMAX

Deprecated alias for --max-records.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash (\) is used to indicate a wrapped line.

To print out a set of IP's with exactly one tcp record during the time period, use:

 $ rwfilter --start-date=2003/09/01:00 --end-date=2003/09/01:12     \
        --proto=6 --pass=stdout                                     \
   | rwaddrcount --max-records=1 --print-ips

In general, to print out record information, use rwaddrcount with --print-recs

 $ rwfilter --start-date=2003/01/17:00 --end-date=2003/01/17:23     \
        --proto=6 --pass=stdout                                     \
   | rwaddrcount --print-rec | head -3

  10.10.10.1|  65792| 147|  21| 2003/01/17T00:19:01| 2003/01/17T02:00:13|
  10.10.10.2| 110744|  89|   7| 2003/01/17T01:21:42| 2003/01/17T01:39:21|
  10.10.10.3|    864|  18|   6| 2003/01/17T00:20:33| 2003/01/17T01:25:38|

ENVIRONMENT

SILK_IP_FORMAT

This environment variable is used as the value for --ip-format when that switch is not provided. Since SiLK 3.11.0.

SILK_TIMESTAMP_FORMAT

This environment variable is used as the value for --timestamp-format when that switch is not provided. Since SiLK 3.11.0.

SILK_PAGER

When set to a non-empty string, rwaddrcount automatically invokes this program to display its output a screen at a time. If set to an empty string, rwaddrcount does not automatically page its output.

PAGER

When set and SILK_PAGER is not set, rwaddrcount automatically invokes this program to display its output a screen at a time.

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwaddrcount may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwaddrcount may use this environment variable. See the "FILES" section for details.

TZ

When the argument to the --timestamp-format switch includes local or when a SiLK installation is built to use the local timezone, the value of the TZ environment variable determines the timezone in which rwaddrcount displays timestamps. (If both of those are false, the TZ environment variable is ignored.) If the TZ environment variable is not set, the machine's default timezone is used. Setting TZ to the empty string or 0 causes timestamps to be displayed in UTC. For system information on the TZ variable, see tzset(3) or environ(7). (To determine if SiLK was built with support for the local timezone, check the Timezone support value in the output of rwaddrcount --version.)

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

SEE ALSO

rwset(1), rwsetcat(1), rwstats(1), rwtotal(1), rwuniq(1), silk(7), tzset(3), environ(7)

NOTES

rwaddrcount only supports IPv4 addresses, and it will not be modified to support IPv6 addresses. To produce output similar to rwaddrcount for IPv6 addresses, use rwuniq(1):

 rwuniq --fields=sip --values=bytes,packets,records,stime,etime

When used in an IPv6 environment, rwaddrcount converts IPv6 flow records that contain addresses in the ::ffff:0:0/96 prefix to IPv4 and processes them. IPv6 records having addresses outside of that prefix are ignored.

rwaddrcount uses a fairly large hashtable to store data, but it is likely that as the amount of data expands, the application will take more time to process data.

Similar binning of records are produced by rwstats(1), rwtotal(1), and rwuniq(1).

To generate a list of IP addresses without the volume information, use rwset(1).