NAME

rwbagcat - Output a binary Bag file as text

SYNOPSIS

  rwbagcat [ --network-structure[=STRUCTURE] | --bin-ips[=SCALE]
             | --sort-counters[=ORDER]]
        [--print-statistics[=OUTFILE]]
        [--minkey=VALUE] [--maxkey=VALUE] [--mask-set=PATH]
        [--mincounter=VALUE] [--maxcounter=VALUE] [--zero-counts]
        [{ --pmap-file=PATH | --pmap-file=MAPNAME:PATH }]
        [--key-format=FORMAT] [--integer-keys] [--zero-pad-ips]
        [--no-columns] [--column-separator=C]
        [--no-final-delimiter] [{--delimited | --delimited=C}]
        [--output-path=PATH] [--pager=PAGER_PROG]
        [--site-config-file=FILENAME]
        [BAGFILE [BAGFILE...]]

  rwbagcat --help

  rwbagcat --version

DESCRIPTION

rwbagcat reads a binary Bag as created by rwbag(1) or rwbagbuild(1), converts it to text, and writes it to the standard output, to the pager, or to the specified output file. It can also print various statistics and summary information about the Bag.

As of SiLK 3.12.0, rwbagcat uses information in the Bag file's header to determine how to display the key column.

In addition, rwbagcat exits with an error when asked to use an IP format to display keys that are not IP addresses.

rwbagcat reads the BAGFILEs specified on the command line; if no BAGFILE arguments are given, rwbagcat attempts to read the Bag from the standard input. BAGFILE may be the keyword stdin or a hyphen (-) to allow rwbagcat to print data from both files and piped input. If any input does not contain a Bag, rwbagcat prints an error to the standard error and exits abnormally.

When multiple BAGFILEs are specified on the command line, each is handled individually. To process the files as a single Bag, use rwbagtool(1) to combine the bags and pipe the output of rwbagtool into rwbagcat.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

--network-structure
--network-structure=STRUCTURE

For each numeric value in STRUCTURE, group the IPs in the Bag into a netblock of that size and print the number of hosts, the sum of the counters, and, optionally, print the number of smaller, occupied netblocks that each larger netblock contains. When STRUCTURE begins with v6:, the IPs in the Bag are treated as IPv6 addresses, and any IPv4 addresses are mapped into the ::ffff:0:0/96 netblock. Otherwise, the IPs are treated as IPv4 addresses, and any IPv6 address outside the ::ffff:0:0/96 netblock is ignored. Aside from the initial v6: (or v4:, for consistency), STRUCTURE has one of following forms:

  1. NETBLOCK_LIST/SUMMARY_LIST. Group IPs into the sizes specified in either NETBLOCK_LIST or SUMMARY_LIST. rwbagcat prints a row for each occupied netblock specified in NETBLOCK_LIST, where the row lists the base IP of the netblock, the sum of the counters for that netblock, the number of hosts, and the number of smaller, occupied netblocks having a size that appears in either NETBLOCK_LIST or SUMMARY_LIST. (The values in SUMMARY_LIST are only summarized; they are not printed.)

  2. NETBLOCK_LIST/. Similar to the first form, except all occupied netblocks are printed, and there are no netblocks that are only summarized.

  3. NETBLOCK_LISTS. When the character S appears anywhere in the NETBLOCK_LIST, rwbagcat provides a default value for the SUMMARY_LIST. That default is 8,16,24,27 for IPv4, and 48,64 for IPv6.

  4. NETBLOCK_LIST. When neither S nor / appear in STRUCTURE, the output does not include the number of smaller, occupied netblocks.

  5. Empty. When STRUCTURE is empty or only contains v6: or v4:, the NETBLOCK_LIST prints a single row for the total network (the /0 netblock) giving the number of hosts, the sum of the counters, and the number of smaller, occupied netblocks using the same default list specified in form 3.

NETBLOCK_LIST and SUMMARY_LIST contain a comma separated list of numbers between 0 (the total network) and the size for an individual host (32 for IPv4 or 128 for IPv6). The characters T and H may be used as aliases for 0 and the host netblock, respectively. In addition, when parsing the lists as IPv4 netblocks, the characters A, B, C, and X are supported as aliases for 8, 16, 24, and 27, respectively. A comma is not required between adjacent letters. The --network-structure switch disables printing of the IPs in the Bag file; specify the H argument to the switch to print each individual IP address and its counter.

The --network-structure switch may not be combined with the --bin-ips or --sort-counters switches. As of SiLK 3.12.0, rwbagcat exits with an error if the --network-structure switch is used on a Bag file whose key-type is neither custom nor an IP address type.

--bin-ips
--bin-ips=SCALE

Invert the bag and count the total number of unique keys for a given value of the volume bin. For example, turn a Bag {sip:flow} into {flow:count(sip)}. SCALE is a string containing the value linear, binary, or decimal.

  • The default behavior is linear: Each distinct counter gets its own bin. Any counter in the input Bag file that is larger than the maximum possible key will be attributed to the maximum key; to prevent this, specify --maxcounter=4294967295 which discards bins whose counter value does not fit into a key.

  • binary creates a bag of {log2(flow):count(sip)}. Bin n contains counts in the range [ 2^n, 2^(n+1) ).

  • decimal creates one hundred bins for each counter in the range [1,100), and one hundred bins for each counter in the range [100,1000), each counter in the range [1000,10000), etc. Counters are logarithmically distributed among the bins.

The --bin-ips switch may not be combined with the --network-structure or --sort-counters switches. See also the --invert switch on rwbagtool(1) which inverts a bag using a linear scale and creates a new binary bag file.

--sort-counters
--sort-counters=ORDER

Sort the output so the counters are presented in either decreasing or increasing order. Typically the output is sorted by the keys. If the ORDER argument is not given to the switch, the counters are printed in decreasing order. Valid values for ORDER are

decreasing

Print the maximum counter first. This is the default.

increasing

Print the minimum counter first.

When two counters have the same value, the smaller key is displayed first. The --sort-counters switch may not be combined with the --network-structure or --bin-ips switches. Since SiLK 3.12.2.

Print a breakdown of the network hosts seen, and print general statistics about the keys and counters. When --print-statistics is specified, no other output is produced unless one of --sort-counters, --network-structure, or --bin-ips is also specified. When the OUTFILE argument is not given, the statistics are written to the standard output or to the pager if output is to a terminal. OUTFILE is a filename, named pipe, the keyword stderr to write to the standard error, or the keyword stdout or - to write to the standard output. If OUTFILE names an existing file, rwbagcat exits with an error unless the SILK_CLOBBER environment variable is set, in which case OUTFILE is overwritten. The output statistics produced by this switch are:

  • count of unique keys

  • sum of all the counters

  • minimum key

  • maximum key

  • minimum counter

  • maximum counter

  • mean of counters

  • variance of counters

  • standard deviation of counters

  • skew of counters

  • kurtosis of counters

  • count of nodes allocated

  • total bytes allocated for nodes

  • count of leaves allocated

  • total bytes allocated for leaves

  • density of the data

--minkey=VALUE

Output records whose key value is at least VALUE. VALUE may be an IP address or an integer in the range 0 to 4294967295 inclusive. The default is to print all records with a non-zero counter.

--maxkey=VALUE

Output records whose key value is not more than VALUE. VALUE may be an IP address or an integer in the range 0 to 4294967295 inclusive. The default is to print all records with a non-zero counter.

--mask-set=PATH

Output records whose key appears in the binary IPset read from the file PATH. (To build an IPset, use rwset(1) or rwsetbuild(1).) When used with --minkey and/or --maxkey, output records whose key is in the IPset and is also within when the specified range. As of SiLK 3.12.0, rwbagcat exits with an error if the --mask-set switch is used on a Bag file whose key-type is neither custom nor an IP address type.

--mincounter=VALUE

Output records whose counter value is at least VALUE. VALUE is an integer in the range 1 to 18446744073709551615. The default is to print all records with a non-zero counter; use --zero-counts to show records whose counter is 0.

--maxcounter=VALUE

Output records whose counter value is not more than VALUE. VALUE is an integer in the range 1 to 18446744073709551615, with the default being the maximum counter value.

--zero-counts

Print keys whose counter is zero. Normally, keys with a counter of zero are suppressed since all keys have a default counter of zero. In order to use this flag, either --mask-set or both --minkey and --maxkey must be specified. When this switch is specified, any counter limit explicitly set by the --maxcounter switch is also applied.

--pmap-file=PATH
--pmap-file=MAPNAME:PATH

Use the prefix map file located at PATH to map the key to a string when the type of the Bag's key is one of sip-pmap, dip-pmap, any-ip-pmap, sport-pmap, dport-pmap, or any-port-pmap. This switch is required for Bag files whose key was derived from a prefix map file. The type of the prefix map file must match the key's type, but a different prefix map file may be used. Specify PATH as - or stdin to read from the standard input. A map-name may be included in the argument to the switch, but rwbagcat currently does not use the map-name. To create a prefix map file, use rwpmapbuild(1). Since SiLK 3.12.0.

--key-format=FORMAT

Specify the format to use when printing the keys. When this switch is not specified, a Bag whose keys are known not to be IP addresses are printed as decimal numbers, and the keys for all other Bags are printed as IP addresses in the canonical format. The FORMAT is one of:

canonical

Print keys as IP addresses in the canonical format: dotted quad for IPv4 (127.0.0.1) and hexadectet for IPv6 (2001:db8::1). Note that IPv6 addresses in ::ffff:0:0/96 and some IPv6 addresses in ::/96 will be printed as a mixture of IPv6 and IPv4. As of SiLK 3.12.0, rwbagcat exits with an error when this format is used on a Bag file whose key-type is neither custom nor an IP address type.

zero-padded

Print keys as IP addresses in their canonical form, but add zeros to the output so it fully fills the width of column. The addresses 127.0.0.1 and 2001:db8::1 are printed as 127.000.000.001 and 2001:0db8:0000:0000:0000:0000:0000:0001, respectively. As of SiLK 3.12.0, rwbagcat exits with an error when this format is used on a Bag file whose key-type is neither custom nor an IP address type.

decimal

Print keys as integers in decimal format. The addresses 127.0.0.1 and 2001:db8::1 are printed as 2130706433 and 42540766411282592856903984951653826561, respectively.

hexadecimal

Print keys as integers in hexadecimal format. The addresses 127.0.0.1 and 2001:db8::1 are printed as 7f000001 and 20010db8000000000000000000000001, respectively.

force-ipv6

Print all keys as IP addresses in the canonical form for IPv6 without using any IPv4 notation. Any integer key or IPv4 address is mapped into the ::ffff:0:0/96 netblock. The addresses 127.0.0.1 and 2001:db8::1 are printed as ::ffff:7f00:1 and 2001:db8::1, respectively. As of SiLK 3.12.0, rwbagcat exits with an error when this format is used on a Bag file whose key-type is neither custom nor an IP address type.

timestamp

Print keys as time in standard SiLK format: yyyy/mm/ddThh:mm:ss. May be combined with utc or localtime. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

iso-time

Print keys as time in the ISO time format yyyy-mm-dd hh:mm:ss. May be combined with utc or localtime. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

m/d/y

Print keys as time in the format mm/dd/yyyy hh:mm:ss. May be combined with utc or localtime. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

utc

Print the keys as time in UTC. If no other time-related key-format is provided, formats the time using the timestamp format. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

localtime

Print as the keys as time and get the timezone from either the TZ environment variable or local machine. If no other time-related key-format is provided, formats the time using the timestamp format. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

epoch

Print keys as seconds since UNIX epoch. May only be used on keys whose type is custom or a time value. Since SiLK 3.12.0.

--integer-keys

This switch is equivalent to --key-format=decimal, it is deprecated as of SiLK 3.7.0, and it will be removed in the SiLK 4.0 release.

--zero-pad-ips

This switch is equivalent to --key-format=zero-padded, it is deprecated as of SiLK 3.7.0, and it will be removed in the SiLK 4.0 release.

--no-columns

Disable fixed-width columnar output.

--column-separator=C

Use specified character between columns and after the final column. When this switch is not specified, the default of '|' is used.

--no-final-delimiter

Do not print the column separator after the final column. Normally a delimiter is printed. When the network summary is requested (--network-structure=S), the separator is always printed before the summary column and never after that column.

--delimited
--delimited=C

Run as if --no-columns --no-final-delimiter --column-sep=C had been specified. That is, disable fixed-width columnar output; if character C is provided, it is used as the delimiter between columns instead of the default '|'.

--output-path=PATH

Write the textual output of the --network-structure, --bin-ips, or --sort-counters switch to PATH, where PATH is a filename, a named pipe, the keyword stderr to write the output to the standard error, or the keyword stdout or - to write the output to the standard output (and bypass the paging program). If PATH names an existing file, rwbagcat exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this option is not given, the output is either sent to the pager or written to the standard output.

--pager=PAGER_PROG

When output is to a terminal, invoke the program PAGER_PROG to view the output one screen full at a time. This switch overrides the SILK_PAGER environment variable, which in turn overrides the PAGER variable. If the --output-path switch is given or if the value of the pager is determined to be the empty string, no paging is performed and all output is written to the terminal.

--site-config-file=FILENAME

Read the SiLK site configuration from the named file FILENAME. When this switch is not provided, rwbagcat searches for the site configuration file in the locations specified in the "FILES" section. Since SiLK 3.15.0.

--help

Print the available options and exit.

--version

Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ($) represents the shell prompt. The text after the dollar sign represents the command line.

Printing a bag

To print the contents of the bag file mybag.bag:

 $ rwbagcat mybag.bag
      172.23.1.1|              5|
      172.23.1.2|            231|
      172.23.1.3|              9|
      172.23.1.4|             19|
   192.168.0.100|              1|
   192.168.0.101|              1|
   192.168.0.160|             15|
  192.168.20.161|              1|
  192.168.20.162|              5|
  192.168.20.163|              5|

Displaying number of hosts by network

To print the bag with a full network breakdown:

 $ rwbagcat --network-structure=TABCHX mybag.bag
           172.23.1.1      |              5|
           172.23.1.2      |            231|
           172.23.1.3      |              9|
           172.23.1.4      |             19|
         172.23.1.0/27     |            264|
       172.23.1.0/24       |            264|
     172.23.0.0/16         |            264|
   172.0.0.0/8             |            264|
           192.168.0.100   |              1|
           192.168.0.101   |              1|
         192.168.0.96/27   |              2|
           192.168.0.160   |             15|
         192.168.0.160/27  |             15|
       192.168.0.0/24      |             17|
           192.168.20.161  |              1|
           192.168.20.162  |              5|
           192.168.20.163  |              5|
         192.168.20.160/27 |             11|
       192.168.20.0/24     |             11|
     192.168.0.0/16        |             28|
   192.0.0.0/8             |             28|
 TOTAL                     |            292|

In the above, lines that include a CIDR prefix display the sum of the preceding hosts. For example, there are 264 hosts in the 172.23.1.0/27 net-block.

To show an abbreviated network structure by class A and C only, including summary information:

 $ rwbagcat --network-structure=ACS mybag.bag
     172.23.1.0/24     |            264| 4 hosts in 1 /27
 172.0.0.0/8           |            264| 4 hosts in 1 /16, 1 /24, and 1 /27
     192.168.0.0/24    |             17| 3 hosts in 2 /27s
     192.168.20.0/24   |             11| 3 hosts in 1 /27
 192.0.0.0/8           |             28| 6 hosts in 1 /16, 2 /24s, and 3 /27s

Overriding the key type

Suppose a key-type of a bag file is duration:

 $ rwfileinfo --field=bag Bag2.bag
 Bag2.bag:
   bag          key: duration @ 4 octets; counter: custom @ 8 octets

rwbagcat complains when the --key-format switch lists a format that it thinks is "nonsensical" for that type of key.

 $ rwbagcat --key-format=utc Bag2.bag
 rwbagcat: Invalid key-format 'utc':
        Nonsensical for Bag containing duration keys

 $ rwbagcat --key-format=canonical Bag2.bag
 rwbagcat: Invalid key-format 'canonical':
        Nonsensical for Bag containing duration keys

To use the --key-format one time and leave the key-type in the Bag file unchanged, you may merge the bag with an empty bag file: Use rwbagbuild(1) to create an empty bag that uses the custom key type, add the empty bag to Bag2.bag using rwbagtool(1), then display the result:

 $ rwbagbuild --bag-input=/dev/null   \
   | rwbagtool --add Bag2.bag stdin   \
   | rwbagcat --key-format=utc
 1970/01/01T00:00:01|                   1|
 1970/01/01T00:00:04|                   2|
 1970/01/01T00:00:07|                  32|
 1970/01/01T00:00:08|                   2|

 $ rwbagbuild --bag-input=/dev/null   \
   | rwbagtool --add Bag2.bag -       \
   | rwbagcat --key-format=canonical
         0.0.0.1|                   1|
         0.0.0.4|                   2|
         0.0.0.7|                  32|
         0.0.0.8|                   2|

To rewrite the bag file with a different key type, print the bag file as text and use rwbagbuild to build a new bag file:

 $ rwbagcat Bag2.bag    \
   | rwbagbuild --bag-input=- --key-type=sipv4

Inverting a bag

Inverting a bag means counting the number of times each counter appears in the bag.

To bin the number of IP addresses that had each flow count:

 $ rwbagcat --bin-ips mybag.bag
               1|              3|
               5|              3|
               9|              1|
              15|              1|
              19|              1|
             231|              1|

The output shows that the bag contains 3 source hosts that had a single flow, 3 hosts that had 5 flows, and four hosts that each had a unique flow count (9, 15, 19, and 231).

For a log2 breakdown of the counts:

 $ rwbagcat --bin-ips=binary mybag.bag
    2^0 to 2^1-1|              3|
    2^2 to 2^3-1|              3|
    2^3 to 2^4-1|              2|
    2^4 to 2^5-1|              1|
    2^7 to 2^8-1|              1|

Sorting the bag by counter value

rwbagcat normally presents the data in order of increasing key value. To sort based on the counter value, specify the --sort-counter switch. When sorting by the counter value, the default order is from maximum counter to minimum counter.

 $ rwbagcat --sort-counter mybag.bag
      172.23.1.2|                 231|
      172.23.1.4|                  19|
   192.168.0.160|                  15|
      172.23.1.3|                   9|
      172.23.1.1|                   5|
  192.168.20.162|                   5|
  192.168.20.163|                   5|
   192.168.0.100|                   1|
   192.168.0.101|                   1|
  192.168.20.161|                   1|

To change the sort order, specify the increasing argument to the --sort-counter switch:

 $ rwbagcat --sort-counter=increasing mybag.bag
   192.168.0.100|                   1|
   192.168.0.101|                   1|
  192.168.20.161|                   1|
      172.23.1.1|                   5|
  192.168.20.162|                   5|
  192.168.20.163|                   5|
      172.23.1.3|                   9|
   192.168.0.160|                  15|
      172.23.1.4|                  19|
      172.23.1.2|                 231|

For keys have the same counter value, the order of the keys is consistent (always from low to high) regardless how the counters are sorted. The following output is limited to those keys whose value is 5. The output is first shown without the --sort-counter switch, then with the data sorted by increasing and decreasing counter value.

 $ rwbagcat --delim=, mybag.bag | grep ,5
 172.23.1.1,5
 192.168.20.162,5
 192.168.20.163,5

 $ rwbagcat --delim=, --sort-counter=increasing mybag.bag | grep ,5
 172.23.1.1,5
 192.168.20.162,5
 192.168.20.163,5

 $ rwbagcat --delim=, --sort-counter=decreasing mybag.bag | grep ,5
 172.23.1.1,5
 192.168.20.162,5
 192.168.20.163,5

Displaying bags that use prefix map values as the key

rwbag(1) and rwbagbuild(1) can use a prefix map file as the key in a bag file as of SiLK 3.12.0. When attempting to display these Bag files, you must specify the --pmap-file switch on the rwbagcat command line for it to map each prefix map value to its label. If the --pmap-file is not given, rwbagcat displays an error.

 $ rwbagcat service.bag
 rwbagcat: The --pmap-file switch is required for \
         Bags containing sport-pmap keys

In addition, the type of the prefix map file must match the key-type in the bag file: a prefix map type of IPv4-address or IPv6-address when the key was mapped from an IP address, and a prefix map type of proto-port when the key was mapped from a protocol-port pair. The type of key in a bag may be determined by rwfileinfo(1).

 $ rwfileinfo --fields=bag service.bag
 service.bag:
   bag          key: sport-pmap @ 4 octets; counter: custom @ 8 octets

 $ rwbagcat --pmap-file=ip-map.pmap service.bag
 rwbagcat: Cannot use IPv4-address prefix map for \
        Bag containing sport-pmap keys

 $ rwbagcat --pmap-file=port-map.pmap service.bag
   TCP/SSH|                   1|
  TCP/SMTP|                 800|
  TCP/HTTP|                5642|

The only check rwbagcat makes is whether the prefix map file is the correct type. A different prefix map file may be used. If a value in the bag file does not have an index in the prefix map file, the numeric index of the label is displayed as shown in the following example which creates a prefix map with a single label.

 $ echo 'label 1 none'                                      \
   | rwpmapbuild --mode=proto-port --input-file=-           \
        --output-file=tmp.pmap
 $ rwbagcat --pmap-file=tmp.pmap service.bag
   7|                   1|
   8|                 800|
   9|                5642|

Displaying statistics

 $ rwbagcat --print-statistics mybag.bag

 Statistics
     number of keys:  10
    sum of counters:  292
        minimum key:  172.23.1.1
        maximum key:  192.168.20.163
    minimum counter:  1
    maximum counter:  231
               mean:  29.2
           variance:  5064
 standard deviation:  71.16
               skew:  2.246
           kurtosis:  8.1
    nodes allocated:  0 (0 bytes)
    counter density:  inf%

ENVIRONMENT

SILK_CLOBBER

The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SILK_PAGER

When set to a non-empty string, rwbagcat automatically invokes this program to display its output a screen at a time. If set to an empty string, rwbagcat does not automatically page its output.

PAGER

When set and SILK_PAGER is not set, rwbagcat automatically invokes this program to display its output a screen at a time.

SILK_CONFIG_FILE

This environment variable is used as the value for the --site-config-file when that switch is not provided.

SILK_DATA_ROOTDIR

This environment variable specifies the root directory of data repository. As described in the "FILES" section, rwbagcat may use this environment variable when searching for the SiLK site configuration file.

SILK_PATH

This environment variable gives the root of the install tree. When searching for configuration files, rwbagcat may use this environment variable. See the "FILES" section for details.

TZ

When the argument to the --key-format switch includes localtime or when a SiLK installation is built to use the local timezone, the value of the TZ environment variable determines the timezone in which rwbagcat displays timestamps. (If both of those are false, the TZ environment variable is ignored.) If the TZ environment variable is not set, the machine's default timezone is used. Setting TZ to the empty string or 0 causes timestamps to be displayed in UTC. For system information on the TZ variable, see tzset(3) or environ(7). (To determine if SiLK was built with support for the local timezone, check the Timezone support value in the output of rwbagcat --version.)

FILES

${SILK_CONFIG_FILE}
${SILK_DATA_ROOTDIR}/silk.conf
/data/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/share/silk/silk.conf
/usr/share/silk.conf

Possible locations for the SiLK site configuration file which are checked when the --site-config-file switch is not provided.

SEE ALSO

rwbag(1), rwbagbuild(1), rwbagtool(1), rwpmapbuild(1), rwfileinfo(1), rwset(1), rwsetbuild(1), silk(7), tzset(3), environ(7)