Use of the SiLK system and related source code is subject to the terms of the following licenses:
SiLK, the System for Internet-Level Knowledge, is a collection of traffic analysis tools developed by the CERT Network Situational Awareness Team (CERT NetSA) to facilitate security analysis of large networks. The SiLK tool suite supports the efficient collection, storage, and analysis of network flow data, enabling network security analysts to rapidly query large historical traffic data sets. SiLK is ideally suited for analyzing traffic on the backbone or border of a large, distributed enterprise or mid-sized ISP.
SiLK supports the collection of the following types of flow data:
This handbook provides instructions to configure and install the SiLK Collection and Analysis Suite. It is intended for individuals comfortable with the following tasks:
Additionally, if SiLK will be accepting NetFlow data from a router, the installer should be comfortable with router configuration.
In order to build SiLK, you will need to have:
To get the full functionality of SiLK, these additional libraries and their header files are recommended:
Note that many Linux systems have one package for the run-time shared libraries and another for the header files, and both must be installed when building SiLK from source. For example, to build SiLK with zlib support on a Red Hat Enterprise Linux AS release 4 system, you will need to install both the zlib-1.2.1.2-1.2 and the zlib-devel-1.2.1.2-1.2 RPMs (your version numbers may be different).
When building on a Linux system, the following packages are recommended:
New releases of SiLK are always capable of reading SiLK Flow data files created by previous releases of SiLK, and support for nearly all other SiLK file formats is maintained in newer releases. When upgrading to a new release of SiLK in an enterprise that uses separate collection, packing, and analysis machines, you should upgrade the analysis host(s) first, then the packing host(s), and finally the collectors. You may also choose to only upgrade the analysis hosts, and leave the packing and collection hosts at previous releases.
In addition, note that any change to the SiLK file formats will only occur when a change is made to the major or minor version numbers of SiLK (the SiLK version number follows the pattern major.minor.revision). Practically, this means that you can upgrade a collection machine to a newer release, say SiLK-0.13.9, and yet maintain the packing machines at an older release, SiLK-0.13.2. (These version numbers are for illustrative purposes only.) However, a bump in the minor version number does not always signal a change to the SiLK file formats. An analysis host at SiLK-0.13.2 may be able to read files created by SiLK-0.14.1 on the packing host; it depends on whether the SiLK file formats changed at SiLK-0.14.0. Changes to the SiLK file formats are always documented in the release notes, which are included in the source distribution and are available on the web site (http://tools.netsa.cert.org/silk/).
There are two categories of applications that comprise a SiLK installation:
Installation of the analysis tools is relatively straightforward since they are installed on systems that have direct access to the SiLK data files and require little configuration.
Installing the packing tools is more complex: the tools run as background processes (with every operating system having a unique way to start these processes) that must cooperate with each other and with additional software and/or network devices. The packing tools are designed to provide a great amount of flexibility in their installation, and with this flexibility comes additional complexity. The tools that make up the SiLK packing system are:
There are several possible configurations of the SiLK system which are introduced in this chapter. The detailed installation instructions are presented in subsequent chapters. In the subsections that follow, the term “remote” is with respect to the machine where rwflowpack is running.
In the single machine (all-in-one) configuration, all processing occurs on a single machine: You configure the rwflowpack program to collect flows, convert them to the SiLK Flow format, categorize them, and store the SiLK Flow records to the local disk. The analysis tools are installed on this same machine and read the files from local disk. Figure 1.1 shows how this configuration would look when flows are collected from a NetFlow router, and Figure 1.2 shows this configuration when the YAF flow collector is used.
This is the simplest complete installation. To use it, follow the instructions in Section 2 to configure and build the source code, Section 3 to customize the analysis tools, and Section 4 to configure rwflowpack.
It is not uncommon to have a situation in which the sensor(s) generating the flow records are not close to the data storage location. You could configure the flow generators to send the data to the data storage location; however, due to network reliability and bandwidth issues, it is desirable to collect flow data as close to where it is produced as possible. (This is especially true if the flow generator uses an unreliable transport protocol, such as UDP-based NetFlow generated by a router.) In these situations, the flowcap daemon can be installed on a machine close to the sensor where it will collect, compress, and forward the data to rwflowpack for packing.
Also, suppose the machine where rwflowpack is running is not the same machine on which you are storing the SiLK Flow files, or perhaps you want the SiLK files to be available on multiple machines for use by groups of analysts. In such cases, you configure rwflowpack to write the SiLK Flows into small files called incremental files, and these incremental files are distributed over the network to machine(s) where the rwflowappend daemon writes the SiLK Flow records to their final location. The analysis tools read the records from this final location.
This configuration is the most complex and it is illustrated in Figure 1.3 collecting NetFlow. When the YAF flow collector is used, the top third of the drawing would resemble Figure 1.4.
In this configuration, the rwsender and rwreceiver daemons transfer files between the machines. rwsender monitors a directory and transfers the files it finds there to one or more rwreceivers on the downstream side. rwreceiver accepts files from one or more rwsenders and places the files into a directory where the next tool in the packing chain can process them.
rwsender and rwreceiver only transfer files; they do not consider the contents of the files. Instead of using rwsender and rwreceiver, you could (with some stipulations) use other software, such as rsync or scp, to transfer the files between the machines.
If this describes your installation, follow the instructions in Section 2 to install SiLK on each machine, in Section 3 to customize the analysis tools on each machine where analysis occurs, and in Section 5 to configure the daemons on all the machines where the packing tools run.
This configuration is a subset of the previous one: flowcap is used to capture the flows near the point where they are generated, and the rwsender and rwreceiver daemons transfer the flows to the machine where rwflowpack packs them and the analysis tools process them. Figure 1.5 depicts this configuration with a NetFlow router. When a YAF sensor is used, the top half of the figure would be replaced with Figure 1.4.
This installation will largely follow the same instructions as those described previously; however, the configuration of rwflowpack is slightly different as described in Section 6. That section will refer you to the parts of Section 5 you must follow to configure flowcap. You will use Section 3 to configure the the analysis tools on the machine where rwflowpack is installed.
This configuration, shown in Figure 1.6, is also a subset of that described in Section 1.3.2, except that rwflowpack is used to collect the flows instead of flowcap.
For this configuration, you will install the source code on the packing machine and the analysis machine (Section 2), customize the analysis tools on the machine where rwflowappend is to run (Section 3), and configure rwflowpack and rwflowappend (Section 7).
Finally, if you only plan to use the software to analyze existing SiLK Flow files and/or packet capture (pcap) data such as that created by tcpdump, you would use this configuration (Figure 1.7). For this configuration, you need to build the source code (Section 2) and customize the analysis tools (Section 3).
The instructions in the next two sections of this handbook will allow you to use SiLK to analyze existing SiLK files and analyze packet capture (pcap) data such as that created by tcpdump: Section 2 describes how to configure and install the SiLK software from source, and Section 3 describes how to customize the analysis tools to get the most use from the system.
The other sections of the handbook describe how to use SiLK to capture flow data, categorize the flows as incoming or outgoing, convert the data to the SiLK format, and store the SiLK Flows in binary flat files indexed by hour, sensor, and direction: The simplest configuration is the Single machine configuration (Section 4), where one machine collects the flow records, packs them, and stores them locally for use by the analysis tools. Having collection, categorization, and storage on separate machines is the most complex configuration (Section 5), and other configurations are possible (Sections 6 and 7).
Section 8 describes how to configure the flow generator to send its data to the SiLK collector(s).
To assist you in the configuration process, Appendix A describes how SiLK categorizes flows as incoming or outgoing (including a description of the data storage hierarchy), and Appendix B provides instructions on how to collect NetFlow data from the router and use that data as part of the configuration.
This handbook describes the installation of SiLK. For a discussion of the analysis tools, see their individual manual pages, the complete set of manual pages in The SiLK Reference Guide, and the tutorial information in Using SiLK for Network Traffic Analysis: Analysts’ Handbook. These documents are available at http://tools.netsa.cert.org/silk/docs.html.
In this section you will
(You may need to become the root user to install the software.)
You may continue to Section 3.
Download and unpack the source code distribution:
For the remainder of these instructions, the full path to the top of the source tree (i.e., the silk-2.5.0 directory, which contains the configure file) will be referred to as $SUITEROOT; it may be set in your (Bourne-compatible) shell by entering the command:
You should decide where to install the tools and where your SiLK Flow data files will reside, and specify this information to the configure script. Some of these locations are compiled into the code, and others are used to initialize the start-up scripts and configuration files for rwflowpack and the other packing tool daemons.
This value will be compiled into the analysis tools, and it will be the default location that rwfilter uses when looking for the hourly data files. This directory must be accessible by the final program in the packing chain (typically rwflowpack) which writes the packed SiLK flow files and by the analysis machine(s) which reads them. The path to the directory tree can be different on the analysis and packing machines, as long as the actual physical location is the same.
When running the tools, the value of the SILK_DATA_ROOTDIR environment variable will override this compiled-in value. In addition, rwfilter allows you to override this value with the --data-rootdir switch.
For historical reasons, the default value for this location is /data. We use a separate disk for the SiLK flow data since the space it requires can be large and depends on the size of the monitored network, the amount of traffic the network sees, and the aging policy for historical data.
The following table shows the subdirectories of $SILK_PATH where files are normally installed, but you can change these by specifying switches to configure. Use configure’s --help switch to see the full list of directory choices.
| bin | analysis tools, such as rwfilter |
| sbin | system administrator tools, for example rwflowpack |
| share/man | manual pages |
| lib/silk | optional plug-in support, such as PySiLK support |
| share/silk | support files, such as the country-code mapping file |
| share/silk/etc | sample configuration files and scripts to assist the system administrator in running the packing system daemons |
| etc | configuration files used by the packing system daemons (see SCRIPT_CONFIG_LOCATION below) |
| var | directory root used by packing tools (see DAEMON_STATE_DIRECTORY below) |
| var/log | log files generated by the packing system daemons |
| var/lib | incomplete data files generated by the packing tools and files awaiting processing |
| lib | libraries required to run the tools and used to build end-user plug-ins |
| include/silk | header files used to build end-user plug-ins |
Note: The applications work best when they have access to configuration files and plug-ins, and the code that searches for these files depend on the directory tree as it will be upon installation. If you do not plan to use the tools outside of your own tree, you may want to specify --prefix=‘pwd‘ (note the back quotes) to the configure script. When you run make install, the tools will be installed into the top of the source tree.
To adapt the source code to your operating system and environment, the configure shell script will run several tests to check for various features. By giving command line switches to configure, you can include additional features or instruct configure to use libraries from particular locations. You can also control where SiLK will be installed. You can display the full list of switches that configure accepts by running configure --help. The remainder of this section describes many of these switches.
SiLK provides support for accessing SiLK flow records from within Python and for using Python code as part of an rwfilter invocation. You may also use Python code to create arbitrary fields to use in rwcut, rwgroup, rwsort, rwstats, and rwuniq. This support is called PySiLK and it requires Python 2.x, where x is 4 or greater. Currently Python 3.x is not supported. For information on using PySiLK, see SiLK in Python, available from http://tools.netsa.cert.org/silk/docs.html. You may also consult the manual pages for pysilk, silkpython, and the various applications.
To include PySiLK support, you must provide the --with-python switch to configure. To use a particular Python interpreter, you may use --with-python=path .
By default, the PySiLK modules will be installed into Python’s standard location for third-party modules. (Writing to this location usually requires that you are a system administrator.) To install the modules in the SiLK installation tree ($SILK_PATH), specify --with-python-prefix when running configure. You may also use --with-python-prefix=path to specify a different install prefix, or --with-python-site-dir=path to specify an explicit directory.
If the PySiLK module is installed outside of Python’s standard search locations, you will need to set or modify the PYTHONPATH environment variable to allow Python to find the PySiLK module.
Some SiLK applications have been modified to support handling IPv6 addresses. To enable this behavior, specify the --enable-ipv6 switch on the configure command line. Currently, SiLK supports collecting IPv6 data from IPFIX and NetFlow v9 flow generators, which requires that you build and install libfixbuf (see 2.3.5) before installing SiLK.
To reduce the size of the data files, the rwflowpack daemon and many analysis tools have the ability to use an external library to automatically compress their binary output when writing and uncompress their input when reading. (This compression occurs on the ‘data’ section of the file; the file’s header remains uncompressed.) You can specify whether a particular tool uses this external compression via a switch on the tool’s command line. The default setting for this behavior is determined by the --enable-output-compression=type switch to configure. SiLK supports the following parameters to the switch:
| none | use no compression; this is the default |
| zlib | use the widely available zlib general compression library |
| lzo1x | use the LZO real-time data compression library |
The latter two options require the support of external libraries as described in the next section.
If you specify --enable-output-compression with no type, the compression will default to the first available method of lzo1x, zlib, or none.
The configure script will attempt to find the zlib general compression library and its header file. Specifying the --with-zlib=dir switch tells configure that the header and library are located in dir/include/zlib.h and dir/lib/libz.a, respectively.
Note: Several operating system vendors distribute the libraries and header files in separate packages. To take zlib on RedHat as an example, the zlib package contains the zlib library, and the header file (and manual page) is in the separate zlib-devel package. In order to build SiLK from source, you need to have both packages installed.
The configure script will also attempt to find the LZO (http://www.oberhumer.com/opensource/lzo/) real-time data compression library and headers. SiLK will work with LZO 1.08, LZO 2.02, or LZO 2.03. You may use the --with-lzo=dir switch to specify the location of LZO.
When SiLK is compiled with libfixbuf support, the SiLK packer can read NetFlow v9 flow records or flow data generated by an IPFIX (Internet Protocol Flow Information eXport) compliant flow generator such as the YAF flow sensor technology (http://tools.netsa.cert.org/yaf/).
libfixbuf is a separate library; it does not come as part of SiLK. You must download it from http://tools.netsa.cert.org/fixbuf/ and install it prior to installing SiLK. For IPFIX support, SiLK requires libfixbuf-0.7.3 or later. For NetFlow v9 support, SiLK requires libfixbuf-1.1.0 or later.
When the SiLK packer reads IPFIX records, the SiLK data files contain additional information: the TCP flags are broken into two fields, one containing the flags on the first packet of a flow and the other containing the flags on all other packets in the flow.
If configure finds libfixbuf, the rwipfix2silk and rwsilk2ipfix command line tools will also be built. These tools support converting between the SiLK Flow record format and IPFIX.
The configure script will look for the pkg-config(1) specification file for libfixbuf (libfixbuf.pc) in the standard pkg-config directories, and if libfixbuf is installed in a standard location, configure should be able to locate it. If you have installed libfixbuf but configure does not find it, you can run configure with the --with-libfixbuf=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The libfixbuf.pc file is normally installed in the lib/pkgconfig subdirectory of the location where libfixbuf was installed.
You may ignore this section if you are not collecting NetFlow v9 flow records from a Cisco ASA router.
As noted in the previous section, SiLK can store NetFlow v9 flow records when it is configured with support for libfixbuf 1.1.0 or later. NetFlow v9 is template based, and the default NetFlow v9 template used by the Cisco ASA router lacks a packetTotalCount information element. (Cisco considers this missing element a low-priority bug.)
When rwflowpack or flowcap processes a NetFlow record lacking the packetTotalCount field, the application treats the record as having a packet count of zero.
The default file formats that rwflowpack uses to store IPv4 flow records in the data repository require the packet count to be non-zero (the formats store packets and a bytes-per-packet ratio; the formats do not store bytes). An attempt to store an IPv4 record with a packet count of zero in the respository causes rwflowpack to drop the flow record and to print the message “Record’s packet count is zero while writing to file.”
Other SiLK file formats allow a packet count of zero, including the file format that rwflowpack uses to store IPv6 flow records. However, the analysis tools in SiLK expect the packet count to be non-zero, and it is unknown how they will act when encountering SiLK Flow records that report zero packets.
To work around the bug in template used by the Cisco ASA router, the configure script, as of SiLK-2.5.0, provides the --enable-asa-zero-packet-hack switch. This switch affects SiLK in the following ways:
Using the --enable-asa-zero-packet-hack switch has no effect if configure is unable to find libfixbuf 1.1.0 or newer.
The packing logic used by rwflowpack to categorize flow records as incoming or outgoing, web or non-web, et cetera, is determined by a plug-in that is loaded when rwflowpack is invoked. The name of this plug-in must be passed to rwflowpack via the --packing-logic switch.
Using a plug-in for flow categorization makes it easier to change the packing logic or to test new categorization schemes. However, it requires that the plug-in be available and that you not have disabled plug-in support by building statically-linked applications (Section 2.3.9).
If you wish to compile the packing-logic into rwflowpack, you must specify the --enable-packing-logic switch when you run configure. The argument to this switch is the C source file containing the packing logic to use for this SiLK installation. For example, if you wish to use the twoway packing logic described in Appendix A, run
All of the SiLK applications (i.e., both the analysis tools and the packing [flow collection and storage] daemons) and their associated manual pages will be built and installed unless the --disable-packing-tools or --disable-analysis-tools switches are passed to configure. You can speed the building of the software if you disable the parts of the system you do not require. For example, a remote collection machine does not need the analysis tools (though they can be useful to have for debugging).
The configure script will build SiLK with support for dynamic-linking, where the common library functions of SiLK are maintained in separate files that the operating system automatically loads when you invoke an application. (The alternative is called static-linking.) While dynamic-linking allows the kernel to maintain one image of the library for simultaneous invocations of SiLK tools, it makes moving the binaries almost impossible since the libraries must move as well, and often the binaries are configured to look in a particular location for the libraries.
If you wish to build without dynamic-linking support, give configure the --enable-static-applications switch, which forces the applications to be statically linked. However, this may result in some plug-ins not working correctly.
An alternative is to specify the --disable-shared switch to configure, but note that this results in the plug-ins not being compiled at all.
If you specify --enable-static-applications or --disable-shared to configure, you also need to specify the --enable-packing-logic switch since rwflowpack will not be able to load the packing logic as a plug-in. See Section 2.3.7 for a description of the --enable-packing-logic switch and the argument the switch requires.
If SiLK is compiled with GnuTLS support, the communication between rwsender and rwreceiver can be encrypted and authenticated once the appropriate certificates have been created and distributed. GnuTLS is the GNU Project’s Transport Layer Security Library, and it is available from http://www.gnu.org/software/gnutls/. Note that SiLK requires GnuTLS v1.4.1 or greater, and SiLK-2.x does not yet have support for GnuTLS v3.x.
The configure script will look for the pkg-config(1) specification file for GnuTLS (gnutls.pc) in the standard pkg-config directories, and if GnuTLS is installed in a standard location, configure should be able to locate it. If you have installed GnuTLS but configure does not find it, you can run configure with the --with-gnutls=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The gnutls.pc file is normally installed in the lib/pkgconfig subdirectory of the location where GnuTLS was installed.
By default, SiLK uses UTC when printing timestamps to the user, and it expects timestamps from the user to be in UTC. Giving configure the --enable-localtime switch will modify SiLK to print and expect times in the local timezone. (Data files are always indexed by UTC.)
The configure script will attempt to locate the pcap library and header files. If they are not found or if they do not have the required functions, SiLK will be built without support for the packet-flow conversion tools rwptoflow and rwpmatch.
If you wish to specify that SiLK use a particular version of the pcap library, pass the --with-pcap=dir switch to configure, where dir contains include/pcap.h and lib/libpcap.a (or a shared version of the library).
The rwresolve tool reads textual input and converts IPv4 addresses to host names. It can take advantage of the Asynchronous DNS (ADNS) library if that library exists on your system. ADNS can be downloaded from http://www.chiark.greenend.org.uk/~ian/adns/.
The configure script will attempt to locate the adns library and header file. If they are not found or if they do not have the required functions, rwresolve will be built without support for ADNS.
If you wish to specify that SiLK use a particular version of the adns library, pass the --with-adns=dir switch to configure, where dir contains include/adns.h and lib/libadns.a (or a shared version of the library).
If SiLK is compiled with libipa support, the rwipaimport and rwipaexport programs will be compiled. These tools interact with an IPA (IP Association) database, which stores information about IP addresses. rwipaimport takes an existing SiLK IPset, Bag, or Prefix Map and stores it in the database; rwipaexport reads data from the IPA database to create a SiLK IPset, Bag, or Prefix Map. libipa is a separate library available from http://tools.netsa.cert.org/ipa/. SiLK requires libipa-0.5.0 or greater.
The configure script will look for the pkg-config(1) specification file for libipa (libipa.pc) in the standard pkg-config directories, and if libipa is installed in a standard location, configure should be able to locate it. If you have installed libipa but configure does not find it, you can run configure with the --with-libipa=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The libipa.pc file is normally installed in the lib/pkgconfig subdirectory of the location where libipa was installed.
By default, SiLK is built with full optimization (assuming the compiler accepts -O3 for optimization), with no debugging, and with assert()s disabled. Pass the --disable-optimization, --enable-debugging, and --enable-assert switches to configure to modify these settings. If your compiler uses a different switch to enable optimization (such as -x04 for Solaris’ cc), you may specify it with --enable-optimization=-x04.
You will need to configure the source code for each machine that runs any part of the SiLK Collection and Analysis Suite.
Run the configure script to configure the SiLK source code. The following command would configure the software to use /data as the location of the data repository and to expect to be installed into the /usr/local directory:
Consult the previous section for additional switches that you may need or wish to pass to configure to help it find a library or to enable an optional feature.
configure will run several tests on your platform and use the results of these tests to create several files. When configure has finished, it will print a summary of how it has configured the SiLK source code:
The above message is also written to the silk-summary.txt file in the directory where you ran configure.
Verify that the configuration matches your expectations.
To build SiLK, simply type make from the top of the source tree:
You can then install the software. Depending on where you chose to install, you may need to become the root user first. This command will install the applications, the support libraries, the plug-ins, and the manual pages:
As this chapter demonstrates, there are many configuration choices an administrator can make when creating a SiLK installation. Because of this, it is difficult for the SiLK authors to provide a single RPM that will work for every installation.
SiLK works around this by providing an RPM spec file template in the distribution (silk.spec.in). When you run the configure script, one of its output files is silk-2.5.0.spec, which is an RPM spec file that matches the configuration options you passed to configure.
To create the RPMs, you will largely follow the instructions provided in Sections 2.1 through 2.4 of this chapter. In Section 2.2, the only installation directory you need to choose is the SILK_DATA_ROOTDIR; that is, the root of the directory tree where the SiLK Flow files will be stored.
Once you have configured SiLK, you can use the RPM spec file (silk-2.5.0.spec), the SiLK distribution file (silk-2.5.0.tar.gz), and the rpmbuild utility to create RPMs that you can install.
The RPM spec file generates the following RPMs:
This section describes the customization of the analysis tools. The manual page for each tool will be installed under $SILK_PATH/share/man/man1/ when you install SiLK. (In addition, http://tools.netsa.cert.org/silk/docs.html provides the manual pages as individual web pages and as a single volume in The SiLK Reference Guide. The web site also contains a tutorial on using the analysis suite: Using SiLK for Network Traffic Analysis: Analysts’ Handbook.)
While nothing in this section is required to use SiLK, these steps will enhance the utility of the software.
In addition to the information contained in the NetFlow or IPFIX flow record (e.g., source and destination addresses and ports, IP protocol, time stamps, data volume), every SiLK flow record has two additional pieces of information:
The purpose of the SiLK site configuration file, silk.conf, is to define the sensors, classes, and types to use when packing and accessing the SiLK flow data. The first time you install SiLK, and any time you add new sensors (IPFIX or NetFlow generators) to a deployment, you will need to update silk.conf.
Once you have made the changes, rename the file silk.conf and save it in the root of your data repository, normally /data.
You may continue to Section 3.2.
When you install SiLK, sample site configuration files are installed in $SILK_PATH/share/silk/SITE-silk.conf. The various files provide different sets of classes and types, and the site file must coordinate with the packing rules that you will use at your site. For information on the twoway and generic site files, see Appendix A. We recommend use of the twoway-silk.conf file.
Copy twoway-silk.conf to a temporary location, renaming the file as silk.conf when you copy it, and open silk.conf in a text editor. If you are using the twoway-silk.conf file, you will see the following near the beginning of the file:
Each line of form
defines a sensor, where
As distributed, the twoway-silk.conf is configured with 15 sensors having names S0, S1, through S14. (If you have 15 or fewer sensors and these names are satisfactory, you may save the silk.conf file to the root of your data repository, typically /data, and skip ahead to Section 3.2.)
You may add, remove, or rename the sensors. Often the sensor names reflect the location of a router or the ISP the router connects to. There are some important things to keep in mind when modifying the list of sensors:
Once you have edited the sensor definitions, you must update the sensors command in the same file (line 10) to contain the list of sensor names.
For example, if you had three routers Alpha, Bravo, and Charlie you would edit the site configuration file to read:
You should not need to change the class and type statements in the file, and doing so may break the packing rules in use at your site.
Once you have modified the silk.conf file, you should copy it to the root of your data repository, typically /data (cf. Section 2.2).
A single installation of SiLK may be used to query multiple data storage locations (though each invocation of a command can only query one storage location). Install a silk.conf into the root of each data storage tree, and set the SILK_DATA_ROOTDIR environment variable to the root of the tree you wish to query.
The address type utility in SiLK provides a quick way to categorize an IPv4 address as internal to your network, external, or non-routable. The --stype and --dtype switches to rwfilter allow one to partition by this category, and the stype and dtype fields in rwcut, rwgroup, rwsort, rwstats, and rwuniq will display, group, sort, or count by this category. To use this functionality, you must create and install a mapping file the describes your IP space. If you do not wish to use this functionality (or if you wish to install it at a later time), you may skip to Section 3.3.
Save the text file, convert it into a binary prefix map, and copy it into the installation tree:
You may continue to Section 3.3.
The mapping file is named address_types.pmap, and you must build this file by creating a text file and processing it with the rwpmapbuild tool. A template for the text file is provided in $SILK_ROOT/share/silk/addrtype-templ.txt. The beginning of the file contains some setup information for rwpmapbuild:
Note: Do not change the numerical values for the mappings (lines 2–4); the address type utility requires those particular values.
As distributed, the addrtype-templ.txt file contains CIDR blocks that should not be seen (are non-routable) on the public Internet. Each CIDR block is labeled as non-routable and is preceded with an explanatory comment:
You may wish to make adjustments to this list depending on what you plan to instrument and where your sensors are located.
Copy the addrtype-templ.txt file to a new file, for example addresses.txt. Open addresses.txt in a text editor, add lines to the file describing your IP space (one CIDR block per line), and label each line internal; for example:
Any CIDR block that is not listed in the file will treated as an external address (due to the default rule on line 7).
Once you’ve created and saved the text file, convert it into a binary prefix map and copy it into the installation tree:
For additional information, see the addrtype(3) and rwpmapbuild(1) manual pages.
Some SiLK tools can use a data file to map IPv4 addresses to the country where that IP is located. With the data file, named country_codes.pmap, in place, an analyst can use the scc and dcc switches (on rwfilter) and fields (on rwcut, rwgroup, rwsort, rwstats, and rwuniq) to partition, display, group, sort, and count by country. This section describes how to build and install the data file. If you do not wish to use this functionality (or if you wish to install it later), you may skip this section.
The data file is based on the GeoIP Country database distributed by MaxMind (http://www.maxmind.com/). MaxMind distributes multiple versions of its GeoIP Country database; the GeoLite Country is a free evaluation copy that is “98% accurate” and is updated monthly. In addition, MaxMind sells versions with higher accuracy which are updated weekly, and it offers various subscription services.
Obtain your copy of the MaxMind GeoIP Country database.
For additional information, see the rwgeoip2ccmap(1) and ccfilter(3) manual pages.
This section describes how to configure your site to use a single machine to collect, pack, and analyze flow data as shown in Figures 1.1 and 1.2.
For this configuration, rwflowpack is used to collect, categorize, convert, and store the flow records on a single machine, and the analysis tools are installed on this same machine.
If this does not describe your packing configuration, refer to the list of possible configurations in Section 1.3.
This section provides instructions on creating the Sensor Configuration file used when collecting and categorizing the flow data. The Sensor Configuration file serves two purposes:
You will find full documentation for the Sensor Configuration Language in the sensor.conf(5) manual page. This section serves as a starter guide.
This handbook will use sensor.conf as the name of the Sensor Configuration file, but it may have any reasonable name.
To meet the two purposes of the Sensor Configuration file, three types of objects are defined:
The SiLK collection tools support the following types of probes:
The syntax of the Sensor Configuration file allows simple key-value pairs on each line, where the key and value are separated by white space. Multiple values are separated by white space and/or comma. Blank lines and comments—which begin with ‘#’ and continue to the end of the line—are ignored.
The probe block assigns a name to the probe and specifies the type of probe. Each probe must have a unique name; since there is often a one-to-one mapping between probes and sensors, each probe usually has the same name as its sensor. Some sample probe blocks follow.
The following block defines the “Alpha” probe and it instructs rwflowpack or flowcap to listen on UDP port 18001 for NetFlow v5 PDUs:
The “Bravo-ipfix” probe tells rwflowpack or flowcap to listen on 18002/tcp for IPFIX flows:
In the next block, rwflowpack or flowcap will listen on UDP port 18003 for NetFlow v5 data. Connections from hosts other than 10.1.1.101 will be ignored.
The “Delta-in” and “Delta-out” probes shown next can be used when the monitoring point sees unidirectional traffic. For example, when all incoming traffic enters the monitor on one network interface card (NIC) and all outgoing traffic enters the monitor on a different NIC. A separate collection process is used for each NIC, each sending to a different port (9902/tcp and 9907/tcp). The rwflowpack or flowcap program will bind to a particular host address (192.168.200.1).
The “Echo” and “Foxtrot” probes can be used by rwflowpack. These probes instruct rwflowpack to periodically poll the named directories for files containing NetFlow v5 PDUs. These directories are where the NetFlow Collector writes its data files.
When creating probes to collect IPFIX data that includes 802.1Q VLAN identifiers, SiLK can store these values (IPFIX’s vlanId and postVlanId fields) in the SiLK Flow record’s fields that typically hold the SNMP interfaces (input and output). In the sensor block, rwflowpack can use the values to discard certain flow records. The “Golf” and “Hotel” probes will extract and store the VLAN identifiers.
A group block gives a name to a list of either CIDR blocks or interface values. To reference an existing group, type an “at” character (@) followed by the name of the group. A group reference can be used in group blocks or in several statements in the sensor block as described in the next section. When using a group reference, the group must contain values consistent with the statement where the group is being used.
The sensor block configures a sensor. The name of the sensor block must be the name of a sensor defined in the silk.conf site configuration file (cf. Section 3.1). The sensor block specifies which probes are associated with that sensor. Whenever flow data arrives on a probe, the sensor associated with the probe notices the data and processes it. The sensor’s processing of the flow data uses the other attributes defined in the sensor block to categorize the flows. Some examples are given here; for the details on how the packlogic-twoway.so plug-in uses this information, see Appendix A.
The following sensor block instructs rwflowpack to categorize a flow from the “Alpha” probe as “incoming” when the incoming SNMP interface on the flow is 3 or 8. All other flows are considered outgoing. Flows processed by this rule are labeled as being from the “Alpha” sensor.
The following example is the same as the previous, but it uses the group “Alpha-external” to specify the external interfaces.
The next block processes IPFIX flows collected by the “Bravo-ipfix” probe. If the source address is not in 192.168.12.0/24, the flow is considered incoming; otherwise, it is considered outgoing. These flows have “Bravo” as their sensor.
The following example uses a group when creating the “Bravo” sensor.
For the following sensor, rwflowpack categorizes a flow as incoming if its incoming SNMP interface is 7; an outgoing SNMP interface of 2 means the flow did not leave the router.
The data from the “Delta-in” and “Delta-out” probes above are merged into a single “Delta” sensor by creating two sensor blocks that each pack to the same sensor. All flows collected by “Delta-in” will be labeled as incoming; those collected by “Delta-out” as outgoing.
The following sensor packs flows collected by the “Echo” probe above, but it discards data that was blocked by the router—that is, traffic that went to the null interface will not be packed. The sensor definition assumes the null interface is 0 and the group “internet-nics” specifies the network cards on the router that face the Internet.
When the same probe is specified in multiple sensors, each sensor has a chance to process the flows. Suppose “Fox” and “Trot” are two sensors whose address space is defined in the groups “fox-net” and “trot-net”, and suppose each sensor processes the data collected by the “Foxtrot” probe. Note that the “Fox” sensor will see data between “trot-net” and the Internet, and rwflowpack would normally pack that data at “Fox” as external-to-external (“ext2ext”) traffic since it does not involve “fox-net”; however, that may not be desirable. The following causes rwflowpack to discard data that is not associated with the appropriate address space.
The following example is similar to the previous in that multiple sensors get data from a single probe, except it discards traffic based on the VLAN identifiers that the “Golf” probe stored in the flow records. The first three sensors only pack traffic that match their specific VLAN identifier, while the “Golf-Extra” sensor will pack any traffic that was not stored in the other three sensors.
The following summarizes the most commonly used statements in the sensor.conf file. For the full syntax, see the sensor.conf(5) manual page.
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Sections 2 and 3. Be certain to customize silk.conf and install it in the SILK_DATA_ROOTDIR directory.
Follow the instructions in Section 4.1 to create the Sensor Configuration file, and copy the file into the CONFIG_FILE_DIR directory.
To provide easier control of the SiLK daemons in UNIX-like environments, example sh-scripts are provided. The names of these scripts are the same as the daemon they control. The scripts are installed in the $SILK_PATH/share/silk/etc/init.d/ directory, but you should copy them to the standard location for start-up scripts on your system (e.g., /etc/init.d/ on Linux and other SysV-type systems).
To generate the command line for the daemon named daemon , the control script checks settings in the text file SCRIPT_CONFIG_LOCATION/daemon.conf. Before using a control script, you must create a daemon.conf file and customize it for your environment.
For each daemon, an example configuration file is installed in the $SILK_PATH/share/silk/etc/ directory. You will need to copy the file to the SCRIPT_CONFIG_LOCATION directory and modify it as described in this section. (The format of these configuration files may change between releases of SiLK. When upgrading from a previous release, you should merge your previous settings into the new version of the configuration file.)
You should not need to edit any of the control scripts; however, be aware the value of SCRIPT_CONFIG_LOCATION they use was set when you ran configure.
Many of the variable names in rwflowpack.conf correspond to a command line switch on rwflowpack. By referencing the rwflowpack manual page and the documentation for each variable in that file, you should be able to determine how set each variable. This section highlights some of the settings. The switch that the variable controls follows each name.
Save the rwflowpack.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
To test whether everything is correct, try starting rwflowpack using the control script:
If rwflowpack fails to start, it prints an error message to the standard error. If everything is correct, rwflowpack writes a file named rwflowpack.pid into the PID_DIR directory, and log messages are written either to files in LOG_DIR or to your machine’s system log.
You can use the control script to stop rwflowpack:
The log messages that rwflowpack generates (assuming no data was collected) will resemble:
If you wish, you can make rwflowpack start automatically when the machine boots by adding the rwflowpack control script to your machine’s boot sequence. The details vary among operating systems.
For RedHat Linux, issue the following commands:
At this point, you should be able to start the packer using the following command:
Follow the instructions in Section 8 to start the flow generator.
If rwflowpack is listening for NetFlow traffic on UDP port(s), follow the instructions in Section 8.3 to increase the maximum socket buffer size allowed by your kernel.
This section describes how to configure your site to use the packing configuration that supports remote data collection and remote SiLK Flow storage (see Figures 1.3 and 1.4).
For this configuration, there are three sets of machines:
One or more machines act as collection machines. Each collection machine runs the flowcap daemon to collect the flows and store them in “flowcap files”. The rwsender daemon also runs on each collection machine, and it transfers the files from the collection machine to the packing machine.
There is typically one machine, called the packing machine, that runs rwflowpack to read the files generated by flowcap, to convert the flow records they contain, to categorize flows, and to write “incremental files” containing small numbers of SiLK flow records. The packing machine runs an rwreceiver process to accept the files from the collection machines, and it runs the rwsender daemon to transfer the incremental files from the packing machine to each storage machine.
One or more storage machines run the rwreceiver daemon to receive the incremental files, and the rwflowappend daemon appends the incremental files to their final location in hourly files. In addition, each storage machine has the SiLK analysis tools installed to read and analyze the data in the hourly files.
If this does not describe your packing configuration, refer to the list of possible configurations in Section 1.3.
These instructions assume the rwreceiver and rwsender daemons on the packing machine always act as clients. That is, the rwsender on each collection machine runs in server mode as does the rwreceiver on each storage machine.
For an installation that uses remote data collection and SiLK Flow storage, you must build, install, and configure the software on the packing machine as well as on every collection machine and storage machine.
The packing machine runs three daemons:
In this section you configure and build the software, and configure rwflowpack. The configuration of rwreceiver and rwsender occur in later sections (5.3 and 5.5, respectively).
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Section 2. Since you will not be storing the SiLK flows on the packing machine, you may ignore the --enable-data-rootdir switch. For faster compilation and to save disk space, you can avoid building the analysis tools by passing the --disable-analysis-tools switch to configure.
Follow the instructions in Section 3.1 to customize the silk.conf file, and save it to $SILK_PATH/share/silk/silk.conf so rwflowpack will locate it. You can ignore the remainder of Section 3 on the packing machine.
Follow the instructions in Section 4.1 to create the Sensor Configuration file, and copy the file into the CONFIG_FILE_DIR directory.
rwflowpack runs on the packing machine to process files generated by flowcap and create incremental files for rwflowappend.
To provide easier control of the SiLK daemons in UNIX-like environments, example sh-scripts are provided. The names of these scripts are the same as the daemon they control. The scripts are installed in the $SILK_PATH/share/silk/etc/init.d/ directory, but you should copy them to the standard location for start-up scripts on your system (e.g., /etc/init.d/ on Linux and other SysV-type systems).
To generate the command line for the daemon named daemon , the control script checks settings in the text file SCRIPT_CONFIG_LOCATION/daemon.conf. Before using a control script, you must create a daemon.conf file and customize it for your environment.
For each daemon, an example configuration file is installed in the $SILK_PATH/share/silk/etc/ directory. You will need to copy the file to the SCRIPT_CONFIG_LOCATION directory and modify it as described in this section. (The format of these configuration files may change between releases of SiLK. When upgrading from a previous release, you should merge your previous settings into the new version of the configuration file.)
You should not need to edit any of the control scripts; however, be aware the value of SCRIPT_CONFIG_LOCATION they use was set when you ran configure.
Many of the variable names in rwflowpack.conf correspond to a command line switch on rwflowpack. By referencing the rwflowpack manual page and the documentation for each variable in that file, you should be able to determine how set each variable. This section highlights some of the settings. The switch that the variable controls follows each name.
The following settings are common across all daemon.conf files:
Save the rwflowpack.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
To test whether the settings in rwflowpack.conf are correct, use the control script to start rwflowpack:
If rwflowpack fails to start, it prints an error message to the standard error. If everything is correct, rwflowpack writes a file named rwflowpack.pid into the PID_DIR directory, and log messages are written either to files in LOG_DIR or to your machine’s system log.
You can stop rwflowpack while you configure the other parts of the system:
The log messages that rwflowpack generates will resemble:
If you wish, you can make rwflowpack start automatically when the packing machine boots by adding the rwflowpack control script to its boot sequence. The details vary among operating systems.
For RedHat Linux, issue the following commands:
At this point, you should be able to start and stop the packer using the following commands:
rwreceiver runs on the packing machine to accept, from the collection machine(s), the files generated by flowcap and sent by rwsender.
Each rwsender and rwreceiver is configured with an identifier of its own and the identifier(s) of the rwreceiver(s) or rwsender(s) that may connect to it. The connection will not be established if the identifier provided by other process is not recognized. In addition, every rwsender that communicates with the same rwreceiver must have a unique identifier; likewise, every rwreceiver that communicates with the same rwsender must have a unique identifier.
Create the identifier that the rwreceiver client on the packing machine sends when it contacts the rwsender daemon running on each collection machine. The identifier should contain only printable, non-whitespace characters; the following characters are illegal: colon (:), slash (/ and \), period (.), and comma (,).
The identifier should reflect that this is the rwreceiver process associated with the packer. These instructions use rcv-packer1.
You will use this identifier when you set up the rwsender daemon on each collection machine in Section 5.2.3, and when you configure rwreceiver on the packing machine (Section 5.3).
rwsender runs on the packing machine to transfer the incremental files generated by rwflowpack to the rwreceiver and rwflowappend processes on the storage machines.
Create the identifier that the rwsender client on the packing machine sends when it contacts the rwreceiver daemon running on each storage machine. The identifier should contain only printable, non-whitespace characters; the following characters are illegal: colon (:), slash (/ and \), period (.), and comma (,).
The identifier should reflect that this is the rwsender process associated with the packer. These instructions suggest you use send-packer1.
You will use this identifier when you set up the rwreceiver daemon on each storage machine in Section 5.4.3, and when you configure rwsender on the packing machine (Section 5.5).
If SiLK is compiled with GnuTLS support (see Section 2.3.10), rwsender and rwreceiver can communicate using a secure (encrypted and authenticated) layer over TCP. If SiLK was not compiled with GnuTLS support or you do not wish to use this feature, you may skip this section.
When the GnuTLS-specific options are specified, rwsender and rwreceiver use GnuTLS for all communications with other rwreceivers and rwsenders. The applications will not allow communication to an application that is not using GnuTLS. If you wish to use GnuTLS for some communication but not others, you will need to run multiple instances of rwsender and rwreceiver.
To use this feature, the rwsender and rwreceiver each need access to the PEM (Privacy Enhanced Mail) encoded root Certificate Authority (CA) file and either to a DER (Distinguished Encoding Rules) encoded PKCS#12 file or to a PEM encoded key and a PEM encoded certificate file. See Appendix C for instructions on creating these files using the GnuTLS certtool program.
The communication between rwsender and rwreceiver will be established as long as the PKCS#12 file or the key and certificate files both have the same CA. You can create a single key and certificate and use that on for all instances of rwsender and rwreceiver, or create a separate certificate/key pair for each instance of these programs.
For simplicity, these instructions assume you will use a single PKCS#12 file, named pkcs12.der, for all communication between any rwsender and rwreceiver. In addition, the instructions use rootcert.pem for the name of the CA root certificate file. These files should be installed in the CONFIG_FILE_DIR directory on the packing machine.
In Section 4.1, you created the Sensor Configuration file listing all the sensors and probes in your network. The instructions that follow assume that every collection machine is associated with a unique sensor named SENSOR.
Each collection machine runs two daemons:
You will perform these steps on every machine where remote collection occurs.
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Section 2. Since you will not be storing the SiLK flows on the collection machine, you may ignore the --enable-data-rootdir switch on this machine. For faster compilation and to save disk space, you can avoid building the analysis tools by passing the --disable-analysis-tools switch to configure.
Copy the silk.conf file from the packing machine to this machine. If you save it in $SILK_PATH/share/silk/silk.conf, flowcap will automatically find it.
Copy the Sensor Configuration file from the packing machine to this machine and save it in the CONFIG_FILE_DIR directory.
If you are using GnuTLS, copy the rootcert.pem and pkcs12.der files that you created on the packing machine in Section 5.1.5 into the CONFIG_FILE_DIR directory on this machine.
flowcap runs on the collection machine to capture flow records and store them in files for transfer to the packing machine.
The SCRIPT_CONFIG_LOCATION/flowcap.conf file is used by the control script to generate the command line for flowcap. An example flowcap.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the flowcap.conf file you will need to change:
Check that the values for the maximum percentage of the disk to use (FULLSPACE_MAX) and the minimum amount of free space to leave (FREESPACE_MIN) make sense at your site. The values specified in the file as shipped assume a single disk partition is dedicated to storing the files generated by flowcap.
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the flowcap.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Check the settings in flowcap.conf by using the control script to start and stop flowcap (cf. Section 5.1.2.2):
The log messages that flowcap generates will resemble:
Add the flowcap control script to the collection machine’s boot sequence if you want flowcap to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
rwsender runs on the collection machine to transfer the files generated by flowcap to the rwreceiver daemon running on the packing machine.
The SCRIPT_CONFIG_LOCATION/rwsender.conf file is used by the control script to generate the command line for rwsender. An example rwsender.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the rwsender.conf file you will need to change:
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the rwsender.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Check the settings in rwsender.conf by using the control script to start and stop rwsender:
The log messages that rwsender generates will resemble:
Add the rwsender control script to the collection machine’s boot sequence if you want rwsender to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
Now that you have created the rwsender.conf file on the collection machines, you can configure rwreceiver on the packing machine. To recap, rwreceiver runs on the packing machine to accept, from the collection machine(s), the files generated by flowcap and sent by rwsender.
The SCRIPT_CONFIG_LOCATION/rwreceiver.conf file is used by the control script to generate the command line for rwreceiver. An example rwreceiver.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the rwreceiver.conf file you need to change:
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the rwreceiver.conf file into the SCRIPT_CONFIG_LOCATION directory on the packing machine.
Check the settings in rwreceiver.conf by using the control script to start and stop rwreceiver:
The log messages that rwreceiver generates will resemble:
Add the rwreceiver control script to the collection machine’s boot sequence if you want rwreceiver to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
Each storage machine runs two daemons:
Perform these steps on every storage machine where remote SiLK data storage occurs.
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Section 2.
Copy the silk.conf file from the packing machine to this machine and save it in the SILK_DATA_ROOTDIR directory.
If desired, follow the instructions from Section 3 to create your site’s address map and country code files. You only need to do this on the first storage machine you configure. For additional storage machines, simply copy the files from the first storage machine.
If you are using GnuTLS, copy the rootcert.pem and pkcs12.der files that you created on the packing machine in Section 5.1.5 into the CONFIG_FILE_DIR directory on this machine.
rwflowappend runs on the storage machine to append the incremental files generated by rwflowpack to the hourly data files for use by the analysis tools.
The SCRIPT_CONFIG_LOCATION/rwflowappend.conf file is used by the control script to generate the command line for rwflowappend. An example rwflowappend.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the rwflowappend.conf file you will need to change:
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the rwflowappend.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Check the settings in rwflowappend.conf by using the control script to start and stop rwflowappend:
The log messages that rwflowappend generates will resemble:
Add the rwflowappend control script to the collection machine’s boot sequence if you want rwflowappend to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
rwreceiver runs on the storage machine to accept the incremental files from the rwsender daemon running on the packing machine.
The SCRIPT_CONFIG_LOCATION/rwreceiver.conf file is used by the control script to generate the command line for rwreceiver. An example rwreceiver.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the rwreceiver.conf file you need to change:
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the rwreceiver.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Check the settings in rwreceiver.conf by using the control script to start and stop rwreceiver:
The log messages that rwreceiver generates will resemble:
Add the rwreceiver control script to the collection machine’s boot sequence if you want rwreceiver to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
Now that you have created the rwreceiver.conf file on the storage machines, you can configure rwsender on the packing machine. To recap, rwsender runs on the packing machine to transfer the incremental files generated by rwflowpack to the rwreceiver and rwflowappend processes on the storage machines.
The SCRIPT_CONFIG_LOCATION/rwsender.conf file is used by the control script to generate the command line for rwsender. An example rwsender.conf file is available in the $SILK_PATH/share/silk/etc/ directory. These are the variables in the rwsender.conf file you need to change:
You will also need to change some of the following; they are the same as those described for rwflowpack.conf on page 70:
| ENABLED. | Whether this file has been configured. |
| CREATE_DIRECTORIES. | Whether to create directories. |
| LOG_TYPE. | The type of logging. |
| LOG_DIR. | The directory for log files. |
| PID_DIR. | The directory for the PID file. |
| USER. | The user to run as. |
Save the rwsender.conf file into the SCRIPT_CONFIG_LOCATION directory on the packing machine.
Check the settings in rwsender.conf by using the control script to start and stop rwsender:
The log messages that rwsender generates will resemble:
Add the rwsender control script to the collection machine’s boot sequence if you want rwsender to start when the machine boots. This process is similar to the one you followed for rwflowpack (see Section 5.1.2.3).
Once you have compiled and installed the software and configured the files used by the daemons, you are almost ready to begin collecting data.
First, you should ensure that the connections between the rwsender and rwreceiver processes are correctly configured. To test for a duplicate identifier entry, you should run all the rwsender and rwreceiver daemons that communicate with one another at the same time.
Once you know that the file transfer is correctly configured, you can start the daemons to collect, convert, and store the data.
As an initial test, you may want to enable a single path through the various daemons, and only start the other daemons once you are confident that your settings are correct. The single path makes clean-up easier if something needs to be changed. To use a single path, stop all but one of the rwreceiver processes on the storage machines, and start a single flowcap process.
To start the connection between the packer and the collection machines, on each collection machine start the rwsender daemon just as you did when testing its configuration in Section 5.2.3.2:
Now start the rwreceiver on the packing machine (cf. Section 5.3.2):
If all goes well, on each collection machine you will see log messages in the form:
The log messages on the packing machine will be similar to:
Make certain that you see connections for each of the remote collection machines.
If the connections fail, make certain that the ports and machine names or IP addresses are all correct. To produce more verbose logging messages to help you debug the problem, you can set the LOG_LEVEL variable in rwsender.conf and/or rwreceiver.conf to debug and restart the daemons.
If you want to test the transfer of files between the machines, you can place any file into the FLOWCAP_DEST directory on a collection machine and it should be transferred to the PACKER_INCOMING directory on the packing machine. (Remember to remove this file before you start rwflowpack.)
To start the connection between the packer and the storage machines, on each storage machine start the rwreceiver daemon just as you did when testing its configuration in Section 5.4.3.2:
Now start the rwsender on the packing machine (cf. Sections 5.5.2):
If everything is successful, the log messages on the packing machine will be similar to:
Make certain that you see connections for each of the remote storage machines.
On each storage machine you will see log messages in the form:
If the connections fail, make certain that the ports and machine names or IP addresses are all correct. To produce more verbose logging messages to help you debug the problem, you can set the LOG_LEVEL variable in rwsender.conf and/or rwreceiver.conf to debug and restart the daemons.
If you want to test the transfer of files between the machines, you can place any file into the PACKER_DEST directory on the packing machine and it should be transferred to the APPEND_INCOMING directory on all the storage machines. (Remember to remove this file before you start rwflowappend.)
Just as you did during testing (Section 5.4.2), start the rwflowappend daemon on each storage machine:
Start the rwflowpack process on the packing machine:
Finally, start the flowcap daemon on each collection machine:
Follow the instructions in Section 8 to start the flow generator.
If flowcap will be listening for NetFlow traffic on UDP port(s), follow the instructions in Section 8.3 to increase the maximum socket buffer size allowed by your kernel.
This section describes how to configure your site to use the packing configuration that supports remote data collection. This configuration is depicted in Figure 1.5.
For this configuration, there is one machine called the packing machine and one or more additional machines referred to as collection machines. Each collection machine runs the flowcap daemon to collect the flows and store them in “flowcap files”. The rwsender daemon also runs on each collection machine, and it transfers the files from the collection machine to an rwreceiver daemon running on the packing machine. The packing machine runs the rwflowpack daemon to read these flowcap files and categorize and pack the flow records they contain. The analysis tools are installed on the packing machine to read and analyze the flow records.
If this does not describe your packing configuration, refer to the list of possible configurations in Section 1.3.
The configuration in this section is similar to that in Section 5. This section will describe the configuration of rwflowpack on the packing machine, and then refer you back to Section 5 to complete the installation.
The packing machine runs two daemons:
Perform these steps on the packing machine to install the software and to configure the rwflowpack daemon.
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Sections 2 and 3. Be certain to customize silk.conf and install it in the SILK_DATA_ROOTDIR directory.
Follow the instructions in Section 4.1 to create the Sensor Configuration file, and copy the file into the CONFIG_FILE_DIR directory.
If you wish to use GnuTLS to secure the connection between the collection machine and the packing machine, create the Certificate Authority file and PKCS#12 file as described in Section 5.1.5. Copy these files into the CONFIG_FILE_DIR directory.
To provide easier control of the SiLK daemons in UNIX-like environments, example sh-scripts are provided. The names of these scripts are the same as the daemon they control. The scripts are installed in the $SILK_PATH/share/silk/etc/init.d/ directory, but you should copy them to the standard location for start-up scripts on your system (e.g., /etc/init.d/ on Linux and other SysV-type systems).
To generate the command line for the daemon named daemon , the control script checks settings in the text file SCRIPT_CONFIG_LOCATION/daemon.conf. Before using a control script, you must create a daemon.conf file and customize it for your environment.
For each daemon, an example configuration file is installed in the $SILK_PATH/share/silk/etc/ directory. You will need to copy the file to the SCRIPT_CONFIG_LOCATION directory and modify it as described in this section. (The format of these configuration files may change between releases of SiLK. When upgrading from a previous release, you should merge your previous settings into the new version of the configuration file.)
You should not need to edit any of the control scripts; however, be aware the value of SCRIPT_CONFIG_LOCATION they use was set when you ran configure.
Many of the variable names in rwflowpack.conf correspond to a command line switch on rwflowpack. By referencing the rwflowpack manual page and the documentation for each variable in that file, you should be able to determine how set each variable. This section highlights some of the settings. The switch that the variable controls follows each name.
Save the rwflowpack.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Follow the instructions in Section 5.1.2.2 to test whether the settings in rwflowpack.conf are correct.
If you wish, you can make rwflowpack start automatically when the packing machine boots by adding the rwflowpack control script to its boot sequence. This process is described in Section 5.1.2.3.
rwreceiver runs on the packing machine to accept, from the collection machine(s), the files generated by flowcap and sent by rwsender.
Each rwsender and rwreceiver is configured with an identifier of its own and the identifier(s) of the rwreceiver(s) or rwsender(s) that may connect to it. The connection will not be established if the identifier provided by other process is not recognized. In addition, every rwsender that communicates with the same rwreceiver must have a unique identifier; likewise, every rwreceiver that communicates with the same rwsender must have a unique identifier.
Create the identifier that the rwreceiver client on the packing machine sends when it contacts the rwsender daemon running on each collection machine. The identifier should contain only printable, non-whitespace characters; the following characters are illegal: colon (:), slash (/ and \), period (.), and comma (,).
The identifier should reflect that this is the rwreceiver process associated with the packer. These instructions use rcv-packer1.
You will use this identifier when you set up the rwsender daemon on each collection machine in Section 5.2.3, and when you configure rwreceiver on the packing machine (Section 5.3).
Setting up each remote collection machine follows the procedure described in Section 5.2.
After you configure the remote collection machines, follow the instructions in Section 5.3 to configure rwreceiver on the packing machine.
Follow the instructions in Section 5.6—ignoring Sections 5.6.2 and 5.6.3—to start the complete collection system.
This section describes how to configure your site to use the packing configuration that supports remote SiLK Flow storage. Figure 1.6 shows this configuration.
For this configuration, there is one machine called the packing machine and one or more additional machines called storage machines. The packing machine runs the rwflowpack daemon to collect the flow records, categorize them, and store them in small “incremental files”. The rwsender daemon also runs on the packing machine, and it transfers the incremental files from the packing machine to an rwreceiver daemon running on each storage machine. Each storage machine also runs the rwflowappend daemon to append the incremental files to their final location in hourly files. In addition, each storage machine has the SiLK analysis tools installed to read and analyze the data in the hourly files.
If this does not describe your packing configuration, refer to the list of possible configurations in Section 1.3.
The configuration in this section is similar to that in Section 5. This section will describe the configuration of rwflowpack on the packing machine, and then refer you back to Section 5 to complete the installation.
The packing machine runs two daemons:
Perform these steps on the packing machine to install the software and to configure the rwflowpack daemon.
Choose locations and create the following directories if they do not exist:
Build and install the SiLK software as described in Section 2. Since you will not be storing the SiLK flows on the packing machine, you may ignore the --enable-data-rootdir switch on this machine. For faster compilation and to save disk space, you can avoid building the analysis tools by passing the --disable-analysis-tools switch to configure.
Follow the instructions in Section 3.1 to customize the silk.conf file, and save it to $SILK_PATH/share/silk/silk.conf so rwflowpack will locate it. You can ignore the remainder of Section 3 on the packing machine.
Follow the instructions in Section 4.1 to create the Sensor Configuration file, and copy the file into the CONFIG_FILE_DIR directory.
If you wish to use GnuTLS to secure the connection between the collection machine and the packing machine, create the Certificate Authority file and PKCS#12 file as described in Section 5.1.5. Copy these files into the CONFIG_FILE_DIR directory.
To provide easier control of the SiLK daemons in UNIX-like environments, example sh-scripts are provided. The names of these scripts are the same as the daemon they control. The scripts are installed in the $SILK_PATH/share/silk/etc/init.d/ directory, but you should copy them to the standard location for start-up scripts on your system (e.g., /etc/init.d/ on Linux and other SysV-type systems).
To generate the command line for the daemon named daemon , the control script checks settings in the text file SCRIPT_CONFIG_LOCATION/daemon.conf. Before using a control script, you must create a daemon.conf file and customize it for your environment.
For each daemon, an example configuration file is installed in the $SILK_PATH/share/silk/etc/ directory. You will need to copy the file to the SCRIPT_CONFIG_LOCATION directory and modify it as described in this section. (The format of these configuration files may change between releases of SiLK. When upgrading from a previous release, you should merge your previous settings into the new version of the configuration file.)
You should not need to edit any of the control scripts; however, be aware the value of SCRIPT_CONFIG_LOCATION they use was set when you ran configure.
Many of the variable names in rwflowpack.conf correspond to a command line switch on rwflowpack. By referencing the rwflowpack manual page and the documentation for each variable in that file, you should be able to determine how set each variable. This section highlights some of the settings. The switch that the variable controls follows each name.
Save the rwflowpack.conf file into the SCRIPT_CONFIG_LOCATION directory that you created above.
Follow the instructions in Section 5.1.2.2 to test whether the settings in rwflowpack.conf are correct.
If you wish, you can make rwflowpack start automatically when the packing machine boots by adding the rwflowpack control script to its boot sequence. This process is described in Section 5.1.2.3.
rwsender runs on the packing machine to transfer the incremental files generated by rwflowpack to the rwreceiver and rwflowappend processes on the storage machines.
Each rwsender and rwreceiver is configured with an identifier of its own and the identifier(s) of the rwreceiver(s) or rwsender(s) that may connect to it. The connection will not be established if the identifier provided by other process is not recognized. In addition, every rwsender that communicates with the same rwreceiver must have a unique identifier; likewise, every rwreceiver that communicates with the same rwsender must have a unique identifier.
Create the identifier that the rwsender client on the packing machine sends when it contacts the rwreceiver daemon running on each storage machine. The identifier should contain only printable, non-whitespace characters; the following characters are illegal: colon (:), slash (/ and \), period (.), and comma (,).
The identifier should reflect that this is the rwsender process associated with the packer. These instructions suggest you use send-packer1.
You will use this identifier when you set up the rwreceiver daemon on each storage machine in Section 5.4.3, and when you configure rwsender on the packing machine (Section 5.5).
Setting up each remote storage machine follows the procedure described in Section 5.4.
After you configure the remote storage machines, follow the instructions in Section 5.5 to configure rwsender on the packing machine.
Follow the instructions in Section 5.6—ignoring Sections 5.6.1 and 5.6.5—to start the complete collection system.
Now that the daemons are installed and listening for data, it is time to provide them with data. This section describes
For SiLK to use the YAF Flow Collection software, you must install libfixbuf v0.7.3 (http://tools.netsa.cert.org/fixbuf/) before you install SiLK, and SiLK’s configure script must notice that libfixbuf is installed. See the libfixbuf documentation for instructions on installing it. If SiLK’s configure script does not find your libfixbuf installation, refer to Section 2.3.5 in this handbook for assistance.
Once both YAF and SiLK are installed, getting them to communicate is straightforward.
You need to create an IPFIX probe in your Sensor Configuration file so that rwflowpack or flowcap knows to listen for IPFIX flows. Section 4.1 and the sensor.conf(5) manual page describe the Sensor Configuration syntax. The examples in this section assume the collection daemon (rwflowpack or flowcap) is running on the machine whose IP is 10.1.18.2 and the daemon and YAF are communicating on port 18002.
In the sensor.conf file, the required probe block, where the probe is named Bravo, is:
This section gives instructions on invoking YAF. In all cases, note the use of the --silk option to YAF; this switch causes YAF to break the IPFIX specification but provides additional analysis capabilities in SiLK. See the yaf(1) manual page for details.
To perform live capture on an interface (e.g., eth0), use the following command. Note the use of sudo; the YAF software will drop its privileges and become user after binding to the interface.
When YAF is configured with the --with-dag option, it can accept packets from an Endace DAG card (e.g., dag0). This invocation is similar to the previous one:
To have YAF process a packet capture (pcap) dump file (such as that produced by tcpdump(1)), run:
If you have several packet capture files to process, you can pass a list of files to YAF and specify --caplist. Since YAF will treat the files as a single stream, you need to make certain the file names occur in ascending time order. Note that the files cannot be compressed.
To have YAF process a directory of packet capture files, where the files are named such that they are naturally listed in ascending time order, use:
See the YAF documentation for additional switches, such as those that control logging.
You will need to perform these steps for each router you wish to instrument. The examples in this section assume the collection daemon (rwflowpack or flowcap) is running on the machine whose IP is 10.1.18.1, and that the daemon and router is communicating on port 18001.
In the sensor.conf file, the required probe block, where the probe is named Alpha, is:
The timestamps on the NetFlow records will be based on the timestamps received from the router, and we suggest using ntp to minimize drift in the router’s clock. To synchronize the router’s time with that from the time server running at ip-address, use the Cisco IOS command
The router needs to know where to send the NetFlow PDUs: the host and port on which rwflowpack or flowcap is listening. (If you are configuring multiple routers, you’ll need to use a unique ip-address:port pair for each router.) To set this information on the router, give the command
To make certain the router exports NetFlow version 5 records, which the SiLK tools require, issue
SiLK assumes no flow records are longer than 60 minutes—this means a long TCP session (such as an interactive ssh session) will be broken across multiple flow records. To set the active timeout on your Cisco router (30 minutes is the default for Cisco), use the IOS command:
When the router is rebooted, it can reassign the SNMP interface numbers. This can create a problem, as the SNMP interface that was facing the Internet could now be facing your organization, resulting in the incoming and outgoing flows being reversed. To prevent this problem, tell the router to use persistent settings for the interface numbers. The easiest solution is to enable global persistence with the IOS command
and then save the configuration with the EXEC mode command
See the following link for more information on IfIndex persistence, including instructions on setting persistence on an interface-by-interface basis: http://www.cisco.com/en/US/docs/ios/12_1t/12_1t5/feature/guide/dt5ifidx.html.
Finally, to enable NetFlow, issue the IOS command
Network traffic tends to be “bursty”: when you make an HTTP request, several servers may respond feeding you pages, images, and ads. To avoid losing records, it is important for each program receiving flow data to have a large socket buffer. The SiLK software will attempt to set the socket buffer size to the largest size the kernel will allow, up to a maximum of 8MB. To ensure that the programs can use the full 8MB buffer, we recommend increasing the maximum socket buffer size on each machine that has incoming flows. When using NetFlow, the machine running rwflowpack or flowcap and receiving the NetFlow PDUs should have its socket buffer adjusted.
To increase the maximum allowable socket buffer size on a running Linux system:
On a running Solaris box, issue:
Those lines may be added to the system’s start-up sequence (e.g., /etc/rc2.d/S99ndd) to make the change persistent across reboots.
This section describes how a flow record read from an external source is processed to become SiLK Flow record. For information on installing SiLK’s flow collection and storage tools, refer to Section 4 or 5.
The most simple SiLK configuration is the Single machine configuration, described in Section 1.3.1. In this configuration, the rwflowpack daemon collects NetFlow v5 flow records, NetFlow v9 flow records, or flow records that follow the Internet Protocol Flow Information eXport (IPFIX) standard. rwflowpack converts the information from these external flow formats to the SiLK format, and some information from the source record is dropped to keep each individual SiLK record small. Next, rwflowpack categorizes the SiLK flow records to determine where on disk they will be stored. Finally, rwflowpack writes the SiLK records into binary flat files where each file represents a specific category, sensor, and hour. This entire process is referred to as packing. The term packing logic refers to the decision process that rwflowpack uses to categorize a flow.
(In the more complex configurations, it may be the flowcap daemon that collects the flow records in the external formats and converts them to a SiLK format. The rwflowappend deamon may be responsible for writing the records into their final location in the data repository of hourly files. These differences are largely immaterial for this section, which describes the categorization process.)
A router will create a NetFlow record for IP packets that traverse the router within a certain time window and have identical IP protocols, identical source and destination IP addresses, and identical source and destination ports. The NetFlow record contains
NetFlow data is strictly unidirectional: a TCP conversation passing through a router causes the router to generate two sets of flow records—one for each side of the conversation.
Over time, NetFlow became a de facto standard for flow data. The Internet Protocol Flow Information eXport (IPFIX) working group (http://www.ietf.org/dyn/wg/charter/ipfix-charter.html) grew out of a desire to standardize the NetFlow format. SiLK handles IPFIX records in much the same way that it handles NetFlow records. (Support for IPFIX requires that SiLK is built with libfixbuf support.)
rwflowpack categorizes each flow to determine where to store it, and the category also determines the format of the file that contains the flow record. This categorization is handled by a plug-in that rwflowpack loads at run-time. The best place to specify the name of this plug-in is in the packing-logic statement in the silk.conf site configuration file. You may also specify the location of the plug-in with the --packing-logic switch to rwflowpack.
This section describes the categorization that the twoway site provides. The packing logic for other sites will be different.
Since NetFlow data is unidirectional, the first part of categorization determines whether the flow entered or left the monitored network. There are three ways to do this:
The IP-based approach works well for a small, well-contained IP space, but it can be unwieldy if the IP space of the monitored network is large and discontinuous or if there is no well-defined IP space solely contained in the monitored network. However, the IP-based approach is appropriate when you have data that does not have SNMP information, such as IPFIX flows or when you are generating flow records from a packet capture (pcap) file (e.g., from the output of tcpdump). Although the initial configuration of the SNMP approach is more difficult, this method has the advantage of being computationally faster than the IP-based method, and it ensures that the flows reflect the way the router is actually moving the traffic into and out of your network, not the way you think it should be routing the traffic.
Directions for using flow data and lists of IPs to determine the external SNMP interfaces are presented in Appendix B.
The source-network and destination-network commands in the Sensor Configuration file (Section 4.1) are used when you wish to explicitly set the direction of the flows. The value to these commands is either external or internal.
To have rwflowpack use the IP-based approach, specify the monitored network’s IP space in the internal-ipblock command of each sensor, and set the external-ipblock to the keyword remainder.
To use the SNMP approach, set the external-interfaces to the list of SNMP interfaces that face outside the monitored network, and either do not specify an internal-interfaces command or set it to the list of SNMP interfaces that connect into the monitored network. (If the internal-interfaces is not provided, it is treated as if it had the value remainder.) Alternatively, you can specify the internal-interfaces as a list and explicitly set the external-interfaces to the keyword remainder.
An additional part of categorizing a flow is to determine what the router did with the packets that the flow represents. A flow record does not have to represent packets that entered or left the monitored network (which we call routed packets); instead, a record may represent packets that did not leave the router; these packets are considered not-routed or null. This behavior occurs when
To determine if a flow was not-routed, the output SNMP index of the flow is compared to the null-interface value. If there is a match, the flow is categorized as not-routed.
Note: Since SiLK-1.0.0, the null-interface is longer set by default. You must explicitly set it to categorize flows as non-routed. Cisco routers use 0 as the output SNMP index for a non-routed flow.
Since web traffic (or traffic that masquerades as web traffic) makes up such a large percentage of flows, additional packing is performed on these flows. The fixed-protocol (TCP) and limited number of web-server-side ports (80 (http), 443 (https), or 8080 (http-alt)) allow routed-web traffic to be packed in a smaller record. Although the savings is only a couple of bytes per record, these can add up to substantial savings over the course of a day. There is certainly no guarantee that routed-web traffic is entirely HTTP-based or that there is no HTTP-traffic in the remaining routed categories; the web/non-web split is a simple heuristic that gets it right most of the time.
Similar to the separation for web traffic, an additional split that can occur is to store the routed ICMP traffic separately from other traffic in a routed-icmp category. Currently, there are no file formats that take advantage of the possible space savings.
The categories (which may also be called flowtypes or types) for the twoway site are:
The packing logic in use prior to SiLK-0.11.0 is called generic, and it is available in the $SILK_PATH/lib/silk/packlogic-generic.so plug-in. It is similar to the twoway site, but it does not provide the ext2ext, int2int, or other categories, and the incoming versus outgoing test is slightly different. In the generic packing logic, rwflowpack tests to see if a flow is incoming; that is, whether the source IP is outside the monitored network or the incoming SNMP interface faces outside the network. Any flow that does not match the rules for an incoming flow is considered an outgoing flow.
Once each SiLK Flow record is categorized, it is stored on disk in a directory tree rooted at a directory called the SILK_DATA_ROOTDIR. When you run configure, you specify a default value of the SILK_DATA_ROOTDIR (see Section 2.2) which is compiled into the rwfilter program. The default can be modified at run-time with the --data-rootdir switch or by setting the SILK_DATA_ROOTDIR environment variable.
The layout of the tree under SILK_DATA_ROOTDIR can be customized by editing the path-format value in the silk.conf file. In the default layout, the directories directly under SILK_DATA_ROOTDIR correspond to the SiLK Flow record categories. An example subdirectory would be $SILK_DATA_ROOTDIR/inweb, which would contain the SiLK Flow records for incoming routed-web traffic. Within each of these directories are date directories, in the form YYYY/MM/DD . For example, output web files for October 4th, 2003 are recorded in:
Each date directory contains the binary SiLK Flow files, one per hour per sensor per category (flowtype). The file names include the date and type information, and are written in the form: flowType-sensorName_YYYYMMDD.HH. Note that the date and hour are based on UTC time, not local time.
The flowType corresponds to how the flow records were categorized (e.g., iw denotes a file containing incoming routed-web records). The sensorName identifies the sensor where the flow was collected.
As explained in Appendix A, rwflowpack can determine whether flows represent inbound or outbound traffic by examining either the IP addresses in the flows or their SNMP interface values. This Appendix explains how to discover the interfaces values by collecting flow data and examining it with SiLK. You may ignore this appendix if you are using the IP address approach, or if you have another method to determine the SNMP interface numbers your router is using.
To pack flows by the SNMP interface values, rwflowpack needs to know which of the router’s interfaces are the external interfaces (i.e., facing outside the monitored network). (For a border router, these are the interfaces that connect to the ISP.) One way to determine these interfaces is to use the SiLK tools to collect data and compare the source and destination IP addresses with the IP addresses of the monitored network. The approach outlined below will not work if the monitored network does not have a well-defined set of exclusive IP addresses.
Begin by creating an IPset of the monitored network’s address space. To do this, list the network’s CIDR blocks in a text file, one CIDR address per line, and save this file as myips.txt. If your address space is 192.168.0.0/16, you could run
To convert the text listing to a binary IPset file, issue the command
The file myips.set is a binary representation of your address space. You can use the rwsetcat command to list the contents of the file, though beware that the default output is one address (/32) per line, so there can be a lot of output. You can use the --cidr switch to print the output in CIDR notation. Supplying the --print-statistics or --network-structure switch should also produce some useful output for sanity checking the IPset file. For example, if your network is 192.168.0.0/16, you will see:
In order to identify which interfaces are external, configure rwflowpack to categorize all data as incoming null, then determine what subset of records actually represent incoming traffic by looking at the source and destination IP addresses. Do this by configuring the Sensor Configuration file (see Section 4.1) so that the flows come from the external network and go to the null network.
For example, if your site as an Alpha sensor, you would create the following sensor.conf file to collect NetFlow v5 traffic on port 8092:
You need to tell rwflowpack to include the SNMP interface numbers in the files it creates. Add the option --pack-interfaces to the invocation of rwflowpack by modifying the rwflowpack.conf configuration file. You may also want to decrease the --flush-timeout, which affects the amount of data rwflowpack stores in RAM before writing the records to disk. The default is two minutes, but since you are waiting for the data, a value of 30 seconds is reasonable.
Near the bottom of the rwflowpack.conf file is the line:
Modify it to read:
If you are running multiple instances of rwflowpack, repeat the above steps for every router (sensor) on your network, since each router will have its own SNMP interface values.
Start the rwflowpack control script and allow rwflowpack to collect data.
You should see data appearing in the files $SILK_DATA_ROOTDIR/innull/*/*/*/*. For example, traffic captured at
2:14 pm EDT on October 4, 2003, from sensor Alpha will be in
$SILK_DATA_ROOTDIR/innull/2003/10/04/innull-Alpha_20031004.18. If data does not appear, do something to
generate traffic, such as browsing the web. If you still do not see data, make certain you have correctly configured
your router(s) to generate NetFlow v5 records and that the host and port to which the router is sending NetFlow
matches the host and port where rwflowpack is listening.
If you see the data files but they are empty, be patient. rwflowpack uses buffered input/output, which may hold records in memory. The data are flushed once the flush-timeout is reached, and at shutdown.
All the collected flows are in the innull data files. To find incoming traffic, you want to select all records for which the source IP is outside the monitored network’s address space and the destination IP is inside the address space. To select records, use the rwfilter command:
The --not-sipset and --dipset switches do the IP address filtering. Use the --type switch to select the all the data files (by default rwfilter looks only at the files for incoming routed data.) The --pass-output switch will direct the records that pass these IP filters to the standard output, which you pipe into another tool. For the records that pass the filter, you want to know which SNMP interfaces the records passed through in the border router(s). To get this information, run the rwuniq command, and select the fields containing the sensor and input and output SNMP indexes as the key.
Running the above command will produce something similar to:
where Alpha and Bravo are the names you assigned to the sensors (routers). From this output, you can see that SNMP interface 1 on the router named Alpha is the incoming interface, and interface 8 on Bravo is incoming. Note that a router connected to multiple ISPs will have multiple input interfaces.
Use the control script to stop the current rwflowpack.
You probably want to remove the data files you just created. This is the brute force method which will remove all the data files:
In the Sensor Configuration file, set the external-interface and internal-interface attributes to the appropriate value(s). You can remove the --pack-interfaces and --flush-timeout switches from the EXTRA_OPTIONS line in the rwflowpack.conf file.
When available, rwsender and rwreceiver can use GnuTLS (the GNU Transport Layer Security Library) to encrypt and authenticate the communication between them. To use this feature, the rwsender and rwreceiver each need access to the PEM (Privacy Enhanced Mail) encoded root Certificate Authority (CA) file and to a program specific certificate and key, which can be either a DER (Distinguished Encoding Rules) encoded PKCS#12 file or a PEM encoded key file and a PEM encoded certificate file.
The communication between rwsender and rwreceiver will be established as long as the PKCS#12 file or the key and certificate files both have the same CA. You can create a single program-specific key and certificate and use that on for all instances of rwsender and rwreceiver, or create a separate certificate/key pair for each instance of these programs.
We recommend creating a local certificate authority (CA) file, and creating program-specific certificates signed by that local CA. The local CA and program-specific certificates are copied onto the machines where rwsender and rwreceiver are running. The local CA acts as a shared secret: it is on both machines and it is used to verify the asymmetric keys between the rwsender and rwreceiver certificates.
If someone gains access to the local CA, that person would not be able to decipher the conversation between rwsender and rwreceiver, since the conversation is encrypted with a private key that was negotiated during the initialization of the TLS session.
However, anyone with access to the CA would be able to set up a new session with an rwsender (to download files) or an rwreceiver (to inject spoofed files). The GnuTLS certificates should be one part of your security; additional measures (such as firewall rules) should be enabled to mitigate these issues.
GnuTLS provides a tool called certtool to create the files, as described below. rwsender and rwreceiver also support using PKCS#12 files created with openssl.
To create a self-signed CA certificate, rootcert.pem, and its private key, rootkey.pem, fill in the following template with the appropriate information and save it to roottemp.cfg. You may also forgo the template, in which case certtool will prompt you for the information interactively.
Once you have filled in the above template, the following commands use it to create the CA key and certificate. (Remove the --template switch and its parameter if you are not using the template).
To create a program-specific certificate, cert.pem, and key, key.pem, you may fill in the following template and save it as progtemp.cfg, or have certtool prompt you for the information interactively.
Use the following commands to create a certification from the template and the root CA you created above:
You may use the cert.pem and key.pem files you created above, or you may convert these to a single PKCS#12 file. The advantages of PKCS#12 is that it is a single file, it may be created with openssl, and it may be password protected.
The following certtool command converts the cert.pem and key.pem files from the previous section to a PKCS#12 file named pkcs12.der:
If you choose to password protect the file, you must specify the password in the RWSENDER_TLS_PASSWORD environment variable prior to starting rwsender, and similarly RWRECEIVER_TLS_PASSWORD for rwreceiver.