SiLK Installation Handbook
SiLK-1.0.1

CERT Network Situational Awareness
© 2003-2008 Carnegie Mellon University
 
The canonical location for this handbook is
http://tools.netsa.cert.org/silk/install-handbook.pdf

April 30, 2008

Use of the SiLK system and related source code is subject to the terms of the following licenses:

GNU Public License (GPL) Rights pursuant to Version 2, June 1991  
Government Purpose License Rights (GPLR) pursuant to DFARS 252.225-7013  
 
NO WARRANTY  
 
ANY INFORMATION, MATERIALS, SERVICES, INTELLECTUAL PROPERTY OR OTHER  
PROPERTY OR RIGHTS GRANTED OR PROVIDED BY CARNEGIE MELLON UNIVERSITY  
PURSUANT TO THIS LICENSE (HEREINAFTER THE "DELIVERABLES") ARE ON AN  
"AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY  
KIND, EITHER EXPRESS OR IMPLIED AS TO ANY MATTER INCLUDING, BUT NOT  
LIMITED TO, WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE,  
MERCHANTABILITY, INFORMATIONAL CONTENT, NONINFRINGEMENT, OR ERROR-FREE  
OPERATION. CARNEGIE MELLON UNIVERSITY SHALL NOT BE LIABLE FOR INDIRECT,  
SPECIAL OR CONSEQUENTIAL DAMAGES, SUCH AS LOSS OF PROFITS OR INABILITY  
TO USE SAID INTELLECTUAL PROPERTY, UNDER THIS LICENSE, REGARDLESS OF  
WHETHER SUCH PARTY WAS AWARE OF THE POSSIBILITY OF SUCH DAMAGES.  
LICENSEE AGREES THAT IT WILL NOT MAKE ANY WARRANTY ON BEHALF OF  
CARNEGIE MELLON UNIVERSITY, EXPRESS OR IMPLIED, TO ANY PERSON  
CONCERNING THE APPLICATION OF OR THE RESULTS TO BE OBTAINED WITH THE  
DELIVERABLES UNDER THIS LICENSE.  
 
Licensee hereby agrees to defend, indemnify, and hold harmless Carnegie  
Mellon University, its trustees, officers, employees, and agents from  
all claims or demands made against them (and any related losses,  
expenses, or attorney’s fees) arising out of, or relating to Licensee’s  
and/or its sub licensees’ negligent use or willful misuse of or  
negligent conduct or willful misconduct regarding the Software,  
facilities, or other rights or assistance granted by Carnegie Mellon  
University under this License, including, but not limited to, any  
claims of product liability, personal injury, death, damage to  
property, or violation of any laws or regulations.  
 
Carnegie Mellon University Software Engineering Institute authored  
documents are sponsored by the U.S. Department of Defense under  
Contract F19628-00-C-0003. Carnegie Mellon University retains  
copyrights in all material produced under this contract. The U.S.  
Government retains a non-exclusive, royalty-free license to publish or  
reproduce these documents, or allow others to do so, for U.S.  
Government purposes only pursuant to the copyright license under the  
contract clause at 252.227.7013.

Contents

1 Introduction
 1.1 Prerequisites
 1.2 Upgrading SiLK
 1.3 SiLK system configurations
  1.3.1 Single machine configuration
  1.3.2 Remote data collection and remote flow storage
  1.3.3 Remote data collection with local storage
  1.3.4 Local collection and remote SiLK flow storage
  1.3.5 Analysis only
 1.4 Handbook summary
 1.5 Additional resources
2 Building SiLK from Source Code
 2.1 Unpack the source code
 2.2 Choose installation directories
 2.3 Optional features
  2.3.1 Supporting PySiLK : SiLK in Python
  2.3.2 Supporting IPv6
  2.3.3 Using automatic file compression
  2.3.4 Specifying the location of compression libraries
  2.3.5 Collecting IPFIX flows
  2.3.6 Disabling run-time packing logic
  2.3.7 Controlling what applications are built and installed
  2.3.8 Statically-linked applications
  2.3.9 Supporting encrypted communication using GnuTLS
  2.3.10 Using your local timezone
  2.3.11 Supporting conversion of packet capture tcpdump data
  2.3.12 Supporting the IP Association library (libipa)
  2.3.13 Supporting development and debugging
 2.4 Configure SiLK
 2.5 Build and install
3 Analysis Tool Customization
 3.1 Create the site configuration file, silk.conf
 3.2 Finalize the PySiLK installation
 3.3 Specify local address space
 3.4 Country Code Library installation
4 Single Machine Configuration
 4.1 Create the sensor configuration file, sensor.conf
  4.1.1 Upgrading from SiLK-0.11.x
  4.1.2 Compatibility between rwflowpack and flowcap
 4.2 Install the software
 4.3 Customize the rwflowpack.conf configuration file
 4.4 Test the settings
 4.5 Enable automatic invocation
 4.6 Start the flow generator
5 Remote Collection and Flow Storage
 5.1 Packing machine, part 1
  5.1.1 Install the software
  5.1.2 Customize and install rwflowpack
  5.1.3 Create an identifier for rwreceiver
  5.1.4 Create an identifier for rwsender
  5.1.5 Create keys and certificates for GnuTLS security
 5.2 Remote collection machine
  5.2.1 Install the software
  5.2.2 Customize and install flowcap
  5.2.3 Customize and install rwsender
 5.3 Packing machine, part 2
  5.3.1 Customize the rwreceiver.conf configuration file
  5.3.2 Test the rwreceiver.conf settings
  5.3.3 Enable automatic invocation of rwreceiver
 5.4 Remote storage machine
  5.4.1 Install the software
  5.4.2 Customize and install rwflowappend
  5.4.3 Customize and install rwreceiver
 5.5 Packing machine, part 3
  5.5.1 Customize the rwsender.conf configuration file
  5.5.2 Test the rwsender.conf settings
  5.5.3 Enable automatic invocation of rwsender
 5.6 Start the complete system
  5.6.1 Start transfer between collection and packing machines
  5.6.2 Start transfer from packing to storage machines
  5.6.3 Start rwflowappend on each storage machine
  5.6.4 Start rwflowpack on the packing machine
  5.6.5 Start flowcap on each collection machine
  5.6.6 Start flow generator
6 Remote Data Collection
 6.1 Packing machine, part 1
  6.1.1 Install the software
  6.1.2 Customize the rwflowpack.conf configuration file
  6.1.3 Create an identifier for rwreceiver
 6.2 Remote collection machine
 6.3 Packing machine, part 2
 6.4 Start the complete system
7 Remote SiLK Flow Storage
 7.1 Packing machine, part 1
  7.1.1 Install the software
  7.1.2 Customize the rwflowpack.conf configuration file
  7.1.3 Create an identifier for rwsender
 7.2 Remote storage machine
 7.3 Packing machine, part 2
 7.4 Start the complete system
8 Flow Generator Configuration
 8.1 Using the YAF Flow Sensor
 8.2 Configuring a router
 8.3 Configure the machine(s) receiving flows
A Packing Logic Overview
 A.1 NetFlow primer
 A.2 IPFIX introduction
 A.3 Categorizing the flow
  A.3.1 Incoming vs. outgoing traffic
  A.3.2 Routed vs. non-routed traffic
  A.3.3 Routed-web traffic
  A.3.4 Routed-ICMP traffic
  A.3.5 Categorization summary
 A.4 Data Storage Hierarchy
B Determining External Interfaces
C Creating GnuTLS Certificates
 C.1 Creating the Certificate Authority
 C.2 Creating a program-specific certificate/key pair
 C.3 Creating a PKCS#12 file

 1
Introduction

SiLK, the System for Internet-Level Knowledge, is a collection of traffic analysis tools developed by the CERT Network Situational Awareness Team (CERT NetSA) to facilitate security analysis of large networks. The SiLK tool suite supports the efficient collection, storage, and analysis of network flow data, enabling network security analysts to rapidly query large historical traffic data sets. SiLK is ideally suited for analyzing traffic on the backbone or border of a large, distributed enterprise or mid-sized ISP.

SiLK supports the collection of the following types of flow data:

NetFlow.
Flows generated by a router producing NetFlow v5, or software that can generate data with that format. The format of NetFlow v5 PDUs (Protocol Data Units) is described in “NetFlow Export Datagram Format,” http://www.cisco.com/en/US/docs/net_mgmt/netflow_collection_engine/3.0/user/guide/nfcform.html.
IPFIX.
Internet Protocol Flow Information eXport flow records that were generated by an IPFIX-compliant flow generator such as YAF. To use this functionality, you must install libfixbuf prior to building and installing SiLK. Both YAF and libfixbuf are available from http://tools.netsa.cert.org/.

This handbook provides instructions to configure and install the SiLK Collection and Analysis Suite. It is intended for individuals comfortable with the following tasks:

Additionally, if SiLK will be accepting NetFlow data from a router, the installer should be comfortable with router configuration.

1.1 Prerequisites

In order to build SiLK, you will need to have:

To get the full functionality of SiLK, these additional libraries and their header files are recommended:

Note that many Linux systems have one package for the run-time shared libraries and another for the header files, and both must be installed when building SiLK from source. For example, to build SiLK with zlib support on a Red Hat Enterprise Linux AS release 4 system, you will need to install both the zlib-1.2.1.2-1.2 and the zlib-devel-1.2.1.2-1.2 RPMs (your version numbers may be different).

1.2 Upgrading SiLK

New releases of SiLK are always capable of reading SiLK Flow data files created by previous releases of SiLK, and support for nearly all other SiLK file formats is maintained in newer releases. When upgrading to a new release of SiLK in an enterprise that uses separate collection, packing, and analysis machines, you should upgrade the analysis host(s) first, then the packing host(s), and finally the collectors. You may also choose to only upgrade the analysis hosts, and leave the packing and collection hosts at previous releases.

In addition, note that any change to the SiLK file formats will always require a change in the minor version number of SiLK (the SiLK version number follows the pattern major.minor.revision). Practically, this means that you can upgrade a collection machine to a newer release, say SiLK-0.13.9, and yet maintain the packing machines at an older release, SiLK-0.13.2. (These version numbers are for illustrative purposes only.) However, a bump in the minor version number does not always signal a change to the SiLK file formats. An analysis host at SiLK-0.13.2 may be able to read files created by SiLK-0.14.1 on the packing host; it depends on whether the SiLK file formats changed at SiLK-0.14.0. Changes to the SiLK file formats are always documented in the release notes, which are included in the source distribution and are available on the web site (http://tools.netsa.cert.org/silk/).

1.3 SiLK system configurations

There are two categories of applications that comprise a SiLK installation:

Analysis tools
read binary files containing SiLK Flow records and partition, sort, and count these records. Additional analysis tools can take packet capture (pcap) data, such as that created by tcpdump, and create SiLK Flow records from this data.
Packing tools
run as daemons to collect flow records from a flow generator (e.g., a router producing NetFlow), convert the records to the SiLK Flow format, categorize the flows as incoming or outgoing, and write the records to their final destination in binary flat files for use by the analysis tools.

Installation of the analysis tools is relatively straightforward since they are installed on systems that have direct access to the SiLK data files and require little configuration.

Installing the packing tools is more complex: the tools run as background processes (with every operating system having a unique way to start these processes) that must cooperate with each other and with additional software and/or network devices. The packing tools are designed to provide a great amount of flexibility in their installation, and with this flexibility comes additional complexity. The tools that make up the SiLK packing system are:

rwflowpack
is the heart of the packing system. It reads flow data either directly from network devices producing flow data (flow generators) or from a file generated by flowcap, converts the data to the SiLK flow format, categorizes the flow records, and writes records either to hourly flat-files organized in a time-based directory structure or to small files for transfer to a remote machine for processing by rwflowappend. All installations of the packing system will run rwflowpack.
flowcap
allows for remote data collection. It listens to flow generators and stores the data in small files (called incremental files) in a single directory. These files are then transferred to rwflowpack for categorization and storage.
rwflowappend
allows for remote data storage. It watches a directory for files containing small numbers of SiLK Flow records and appends those records to hourly files organized in a time-based directory tree.
rwsender
watches an incoming directory for files, moves the files to a processing directory, and transfers the files to one or more rwreceiver processes. rwsender’s incoming directory is usually the output directory of flowcap or rwflowpack.
rwreceiver
accepts files transferred from one or more rwsender processes and stores them in a destination directory. It is this destination directory that rwflowpack or rwflowappend monitor for new files. Note that either rwsender or rwreceiver may act as the server process with the other acting as the client.

There are several possible configurations of the SiLK system which are introduced in this chapter. The detailed installation instructions are presented in subsequent chapters. In the subsections that follow, the term “remote” is with respect to the machine where rwflowpack is running.

1.3.1 Single machine configuration


PIC


Figure 1.1: Single machine operation with NetFlow sensor



PIC


Figure 1.2: Single machine operation with YAF sensor


In the single machine (all-in-one) configuration, all processing occurs on a single machine: You configure the rwflowpack program to collect flows, convert them to the SiLK Flow format, categorize them, and store the SiLK Flow records to the local disk. The analysis tools are installed on this same machine and read the files from local disk. Figure 1.1 shows how this configuration would look when flows are collected from a NetFlow router, and Figure 1.2 shows this configuration when the YAF flow collector is used.

This is the simplest complete installation. To use it, follow the instructions in Section 2 to configure and build the source code, Section 3 to customize the analysis tools, and Section 4 to configure rwflowpack.

1.3.2 Remote data collection and remote flow storage


PIC


Figure 1.3: Remote collection and remote storage


It is not uncommon to have a situation in which the sensor(s) generating the flow records are not close to the data storage location. You could configure the flow generators to send the data to the data storage location; however, due to network reliability and bandwidth issues, it is desirable to collect flow data as close to where it is produced as possible. (This is especially true if the flow generator uses an unreliable transport protocol, such as UDP-based NetFlow generated by a router.) In these situations, the flowcap daemon can be installed on a machine close to the sensor where it will collect, compress, and forward the data to rwflowpack for packing.

Also, suppose the machine where rwflowpack is running is not the same machine on which you are storing the SiLK Flow files, or perhaps you want the SiLK files to be available on multiple machines for use by groups of analysts. In such cases, you configure rwflowpack to write the SiLK Flows into small files called incremental files, and these incremental files are distributed over the network to machine(s) where the rwflowappend daemon writes the SiLK Flow records to their final location. The analysis tools read the records from this final location.

This configuration is the most complex and it is illustrated in Figure 1.3 collecting NetFlow. When the YAF flow collector is used, the top third of the drawing would resemble Figure 1.4.


PIC


Figure 1.4: Using YAF for remote collection


In this configuration, the rwsender and rwreceiver daemons transfer files between the machines. rwsender monitors a directory and transfers the files it finds there to one or more rwreceivers on the downstream side. rwreceiver accepts files from one or more rwsenders and places the files into a directory where the next tool in the packing chain can process them.

rwsender and rwreceiver only transfer files; they do not consider the contents of the files. Instead of using rwsender and rwreceiver, you could (with some stipulations) use other software, such as rsync or scp, to transfer the files between the machines.

If this describes your installation, follow the instructions in Section 2 to install SiLK on each machine, in Section 3 to customize the analysis tools on each machine where analysis occurs, and in Section 5 to configure the daemons on all the machines where the packing tools run.

1.3.3 Remote data collection with local storage


PIC


Figure 1.5: Remote collection and local storage


This configuration is a subset of the previous one: flowcap is used to capture the flows near the point where they are generated, and the rwsender and rwreceiver daemons transfer the flows to the machine where rwflowpack packs them and the analysis tools process them. Figure 1.5 depicts this configuration with a NetFlow router. When a YAF sensor is used, the top half of the figure would be replaced with Figure 1.4.

This installation will largely follow the same instructions as those described previously; however, the configuration of rwflowpack is slightly different as described in Section 6. That section will refer you to the parts of Section 5 you must follow to configure flowcap. You will use Section 3 to configure the the analysis tools on the machine where rwflowpack is installed.

1.3.4 Local collection and remote SiLK flow storage


PIC


Figure 1.6: Local collection and remote storage


This configuration, shown in Figure 1.6, is also a subset of that described in Section 1.3.2, except that rwflowpack is used to collect the flows instead of flowcap.

For this configuration, you will install the source code on the packing machine and the analysis machine (Section 2), customize the analysis tools on the machine where rwflowappend is to run (Section 3), and configure rwflowpack and rwflowappend (Section 7).

1.3.5 Analysis only


PIC


Figure 1.7: Configuration where only analysis occurs


Finally, if you only plan to use the software to analyze existing SiLK Flow files and/or packet capture (pcap) data such as that created by tcpdump, you would use this configuration (Figure 1.7). For this configuration, you need to build the source code (Section 2) and customize the analysis tools (Section 3).

1.4 Handbook summary

The instructions in the next two sections of this handbook will allow you to use SiLK to analyze existing SiLK files and analyze packet capture (pcap) data such as that created by tcpdump: Section 2 describes how to configure and install the SiLK software from source, and Section 3 describes how to customize the analysis tools to get the most use from the system.

The other sections of the handbook describe how to use SiLK to capture flow data, categorize the flows as incoming or outgoing, convert the data to the SiLK format, and store the SiLK Flows in binary flat files indexed by hour, sensor, and direction: The simplest configuration is the Single machine configuration (Section 4), where one machine collects the flow records, packs them, and stores them locally for use by the analysis tools. Having collection, categorization, and storage on separate machines is the most complex configuration (Section 5), and other configurations are possible (Sections 6 and 7).

Section 8 describes how to configure the flow generator to send its data to the SiLK collector(s).

To assist you in the configuration process, Appendix A describes how SiLK categorizes flows as incoming or outgoing (including a description of the data storage hierarchy), and Appendix B provides instructions on how to collect NetFlow data from the router and use that data as part of the configuration.

1.5 Additional resources

This handbook describes the installation of SiLK. For a discussion of the analysis tools, see their individual manual pages, the complete set of manual pages in The SiLK Reference Guide, and the tutorial information in Using SiLK for Network Traffic Analysis: Analysts’ Handbook. These documents are available at http://tools.netsa.cert.org/silk/silk_docs.html.

 2
Building SiLK from Source Code

In this section you will

Quick Start Tip: To unpack the software, install the entire suite into /usr/local, and have it use /data as the location of the data repository, issue the commands:
  $ gzip -d -c silk-1.0.1.tar.gz | tar xf -
  $ cd silk-1.0.1
  $ ./configure \
          --enable-data-rootdir=/data  \
          --prefix=/usr/local
  $ make
  # make install

(You may need to become the root user to install the software.)

You may continue to Section 3.

Note: As of SiLK-1.0.0, you no longer specify the site when you configure the software since the packing logic is (normally) determined by a run-time plug-in loaded by rwflowpack.

2.1 Unpack the source code

Download and unpack the source code distribution:

  $ gzip -d -c silk-1.0.1.tar.gz | tar xf -

For the remainder of these instructions, the full path to the top of the source tree (i.e., the silk-1.0.1 directory, which contains the configure file) will be referred to as $SUITEROOT; it may be set in your (Bourne-compatible) shell by entering the command:

  $ export SUITEROOT=/home/silk/silk-1.0.1

2.2 Choose installation directories

You should decide where to install the tools and where your SiLK Flow data files will reside, and specify this information to the configure script. Some of these locations are compiled into the code, and others are used to initialize the start-up scripts and configuration files for rwflowpack and the other packing tool daemons.

SILK_DATA_ROOTDIR.
The root of the directory tree where the SiLK Flow files are permanently stored. Use the --enable-data-rootdir=dir switch to give the value to configure. If you do not specify a location, /data is the name of the directory.

This value will be compiled into the analysis tools, and it will be the default location that rwfilter uses when looking for the hourly data files. This directory must be accessible by the final program in the packing chain (typically rwflowpack) which writes the packed SiLK flow files and by the analysis machine(s) which reads them. The path to the directory tree can be different on the analysis and packing machines, as long as the actual physical location is the same.

When running the tools, the value of the SILK_DATA_ROOTDIR environment variable will override this compiled-in value. In addition, rwfilter allows you to override this value with the --data-rootdir switch.

For historical reasons, the default value for this location is /data. We use a separate disk for the SiLK flow data since the space it requires can be large and depends on the size of the monitored network, the amount of traffic the network sees, and the aging policy for historical data.

SILK_PATH.
The root of the directory tree where SiLK will be installed. Pass this value to configure in the --prefix switch. If not specified, the default is /usr/local. If you decide to move the tools after they have been installed, you may need to specify the LD_LIBRARY_PATH environment variable (or something equivalent for your platform) so that the applications can find the shared libraries.

The following table shows the subdirectories of $SILK_PATH where files are normally installed, but you can change these by specifying switches to configure. Use configure’s --help switch to see the full list of directory choices.

bin

analysis tools, such as rwfilter

sbin

system administrator tools, for example rwflowpack

share/man

manual pages

lib/silk

optional plug-in support, such as the country code support

share/silk

support files, such as the country-code mapping file

share/silk/etc

sample configuration files and scripts to assist the system administrator in running the packing system daemons

etc

configuration files used by the packing system daemons (see SCRIPT_CONFIG_LOCATION below)

var

directory root used by packing tools (see DAEMON_STATE_DIRECTORY below)

var/log

log files generated by the packing system daemons

var/lib

incomplete data files generated by the packing tools and files awaiting processing

lib

libraries required to run the tools and used to build end-user plug-ins

include/silk

header files used to build end-user plug-ins

Note: The applications work best when they have access to configuration files and plug-ins, and the code that searches for these files depend on the directory tree as it will be upon installation. If you do not plan to use the tools outside of your own tree, you may want to specify --prefix=‘pwd‘ (note the back quotes) to the configure script. When you run make install, the tools will be installed into the top of the source tree.

SCRIPT_CONFIG_LOCATION.
The directory containing configuration files used by the daemons that make up the SiLK packing system. Often this is the /etc directory for system daemons; RedHat Linux uses /etc/sysconfig for this value. The value SiLK uses is determined by the --sysconfdir switch to configure, and it defaults to $SILK_PATH/etc if the --sysconfdir switch was not given. This value will be written into the sample daemon control sh-scripts that get installed in $SILK_PATH/share/silk/etc/init.d/daemon . If you need to change this value after you run configure, you may simply change the value in the sh-scripts.
DAEMON_STATE_DIRECTORY.
The directory used by the packing system daemons to store log files, incomplete data files, files received from remote machines, and files awaiting transfer. This is usually the /var directory, with subdirectories for the various types of files and applications that own them. You may set this value by running configure with --localstatedir=dir ; the default value for this directory is $SILK_PATH/var. This value is used in the configuration files for the packing tools that get installed in $SILK_PATH/share/silk/etc/daemon.conf. You will need to edit these files when you set up the packing system, and you do not have to use these initial values.

2.3 Optional features

To adapt the source code to your operating system and environment, the configure shell script will run several tests to check for various features. By giving command line switches to configure, you can include additional features or instruct configure to use libraries from particular locations. You can also control where SiLK will be installed. You can display the full list of switches that configure accepts by running configure --help. The remainder of this section describes many of these switches.

2.3.1 Supporting PySiLK: SiLK in Python

SiLK-1.0 provides support for accessing SiLK from within Python and for using Python code as part of an rwfilter invocation. This support is called PySiLK and it requires Python 2.4 or later.

To include PySiLK support, you must provide the --with-python switch to configure. To use a particular Python interpreter, you may use --with-python=path . For information on using PySiLK, see the rwfilter man page and SiLK in Python, available from http://tools.netsa.cert.org/silk/silk_docs.html.

(Unless Python and SiLK are installed in the same directory, you will need to follow the instructions in 3.2 to allow Python to find the PySiLK modules.)

2.3.2 Supporting IPv6

Some SiLK applications have been modified to support handling IPv6 addresses. To enable this behavior, specify the --enable-ipv6 switch on the configure command line. Currently, SiLK supports collecting IPv6 data from IPFIX data, which requires that you build and install libfixbuf v0.7.3 (see 2.3.5) or later before installing SiLK.

2.3.3 Using automatic file compression

To reduce the size of the data files, the rwflowpack daemon and many analysis tools have the ability to use an external library to automatically compress their binary output when writing and uncompress their input when reading. (This compression occurs on the ‘data’ section of the file; the file’s header remains uncompressed.) You can specify whether a particular tool uses this external compression via a switch on the tool’s command line. The default setting for this behavior is determined by the --enable-output-compression=type switch to configure. SiLK supports the following parameters to the switch:

none

use no compression; this is the default

zlib

use the widely available zlib general compression library

lzo1x

use the LZO (LZO 1.08 or LZO 2.02) real-time data compression library

The latter two options require the support of external libraries as described next.

2.3.4 Specifying the location of compression libraries

The configure script will attempt to find the zlib general compression library and its header file. Specifying the --with-zlib=dir switch tells configure that the header and library are located in dir/include/zlib.h and dir/lib/libz.a, respectively.

Note: Several operating system vendors distribute the libraries and header files in separate packages. To take zlib on RedHat as an example, the zlib package contains the zlib library, and the header file (and manual page) is in the separate zlib-devel package. In order to build SiLK from source, you need to have both packages installed.

The configure script will also attempt to find the LZO ( http://www.oberhumer.com/opensource/lzo/) real-time data compression library and headers. SiLK will work with either LZO 1.08 or LZO 2.02. You may use the --with-lzo=dir switch to specify the location of LZO.

2.3.5 Collecting IPFIX flows

If SiLK is compiled with libfixbuf support, the SiLK packer can read flow data generated by an IPFIX (Internet Protocol Flow Information eXport) compliant flow generator such as the YAF v1.0 flow sensor technology ( http://tools.netsa.cert.org/yaf/). (libfixbuf is not part of SiLK, you must download it from http://tools.netsa.cert.org/fixbuf/and install it prior to installing SiLK. To use this feature, SiLK requires libfixbuf-0.7.3 or later.)

In addition, if configure finds libfixbuf, the rwipfix2silk and rwsilk2ipfix command line tools will be built. These tools support converting between the SiLK Flow record format and IPFIX.

When libfixbuf support is included, the SiLK data files contain additional information: the TCP flags are broken into two fields, one containing the flags on the first packet of a flow and the other containing the flags on all other packets in the flow. This feature is automatically enabled when libfixbuf is found. Specifying --enable-initial-tcpflags also enables this feature, but note that the separate TCP flag fields will only contain valid values when used with the enhanced flow collection software.

The configure script will look for the pkg-config(1) specification file for libfixbuf (libfixbuf.pc) in the standard pkg-config directories, and if libfixbuf is installed in a standard location, configure should be able to locate it. If you have installed libfixbuf but configure does not find it, you can run configure with the --with-libfixbuf=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The libfixbuf.pc file is normally installed in the lib/pkgconfig subdirectory of the location where libfixbuf was installed.

2.3.6 Disabling run-time packing logic

As of SiLK-1.0.0, the packing logic used by rwflowpack to categorize flow records as incoming or outgoing, web or non-web, et cetera, is determined by plug-in that is loaded when rwflowpack is invoked. The name of this plug-in must be passed to rwflowpack via the --packing-logic switch.

Using a plug-in for flow categorization makes it easier to change the packing logic or to test new categorization schemes. However, it requires that the plug-in be available and that you not have disabled plug-in support by building statically-linked applications (Section 2.3.8).

If you wish to compile the packing-logic into rwflowpack, you must specify the --enable-packing-logic switch when you run configure. The argument to this switch is the C source file containing the packing logic to use for this SiLK installation. For example, if you wish to use the twoway packing logic described in Appendix A, run

  $ configure ... \
      --enable-packing-logic=site/twoway/packlogic-twoway.c

2.3.7 Controlling what applications are built and installed

All of the SiLK applications (i.e., both the analysis tools and the packing [flow collection and storage] daemons) and their associated manual pages will be built and installed unless the --disable-packing-tools or --disable-analysis-tools switches are passed to configure. You can speed the building of the software if you disable the parts of the system you do not require. For example, a remote collection machine does not need the analysis tools (though they can be useful to have for debugging).

2.3.8 Statically-linked applications

The configure script will build SiLK with support for dynamic-linking, where the common library functions of SiLK are maintained in separate files that the operating system automatically loads when you invoke an application. (The alternative is called static-linking.) While dynamic-linking allows the kernel to maintain one image of the library for simultaneous invocations of SiLK tools, it makes moving the binaries almost impossible since the libraries must move as well, and often the binaries are configured to look in a particular location for the libraries.

If you wish to build without dynamic-linking support, give configure the --enable-static-applications switch, which forces the applications to be statically linked. However, this may result in some plug-ins not working correctly.

An alternative is to specify the --disable-shared switch to configure, but note that this results in the plug-ins not being compiled at all.

If you specify --enable-static-applications or --disable-shard to configure, you also need to specify the --enable-packing-logic switch since rwflowpack will not be able to load the packing logic as a plug-in. See Section 2.3.6 for a description of the --enable-packing-logic switch and the argument the switch requires.

2.3.9 Supporting encrypted communication using GnuTLS

If SiLK is compiled with GnuTLS support, the communication between rwsender and rwreceiver can be encrypted and authenticated once the appropriate certificates have been created and distributed. GnuTLS is the GNU Project’s Transport Layer Security Library, and it is available from http://www.gnu.org/software/gnutls/. Note that SiLK requires GnuTLS v1.4.1 or greater.

The configure script will look for the pkg-config(1) specification file for GnuTLS (gnutls.pc) in the standard pkg-config directories, and if GnuTLS is installed in a standard location, configure should be able to locate it. If you have installed GnuTLS but configure does not find it, you can run configure with the --with-gnutls=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The gnutls.pc file is normally installed in the lib/pkgconfig subdirectory of the location where GnuTLS was installed.

2.3.10 Using your local timezone

By default, SiLK uses UTC when printing timestamps to the user, and it expects timestamps from the user to be in UTC. Giving configure the --enable-localtime switch will modify SiLK to print and expect times in the local timezone. (Data files are always indexed by UTC.)

2.3.11 Supporting conversion of packet capture tcpdump data

The configure script will attempt to locate the pcap library and header files. If they are not found or if they do not have the required functions, SiLK will be built without support for the packet-flow conversion tools rwptoflow and rwpmatch.

If you wish to specify that SiLK use a particular version of the pcap library, pass the --with-pcap=dir switch to configure, where dir contains include/pcap.h and lib/libpcap.a (or a shared version of the library).

2.3.12 Supporting the IP Association library (libipa)

If SiLK is compiled with libipa support, the rwipaimport and rwipaexport programs will be compiled. These tools interact with an IPA (IP Association) database, which stores information about IP addresses. rwipaimport takes an existing SiLK IPset, Bag, or Prefix Map and stores it in the database; rwipaexport reads data from the IPA database to create a SiLK IPset, Bag, or Prefix Map. libipa is a separate library available from http://tools.netsa.cert.org/ipa/. SiLK-1.0 requires libipa-0.3.0 or greater.

The configure script will look for the pkg-config(1) specification file for libipa (libipa.pc) in the standard pkg-config directories, and if libipa is installed in a standard location, configure should be able to locate it. If you have installed libipa but configure does not find it, you can run configure with the --with-libipa=dir switch to add the directory dir to pkg-config’s search path (configure will add dir to the PKG_CONFIG_PATH environment variable). The libipa.pc file is normally installed in the lib/pkgconfig subdirectory of the location where libipa was installed.

2.3.13 Supporting development and debugging

By default, SiLK is built with full optimization (assuming the compiler accepts -O3 for optimization), with no debugging, and with assert()s disabled. Pass the --disable-optimization, --enable-debugging, and --enable-assert switches to configure to modify these settings. If your compiler uses a different switch to enable optimization (such as -x04 for Solaris’ cc), you may specify it with --enable-optimization=-x04.

2.4 Configure SiLK

You will need to configure the source code for each machine that runs any part of the SiLK Collection and Analysis Suite.

Run the configure script to configure the SiLK source code. The following command would configure the software to use /data as the location of the data repository and to expect to be installed into the /usr/local directory:

  $ cd $SUITEROOT
  $ ./configure \
          --prefix=/usr/local \
          --enable-data-rootdir=/data

Consult the previous section for additional switches that you may need or wish to pass to configure to help it find a library or to enable an optional feature.

configure will run several tests on your platform and use the results of these tests to create several files. When configure has finished, it will print a summary of how it has configured the SiLK source code:

  * Configured package:           SiLK 1.0.0
  * Root of packed data tree:     /data
  * Install directory:            /usr/local
  * Source files ($top_srcdir):   .
  * Packing logic:                via run-time plugin
  * Timezone support:             UTC
  * Default compression method:   SK_COMPMETHOD_NONE
  * IPv6 support:                 NO
  * IPFIX collection support:     YES (-lfixbuf -lgthread-2.0 -lglib-2.0)
  * Initial TCP flag support:     YES (via libfixbuf)
  * Transport encryption support: NO (gnutls not found)
  * IPA support:                  YES (-lipa -lairdbc -lglib-2.0)
  * LIBPCAP support:              YES (-lpcap)
  * Python support:               NO
  * Build analysis tools:         YES
  * Build rwflowpack:             YES
  * Build flowcap:                YES
  * Compiler (CC):                gcc
  * Compiler flags (CFLAGS):      -I$(top_srcdir)/src/include
      -DNDEBUG -D_GNU_SOURCE=1 -D_FILE_OFFSET_BITS=64 -O3
      -fno-strict-aliasing -Wall -W -Wmissing-prototypes
      -Wformat=2 -Wdeclaration-after-statement
  * Linker flags (LDFLAGS):
  * Libraries (LIBS):             -llzo -lz -ldl -lm

The above message is also written to the silk-summary.txt file in the directory where you ran configure.

Verify that the configuration matches your expectations. The configure script does not complain when it is given a switch it does not recognize, which makes it easy for a simple “typo” to go unnoticed.

2.5 Build and install

To build SiLK, simply type make from the top of the source tree:

  $ cd $SUITEROOT
  $ make

You can then install the software. Depending on where you chose to install, you may need to become the root user first. This command will install the applications, the support libraries, the plug-ins, and the manual pages:

  # cd $SUITEROOT
  # make install

 3
Analysis Tool Customization

This section describes the customization of the analysis tools. The manual page for each tool will be installed under $SILK_PATH/share/man/man1/ when you install SiLK. In addition, http://tools.netsa.cert.org/silk/silk_docs.html provides the manual pages as individual web pages and as a single volume in The SiLK Reference Guide. The web site also contains a tutorial on using the analysis suite: Using SiLK for Network Traffic Analysis: Analysts’ Handbook.

While nothing in this section is required to use SiLK, these steps will enhance the utility of the software.

3.1 Create the site configuration file, silk.conf

In addition to the information contained in the NetFlow or IPFIX flow record (e.g., source and destination addresses and ports, IP protocol, time stamps, data volume), every SiLK flow record has two additional pieces of information:

The purpose of the SiLK site configuration file, silk.conf, is to define the sensors, classes, and types to use when packing and accessing the SiLK flow data. The first time you install SiLK, and any time you add new sensors (IPFIX or NetFlow generators) to a deployment, you will need to update silk.conf.

Note: If you are upgrading from SiLK-0.11.x to SiLK-1.0, no changes to the silk.conf file are required, though you may want to read about the new packing-logic statement that silk.conf supports.

Quick Start Tip: Open $SILK_PATH/share/silk/twoway-silk.conf in a text editor and change the sensor names S0, S1, et cetera to reflect the sensors at your site. Add or remove sensors as required, and be certain to change the name in both the sensor and the class sections of the file.
  sensor 0 Alpha
  sensor 1 Bravo
  ...
  
  class all
      sensors Alpha Bravo ...
  end class

Once you have made the changes, rename the file silk.conf and save it in the root of your data repository, normally /data.

You may continue to Section 3.2.

When you install SiLK, sample site configuration files are installed in $SILK_PATH/share/silk/SITE-silk.conf. The various files provide different sets of classes and types, and must coordinate with the packing rules that you will use at your site. For information on the twoway and generic site files, see Appendix A. We recommend use of the twoway-silk.conf file.

Copy the twoway-silk.conf file to a temporary location, renaming the file silk.conf, and open silk.conf in a text editor. If you are using the twoway-silk.conf file, you will see the following near the beginning of the file:

1  sensor 0 S0
2  sensor 1 S1
3  sensor 2 S2
4  sensor 3 S3
5  ...
6  sensor 13 S13
7  sensor 14 S14
8  
9  class all
10      sensors S0 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14
11  end class

Each line of form

  sensor NUM NAME

defines a sensor, where

NUM
is an increasing integer number representing the integer ID of the sensor. It is good practice to number the first entry 0, the second 1, etc.
NAME
is the name of the sensor. For example, the name of the sensor on line 2 is S1. Each NAME can be up to 24 characters in length, it should begin with a capital letter, and it may not contain an underscore, a slash, or white space.

As distributed, the twoway-silk.conf is configured with 15 sensors having names S0, S1, through S14. (If you have 15 or fewer sensors and these names are satisfactory, you may save the silk.conf file to the root of your data repository, typically /data, and skip ahead to Section 3.2.)

You may add, remove, or rename the sensors. Often the sensor names reflect the location of a router or the ISP the router connects to. There are some important things to keep in mind when modifying the list of sensors:

  1. Once a sensor has been assigned an ID number, future revisions should never remove or renumber the sensor. SiLK Flow files store the sensor’s integer ID and use it to look up the sensor’s name; removing or renumbering a sensor breaks this mapping. In order to keep the mapping consistent between new and old data, old sensor definitions should remain indefinitely.
  2. If an existing sensor is ever renamed, it will be necessary to rename all the previously packed data files that have the former sensor name as part of their file names.

Once you have edited the sensor definitions, you must update the sensors command in the same file (line 10) to contain the list of sensor names.

For example, if you had three routers Alpha, Bravo, and Charlie you would edit the site configuration file to read:

  sensor 0 Alpha
  sensor 1 Bravo
  sensor 2 Charlie
  
  class all
      sensors Alpha Bravo Charlie