package io
Package Members
- package ipfix
The
ipfix
package provides classes and objects for reading and writing IPFIX data.The
ipfix
package provides classes and objects for reading and writing IPFIX data.Class / Object Overview
(For a quick overview of the IPFIX format, see the end of this description.)
The Message trait describes the attributes of an IPFIX message, and the CollectedMessage class and object are implementations of that trait when reading data. (Record export does not create specific Message instance.)
The IpfixSet abstract class and object hold the attributes of a Set. The TemplateSet class may represent a Template Set or an Options Template Set.
The Template class and object are used to represent a Template Record or an Options Template Record.
The IEFieldSpecifier class and object represent a Field Specifier within an existing Template. To search for a field within a Template, the user of the
ipfix
package creates a FieldSpec (the companion object) and attempts to find it within a Template.The Field Specifier uses the numeric Identifier to identify an Information Element, and an Element is represented by the InfoElement class and object. The InfoModel class and object represent the Information Model.
To describe the attributes of an
InfoElement
, several support classes are defined: DataTypes is an enumeration that describes the type of data that the element contains, and DataType is a class that extracts a Field Value with thatDataType
. IESemantics describes the data semantics of an Information Element (e.g., a counter, an identifier, a set of flags), and IEUnits describes its units.The Data Set is represented by the RecordSet class and object.
A Data Record is represented by the Record abstract class. This class has three subclasses:
- The CollectedRecord class and object are its implementation when reading data. Its members are always referenced by numeric position.
- The ArrayRecord (I do not like this name) and object may be used to build a Record from Scala objects; its fields are also referenced by numeric position.
- ExportRecord is an abstract class that also supports building a Record from Scala objects. The user extends the class and uses the IPFIXExtract annotation to mark the members of the subclass that are to be used when writing the Record.
A user-defined class that extends Fillable trait may use the Record's fill() method to copy fields from a Record to the user's class. It also uses the IPFIXExtract annotation.
A Structured Data Field Value in a Data Record is represented by the ListElement abstract class. That abstract class has three abstract subclasses, and each of those has two concrete subclasses (one for reading and one for writing):
- The BasicList abstract class (object) has subclasses CollectedBasicList and ExportBasicList.
- The SubTemplateList abstract class (object) has subclasses CollectedSubTemplateList and ExportSubTemplateList.
- The SubTemplateMultiList abstract class (object) has subclasses CollectedSubTemplateMultiList and ExportSubTemplateMultiList.
Reading data
When reading data, a Record instance is returned by a RecordReader. The RecordReader uses a class that extends the MessageReader trait. The
ipfix
package includes two: ByteBufferMessageReader and StreamMessageReader.A Session value represent an IPFIX session, which is part of a SessionGroup.
Writing data
For writing data, an instance of an ExportStream must be created using a Session and the destination FileChannel. The user adds Records or Templates to the ExportStream and they are written to the FileChannel.
Overview of IPFIX
An IPFIX stream is composed of Messages. Each Message has a 16-byte Message Header followed by one or more Sets. There are three types of Sets: A Data Set, a Template Set, and an Options Template Set.
Each Set has a 4-byte set header followed by one or more Records. A Data Set contains Data Records and a Template Set contains Template Records.
A Template Record describes the shape of the data that appears in a Data Record. A Template Record contains a 4-byte header followed by zero or more Field Specifiers. Each Field Specifier is either a 4-byte or an 8-byte value that describes a field in the Data Record.
A Field Specifier has two parts. The first is the numeric Information Element Identifier that is defined in an Information Model. The second is the number of octets the field occupies in the Data Record.
A Data Set contains one or more Data Records of the same type, where the type is determined by the Template Record that the Data Set Header refers to. Each Data Record contains one or more Field Values, where the order and length of the Field Values is given by the Template.
A Field Value in a Data Record may be a Structured Data. There are three types of Structured Data:
- A Basic List contains one or more instances of a Single Information Element.
- A SubTemplateList references a single Template ID, and it contains one or more Records that match that Template.
- The SubTemplateMultiList contains a series of Template IDs and Records that match that Template ID.
An IPFIX stream exists in a Transport Session, where a Transport Session is part of a Session Group. All Sessions in a Session Group use the same Transport Protocol, and only differ in the numeric Observation Domain that is part of the Message Header.
- package silk
SiLK file formats, data types, and methods to read them, including support for reading them from Spark.
SiLK file formats, data types, and methods to read them, including support for reading them from Spark.
RWRec is the type of SiLK flow records.
You can use RWRecReader to read SiLK files from Scala, including compressed files if Hadoop native libraries are available. For example:
import org.cert.netsa.io.silk.RWRecReader import java.io.FileInputStream val inputFile = new FileInputStream("path/to/silk/rw/file") for ( rec <- RWRecReader.ofInputStream(inputFile) ) { println(rec.sIP) }
- See also
org.cert.netsa.mothra.datasources.silk.flow for working with SiLK data in Spark using the Mothra SiLK datasource.
This is documentation for Mothra, a collection of Scala and Spark library functions for working with Internet-related data. Some modules contain APIs of general use to Scala programmers. Some modules make those tools more useful on Spark data-processing systems.
Please see the documentation for the individual packages for more details on their use.
Scala Packages
These packages are useful in Scala code without involving Spark:
org.cert.netsa.data
This package, which is collected as the
netsa-data
library, provides types for working with various kinds of information:org.cert.netsa.data.net
- types for working with network dataorg.cert.netsa.data.time
- types for working with time dataorg.cert.netsa.data.unsigned
- types for working with unsigned integral valuesorg.cert.netsa.io.ipfix
The
netsa-io-ipfix
library provides tools for reading and writing IETF IPFIX data from various connections and files.org.cert.netsa.io.silk
To read and write CERT NetSA SiLK file formats and configuration files, use the
netsa-io-silk
library.org.cert.netsa.util
The "junk drawer" of
netsa-util
so far provides only two features: First, a method for equipping Scala scala.collection.Iterators with exception handling. And second, a way to query the versions of NetSA libraries present in a JVM at runtime.Spark Packages
These packages require the use of Apache Spark:
org.cert.netsa.mothra.datasources
Spark datasources for CERT file types. This package contains utility features which add methods to Apache Spark DataFrameReader objects, allowing IPFIX and SiLK flows to be opened using simple
spark.read...
calls.The
mothra-datasources
library contains both IPFIX and SiLK functionality, whilemothra-datasources-ipfix
andmothra-datasources-silk
contain only what's needed for the named datasource.org.cert.netsa.mothra.analysis
A grab-bag of analysis helper functions and example analyses.
org.cert.netsa.mothra.functions
This single Scala object provides Spark SQL functions for working with network data. It is the entirety of the
mothra-functions
library.