NAME
PySiLK: Silk in Python
DESCRIPTION
This document describes the features of PySiLK, the SiLK Python extension. It documents the objects and methods that allow one to read, manipulate, and write SiLK Flow records, IPsets, Bags, and Prefix Maps (pmaps) from within Python. PySiLK may be used in a stand-alone Python script or as a plug-in from within the SiLK tools rwfilter(1), rwcut(1), rwgroup(1), rwsort(1), rwstats(1), and rwuniq(1). This document describes the objects and methods that PySiLK provides; the details of using those from within a plug-in are documented in the silkpython(3) manual page.
The SiLK Python extension provides the following functions:
- init_site([filename])
-
Use the given filename as the name of the SiLK site configuration file (see silk.conf(3)). If filename is omitted, the value specified in the environment variable SILK_CONFIG_FILE will be used as the name of the configuration file. If SILK_CONFIG_FILE is not set, the module looks for a file named silk.conf in the following directories: the directory specified in the SILK_DATA_ROOTDIR environment variable; the data root directory that is compiled into SiLK; the directories $SILK_PATH/share/silk/ and $SILK_PATH/share/.
-
This function should not generally be called explicitly unless one wishes to use a non-default site configuration file.
-
The init_site() function can only be called once. Subsequent invocations will raise a RuntimeError exception. Some methods and RWRec members require information from the silk.conf file, and when these methods are called or members accessed, the silk.have_site_config() method is invoked. That method will call init_site() with no argument if it has not yet been called. The list of functions, methods, and attributes include: silk.sensors(), silk.classtypes(), silk.classes(), rwrec.as_dict(), rwrec.classname, rwrec.typename, rwrec.classtype, and rwrec.sensor.
- have_site_config()
-
Return
Trueif the module was able to locate the SiLK configuration file,Falseotherwise. Implicitly calls init_site() with no argument if it has not yet been called. - sensors()
-
Return a tuple of valid sensor names. Calls silk.have_site_config().
- classtypes()
-
Return a tuple of valid (class name, type name) tuples. Calls silk.have_site_config().
- classes()
-
Return a tuple of valid class names. Calls silk.have_site_config().
- ipv6_enabled()
-
Return
Trueif SiLK was compiled with IPv6 support,Falseotherwise. - initial_tcpflags_enabled()
-
Return
Trueif SiLK was compiled with support for initial TCP flags,Falseotherwise.
The SiLK Python extension defines the following objects:
- IPAddr
-
A representation of an IP Address.
- IPWildcard
-
A representation of CIDR blocks or SiLK IP wildcard addresses.
- IPSet
-
A representation of a SiLK IPset.
- PrefixMap
-
A representation of a SiLK Prefix Map.
- Bag
-
A representation of a SiLK Bag.
- TCPFlags
-
A representation of TCP flags.
- RWRec
-
A representation of a SiLK Flow record.
- SilkFile
-
A representation of a channel for writing to or reading from SiLK Flow files.
- FGlob
-
An iterable object that allows retrieval of filenames in a SiLK data store.
IPAddr Object
An IPAddr object represents an IPv4 or IPv6 address.
- class IPAddr(address)
-
The constructor takes a string address, which must be a string representation of either an IPv4 or IPv6 address, an integer representation of the address, or an existing IPAddr object. IPv6 addresses are only accepted if
ipv6_enabled()returnsTrue. -
Examples:
-
>>> addr1 = IPAddr('192.160.1.1') >>> addr2 = IPAddr('2001:db8::1428:57ab') >>> addr3 = IPAddr('::ffff:12.34.56.78') >>> addr4 = IPAddr(0xffffffff) >>> addr5 = IPAddr(0xffffffffffffffffffffffffffffffff) >>> addr6 = IPAddr(addr5)
Supported operations and methods:
- addr1 == addr2
-
Return
Trueif addr1 is equal to addr2;Falseotherwise. - addr1 != addr2
-
Return
Falseif addr1 is equal to addr2;Trueotherwise. - addr1 < addr2
-
Return
Trueif addr1 is less than addr2;Falseotherwise. - addr1 <= addr2
-
Return
Trueif addr1 is less than or equal to addr2;Falseotherwise. - addr1 >= addr2
-
Return
Trueif addr1 is greater than or equal to addr2;Falseotherwise. - addr1 > addr2
-
Return
Trueif addr1 is greater than addr2;Falseotherwise. - addr.isipv6()
-
Return
Trueif addr is an IPv6 address,Falseotherwise. - addr.ipv6()
-
(DEPRECATED) An alias for isipv6().
- addr.to_ipv6()
-
Convert an IPAddr to an IPv6 address. This is a no-op if the address is already an IPv6 address.
- addr.to_ipv4()
-
Convert an IPAddr to an IPv6 address. This is a no-op if the address is already an IPv4 address. If the address cannot be converted to an IPv4 address, this method will raise a ValueError exception.
- int(addr)
-
Return the integer representation of addr.
- str(addr)
-
Return a human-readable representation of addr in its canonical form.
- addr.padded()
-
Return a human-readable representation of addr which is fully padded with zeroes. With IPv4, it will return a string of the form ``xxx.xxx.xxx.xxx''. With IPv6, it will return a string of the form ``xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx''.
Note: All comparison operations use the 128-bit representations of the IP addresses to do the comparison.
IPWildcard Object
An IPWildcard object represents a range or block of IP addresses. The IPWildcard object handles iteration over IP addresses with for x in wildcard.
- class IPWildcard(wildcard)
-
The constructor takes a string representation wildcard of the wildcard address. The string wildcard can be an IP address, an IP with a CIDR notation, an integer, an integer with a CIDR designation, or an entry in SiLK wildcard notation. In SiLK wildcard notation, a wildcard is represented as an IP address in canonical form with each octet (IPv4) or hexadectet (IPv6) represented by one of following: a value, a range of values, a comma separated list of values and ranges, or the character 'x' used to represent the entire octet or hexadectet. IPv6 wildcard addresses are only accepted if silk.ipv6_enabled() returns
True. -
Examples:
-
>>> a = IPWildcard('1.2.3.0/24') >>> b = IPWildcard('ff80::/16') >>> c = IPWildcard('1.2.3.4') >>> d = IPWildcard('::FFFF:0102:0304') >>> e = IPWildcard('16909056') >>> f = IPWildcard('16909056/24') >>> g = IPWildcard('1.2.3.x') >>> h = IPWildcard('1:2:3:4:5:6:7.x') >>> i = IPWildcard('1.2,3.4,5.6,7') >>> j = IPWildcard('1.2.3.0-255') >>> k = IPWildcard('::2-4') >>> l = IPWildcard('1-2:3-4:5-6:7-8:9-a:b-c:d-e:0-ffff')
Supported operations and methods:
- addr in wildcard
-
Return
Trueif addr is in wildcard,Falseotherwise. - addr not in wildcard
-
Return
Falseif addr is in wildcard,Trueotherwise. - string in wildcard
-
Return the result of IPAddr(string) in wildcard.
- string not in wildcard
-
Return the result of IPAddr(string) not in wildcard.
- wildcard.isipv6()
-
Return
Trueif wildcard contains IPv6 addresses,Falseotherwise. - str(wildcard)
-
Return the string that was used to construct wildcard.
IPSet Object
An IPSet object represents a set of IPv4 addresses, as produced by rwset(1) and rwsetbuild(1). IPSets do not yet support IPv6. The IPSet object handles iteration over IP addresses with for x in set, and iteration over CIDR blocks using for x in set.cidr_iter().
- class IPSet([ip_iterable])
-
The constructor creates an empty IPset. If an ip_iterable is supplied as an argument, each member of ip_iterable will be added to the IPset. The ip_iterable may be:
-
an IPAddr object representing an IPv4 address
-
the string representation of a valid IPv4 address
-
the string representation of an 32-bit integer value
-
an IPWildcard object containing IPv4
address(es) -
the string representation of an IPWildcard
-
an iterable of any combination of the above
-
another IPSet object
Other constructors, all class methods:
- load(path)
-
Create an IPSet by reading a SiLK IPset file. path must be a valid location of an IPset.
Supported operations and methods:
In the lists of operations and methods below,
-
set is an IPSet object
-
addr can be an IPAddr object or the string representation of an IPv4 address
-
set2 is an IPSet object. The operator versions of the methods require an IPSet object.
-
ip_iterable is an iterable over IPv4 addresses as accepted by the IPSet constructor. Consider ip_iterable as creating a temporary IPSet to perform the requested method.
The following operations and methods do not modify the IPSet:
- set.cardinality()
-
Return the cardinality of set.
- len(set)
-
Return the cardinality of set. This method will raise OverflowError if there are too many IPs in the set---when the number of IPs in the set will not fit into Python's Plain Integer type. The cardinality() method will not raise this exception.
- addr in set
-
Return
Trueif addr is a member of set;Falseotherwise. - addr not in set
-
Return
Falseif addr is a member of set;Trueotherwise. - set.copy()
-
Return a new IPSet with a copy of set.
- set <= set2
- set.issubset(ip_iterable)
-
Return
Trueif every IP address in set is also in set2. ReturnFalseotherwise. - set >= set2
- set.issuperset(ip_iterable)
-
Return
Trueif every IP address in set2 is also in set. ReturnFalseotherwise. - set | set2
- set.union(ip_iterable)
-
Return a new IPset containing the IP addresses in set and set2.
- set & set2
- set.intersection(ip_iterable)
-
Return a new IPset containing the IP addresses common to set and set2.
- set - set2
- set.difference(ip_iterable)
-
Return a new IPset containing the IP addresses in set but not in set2.
- set ^ set2
- set.symmetric_difference(ip_iterable)
-
Return a new IPset containing the IP addresses in either set or in set2 but not in both.
- set.cidr_iter()
-
Return an iterator over the CIDR blocks in set. Each iteration returns a 2-tuple, the first element of which is the first IP address in the block, the second of which is the prefix length of the block. Can be used as for (addr, prefix) in set.cidr_iter().
- set.save(filename)
-
Save the contents of set in the file filename.
The following operations and methods will modify the IPSet:
- set.add(addr)
-
Add addr to set and return set. To add multiple IP addresses, use the update() method.
- set.discard(addr)
-
Remove addr from set if addr is present; do nothing if it is not. Return set. To discard multiple IP addresses, use the difference_update() method.
- set.remove(addr)
-
Similar to discard(), but raises KeyError if addr is not a member of set.
- set.clear()
-
Remove all IP addresses from set and return set.
- set |= set2
- set.update(ip_iterable)
-
Add the IP addresses specified in set2 to set; the result is the union of set and set2.
- set &= set2
- set.intersection_update(ip_iterable)
-
Remove from set any IP address that does not appear in set2; the result is the intersection of set and set2.
- set -= set2
- set.difference_update(ip_iterable)
-
Remove from set any IP address found in set2; the result is the difference of set and set2.
- set ^= set2
- set.symmetric_difference_update(ip_iterable)
-
Update set, keeping the IP addresses found in set or in set2 but not in both.
RWRec Object
An RWRec object represents a SiLK Flow record.
- class RWRec([rec],[field=value],...)
-
This constructor creates an empty RWRec object. If an RWRec rec is supplied, the constructor will create a copy of it. The variable rec can be a dictionary, such as that supplied by the as_dict() method. Initial values for record fields can be included. Note that setting or accessing certain attributes on an RWRec causes the silk.have_site_config() to be invoked; that function will call silk.init_site() with no argument if it has not yet been called.
-
Example:
-
>>> recA = RWRec(input=10, output=20) >>> recB = RWRec(recA, output=30) >>> (recA.input, recA.output) (10, 20) >>> (recB.input, recB.output) (10, 30)
Instance attributes:
- rec.application
-
The service port of the flow rec as set by the flow collector if the collector supports it, an integer. The default application value is 0.
- rec.bytes
-
The count of the number of bytes in the flow rec, an integer. The default bytes value is 0.
- rec.classname
-
(READ ONLY) The class name of assigned to the flow rec, a string. Calls silk.have_site_config(). The default classname is
?. The classname cannot be modified by itself. In order to modify the classname, you also need to modify the typename. See the rec.classtype attribute. - rec.classtype
-
A tuple of the classname and the typename of the flow rec. Calls silk.have_site_config().
- rec.dip
-
The destination IP of the flow rec, an IPAddr object. The default dip value is IPAddr('0.0.0.0'). May be set using a string containing a valid IP address.
- rec.dport
-
The destination port of the flow rec, an integer. The default dport value is 0.
- rec.duration
-
The duration of the flow rec, a datetime.timedelta object. The default duration value is 0. Changing the rec.duration attribute will modify the rec.etime attribute such that (rec.etime - rec.stime) == the new rec.duration. See also rec.duration_secs.
- rec.duration_secs
-
The duration of the flow rec in seconds, a float. The default duration_secs value is 0. Changing the rec.duration_secs attribute will modify the rec.etime attribute in the same way as changing rec.duration.
- rec.etime
-
The end time of the flow rec, a datetime.datetime object. The default etime value is the UNIX epoch time, datetime.datetime(1970,1,1,0,0). Changing the rec.etime attribute modifies the flow record's duration. If the new duration is larger than RWRec supports, an OverflowError will be raised. See also rec.etime_epoch_secs.
- rec.etime_epoch_secs
-
The end time of the flow rec as a number of seconds since the epoch time, a float. Epoch time is 1970-01-01 00:00:00. The default etime_epoch_secs value 0. Changing the rec.etime_epoch_secs attribute modifies the flow record's duration. If the new duration is larger than RWRec supports, an OverflowError will be raised.
- rec.initflags
-
The TCP flags on the first packet of the flow rec, a TCPFlags object. The default initflags value is
None. The rec.initflags attribute may be set to a new TCPFlags object, or a string or number which can be converted to a TCPFlags object by the TCPFlags() constructor. - rec.icmpcode
-
The ICMP code of the flow rec (only valid if rec.protocol is
1), an integer. The default icmpcode value is 0. - rec.icmptype
-
The ICMP type value of the flow rec (only valid if rec.protocol is
1), an integer. The default icmpcode value is 0. - rec.input
-
The SNMP interface where the flow rec entered the router, an integer. The default input value is 0.
- rec.nhip
-
The next-hop IP of the flow rec as set by the router, an IPAddr object. The default nhip value is IPAddr('0.0.0.0'). May be set using a string containing a valid IP address.
- rec.output
-
The SNMP interface where the flow rec exited the router, an integer. The default output value is 0.
- rec.packets
-
The packet count for the flow rec, an integer. The default packets value is 0.
- rec.protocol
-
The IP protocol of the flow rec, an integer. The default protocol value is 0.
- rec.restflags
-
The union of the flags of all but the first packet in the flow rec, a TCPFlags object. The default restflags value is
None. The rec.restflags attribute may be set to a new TCPFlags object, or a string or number which can be converted to a TCPFlags object by the TCPFlags() constructor. - rec.sensor
-
The name of sensor where the flow rec was collected, a string. Calls silk.have_site_config(). The default sensor value is
?. - rec.sip
-
The source IP of the flow rec, an IPAddr object. The default sip value is IPAddr('0.0.0.0'). May be set using a string containing a valid IP address.
- rec.sport
-
The source port of the flow rec, an integer. The default sport value is 0.
- rec.stime
-
The start time of the flow rec, a datetime.datetime object. The default stime value is the UNIX epoch time, datetime.datetime(1970,1,1,0,0). Modifying the rec.stime attribute will modify the flow's end time such that the rec.duration is constant. See also rec.etime_epoch_secs.
- rec.stime_epoch_secs
-
The start time of the flow rec as a number of seconds since the epoch time, a float. Epoch time is 1970-01-01 00:00:00. The default stime_epoch_secs value 0. Changing the rec.stime_epoch_secs attribute will modify the flow's end time such that the rec.duration is constant.
- rec.tcpflags
-
The union of the TCP flags of all packets in the flow rec, a TCPFlags object. The default tcpflags value is TCPFlags(' '). The rec.tcpflags attribute may be set to a new TCPFlags object, or a string or number which can be converted to a TCPFlags object by the TCPFlags() constructor.
- rec.timeout_killed
-
Whether the flow rec was closed early due to timeout by the collector, a boolean. The default timeout_killed value is
None. - rec.timeout_started
-
Whether the flow rec is a continuation from a timed-out flow, a boolean. The default timeout_started value is
None. - rec.typename
-
(READ ONLY) The type name of the flow rec, a string. Calls silk.have_site_config(). The default typename is '255'. The typename cannot be modified by itself. In order to modify the typename, you also need to modify the classname. See the rec.classtype attribute.
Supported operations and methods:
- rec.is_web()
-
Return
Trueif rec can be represented as a web record,Falseotherwise. - rec.as_dict()
-
Return a dictionary representing the contents of rec. Calls silk.have_site_config().
- str(rec)
-
Return the string representation of rec.as_dict().
- rec1 == rec2
-
Return
Trueif rec1 is structurally equivalent to rec2. - rec1 != rec2
-
Return
Trueif rec1 is not structurally equivalent to rec2
SilkFile Object
A SilkFile object represents a channel for writing to or reading from SiLK Flow files. A SiLK file open for reading can be iterated over using for rec in file.
- class SilkFile(filename, mode, compression=
DEFAULT, notes=[], invocations=[]) -
The constructor takes a filename, a mode, and a set of optional keyword parameters. The mode should be one of the following constant values:
- DEFAULT
-
Use the default compression scheme compiled into SiLK.
- NO_COMPRESSION
-
Use no compression.
- ZLIB
-
Use zlib block compression (as used by gzip(1)).
- LZO1X
-
Use lzo1x block compression.
The filename should be the path to the file to open. A few
filenames are treated specially. The filename stdin maps to the
standard input stream when the mode is READ. The filenames
stdout and stderr map to the standard output and standard error
streams respectively when the mode is WRITE. A filename consisting
of a single hyphen (-) maps to the standard input if the mode is
READ, and to the standard output if the mode is WRITE.
The compression parameter can be one of the following constants:
If notes or invocations are set, they should be list of strings. These add annotation and invocation headers to the file. These values are visible by the rwfileinfo(1) program.
Examples:
>>> myinputfile = SilkFile('/path/to/file', READ)
>>> myoutputfile = SilkFile('/path/to/file', WRITE,
compression=LZO1X,
notes=['My output file',
'another annotation'])
Instance methods:
- file.read()
-
Return an RWRec representing the next record in the SilkFile file. If there are no records left in the file, return
None. - file.write(rec)
-
Write the RWRec rec to the SilkFile file. Return
None. - file.next()
-
A SilkFile object is its own iterator. For example, iter(file) returns file. When the SilkFile is used as an iterator, the next() method is called repeatedly. This method returns the next record, or raises StopIteration once the end of file is reached
- file.notes()
-
Return the list of annotation headers for the file as a list of strings.
- file.invocations()
-
Return the list of invocation headers for the file as a list of strings.
- file.close()
-
Close the file and return
None.
PrefixMap Object
A PrefixMap object represents an immutable mapping from IPv4 addresses or protocol/port pairs to labels. PrefixMap objects are created from SiLK prefix map files as created by rwpmapbuild(1).
- class PrefixMap(filename)
-
The constructor creates a prefix map initialized from the filename. The PrefixMap object will be of one of the two subtypes of PrefixMap: an AddressPrefixMap or a ProtoPortPrefixMap.
Supported operations and methods:
- pmap[key]
-
Return the string label associated with key in pmap. key must be of the correct type: either an IPv4 IPAddr if pmap is an AddressPrefixMap, or a 2-tuple of integers (protocol, port), if pmap is a ProtoPortPrefixMap. The method raises TypeError when the type of the key is incorrect, and it raises ValueError when an IPv6 IPAddr is used as a key for an AddressPrefixMap.
- pmap.get(key[,default])
-
Return the string label associated with key in pmap. Return the value default if key is not in pmap, or if key is of the wrong type or value to be a key for pmap. The default value of default is
None. - pmap.labels()
-
Return a tuple of the labels defined by the PrefixMap pmap.
- pmap.iterranges()
-
Return an iterator that will iterate over ranges of contiguous values with the same label. The return values of the iterator will be the 3-tuple (start, end, label), where start is the first element of the range, end is the last element of the range, and label is the label for that range.
Bag Object
A Bag object is a representation of a multiset. Each key represents a potential element in the set, and the key's value represents the number of times that key is in the set.
- class Bag([mapping][,key_type=IPAddr])
-
The constructor creates a bag of type key_type. The default key_type is IPAddr. Object of class key_type must be constructable from an integer, and possess an
__int__()method which retrieves that integer from the object. -
If mapping is included, the bag is initialized from that mapping. Valid mappings are:
-
a Bag
-
a key/value dictionary
-
an iterable of key/value pairs
Other constructors, all class methods:
- Bag.ipaddr(mapping)
-
Creates a Bag using IPAddr as the key_type (IP address bag). Equivalent to Bag(mapping, key_type = IPAddr).
- Bag.integer(mapping)
-
Creates a Bag using long as the key_type (integer bag). Equivalent to Bag(mapping, key_type = long).
- Bag.load(path[, key_type=IPAddr])
-
Creates a Bag by reading a SiLK bag file. path must be a valid location of a bag. key_type is used as in the Bag constructor. key_type defaults to IPAddr.
- Bag.load_ipaddr(path)
-
Creates an IP address bag from a SiLK bag file. Equivalent to Bag.load(path, key_type = IPAddr).
- Bag.load_integer(path)
-
Creates an integer bag from a SiLK bag file. Equivalent to Bag.load(path, key_type = long).
Supported operations and methods:
In the lists of operations and methods below,
-
bag is a Bag object
-
key is an object which is a subtype of the working Bag's key_type
-
bag2 is a Bag object
-
key2 is an object which is a subtype of the working Bag's key_type
-
value is an integer which represents the number of items of a particular key that are in the bag
-
ipset is an IPSet object
-
ipwildcard is an IPWildcard object
Bags contain the following attribute:
- key_type
-
The class which represents the type of keys in this bag. Objects of this class must be constructable from an integer, and possess an
__int__()method which retrieves that integer from the object.
The following operations and methods do not modify the Bag:
- bag.copy()
-
Return a new Bag which is a copy of bag.
- bag[key]
-
Return the number of elements key in bag.
- bag[key:key2]
-
Return a new Bag which contains only the elements in the key range [key, key2).
- bag[ipset]
-
Return a new Bag which contains only elements that are also contained in ipset. This is only valid for IP address bags.
- bag[ipwildcard]
-
Return a new Bag which contains only elements that are also contained in ipwildcard. This is only valid for IP address bags.
- key in bag
-
Return
Trueif bag[key] is non-zero,Falseotherwise. - bag.get(key[, default=None])
-
Return bag[key] if key is in bag, otherwise return default. default defaults to
None. - bag.items()
-
Return a list of (key, value) pairs for all keys in bag with non-zero values. This list is guaranteed to be sorted in int(key) order.
- bag.iteritems()
-
Return an iterator over (key, value) pairs for all keys in bag with non-zero values. This iterator is guaranteed to iterate over items in int(key) order.
- bag.keys()
-
Return a list of keys for all keys in bag with non-zero values. This list is guaranteed to be sorted in int(key) order.
- bag.iterkeys()
-
Return an iterkeys over keys for all keys in bag with non-zero values. This iterator is guaranteed to iterate over keys in int(key) order.
- bag.values()
-
Return a list of values for all keys in bag with non-zero values. This list is guaranteed to be sorted in int(key) order.
- bag.itervalues()
-
Return an iterator over values for all keys in bag with non-zero values. This iterator is guaranteed iterate over values in int(key) order.
- bag.group_iterator(bag2)
-
Return an iterator over keys and values of a pair of Bags. For each key which is in either bag or bag2, this iterator will return a (key, value, value2) triple, where value is bag.get(key), and value2 is bag.get(key). This iterator is guaranteed to iterate over triples in int(key) order.
- bag + bag2
-
Add two bags together. Return a new Bag for which newbag[key] = bag[key] + bag2[key] for all keys in bag and bag2. Will raise an OverflowError if the resulting value for a key is greater than 2^64 - 1.
- bag - bag2
-
Subtract two bags. Return a new Bag for which newbag[key] = bag[key] - bag2[key] for all keys in bag and bag2, as long as the resulting value for that key would be non-negative. If the resulting value for a key would be negative, the value of that key will be zero.
- bag.min(bag2)
-
Return a new Bag for which newbag[key] = min(bag[key], bag2[key]) for all keys in bag and bag2.
- bag.max(bag2)
-
Return a new Bag for which newbag[key] = max(bag[key], bag2[key]) for all keys in bag and bag2.
- bag.div(bag2)
-
Divide two bags. Return a new Bag for which newbag[key] = bag[key] / bag2[key]) rounded to the nearest integer for all keys in bag and bag2, as long as bag2[key] is non-zero. newbag[key] = 0, when bag2[key] is zero.
- bag * integer
- integer * bag
-
Multiple a bag by a scalar. Return a new Bag for which newbag[key] = bag[key] * integer for all keys in bag.
- bag.intersect(set_like)
-
Return a new Bag which contains bag[key] for each key where key in set_like is true.
- bag.complement_intersect(set_like)
-
Return a new Bag which contains bag[key] for each key where key in set_like is not true.
- bag.ipset()
-
Return an IPSet consisting of the set of IP address key values from bag with positive values. This only works if bag is an IP address bag.
- bag.inversion()
-
Return a new integer Bag for which all values from bag are inserted as key elements. Hence, if two keys in bag have a value of 5, newbag[5] will be equal to two.
- bag == bag2
-
Return
Trueif the contents of bag are equivalent to the contents of bag2,Falseotherwise. - bag != bag2
-
Return
Falseif the contents of bag are equivalent to the contents of bag2,Trueotherwise.
The following operations and methods will modify the Bag:
- bag.clear()
-
Empty bag, such that bag[key] is zero for all keys.
- bag[key] = value
-
Set the number of key in bag to value.
- del bag[key]
-
Remove key from bag, such that bag[key] is zero.
- bag.update(mapping)
-
For each item in mapping, bag is modified such that for each key in mapping, the value for that key in bag will be set to the mapping's value.
-
Valid mappings are:
-
a Bag
-
a key/value dictionary
-
an iterable of key/value pairs
- bag.add(key[, key2[, ...]])
-
Add each key to bag. This is the same as incrementing the value for each key by one.
- bag.add(iterable)
-
Add each key in iterable to bag. This is the same as incrementing the value for each key by one.
- bag.remove(key[, key2[, ...]])
-
Remove one of each key from bag. This is the same as decrementing the value for each key by one.
- bag.remove(iterable)
-
Remove one of each key in iterable from bag, essentially decrementing the value for each key by one.
- bag.incr(key, value = 1)
-
Increment the number of key in bag by value. value defaults to one.
- bag.decr(key, value = 1)
-
Decrement the number of key in bag by value. value defaults to one.
- bag += bag2
-
Equivalent to bag = bag + bag2, unless an OverflowError is raised, in which case bag is no longer necessarily valid. When an error is not raised, this operation takes less memory than bag = bag + bag2.
- bag -= bag2
-
Equivalent to bag = bag - bag2. This operation takes less memory than bag = bag - bag2.
- bag *= integer
-
Equivalent to bag = bag * integer, unless an OverflowError is raised, in which case bag is no longer necessarily valid. When an error is not raised, this operation takes less memory than bag = bag * integer.
- bag.constrain_values(min = None, max = None)
-
Remove key from bag if that key's value is less than min, or greater than max. At least one of min or max must be specified.
- bag.constrain_keys(min = None, max = None)
-
Remove key from bag if that key is less than min, or greater than max. At least one of min or max must be specified.
TCPFlags Object
A TCPFlags object represents the eight bits of flags from a TCP session.
- class TCPFlags(value)
-
The constructor takes either a TCPFlags value, a string, or an integer. If a TCPFlags value, it returns a copy of that value. If an integer, the integer should represent the 8-bit representation of the flags. If a string, the string should consist of a concatenation of zero or more of the characters
F,S,R,P,A,U,E, andC---upper or lower-case---representing the FIN, SYN, RST, PSH, ACK, URG, ECE, and CWR flags. Spaces in the string are ignored. -
Examples:
-
>>> a = TCPFlags('SA') >>> b = TCPFlags(5)
Instance attributes (read-only):
- flags.FIN
-
Trueif the FIN flag is set on flags,Falseotherwise - flags.SYN
-
Trueif the SYN flag is set on flags,Falseotherwise - flags.RST
-
Trueif the RST flag is set on flags,Falseotherwise - flags.PSH
-
Trueif the PSH flag is set on flags,Falseotherwise - flags.ACK
-
Trueif the ACK flag is set on flags,Falseotherwise - flags.URG
-
Trueif the URG flag is set on flags,Falseotherwise - flags.ECE
-
Trueif the ECE flag is set on flags,Falseotherwise - flags.CWR
-
Trueif the CWR flag is set on flags,Falseotherwise
Supported operations and methods:
- ~flags
-
Return the bitwise inversion (not) of flags
- flags1 & flags2
-
Return the bitwise intersection (and) of the flags from flags1 and flags2
- flags1 | flags2
-
Return the bitwise union (or) of the flags from flags1 and flags2.
- flags1 ^ flags2
-
Return the bitwise exclusive disjunction (xor) of the flags from flags1 and flags2.
- int(flags)
-
Return the integer value of the flags set in flags.
- str(flags)
-
Return a string representation of the flags set in flags.
- flags.padded()
-
Return a string representation of the flags set in flags. This representation will be padded with spaces such that flags will line up if printed above each other.
- flags
-
When used in a setting that expects a boolean, return
Trueif any flag value is set in flags. ReturnFalseotherwise. - flags.matches(flagmask)
-
Given flagmask, a string of the form high_flags/mask_flags, return
Trueif the flags of flags match high_flags after being masked with mask_flags;Falseotherwise. Given a flagmask without the slash (/), returnTrueif flags matches high_flags, as if mask_flags contained all flags.
Constants:
The following constants are defined:
- FIN
-
A TCPFlags value with only the FIN flag set
- SYN
-
A TCPFlags value with only the SYN flag set
- RST
-
A TCPFlags value with only the RST flag set
- PSH
-
A TCPFlags value with only the PSH flag set
- ACK
-
A TCPFlags value with only the ACK flag set
- URG
-
A TCPFlags value with only the URG flag set
- ECE
-
A TCPFlags value with only the ECE flag set
- CWR
-
A TCPFlags value with only the CWR flag set
FGlob Object
An FGlob object is an iterable object which iterates over filenames from a SiLK data store. It does this internally by calling the rwfglob(1) program. The FGlob object assumes that the rwfglob program is in the PATH, and will raise an exception when used if not.
- class FGlob(classname=
None, type=None, sensors=None, start_date=None, end_date=None, data_rootdir=None, site_config_file=None) -
Although all arguments have defaults, at least one of
classname,type,sensors,start_datemust be specified. The arguments are: - classname
-
if given, should be a string representing the class name. If not given, defaults based on the site configuration file, silk.conf(5).
- type
-
if given, can be either a string representing a type name or comma-separated list of type names, or can be a list of strings representing type names. If not given, defaults based on the site configuration file, silk.conf.
- sensors
-
if given, should be either a string representing a comma-separated list of sensor names or IDs, and integer representing a sensor ID, or a list of strings or integers representing sensor names or IDs. If not given, defaults to all sensors.
- start_date
-
if given, should be either a string in the format
YYYY/MM/DD[:HH], a date object, a datetime object (which will be used to the precision of one hour), or a time object (which is used for the given hour on the current date). If not given, defaults to start of current day. - end_date
-
if given, should be either a string in the format
YYYY/MM/DD[:HH], a date object, a datetime object (which will be used to the precision of one hour), or a time object (which is used for the given hour on the current date). If not given, defaults tostart_date.end_datecannot be used without astart_date. - data_rootdir
-
if given, should be a string representing the directory in which to find the packed SiLK data files. If not given, defaults to the value in the SILK_DATA_ROOTDIR environment variable or the compiled-in default.
- site_config_file
-
if given, should be a string representing the path of the site configuration file, silk.conf. If not given, defaults to the value in the SILK_CONFIG_FILE environment variable or $SILK_DATA_ROOTDIR/silk.conf.
An FGlob object can be used as a standard iterator. For example:
for filename in FGlob(classname="all", start_date="2005/09/22"):
for rec in SilkFile(filename):
...
silk.plugin
silk.plugin is a module to support using PySiLK code as a plug-in to the rwfilter(1), rwcut(1), rwgroup(1), rwsort(1), rwstats(1), and rwuniq applications. The module defines the following methods, which are described in the silkpython(3) manual page:
- register_switch(switch_name, handler=handler, [arg=needs_arg], [help=help_string])
-
Define the command line switch --switch_name that can be used by the PySiLK plug-in.
- register_filter(filter, [finalize=finalize], [initialize=initialize])
-
Register the callback function filter that can be used by rwfilter to specify whether the flow record passes or fails.
- register_field(field_name, [add_rec_to_bin=add_rec_to_bin,] [bin_compare=bin_compare,] [bin_bytes=bin_bytes,] [bin_merge=bin_merge,] [bin_to_text=bin_to_text,] [column_width=column_width,] [description=description,] [initial_value=initial_value,] [initialize=initialize,] [rec_to_bin=rec_to_bin,] [rec_to_text=rec_to_text])
-
Define the new key field or aggregate value field named field_name. Key fields can be used in rwcut, rwgroup, rwsort, rwstats, and rwuniq. Aggregate value fields can be used in rwstats and rwuniq. Creating a field requires specifying one or more callback functions---the functions required depend on the
application(s)where the field will be used. To simplify field creation for common field types, the remaining functions can be used instead. - register_int_field(field_name, int_function, min, max, [width])
-
Create the key field field_name whose value is an unsigned integer.
- register_ipv4_field(field_name, ipv4_function, [width])
-
Create the key field field_name whose value is an IPv4 address.
- register_ip_field(field_name, ipv4_function, [width])
-
Create the key field field_name whose value is an IPv4 or IPv6 address.
- register_enum_field(field_name, enum_function, width, [ordering])
-
Create the key field field_name whose value is a Python object (often a string).
- register_int_sum_aggregator(agg_value_name, int_function, [max_sum], [width])
-
Create the aggregate value field agg_value_name that maintains a running sum as an unsigned integer.
- register_int_max_aggregator(agg_value_name, int_function, [max_max], [width])
-
Create the aggregate value field agg_value_name that maintains the maximum unsigned integer value.
- register_int_min_aggregator(agg_value_name, int_function, [max_min], [width])
-
Create the aggregate value field agg_value_name that maintains the minimum unsigned integer value.
EXAMPLE
The following is an example using the PySiLK bindings. The code is meant to show some standard PySiLK techniques, but is not otherwise meant to be useful. Explanations for the code can be found in-line in the comments.
#!/usr/bin/env python
# Import the PySiLK bindings from silk import *
# Import sys for the command line arguments. import sys
# Main function def main():
if len(sys.argv) != 3:
print ("Usage: %s infile outset" % sys.argv[0])
sys.exit(1)
# Open an silk file for reading
infile = SilkFile(sys.argv[1], READ)
# Create an empty IPset
destset = IPSet()
# Loop over the records in the file
for rec in infile:
# Do comparisons based on rwrec field value
if (rec.protocol == 6 and rec.sport in [80, 8080] and
rec.packets > 3 and rec.bytes > 120):
# Add the dest IP of the record to the IPset
destset.add(rec.dip)
# Save the IPset for future use
try:
destset.save(sys.argv[2])
except:
sys.exit("Unable to write to %s" % sys.argv[2])
# count the items in the set
count = 0
for addr in destset:
count = count + 1
print "%d addresses" % count
# Another way to do the same
print "%d addresses" % len(destset)
# Print the ip blocks in the set
for base_prefix in destset.cidr_iter():
print "%s/%d" % base_prefix
# Call the main() function when this program is started
if __name__ == '__main__':
main()
ENVIRONMENT
The following environment variables affect the tools in the SiLK tool suite.
- SILK_CONFIG_FILE
-
This environment variable contains the location of the site configuration file, silk.conf. This variable will be used by silk.init_site() if no argument is passed to that method.
- SILK_DATA_ROOTDIR
-
This variable gives the root of directory tree where the data store of SiLK Flow files is maintained, overriding the location that is compiled into the tools. This variable will be used by the FGlob constructor unless an explicit data_rootdir value is specified. In addition, the silk.init_site() may search for the site configuration file, silk.conf, in this directory.
- SILK_PATH
-
This environment variable gives the root of the directory tree where the tools are installed. As part of its search for the SiLK site configuration file, the silk.init_site() method checks for a file named silk.conf in the directories $SILK_PATH/share/silk and $SILK_PATH/share.
- PYTHONPATH
-
This is the search path that Python uses to find modules and extensions. The SiLK Python extension described in this document may be installed outside Python's installation tree; for example, in SiLK's installation tree. It may be necessary to set or modify the PYTHONPATH environment variable so Python can find the SiLK extension.
- PATH
-
This is the standard search path for executable programs. The FGlob constructor will invoke the rwfglob program; the directory containing rwfglob should be included in the PATH.
SEE ALSO
silkpython(3), rwfglob(1), rwfileinfo(1), rwfilter(1), rwcut(1), rwpmapbuild(1), rwset(1), rwsetbuild(1), rwsort(1), rwstats(1), rwuniq(1), silk.conf(5), silk(7), python(1), http://docs.python.org/


