Packages

  • package root

    This is documentation for Mothra, a collection of Scala and Spark library functions for working with Internet-related data.

    This is documentation for Mothra, a collection of Scala and Spark library functions for working with Internet-related data. Some modules contain APIs of general use to Scala programmers. Some modules make those tools more useful on Spark data-processing systems.

    Please see the documentation for the individual packages for more details on their use.

    Scala Packages

    These packages are useful in Scala code without involving Spark:

    org.cert.netsa.data

    This package, which is collected as the netsa-data library, provides types for working with various kinds of information:

    org.cert.netsa.io.ipfix

    The netsa-io-ipfix library provides tools for reading and writing IETF IPFIX data from various connections and files.

    org.cert.netsa.io.silk

    To read and write CERT NetSA SiLK file formats and configuration files, use the netsa-io-silk library.

    org.cert.netsa.util

    The "junk drawer" of netsa-util so far provides only two features: First, a method for equipping Scala scala.collection.Iterators with exception handling. And second, a way to query the versions of NetSA libraries present in a JVM at runtime.

    Spark Packages

    These packages require the use of Apache Spark:

    org.cert.netsa.mothra.datasources

    Spark datasources for CERT file types. This package contains utility features which add methods to Apache Spark DataFrameReader objects, allowing IPFIX and SiLK flows to be opened using simple spark.read... calls.

    The mothra-datasources library contains both IPFIX and SiLK functionality, while mothra-datasources-ipfix and mothra-datasources-silk contain only what's needed for the named datasource.

    org.cert.netsa.mothra.analysis

    A grab-bag of analysis helper functions and example analyses.

    org.cert.netsa.mothra.functions

    This single Scala object provides Spark SQL functions for working with network data. It is the entirety of the mothra-functions library.

    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package cert
    Definition Classes
    org
  • package netsa
    Definition Classes
    cert
  • package mothra
    Definition Classes
    netsa
  • package analysis
    Definition Classes
    mothra
  • package datasources

    This package contains the Mothra datasources, along with mechanisms for working with those datasources.

    This package contains the Mothra datasources, along with mechanisms for working with those datasources. The primary novel feature of these datasources is the fields mechanism.

    To use the IPFIX or SiLK data sources, you can use the following methods added by the implicit CERTDataFrameReader on DataFrameReader after importing from this package:

    import org.cert.netsa.mothra.datasources._
    val silkDF = spark.read.silkFlow()                                    // to read from the default SiLK repository
    val silkRepoDF = spark.read.silkFlow(repository="...")                // to read from an alternate SiLK repository
    val silkFilesDF = spark.read.silkFlow("/path/to/silk/files")          // to read from loose SiLK files
    val ipfixDF = spark.read.ipfix(repository="/path/to/mothra/data/dir") // for packed Mothra IPFIX data
    val ipfixS3DF = spark.read.ipfix(s3Repository="bucket-name")          // for packed Mothra IPFIX data from an S3 bucket
    val ipfixFilesDF = spark.read.ipfix("/path/to/ipfix/files")           // for loose IPFIX files

    (The additional methods are defined on the implicit class CERTDataFrameReader.)

    Using the fields method allows you to configure which SiLK or IPFIX fields you wish to retrieve. (This is particularly important for IPFIX data, as IPFIX files may contains many many possible fields organized in various ways.)

    import org.cert.netsa.mothra.datasources._
    val silkDF = spark.read.fields("sIP", "dIP").silkFlow(...)
    val ipfixDF = spark.read.fields("sourceIPAddress", "destinationIPAddress").ipfix(...)

    Both of these dataframes will contain only the source and destination IP addresses from the specified data sources. You may also provide column names different from the source field names:

    val silkDF = spark.read.fields("server" -> "sIP", "client" -> "dIP").silkFlow(...)
    val ipfixDF = spark.read.fields("server" -> "sourceIPAddress", "client" -> "destinationIPAddress").ipfix(...)

    You may also mix the mapped and the default names in one call:

    val df = spark.read.fields("sIP", "dIP", "s" -> "sensor").silkFlow(...)
    Definition Classes
    mothra
    See also

    IPFIX datasource

    SiLK flow datasource

  • functions

object functions

A collection of Spark SQL functions for use with network data.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. functions
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  9. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def icmpcode_description(icmpCode: Any, icmpType: Any): Column

    Given an 8-bit numeric ICMP code and the 8-bit numeric ICMP type it belongs to, returns the description given for this code by IANA, if it exists.

  12. def icmptype_description(icmpType: Any): Column

    Given an 8-bit numeric ICMP type, returns the description given for this type by IANA, if it exists.

  13. def icmptype_is_deprecated(icmpType: Any): Column

    Given an 8-bit numeric ICMP type, returns true if IANA considers this ICMP type to be deprecated.

  14. def icmptype_is_reserved(icmpType: Any): Column

    Given an 8-bit numeric ICMP type, returns true if IANA considers this ICMP type to be reserved.

  15. def icmptype_is_unassigned(icmpType: Any): Column

    Given an 8-bit numeric ICMP type, returns true if IANA considers this ICMP type to be unassigned.

  16. def icmptypecode(icmpType: Any, icmpCode: Any): Column

    Given an 8-bit numeric ICMP type and ICMP code, returns a 16-bit numeric ICMP type + code.

    Given an 8-bit numeric ICMP type and ICMP code, returns a 16-bit numeric ICMP type + code.

    Annotations
    @silent(" shiftLeft .*deprecated")
  17. def icmptypecode_code(icmpTypeCode: Any): Column

    Given a 16-bit numeric ICMP type + code, returns the associated 8-bit numeric ICMP code.

  18. def icmptypecode_description(icmpTypeCode: Any): Column

    Given a 16-bit numeric ICMP type + code, returns a text description of the ICMP type and code, containing the descriptions of the type and code given by IANA.

  19. def icmptypecode_type(icmpTypeCode: Any): Column

    Given a 16-bit numeric ICMP type + code, returns the associated 8-bit numeric ICMP type.

    Given a 16-bit numeric ICMP type + code, returns the associated 8-bit numeric ICMP type.

    Annotations
    @silent(" shiftRightUnsigned .*deprecated")
  20. def ipaddr(addr: Any): Column

    Given an IP address in string form, returns the canonical form of that IP address.

  21. def ipaddr_eq(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if they represent the same address.

  22. def ipaddr_gt(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if the first address is greater than the second.

  23. def ipaddr_gteq(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if the first address is greater than or equal to the second.

  24. def ipaddr_in(addr: Any, block: Any): Column

    Given an IP address and an IP block in string form, returns true if the address is contained in the block.

  25. def ipaddr_in_collection(addr: Any, collection: Iterable[Any]): Column

    Given an IP address and a collection of IP addresses and blocks in string form, returns true if the address is contained in the collection or any block in the collection.

  26. def ipaddr_is_ipv6(addr: Any): Column

    Given an IP address in string form, returns true if it is an IPv6 address.

  27. def ipaddr_lt(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if the first address is less than the second.

  28. def ipaddr_lteq(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if the first address is less than or equal to the second.

  29. def ipaddr_ne(addr1: Any, addr2: Any): Column

    Given two IP addresses in string form, returns true if they do not represent the same address.

  30. def ipaddr_normalize(addr: Any): Column

    Given an IP address in string form, returns the canonical form of that IP address.

  31. def ipaddr_sort_key(addr: Any): Column

    Given an IP address in string form, returns a byte array suitable to sort by.

  32. def ipaddr_to_bytes(addr: Any): Column

    Given an IP addresses in string form, returns the byte array represntation of the address.

  33. def ipblock(block: Any): Column

    Given an IP block in string form, returns the canonical form of that IP block.

  34. def ipblock_contains(block: Any, addr: Any): Column

    Given an IP block and an IP address in string form, returns true if the block contains the address.

  35. def ipblock_eq(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if they represent the same block.

  36. def ipblock_gt(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if the first block as a pair of IP addresses is greater than the second.

  37. def ipblock_gteq(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if the first block as a pair of IP addresses is greater than or equal to the second.

  38. def ipblock_lt(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if the first block as a pair of IP addresses is less than the second.

  39. def ipblock_lteq(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if the first block as a pair of IP addresses is less than or equal to the second.

  40. def ipblock_max(block: Any): Column

    Given an IP block in string form, returns the maximum IP address contained within the block.

  41. def ipblock_min(block: Any): Column

    Given an IP block in string form, returns the minimum IP address contained within the block.

  42. def ipblock_ne(block1: Any, block2: Any): Column

    Given two IP blocks in string form, returns true if they do not represent the same block.

  43. def ipblock_normalize(block: Any): Column

    Given an IP block in string form, returns the canonical form of that IP block.

  44. def ipblock_prefix_length(block: Any): Column

    Given an IP block in string form, returns the length of the common prefix contained in the block.

  45. def ipblock_sort_key(block: Any): Column

    Given an IP block in string form, returns a byte array suitable to sort by.

  46. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  47. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  48. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  49. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  50. def port_service_name(port: Any): Column

    Given the numeric representation of a TCP, UDP, SCTP, or similar port, returns the service name given to that port by IANA, if any.

  51. def proto_keyword(proto: Any): Column

    Given the numeric representation of an IP protocol, returns the keyword given to this protocol by IANA, if any.

  52. def silkattrs_and(attrs1: Any, attrs2: Any): Column

    Given the numeric representations of two sets of SiLK attributes, returns the bitwise and of the attributes.

  53. def silkattrs_continuation(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "continuation" attribute is set.

  54. def silkattrs_expanded_flags(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "expanded flags" attribute is set.

  55. def silkattrs_fin_followed(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "FIN followed" attribute is set.

  56. def silkattrs_is_ipv6(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "is IPv6" attribute is set.

  57. def silkattrs_match(attrs: Any, high: Any, mask: Any): Column

    Given the numeric representations of a set of SiLK attributes, a set of attributes that should be set high, and a mask of the attributes to be considered, returns true if the masked set is the same as the masked high bits.

  58. def silkattrs_match_str(attrs: Any, target: Any): Column

    Given the numeric representation of a set of SiLK attributes, and a symbolic string representation of a set of bits to be checked, returns true if the attributes match the specification.

    Given the numeric representation of a set of SiLK attributes, and a symbolic string representation of a set of bits to be checked, returns true if the attributes match the specification. The specification may be like "TCF" to indicate that the specified attributes should be set and the other attributes don't matter, or "TC/TCF" to indicate that the first set of specified attributes should be set and any others in the mask should not be set.

  59. def silkattrs_not(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns the bitwise inverse of the attributes.

  60. def silkattrs_of_string(attrs: Any): Column

    Given a symbolic string representation of a set of SiLK attributes, returns the numeric attribute value.

  61. def silkattrs_or(attrs1: Any, attrs2: Any): Column

    Given the numeric representations of two sets of SiLK attributes, returns the bitwise or of the attributes.

  62. def silkattrs_to_string(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns a symbolic string representation.

  63. def silkattrs_truncated(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "truncated" attribute is set.

  64. def silkattrs_uniform_packet_size(attrs: Any): Column

    Given the numeric representation of a set of SiLK attributes, returns true if the "uniform packet size" attribute is set.

  65. def silkattrs_xor(attrs1: Any, attrs2: Any): Column

    Given the numeric representations of two sets of SiLK attributes, returns the bitwise exclusive or of the attributes.

  66. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  67. def tcpflags_ack(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the ACK (acknowledgement) flag is set.

  68. def tcpflags_and(flags1: Any, flags2: Any): Column

    Given the numeric representations of two sets of TCP flags, returns the bitwise and of the flags.

  69. def tcpflags_cwr(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the CWR (congestion window reduced) flag is set.

  70. def tcpflags_ece(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the ECE (ECN-echo) flag is set.

  71. def tcpflags_fin(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the FIN (finished) flag is set.

  72. def tcpflags_match(flags: Any, high: Any, mask: Any): Column

    Given the numeric representations of a set of TCP flags, a set of flags that should be set high, and a mask of the flags to be considered, returns true if the masked set is the same as the masked high bits.

  73. def tcpflags_match_str(flags: Any, target: Any): Column

    Given the numeric representation of a set of TCP flags, and a symbolic string representation of a set of bits to be checked, returns true if the flags match the specification.

    Given the numeric representation of a set of TCP flags, and a symbolic string representation of a set of bits to be checked, returns true if the flags match the specification. The specification may be like "UASF" to indicate that the specified flags should be set and the other flags don't matter, or "UA/UASF" to indicate that the first set of specified flags should be set and any others in the mask should not be set.

  74. def tcpflags_not(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns the bitwise inverse of the flags.

  75. def tcpflags_of_string(flags: Any): Column

    Given a symbolic string representation of a set of TCP flags, returns the numeric flag value.

  76. def tcpflags_or(flags1: Any, flags2: Any): Column

    Given the numeric representations of two sets of TCP flags, returns the bitwise or of the flags.

  77. def tcpflags_psh(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the PSH (push) flag is set.

  78. def tcpflags_rst(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the RST (reset) flag is set.

  79. def tcpflags_syn(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the SYN (synchronisation) flag is set.

  80. def tcpflags_to_string(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns a symbolic string representation.

  81. def tcpflags_urg(flags: Any): Column

    Given the numeric representation of a set of TCP flags, returns true if the URG (urgent) flag is set.

  82. def tcpflags_xor(flags1: Any, flags2: Any): Column

    Given the numeric representations of two sets of TCP flags, returns the bitwise exclusive or of the flags.

  83. def toString(): String
    Definition Classes
    AnyRef → Any
  84. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  85. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  86. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped