util

object util

A collection of analysis helper functions for working with network data in Spark.

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

util
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
val APPLICATION_LABEL_FIELD: String
val DNS_RESOURCE_RECORD_TYPE_FIELD: String
lazy val MOTHRA_COLLECTORS_DF: Option[DataFrame]
val ORGANIZATION_LABEL_FIELD: String
val TLS_CIPHER_SUITE_FIELD: String
val app_label_name: Column
def app_labels(labels: Any*): Column
def append_collection_labels(df: DataFrame, attributes: String*): DataFrame
Append one or more collection attribute labels to a dataframe based on the observationDomainid and vlanId fields in the data.
Append one or more collection attribute labels to a dataframe based on the observationDomainid and vlanId fields in the data.
df
input dataframe
Example:
1. append_label_columns(df, "enclave", "department", "organization")
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native()
def collect_column(df: DataFrame, c: Column): Array[Any]
An array containing the values of a column expression of a dataframe.
def collect_column(df: DataFrame): Array[Any]
An array containing the values from the first column of a dataframe.
def collection_labels(attribute: String, labels: String*): Column
Given the name of a collection attribute and a set of matching labels, produce a filter expression that can be used to filter a DataFrame for flows matching the given labels.
Given the name of a collection attribute and a set of matching labels, produce a filter expression that can be used to filter a DataFrame for flows matching the given labels.
Example:
1. df.filter(collection_labels("enclave", "IT1A", "DEV2B"))
  will find flows in either the IT1A or DEV2B enclaves.
def collector_labels(labels: String*): Column
Given a set of collector labels, returns a set of filter expressions for filtering records from those collectors.
Given a set of collector labels, returns a set of filter expressions for filtering records from those collectors.
Example:
1. df.filter(collector_labels("IT1", "DEV2"))
def compare_field(field: String, condition: String, value: Any): Column
Convenience method for building filter conditions using comparison operators.
def count_by_collection_labels(df: DataFrame, attributes: String*): DataFrame
Group data by one or more collection attributes and count the number of flows within each group.
Group data by one or more collection attributes and count the number of flows within each group.
df
input dataframe
Example:
1. count_by_label_columns(df, "enclave", "department")
def dateRange(from: LocalDate, to: LocalDate, step: Int = 1): Iterator[LocalDate]
def department_labels(labels: String*): Column
Given a set of department labels, returns a set of filter expressions for filtering records from those departments.
Given a set of department labels, returns a set of filter expressions for filtering records from those departments.
Example:
1. df.filter(department_labels("IT", "DEV"))
val dns_rr_types: Column
val dns_rr_types_udf: UserDefinedFunction
val email_addr: UserDefinedFunction
val email_addrs: UserDefinedFunction
val email_display: UserDefinedFunction
val email_displays: UserDefinedFunction
val email_domain: UserDefinedFunction
val email_domains: UserDefinedFunction
val email_header_addrs: UserDefinedFunction
val email_header_displays: UserDefinedFunction
val email_header_domains: UserDefinedFunction
val email_header_mailboxes: UserDefinedFunction
val email_mailbox: UserDefinedFunction
val email_mailboxes: UserDefinedFunction
def enclave_labels(labels: String*): Column
Given a set of enclave labels, returns a set of filter expressions for filtering records from those enclaves.
Given a set of enclave labels, returns a set of filter expressions for filtering records from those enclaves.
Example:
1. df.filter(enclave_labels("IT1A", "DEV2B"))
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def filter_array_field_list(df: DataFrame, field: String, list: Seq[Any]): DataFrame
def filter_by_field(df: DataFrame, field: String, condition: String, value: Any): DataFrame
Filter a DataFrame on the passed field, value and condition.
def filter_by_times(df: DataFrame, start: String, end: String): DataFrame
Return a DataFrame containing only flows starting at or after the passed start time before or at the passed end time.
Return a DataFrame containing only flows starting at or after the passed start time before or at the passed end time. Empty strings can be passed for the start or end times to avoid filtering on that field.
def filter_field_list(df: DataFrame, field: String, list: Seq[Any]): DataFrame
This function filters and returns a DataFrame based on whether the value or values in the given field of the DataFrame are contained in the given list.
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable])
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native()
def ip_is_external(a: Any): Column
True if the input is a string representing an IP address not in the internal set.
True if the input is a string representing an IP address not in the internal set. False if it is an IP address that is in the internal set. NULL if the value cannot be parsed as an IP address.
def ip_is_internal(a: Any): Column
True if the input is a string representing an IP address in the internal set.
True if the input is a string representing an IP address in the internal set. False if it is an IP address that is not in the internal set. NULL if the value cannot be parsed as an IP address.
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
def load_csv_file(filepath: String): DataFrame
Import data from a CSV file and return a DataFrame containing the imported data.
def load_hive_table(tablename: String): DataFrame
Import data from a CSV file and return a DataFrame containing the imported data.
def load_ipset(infile: String): DataFrame
Load a SiLK IPset file into a DataFrame of individual IP addresses.
def load_ipset_blocks(infile: String): DataFrame
Load a SiLK IPset file into a DataFrame of IP address blocks.
def make_timestamp(t: Timestamp): Timestamp
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native()
def organization_labels(labels: String*): Column
Given a set of organization labels, returns a set of filter expressions for filtering records from those organizations.
Given a set of organization labels, returns a set of filter expressions for filtering records from those organizations.
Example:
1. df.filter(organization_labels("ORG1", "ORG2))
def save_csv_file(df: DataFrame, outfile: String): Unit
Write data in an existing DataFrame to a CSV file.
Write data in an existing DataFrame to a CSV file. The output CSV file will have headers corresponding to the column names of the DataFrame.
def save_hive_table(df: DataFrame, tablename: String): Unit
Write data in an existing DataFrame to a Hive table.
def save_ipset(outfile: String, df: DataFrame): Unit
Save a DataFrame of IP addresses or blocks as a SiLK IPset file.
def sip_dip_direction(sip: Any, dip: Any): Column
If both arguments are strings parsable as IP addresses, and if the first argument (sip) represents the initiator of a connection and the second argument (dip) represents the recipient of a connection, returns:
If both arguments are strings parsable as IP addresses, and if the first argument (sip) represents the initiator of a connection and the second argument (dip) represents the recipient of a connection, returns:
"in" if the initiator is external and the recipient is internal "out" if the initiator is internal and the recipient is external "int2int" if both are internal "ext2ext" if both are external NULL if either value is not parsable as an IP address
val ssl_rewrite_col: UserDefinedFunction
def ssl_rewrite_df(df: DataFrame): DataFrame
def stime_date(date: String): Column
Given a string in the form of yyyy-mm-dd representing a date, returns a filter expression that can be used to filter a DataFrame for records with a start time on that date.
Given a string in the form of yyyy-mm-dd representing a date, returns a filter expression that can be used to filter a DataFrame for records with a start time on that date.
Example:
1. df.filter(stime_date("2018-01-01"))
def stime_range(begin: String = null, end: String = null): Column
Given string arguments representing the begin and end of a time range, returns a filter expression that can be used to filter a DataFrame for records with a start time within that range.
Given string arguments representing the begin and end of a time range, returns a filter expression that can be used to filter a DataFrame for records with a start time within that range.
Example:
1. df.filter(stime_range("2018-01-01 12:00:00", "2018-01-01 13:00:00"))
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
val tls_cipher_descs: Column
val tls_cipher_descs_udf: UserDefinedFunction
def toString(): String
Definition Classes
AnyRef → Any
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
def withSensorIds[T](input: Dataset[T], sensorIds: Seq[Int]): Dataset[T]
val yaf_ssl_object_type: UserDefinedFunction

Deprecated Value Members

def make_timestamp(s: String): Timestamp
Annotations
@deprecated
Deprecated
(Since version org.cert.netsa.mothra.analysis 1.3.2) this method will be removed -- please use Timestamp.valueof(s)

Packages

Scala Packages

org.cert.netsa.data

org.cert.netsa.io.ipfix

org.cert.netsa.io.silk

org.cert.netsa.util

Spark Packages

org.cert.netsa.mothra.datasources

org.cert.netsa.mothra.analysis

org.cert.netsa.mothra.functions

util

object util

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Scala Packages

org.cert.netsa.data

org.cert.netsa.io.ipfix

org.cert.netsa.io.silk

org.cert.netsa.util

Spark Packages

org.cert.netsa.mothra.datasources

org.cert.netsa.mothra.analysis

org.cert.netsa.mothra.functions

util

object util

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

util