Monitoring for Large-Scale Networks

UNX-OBP

Component Details and Initial Loads

S3 Directory Structure

The Main UNX-OBP Bucket:

scripts/regenerate_protocol_baseline.py

<protocol>/baseline_traffic/ contains:

Multiple files containing protocol traffic
Filename format:

<protocol>__<iso-date>__<uuid4> e.g., smb__2022-03-08__1a584418-ca79-4147-9fc2-8cd2abcb6454

where filename elements are separated by two underscores

Per line file content format is prefixed with <protocol>; and Pipe Separated Values (PSV) as follows:

where asn_info → netblk|asn|cc|rir|org

The incoming queue receives messages from NiFi. Each message is a JSON object and certain fields are expected. The field names generally follow the conventions of rwcut –fields options, with a few variations:

sensor, class, and type are integer IDs; if a silk.conf is provided to the SiLK NiFi processor, sensorName, className, and typeName will be present, all strings representing the symbolic name
flags, initialFlags, and sessionFlags are strings, but there are also flagsBits, initialFlagsBits, and sessionFlagsBits, which are integer representations of the TCP flags observed as set in the flow
attributes is a string, but attributesBits is the integer representation
Lastly, there may be a memo field, which is an integer but is irrelevant to the analytic

An additional required field called, unx_obp_proto, must be present and is added to each record by NiFi. The value of this field is a lowercase string of the protocol which the flow should represent. For example, “unx_obp_proto”: “smb”.

The minimum required fields in any incoming message are given in the following example JSON object:

{
  "unx_obp_proto": "smb",
  "sTime": "2025-03-11T01:19:56.621",
  "sensor": 9050,
  "sIP": "90.5.0.203",
  "dIP": "20.36.163.72",
  "sPort": 47547,
  "dPort": 445,
  "protocol": 6,
  "application": 139,
  "bytes": 67270,
  "packets": 111,
  "duration": 331.688
}

Finally, the format of “sTime” must be in standard ISO format. This can be configured in the properties of the relevant JsonRecordSetWriter service in NiFi. Specifically, you’ll need to set the “Timestamp Format” property to “YYYY-MM-dd’T’HH:mm:ss.SSS” (without the double quotes).

Outgoing SQS Queue

The outgoing queue messages are read/received by NiFi. Each message is a JSON object and certain fields are expected. Required fields include:

unx_obp_proto, whose value is a lowercase string of the protocol which the flow represents.
base_psv, whose value is a Pipe Separated Value (PSV) string in the format:

where asn_info → netblk|asn|cc|rir|org

Infrastructure Dependencies

Depending on your specific environment and requirements, you will likely need to modify the provided Terraform to suit your needs and/or perform a manual deployment of some UNX-OBP resources.

Networking

All components were deployed and tested in AWS within a single basic VPC with various VPC endpoints.

As each cloud network environment is different, exactly how you enable access for and/or between the UNX-OBP resources is up to you and is your responsibility to align to your environment’s networking practices.

IAM

The permissions (API Actions) required for each component to interact with other resources are handled via Terraform, which will create the necessary IAM roles/policies for you. However, it is ultimately your responsibility to review/adjust/align them to your environment’s IAM practices (e.g., nomenclature, level of granularity, etc.).

OpenSearch / Elasticsearch

During testing, an internal VPC-attached AWS-managed OpenSearch / Elasticsearch service was used, with no domain access policy. Thus, there are no special permissions noted here - only VPC access and security groups were used to manage access from other cloud resources.

You likely have an existing Elasticsearch cluster somewhere. Your Elasticsearch deployment and access policies will be different, and the relevant components of the analytic will need to be adjusted for that. This might consist of adjusting IAM policies associated with the IAM role(s) associated with certain Lambda functions, adjusting the authentication values for the REST API requests in the function code, configuring a proxy or use of a certain Certificate Authority in the function code, and/or removing Elasticsearch Terraform resources and explicitly setting the Terraform es_domain variable and ES_DOMAIN environment variable to your existing Elasticsearch domain endpoint.

NiFi

You likely have an existing NiFi cluster. You’ll need to establish the necessary IAM policy for your cluster to access UNX-OBP resources. Namely, you’ll need to account for how you specifically ingest SiLK binary files. Otherwise, NiFi must have at least the following permissions:

    {
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:${partition}:s3:::${unx_obp_bucket}/*",
      "Effect": "Allow",
      "Sid": "AllowReadWriteUnxObpBucket"
    },
    {
      "Action": [
        "s3:List*"
      ],
      "Resource": "arn:${partition}:s3:::${unx_obp_bucket}",
      "Effect": "Allow",
      "Sid": "AllowListUnxObpBucket"
    },
    {
      "Action": [
        "sqs:ReceiveMessage",
        "sqs:DeleteMessage",
        "sqs:GetQueueAttributes"
      ],
      "Resource": [
        "${outgoing_sqs_arn}"
      ],
      "Effect": "Allow",
      "Sid": "AllowReadDeleteMessagesFromOutgoingSQS"
    }

DynamoDB

All DynamoDB tables setup/tested using PAY_PER_REQUEST billing mode (On-demand capacity mode), so there is no read or write throttling. It is NOT recommended to change this for two reasons:

If you do not set provisioned capacity correctly, it will likely cause the state machine executions to slow or stop completely due to throttling, causing executions to fail. Also, it will greatly slow down the nightly baseline regeneration process.
For this use case, the overall number of read/write requests are generally predictable and are accounted for in overall cost estimates. You generally do not need to worry about significant swings in read/write requests, and therefore sudden increases in DynamoDB costs, because the number of requests are largely based on the amount of per-protocol outbound traffic, which tends not to experience significant swings.

Lambda

All Lambda functions are written in Python and use the Python 3.9 runtime.

In addition to any required permissions listed below, all Lambda function IAM roles should include standard logging permissions:

    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }

For VPC-attached Lambda functions, the following standard permissions should be included as well:

    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateNetworkInterface",
        "ec2:DeleteNetworkInterface",
        "ec2:DescribeNetworkInterfaces"
      ],
      "Resource": "*"
    }

Secrets Manager

This service holds the user IDs (“uid”) and secret keys (“skey”) used for any API lookups, namely Censys.io. You will deploy these manually.

Initial Loads

1 - Deploy Infrastructure

Deploy the infrastructure/services using the provided Terraform.

2 - Prepare the OpenSearch / Elasticsearch Cluster and Dashboards

This step consists of uploading an index template, importing Saved Objects, loading initial documents to certain indices.

Load the UNX-OBP Index Template

The index template essentially controls the field data type mappings and number of primary and replica shards created for each matching index prefix. This index template covers all indices used by the capability. Those include the following:

unx-obp-asn-info - a single permanent index for the ASN Info cache; documents are added as lookups occur and removed nightly if expired
unx-obp-csp-info - a single permanent index for the CSP Info cache; documents are refreshed (largely replaced) and non-updated documents removed weekly
unx-obp-alert-* - a daily index for all UNX-OBP Alerts generated that day; suggest you keep the latest 20-30 days of these indices open for writing; suggest you retain at least 90 days of indices in the cluster
unx-obp-allowlisted-* - a daily index for all UNX-OBP Allowlist hits generated that day; suggest you keep the latest 20-30 days of these indices open for writing; suggest you retain at least 90 days of indices in the cluster

Due to the relatively low volume of documents that should be in any one index at any given time, there only needs to be a single primary shard for each index. Nonetheless, you may wish to adjust the number of replica shards according to the number of data nodes in your cluster and/or to improve search performance. This index template currently uses the following defaults:

number_of_shards: 1
number_of_replicas: 1

You can adjust these at the bottom of the file, if desired.

Use the “search_domain_endpoint” (ES_DOMAIN) from the Terraform output…

curl -XPUT ${ES_DOMAIN}/_template/template_unx-obp \
  --data @unx-obp-ecs-catchall-template.json \
  -H 'Content-Type: application/json'

Import Saved Objects

The provided file, unx-obp-public-saved-objects-export.ndjson, contains exported Saved Objects including index patterns, saved searches, visualizations, and dashboards.

Import this file via the Stack Management options available in the Dashboards / Kibana web interface.

Perform Initial Load of ASN Info

The provided file, asn-info-initial-load.json, contains some long-lived ASN Info documents, the loading of which will also create/prime the unx-obp-asn-info index.

yesterday=$(date --date="1 day ago" +"%Y-%m-%d")
sed -i "s/YYYY-MM-DD/$yesterday/g" asn-info-initial-load.json
curl -XPOST ${ES_DOMAIN}/_bulk \
  --data-binary @asn-info-initial-load.json \
  -H 'Content-Type: application/json'

Confirm there are documents loaded by using the Discover tab and looking at the unx-obp-asn-info index for the last 24 hours. Change the time-picker to the last 7 days if nothing shows up on first attempt.

Perform Initial Load of CSP Info

Assuming that the unx-obp-update-csp-info-lambda-function-scheduled was setup correctly in Step 1, you can simply manually execute this function via the AWS Management Console using an empty test event.

Any fatal errors should be evident from within the same interface. Once complete, confirm there are documents loaded by using the Discover tab and looking at the unx-obp-csp-info index for the last 24 hours.

3 - Create / Update Org Info Lambda Layer

This step should be performed any time there is an update to the silk.conf.

The major prerequisite for this step is to have available the latest silk.conf with specially annotated sensor-descriptions.

Each sensor-description must be annotated with the following fields:

org_name - the acronym of the org to which the sensor-id is associated
org_parent - the acronym of the org that is parent to the org specified in the org_name field, or ‘none’ if no parent org exists

The following format is expected:

sensor 101 ABC1 "org_name:ABC,org_parent:DAFT"
sensor 102 ABC2 "org_name:ABC,org_parent:DAFT"
sensor 199 XYZ1 "org_name:XYZ,org_parent:none"
...

With the latest annotated silk.conf available, use the provided Python script, silk-site2star.py, to parse the file and generate a file of JSON objects.

python3 silk-site2star.py \
  --silk-conf=latest_annotated_silk.conf dynamodb > ddb-org-info-items.json

Org info is now looked up via a Lambda Layer. Simply add the ddb-org-info-items.json file to a ZIP archive and move it to the terraform/files/lambda/ directory.

4 - Load Allow List and Explicit Deny List Entries to DynamoDB

This step consists of running the provided Python script, load_list.py, with the provided simple-allowlist.json, enhanced-allowlist.json, and xdenylist.json files to convert entries to unique DynamoDB items for fast lookups. You will need to run this from a machine/shell that has the appropriate AWS credentials available with the appropriate permissions and the AWS SDK for Python, boto3, installed (pip3 install boto3).

See Terraform outputs for table_prefix and aws_region values.

python3 load_list.py -a -s -r simple-allowlist.json -t <table_prefix> --region <aws_region>
python3 load_list.py -a -e -r enhanced-allowlist.json -t <table_prefix> --region <aws_region>
python3 load_list.py -x -r xdenylist.json -t <table_prefix> --region <aws_region>

Confirm items loaded by exploring the items in each table via the DynamoDB page of the AWS Management Console.

5 - Prepare NiFi Cluster

Download the latest NetSA NiFi NAR file and place it into the NiFi lib directory (e.g., /opt/nifi/lib/) - this may require a restart of the NiFi service.

Follow the NiFi instructions in Get Started. You may need to update placeholder values in the Queue URL, Bucket, and Region properties of the following processors:

Send to Analytic Incoming Queue (in UNX-OBP INPUT process group)
Receive from Analytic Outgoing Queue (in UNX-OBP OUTPUT process group)
PutS3Object (in UNX-OBP OUTPUT process group)

UNX-OBP

UNX-OBP

Component Details and Initial Loads

S3 Directory Structure

Queue Message Structures

Incoming SQS Queue

Outgoing SQS Queue

Infrastructure Dependencies

Networking

IAM

OpenSearch / Elasticsearch

NiFi

DynamoDB

Lambda

Secrets Manager

Initial Loads

1 - Deploy Infrastructure

2 - Prepare the OpenSearch / Elasticsearch Cluster and Dashboards

Load the UNX-OBP Index Template

Import Saved Objects

Perform Initial Load of ASN Info

Perform Initial Load of CSP Info

3 - Create / Update Org Info Lambda Layer

4 - Load Allow List and Explicit Deny List Entries to DynamoDB

5 - Prepare NiFi Cluster