A Note on "Unexpected"

Our notion of unexpected is tied to our experience of what happens and what doesn't happen, or the way that certain things happen. It is tied to our notions of what is common.

Common is what has precedent (what happens) and some level of regularity and/or consistency - some phenomenon which occurs within somewhat predictable bounds of frequency, volume, time, etc. It is generally expected that a phenomenon continues to occur within or around those bounds. It is generally unexpected when the phenomenon occurs too far outside of those bounds or when a sufficiently new phenomenon occurs. There's a difference between "commonly occurring" and "commonly occurring this way." "Commonly occurring this way" can also be thought of as consistency. Think about your expectations and level of surprise by the sun rising each day at around the same time or the cycles of the moon.

We apply these concepts to our outbound views of certain network traffic. We track how frequently we observe certain flow characteristics (as a percentage of days seen in the last X days of baseline network traffic), the volume at which they tend to occur (average flow count per day seen), and when they tend to occur (days of the week, hours of the day). Also, we track application, packets, bytes, and duration tendencies per those same flow characteristics.

Percent days seen is the best reflection of "commonly occurring." All other measurements are more a reflection of "commonly occurring this way"/consistency.

Unexpected outbound protocol use occurs when certain flow characteristics are observed but which have never been seen before, or when certain flow characteristics are observed and have been seen before, but which have only rarely occurred or are sufficiently inconsistent with past occurrences.

So what are the flow characteristics?

The Anchor Flow Characteristics

A Quick Note on Levels of Specifity (from least to most specific)

Source: parent organization -> organization -> sensor -> src-net-block -> sip

Destination: ASN info (number, cc, registry, org) -> dst-net-block -> dip -> dport -> protocol/service

From left to right, each pair typically has a one-to-many relationship. For example:

Importantly, observed outbound dports ultimately depend on what is allowed to egress a network. The observed dports may or may not change often - this is more a reflection of the velocity, fluidity, or lack of egress controls of each source organization/enclave. Ideally, they will be tightly controlled, and you would expect certain outbound protocols to have more stringent rules regarding their use. Either way, we will know about it.

In order to balance efficacy and alert generation, we anchor/scope protocol baseline traffic measurements to different levels of specificity – the Full Anchor Tuple and the Partial Anchor Tuple, which essentially act as unique keys for the calculation and aggregation of other flow characteristics.

Full Anchor Tuple (FAT)

Consists of: src-org|sensor|sip|dip|dport|asn_info where asn_info → dst-netblk|asn|cc|rir|org

This is the most specific set of flow characteristics on which protocol baseline traffic measurements are calculated. It essentially accounts for unique source/destination IP pairs across any combination of src-org, sensor, dport, and asn_info. These baseline entries are used for consistency checks, if a matching tuple exists for an incoming flow, as well as to add context to generated alerts.

Partial Anchor Tuple (PAT)

Consists of: sensor|dport|asn_info where asn_info → dst-netblk|asn|cc|rir|org

This is the slightly abstracted and broadest set of flow characteristics on which protocol baseline traffic measurements are calculated. IP addresses are more variable for numerous reasons (e.g., unknown/complex source network architecture, CDNs, load balancers, routing policies, etc.). So we treat variations in sip and/or dip as less significant than variations in sensor|dport|asn_info combinations. These baseline entries ultimately determine whether an incoming flow for a given protocol was NEVER_SEEN_IN_BASELINE or SEEN_BUT_RARELY_OCCURRING (see below). By default, we allow PAT matches which are above set thresholds to pass, that is, not alert.

The Measurements

Each flow of the baseline traffic is first lightly enriched with source org and destination ASN information.

The following measurements are calculated for each unique FAT and PAT:

The following additional measurements are calculated for each unique PAT:

Example FAT Baseline Entry

{
  "fat": "FLDC|622|77.128.165.28|138.128.188.168|445|138.128.160.0/19|33182|US|arin|DIMENOC",
  "total_days_seen": 24,
  "perc_days_seen": "26.67",
  "spread_days_seen": 72,
  "first_seen": "2020-08-24T19:25:34.904",
  "last_seen": "2020-11-03T16:13:44.259",
  "dow_seen": ["Monday", "Tuesday", "Wednesday", "Friday", "Thursday"],
  "hours_seen": [19, 20, 21, 22, 23, 18, 17, 12, 16, 14],
  "applications_seen": [0],
  "avg_flow_count": "2.33", "std_flow_count": "2.18",
  "avg_packets": "471.64", "std_packets": "428.78",
  "avg_bytes": "613787.875", "std_bytes": "594779.603",
  "avg_duration": "22.501", "std_duration": "15.102",
  "expires_at": 1651780714
}

Example PAT Baseline Entry

{
  "pat": "622|445|138.128.160.0/19|33182|US|arin|DIMENOC",
  "total_days_seen": 87,
  "perc_days_seen": "96.67",
  "spread_days_seen": 90,
  "first_seen": "2020-08-09T23:58:13.253",
  "last_seen": "2020-11-06T15:00:29.139",
  "dow_seen": ["all"],
  "hours_seen": [24],
  "applications_seen": [0, 139],
  "avg_flow_count": "23.47", "std_flow_count": "20.82",
  "avg_packets": "44.23", "std_packets": "140.25",
  "avg_bytes": "24450.706", "std_bytes": "139338.67",
  "avg_duration": "25.945", "std_duration": "73.435",
  "uniq_sip_count": 17,
  "uniq_dip_count": 28,
  "uniq_fat_count": 65,
  "highest_fat_perc_days_seen": "56.67",
  "highest_fat_avg_flow_count": "30.16", "highest_fat_std_flow_count": "17.65"
  "expires_at": 1651780714,
}

Thresholds and Baseline Checks

There is a set of configurable thresholds for each protocol. Those include the following:

Sane defaults for the thresholds are already set for the SMB protocol.

{
    "protocols": [
        {
            "protocol": "smb",
            "thresholds": {
                "perc_days_seen": "26.0",
                "consistency_score": 85,
                "standard_deviations": "3.0"
            }
        }
    ]
}

You should use these sane default values initially for newly onboarded protocols, but ideally you will set these values after studying the baseline traffic and adjusting according to the data. For example, if the baseline protocol traffic is really consistent, you might decrease the standard_deviations and/or increase the consistency_score. Or if the baseline protocol traffic shows that the protocol is rather well-controlled and the vast majority of connections occur regularly, you might increase the perc_days_seen threshold. Ultimately, you determine via these values what is considered common and consistent for the given protocol.

The Consistency Score

The consistency score starts at 100 and various deviation checks are performed. For each deviation, some number of points is deducted from the consistency score. Since this is an outbound analytic, for packets, bytes, and duration, we only care about deviations which are higher than usual - we neither check nor deduct any points for deviations which are lower than usual. We weigh deviations in applications seen and bytes more than other deviations.

These checks are not performed at all if no matching baseline entry exists for the PAT of an incoming flow. If a matching baseline entry exists for the FAT of an incoming flow and the FAT was seen on at least 2 days and total_days_seen * avg_flow_count >= 10, then the measurements associated with the FAT entry are used for these checks; otherwise, the measurements associated with the PAT entry are used.

The deviation checks are as follows:

So, with a default consistency_score of 85, the following sets of deviations are the minimum needed to generate an alert:

The Baseline Checks

Each incoming flow's PAT is derived and looked up in the baseline database. If no matching entry exists, it is marked as an unexpected record and the alert reason is NEVER_SEEN_IN_BASELINE.

If a matching entry does exist, then the PAT entry's perc_days_seen is compared against the perc_days_seen threshold. If the entry's perc_days_seen is less than the set threshold, it is marked as an unexpected record and the alert reason is SEEN_BUT_RARELY_OCCURRING.

Lastly, if a matching entry exists, the consistency score is calculated according to the above section. If the calculated consistency score is less than the set threshold, the alert reason will be SEEN_BUT_INCONSISTENT, but only if the incoming flow is not considered "rarely occurring." Either way, the consistency score is calculated and included in all SEEN_BUT_RARELY_OCCURRING and SEEN_BUT_INCONSISTENT alerts.

Explicit Deny and Allow Lists

You can add entries containing one or more rules to match on incoming flows and select metadata to one of two lists.

The Explicit Deny List applies prior to the baseline checks and allows you to match on flow characteristics that you want to alert on immediately.

The Allow List applies after the baseline checks and only on flows marked as an unexpected record. This allows you to suppress certain flow characteristics for which you no longer wish to receive alerts for any number of reasons.

Entries and rules in both lists follow the same basic format.

Entries

Entries may be defined on a global level (applies to all flows/protocol traffic incoming to UNX-OBP) or per protocol (applies to incoming flows for a single protocol). Each entry has the following fields:

Rules

There may be one or more rules defined in each entry. Each rule is a single string with fields/values defined as field=value separated by semicolons.

Only positive hits on all fields in a rule against the incoming flow is considered a match. No field should be present in a rule more than once. Rules can be any order/combination of one or more of the following fields:

If you specify major_csp=true in a rule, you may also specify more granular CSP criteria with the following fields:

In addition to the above fields and applicable to allowlist rules only:

For fields that support ranges, checks are always inclusive and you must specify at least one limit (lower or upper) with a hyphen. Ranges may be included in a comma-separated list of values. See the following examples with ranges: