super_table_creator - MySQL Database Schema Creator for super_mediator
super_table_creator [--out MYSQL_DB_HOSTNAME]
[--name MYSQL_USER_NAME]
[--pass MYSQL_PASSWORD]
[--database DATABASE_NAME]
[--version]
[--flow-only]
[--no-index]
[--dns-dedup]
[--dedup-last-seen]
[--flow-stats]
[--yaf-stats]
[--dedupflow]
[--dedup TABLE_NAME]
[--ssl-certs]
[--ssl-dedup]
super_table_creator creates the MySQL database tables for use with super_mediator(1) TEXT Exporters. super_mediator exports deep packet inspection and flow information to CSV Files. CSV files can be imported into databases using tools such as mysqlimport.
The following options configure where super_table_creator creates database tables and which tables to create.
MYSQL_DB_HOSTNAME is the hostname or IP address where the MySQL Database lives. Default is localhost.
MYSQL_USER_NAME is the user name to use when connecting to the MySQL server. Default is root.
MYSQL_PASSWORD is the password to use when connecting to the MySQL server. No Default.
DATABASE_NAME is the name of the database to create, or the name of the pre-existing database to create the table(s).
Print the version and exit.
Create the full flow table and exit. The full flow table has the following fields:
column name | size | description
stime | DATETIME | flow start time
etime | DATETIME | flow end time
duration | DECIMAL(10,3) | duration
rtt | DECIMAL(10, 3) | round trip time
protocol | TINYINT | flow protocol
sip | VARCHAR(40) | source IP address
sport | MEDIUMINT | source port
pkt | BIGINT | packetTotalCount
oct | BIGINT | octetTotalCount
att | MEDIUMINT | flow attributes
mac | VARCHAR(18) | source MAC Address
dip | VARCHAR(40) | destination IP Address
dport | MEDIUMINT | destination Transport Port
rpkt | BIGINT | reversePacketTotalCount
roct | BIGINT | reverseOctetTotalCount
ratt | MEDIUMINT | reverse flow attributes
rmac | VARCHAR(18) | destination MAC Address
iflags | VARCHAR(10) | initial TCP Flags
uflags | VARCHAR(10) | union TCP Flags
isn | VARCHAR(10) | initial sequence number (hex)
risn | VARCHAR(10) | reverse initial sequence number (hex)
vlan | VARCHAR(3) | vlan ID (hex)
app | MEDIUMINT | application label
ent | INT | entropy
rent | INT | reverse entropy
reason | VARCHAR(10) | flow end reason
If present, put flow index information into each table. Otherwise, a separate flow index table is created and it will be necessary to join the flow index table with the application protocol table to retrieve all information related to a particular flow. The following fields will be added to each DPI table in place of the flow key hash, stime, and obid:
column name | size | description
stime | DATETIME | flow start milliseconds
sip | VARCHAR(40) | source IP address
dip | VARCHAR(40) | destination IP Address
sport | MEDIUMINT | source Transport port
dport | MEDIUMINT | destination transport port
vlan | INT | vlan ID
obid | INT | observation ID of the flow sensor
If present, create the default DNS deduplication table and exit. The default table consists of 4 columns. If using the LAST_SEEN option in the super_mediator.conf(1) file, use the --dedup-last-seen option. The default DNS deduplication has the following columns:
column name | size | description
first_seen | DATETIME | flow start time DNS Record was first seen.
rrtype | MEDIUMINT | type of resource record (A, NS, CNAME, etc.)
rrname | VARCHAR(270) | domain name found in RNAME in Resource Record
rrval | VARCHAR(300) | RDATA in Resource Record
If present, create the extended DNS deduplication table and exit. The extended table consists of 6 columns, the above 4 columns plus the last time seen and the hit count:
column name | size | description
first_seen | DATETIME | flow start time DNS Record was first seen.
last_seen | DATETIME | flow start time of last record seen before export
rrtype | MEDIUMINT | type of resource record (A, NS, CNAME, etc.)
rrname | VARCHAR(270) | domain name found in RNAME in Resource Record
hitcount | INT | number of records seen between first_seen and last_seen.
rrval | VARCHAR(300) | RDATA in Resource Record
If present, create the flow statistics table. As of yaf 2.3.0, yaf(1) will export extended flow information if yaf is run with --flow-stats. super_mediator(1) will collect and export this information in CSV format, if available. The table will be created in the following format:
column name | size | description
flow_key | INT | flow key hash
stime | BIGINT | flow start time
obid | INT | observation ID
tcpurg | BIGINT | number of packets with the TCP urgent flag set.
smallpkt | BIGINT | number of packets that are smaller than 60 bytes
nonempty | BIGINT | number of packets with a non-zero payload
datalen | BIGINT | total payload byte count
avgitime | BIGINT | average interarrival time
firstpktlen | INT | length of first non-zero payload
largepktct | BIGINT | number of packets that were larger than 220 bytes
maxpktsize | INT | largest payload length transferred in the flow.
firsteight | SMALLINT | Directionality for the first 8 non-empty packets
stddevlen | BIGINT | standard deviation of payload length
stddevtime | BIGINT | standard deviation of interarrival time
avgdata | BIGINT | average payload length in forward direction
revtcpurg | BIGINT | number of packets with the TCP urgent flag set.
revsmallpkt | BIGINT | number of packets that are smaller than 60 bytes
revnonempty | BIGINT | number of packets with a non-zero payload
revdatalen | BIGINT | total payload byte count
revavgitime | BIGINT | average interarrival time
revfirstpktlen | INT | length of first non-zero payload
revlargepktct | BIGINT | number of packets that were larger than 220 bytes
revmaxpktsize | INT | largest payload length transferred in the flow.
revstddevlen | BIGINT | standard deviation of payload length
revstddevtime | BIGINT | standard deviation of interarrival time
revavgdata | BIGINT | average payload length in reverse direction
If present, create the yaf process statistics table, "yaf_stats". As of yaf 2.0.0, yaf(1) will export process statistics every 5 minutes by default. super_mediator(1) will collect and write this information to the log file and to the TEXT exporters. It is possible to configure an exporter to only process yaf statistics so they can be imported to a database. The following table can be used to store yaf process statistics.
column name | size | description
ts | TIMESTAMP | auto insert the current time
flows | BIGINT | total exported flow count
packets | BIGINT | total exported packet count
dropped | BIGINT | total packets dropped by yaf
ignored | BIGINT | total packets ignored due to improper headers
expired_frags | BIGINT | total fragments expired
assembled_frags | BIGINT | total fragments assembled
flush_events | INT | number of times flow table flushed
table_peak | INT | max. number of flows in flow table
yaf_ip | VARCHAR | exporter IP address
yaf_id | INT | observation domain of exporter
flow_rate | INT | mean flow rate
packet_rate | INT | mean packet rate
If present, add a column between id and data for the hit count which is present when super_mediator(1) is configured with DEDUP_PER_FLOW. This option can be used with existing tables as it modifies the tables after initially creating them. This option only modifies the HTTP, SLP, IMAP, SMTP, POP3, IRC, FTP, SIP, RTSP, SSH, MODBUS, and ENIP tables. The other protocols are not affected by the DEDUP_PER_FLOW option.
If present, add the table with TABLE_NAME to the database given to --database and exit. This table's schema corresponds with the CSV output format of files produced by the DEDUP_CONFIG configuration. This table will have the following schema:
column name | size | description
first_seen | DATETIME | first time ip, data tuple was seen
last_seen | DATETIME | last time ip, data tuple was seen
ip | VARCHAR(40) | src or dst ipv4 or ipv6 address
hash | INT | flow key hash of last flow with ip, data tuple
hitcount | BIGINT | number of times ip, data tuple was seen
data | VARCHAR(500) | data that corresponds with configured info element ID
If present add the following two tables to the database specified by --database and exit. These two tables correspond to the CSV output format of the SSL_DEDUP_ONLY and SSL_DEDUP configurations in super_mediator(1). The certs table stores characteristics of certificates (CERT_FILE) and the certs_dedup table stores certificate metadata and hitcounts. The certs table has the following schema:
column name | size | description
serial | VARCHAR(150) | serial number of X.509 Certificate
issuer | VARCHAR(500) | Issuer's common name in X.509 Certificate
stime | DATETIME | first time certificate was seen
id | INT | the object/member ID of the data
ISE | VARCHAR(2) | Issuer(I), Subject(S), Extension(E)
cert_no | SMALLINT | Order in certificate chain
data | VARCHAR(500) | data that corresponds with id
The certs_dedup table will have the following schema:
column name | size | description
first_seen | DATETIME | first time ip, data tuple was seen
last_seen | DATETIME | last time ip, data tuple was seen
serial | VARCHAR(150) | serial number of X.509 Certificate
hitcount | BIGINT | number of times ip, data tuple was seen
issuer | VARCHAR(500) | Issuer's common name in X.509 Certificate
If present, create the ssl_ip_dedup table that follows the same format as SSL certificate de-duplicated data configured in the DEDUP_CONFIG block of super_mediator.conf(1) and exit. This certificate information is de-duplicated by IP address and certificate chain. It has the following schema:
column name | size | description
first_seen | DATETIME | first time ip, cert chain was seen
last_seen | DATETIME | last time ip, cert chain was seen
ip | VARCHAR(40) | src or dst ipv4 or ipv6 address
hash | INT | flow key hash of last flow with ip, cert chain tuple
hitcount | BIGINT | number of times ip, cert chain was seen
serial1 | VARCHAR(150) | serial number of End-user certificate
issuer1 | VARCHAR(500) | Issuer's common name of End-user Certificate
serial2 | VARCHAR(150) | serial number of Intermediate/Root CA certificate
issuer2 | VARCHAR(500) | Issuer's common name of Intermediate/Root CA Certificate
Unless --no-index is present, the following flow table will be created:
column name | size | description
flow_key | INT | flow key hash
stime | BIGINT | flow start milliseconds
sip | VARCHAR(40) | source IP address
dip | VARCHAR(40) | destination IP Address
protocol | TINYINT | flow protocol
sport | MEDIUMINT | source transport port
dport | MEDIUMINT | destination transport port
vlan | INT | vlan ID
obid | INT | observation ID of flow sensor
Together the flow key, stime, and obid will create a primary key to join with other tables.
column name | size | description
flow_key | INT | flow key hash
stime | BIGINT | flow start milliseconds
obid | INT | observation ID of flow sensor
qr | VARCHAR(1) | Query (Q) or Response (R)
id | INT | query or response ID
section | TINYINT | section of DNS Packet
nx | TINYINT | NXDomain (1) or Not (0)
auth | TINYINT | Authoritative Response (1) or Not (0)
type | MEDIUMINT | Resource Record Type (1,2,5,6,..)
ttl | INT | Time to Live
name | VARCHAR(255) | domain name from RRNAME
val | VARCHAR(255) | resource record data from RRDATA.
The X.509 Certificate table has the following format:
column name | size | description
flow_key | INT | flow key hash
stime | BIGINT | flow start milliseconds
obid | INT | observation ID of flow sensor
id | MEDIUMINT | information element ID
cert_type | VARCHAR(5) | Subject (S) or Issuer (I)
cert_no | TINYINT | order in certificate chain
data | VARCHAR(500) | X.509 Cert Value.
These tables all have the same format:
column name | size | description
flow_key | INT | flow key hash
stime | BIGINT | flow start milliseconds
obid | INT | observation ID of flow sensor
id | MEDIUMINT | information element ID
count* | INT | hit count *optional (--dedupflow)
data | VARCHAR(500) | data
In the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line.
$ super_table_creator --name dbadmin --password jkab7$3 \
--database my_flows --flow-only
Bug reports may be sent directly to the Network Situational Awareness team at <netsa-help@cert.org>.
Emily Sarneso and the CERT Network Situational Awareness Group Engineering Team, <http://www.cert.org/netsa>.