CERT
Software Assurance Secure Systems Organizational Security Coordinating Response Training
Child pages
  • Using gnuplot to plot time series of flow data
Skip to end of metadata
Go to start of metadata

Gnuplot is a scientific visualization and plotting tool that provides command-line facilities for generating charts from text data. Combined with the SiLK toolset it provides facilities for quickly visualizing data for exploratory analysis or systematic reporting.

The easiest way to combine SiLK data with gnuplot is through rwcount. For example:

  $ rwcount --bin-size=3600 sample.rwf > sample.txt
  $ gnuplot
  $ gnuplot> plot "sample.txt" using 2 with linespoints

This produces a simple image like the one shown here

Gnuplot is very good at producing unattractive plots with minimal instruction. In this case, we have the following problems to consider:

  • The scale is linear, which for this data results in a single spike dominating traffic
  • The x axis values are meaningless
  • The label is the filename

All of these can be easily fixed. Here's an improved command using gnuplot:

  gnuplot> set xdata time
  gnuplot> set timefmt "%Y/%m/%dT%H:%M:%S"
  gnuplot> set logscale y
  gnuplot> set yrange [1000:]
  gnuplot> plot 'sample.txt' using 1:2 title 'Bytes' with linespoints 2

Which results in this image:

We now cover each of these commands in order:

  gnuplot> set xdata time
  gnuplot> set timefmt "%Y/%m/%dT%H:%M:%S"

This instructs gnuplot to treat its x axis as time-ordered data. The next line specifies the format of the time data; the "%Y/%m/%dT%H:%M:%S" format will read normal rwcount dates correctly.

  gnuplot> set logscale y

This sets the y axis to use a logarithmic rather than linear scale. Practically speaking, logarithmic scale plots will reduce the effect of large outliers (such as those caused by scans and DDoSes) and let you see other traffic in a plot.

  gnuplot> set yrange [1000:]

The yrange command tells gnuplot what set of y values to plot; in the form given above ([1000:]), gnuplot will plot everything that has a value of 1000 or more.

  gnuplot> plot 'sample.txt' using 1:2 title 'Bytes' with linespoints 2

Note that in the new plot we specify what columns of the data file to use (using 1:2). Gnuplot will treat the date field from rwcount as a column, and then every other value (records,bytes,and packets) as additional columns. This instruction says to use the first column (dates) as the X values and the second column (records) as the Y values.

The "title" command specifies a title (in this case 'Records'). The end of the command (with linespoints 2) specifies to plot using a line with points and to set the color to blue (style 2). The resulting plot is the second plot shown above.

Gnuplot is a fully-featured graphics programming environment. You can learn more about gnuplot using its built-in help facility. Just type "gnuplot" at the command line to enter interactive use, and type "help" to learn more. "help plot" will teach you specifically about the "plot" command.

  • No labels