Netflow records contain a src-ip and a dest-ip. Is there a way to interpret "client" and "server" from the stored data where the "client" is defined as a host "initiating" the connection?
Answer 1: Use "initial-flags"
Probably the most effective way to separate clients from servers is to use the
intial-flags switch in rwfilter. TCP conversations with
initial=S/SA are those which are initiated by the client (the first packet was the client SYN) so the client is the source address, the server is the destination address, and the service is the destination port.
Similarly, you might look at TCP conversations with
initial-flags=SA/SA. These are typically flows where the first packet was the server's SYN-ACK, so the source address is the server, the destination address is the client, and the service is the source port.
If you're using YAF for flow collection, you can capture initial flags; however, if you're using many of the standard collection engines, initial flags are not captured, so you can't query against them and this approach won't work.
Answer 2: Use a port-based approach
The most common service ports are below 1024, and ephemeral ports are always greater than 1024. Taking advantage of this, we can create an IP set of these services by looking for ephemeral port connections something like this:
Now, suppose after looking at the leftover traffic that has neither port below 1024, you find some additional common service ports like 1935 (Shockwave) and 8080 (HTTP proxy). This example shows how to add these extra service ports and how to generate a list of service addresses and ports instead of IP sets:
Answer 3: Use a port-based prefix map
This approach builds off of answer 2. In this case, rather than create a long list of ports, we put the list in a prefix map and query off the prefix map. Here's how it works.
First, create the prefix map that defines ports on which you expect services. Note that prefix maps are hierarchical, so the generic range-based assignments are overwritten by more specific entries. The port based prefix map looks like this:
Second, compile the prefix map:
Finally, use the compiled prefix map to separate client and server traffic using a command very similar to what we defined above:
Answer 4: A Time-Based Approach (not recommended)
You may try to identify clients and servers by using timing information. Assuming the first flow seen was initiated by the client, the source address is the client and the destination address is the server. However, this technique is actually very tricky and often does not work well. It assumes that you have both directions of the flow, and that the times are recorded very accurately (very difficult with asymmetric routing).
Don't forget about FTP!
Keep passive FTP data channels in mind, since they often look like high-port to high-port services. Active FTP data channels make your FTP client look like a server. There's another tooltip on identifying FTP traffic; it's best to try and remove FTP data channels before trying to build up a list of clients and servers.