Friday, January 24, 2014

Visualization

The human eye can often pictorially identify patterns, connections, and outliers in the data that would otherwise be very difficult to identify through other means.  Visualization, in that it allows the human eye to pictorially scan the underlying data, can be a powerful tool when leveraged appropriately.

I have seen many different types of network traffic data visualizations, but I have seen very few that add value to security operations.   To understand why, it helps to take a step back and look at what visualization is attempting to do from a higher level.

The purpose of visualization is most often to elicit patterns, connections, and outliers in the data using the human eye as the parsing and analysis mechanism.  In order to properly elicit meaning from large enterprise data, one must first reduce the data to improve the signal to noise ratio.  In other words, given the volume and variety of data in the modern enterprise, the level of noise is simply too high to allow for meaningful visualizations without first performing one or more data reductions.

How does one perform data reduction to produce a meaningful visualization that will be useful to security operations?  Thinking about what specific question the data should be used to answer is a good first step.  Let me try and illustrate this through an example.  For our example, let's assume that we are trying to use visualization to understand to which countries we are sending Office documents.  Before we can think about how to visualize the data, we need to reduce the data by asking it to return only the results that meet these criteria:
  • That the data is leaving the network (as opposed to entering the network).
  • That the data contains only sessions where the file type is one of the Office file types (e.g., Word, Excel, PowerPoint, etc.).
  • That we have a mechanism in place to map the destination to a country (be it by domain, IP address, or ASN).
Once the data has been reduced, and the signal to noise ratio has been increased substantially, we can begin to consider which type of visualization fits best.  Different questions asked of the data will necessitate different types of visualizations to elicit the patterns, connections, and outliers we are looking to delineate.  In our example, a world map with some coloring or shading to indicate volume (be it number of sessions, number of bytes, or otherwise) probably fits best.  When completed, our visualization will provide us with a graphic that we can scan with our eye.  The data reduction we performed allows us to assess quickly, with a specific context in mind, whether or not we can identify something requiring further investigation.

In my experience, unfocused attempts at visualization produce visualizations that are not particularly useful for security operations.  Visualization does have tremendous potential to bring value to security operations when leveraged properly.  Performing data reduction by posing specific, incisive queries into the data provides a good starting point for producing visualizations of high value to security operations.

No comments:

Post a Comment