Friday, May 20, 2011

We Already Use Layer 7 Enriched Meta-Data and Don't Know It

It occurred to me the other day that many of us analyst types already use layer 7 enriched meta-data and likely don't realize it. In thinking about it, it dawned on me that DNS logs, proxy logs, IDS alerts, etc. are all a highly specialized versions of layer 7 enriched meta-data. Think of the power of a generalized layer 7 enriched meta-data -- the ability to exploit all the advantages of meta-data, while simultaneously providing much of the detail necessary for determining the true nature of network traffic of interest. Possibly the uber data source?

Merits of Meta-Data

In my previous post, I discussed the logic behind keeping a record of all traffic traversing the network. As we all know, the traffic that a large, enterprise network generates is incredibly voluminous. So what does one do to best keep eyes on the network? I believe the key here is meta-data. Meta-data describes the envelope information about transactions/conversations on the network, but doesn't include the content of the actual conversation. Network flow data is one type of meta-data, while layer 7 enriched meta-data (discussed in a previous blog post) is another type. This allows for several key advantages:

  • Long term retention of data for auditing and forensics purposes without the need for large amounts of expensive disk space.
  • The ability to see all the data without needing to sample, filter, or drop certain traffic.
  • Rapid search capability over vast quantities of data collected over long periods of time.

Now, for sure there is information in the packet data that is helpful for identifying the true nature of malicious or suspicious traffic. I believe that meta-data based technologies and packet-based technologies can work together beautifully here. Meta-data allows one to craft incisive queries designed to interrogate the data so as to identify network traffic that requires further investigation. I call these jumping off points (also discussed in a previous blog post). From there, the packet data can be consulted to assist in the investigation (presuming that the retention window for the packet data has not already expired).

As the amount of traffic on our networks continues to grow, I believe that we as a community will need to get used to the network traffic analysis model/work flow described above. I sometimes refer to it as breadth, then depth. I believe it to be a model capable of scaling with the data volumes of the present and future.

Seeing It All

There are some network monitoring technologies and some industry practitioners that practice sampling, filtering, or dropping of certain traffic. The logic here is that certain traffic is known to be noise that is of no concern from a cyber security perspective, and needn't be examined. Unfortunately, there is a fatal flaw in this logic. What may appear to be without value today may turn out to be priceless tomorrow. Where would I hide if I were an attacker and wanted to persist APT (Advanced Persistent Threat) style? In the traffic most commonly sampled, filtered, or dropped by most network monitoring technologies. Even the most highly skilled analyst can't find a stealthy threat if the data isn't there to analyze. We are only as good as our data. We need to see it all.

Wednesday, May 4, 2011

Analyst Freedom

By nature, analysts are an inquisitive bunch who enjoy discovering new ways to interrogate the data. Like in many professions, analytical inspiration comes from a variety of sources and in irregular spurts. One thing I've noticed throughout my career is that environments that are more flexible and allow for more outside-the-box thinking (analyst freedom if you will) generally produce more unique and novel analytical techniques. Although the organization has less control over things, they are often the better for it. There is something about bureaucracy and rigidity that seem to work against analytical inspiration. It's fascinating.