An Analytical Approach: 2011

Friday, December 23, 2011

SOC/IRC Building

Over the last decade, I've had the privilege to help build multiple different Security Operations Centers (SOCs)/Incident Response Centers (IRCs). This is a line of work that I'm truly passionate about and have had a good amount of success in. The good news is that this skill appears to be moving from a niche line of work to a more mainstream endeavor. I see this as a tremendous positive for the world -- proper network security monitoring and a successful SOC/IRC are an integral part of helping organizations combat the security threats of today. Onward!

Time

Time is an extremely interesting concept analytically. It's a dimension that's often overlooked when performing network traffic analysis. On this blog, I've discussed the concept of looking for anomalous or unexpected traffic/behavior on an enterprise network quite a bit. But what about traffic that may be completely normal/expected at 14:00 on a weekday, but not at 02:00 on a Sunday? By considering the dimension of time analytically, one can look for normal traffic that because of the time window it occurs in is considered abnormal.

Consider the example of the administrative assistant who sends emails and calendar invites (amidst performing a variety of other tasks) all day long. If we study the mail logs, there is nothing particularly interesting or unusual about this. But what if that same administrative assistant sends a bunch of emails and calendar invites between 02:00 and 03:00 on Sunday? Perhaps he/she is dedicated and catching up on work while dealing with a bout of insomnia. Or, perhaps he/she is about to become a pawn in a spear phishing campaign that will await targeted personnel when they arrive to work Monday morning....

Tuesday, November 22, 2011

Money Shot

Finding the money shot is key to successfully containing an emerging threat. What do I mean by the money shot? That would be the point at which the point of no return is passed in a security incident. In most malicious code incidents, this is where a binary reaches a system (via HTTP download, email, or some other means) and successfully executes. It's fairly common nowadays to see 2, 3, 4, or more re-directs from one compromised or malicious site to another before finally reaching the money shot. But trying to keep up with blocking/containing all the stage 1, stage 2, etc. re-direct domains is an exhausting and futile process. On top of that, it's an extremely false positive prone undertaking that could have a fair bit of collateral damage as well (in terms of blocking traffic necessary for business operations). Focus on the money shot first. That's where the most containment bang for the buck is to be found. It's the only chance we as practitioners have at keeping up with the ever-changing landscape. It's all about the money shot.

Message Clarity

Message clarity is a common sense concept that, unfortunately, is not always so common. In the practice of network security monitoring, clearly communicating a simple and straightforward message is often necessary in order to conduct proper security operations. In other words, clearly communicating and leveraging data about new tactics, infection vectors, indicators of compromise, command and control channels, and other important data can help organizations successfully contain and remediate new campaigns, rather than falling victim to them.

I've so often seen cases where the message is garbled or over-complicated (for whatever reason -- be it a lack of knowledge, lack of communication skills, or some other reason). This helps no one. I've often been told that one of my greatest strengths is being able to clearly and effectively communicate what I find through detailed analysis in an easy to understand manner. There is elegance in simplicity -- I firmly believe that. And an elegant, clear, concise, and simple message can often facilitate network security monitoring and security operations.

Tuesday, November 1, 2011

Properly Leveraging a SIEM

For some reason, when organizations have a SIEM, the overwhelming tendency is to deluge the SIEM with data. Any kind of data -- as much as can be gathered from myriad data sources, regardless of its actual value to security operations and incident response (I've previously blogged on the different between data value and data volume). To be fair, I can completely understand the need to log as much data as possible to a SIEM for auditing and retention purposes -- one never knows what data might be needed for an investigation. However, it is also often the case that organizations have a hard time distinguishing between data that is to be retained vs. data that is to be reviewed/analyzed/monitored as part of a Security Operations Center's/Incident Response Center's operational workflow. There is a difference, and it's an important one to understand.

A SIEM can be a valuable tool to use as the foundation of an organization's security operations workflow -- but only if it is configured/set up in such a way as to provide events of value to the analysts and incident handlers. In other words, if there is more noise than value-add, human nature is to tune out. There are a number of techniques that one can employ to increase the value-add of one's SIEM by selectively and precisely engineering/tuning the rules, events, and views presented to the analyst. They mainly involve (surprise, surprise) studying, analyzing, and understanding the data on the network, and then purposely selecting those rules, events, and views that will return the most bang for the buck when presented to the analyst and/or incident handler.

I believe this to be an important point that is, unfortunately, often overlooked by organizations.

Friday, October 28, 2011

Outbound Denies

Many organizations don't allow a default route out of the enterprise. In other words, they force systems on the internal network to go through some sort of a proxy in order to reach the Internet. In general, this is a good thing, as it provides both auditing and egress control/filtering.

Monitoring this type of traffic creates two types of extremely interesting data from a network traffic analysis perspective, both involving denied outbound traffic:

1) Traffic that is proxy aware, but attempting to connect out to controlled/filtered/denied sites.
2) Traffic that is not proxy aware, and is thus dropped (no default route out allowed).

Both #1 and #2 can be caused by misconfigurations, and #1 can be caused by attempted drive-by re-directs that fail/are blocked (and thus do not result in successful compromise). In both those cases, the denied traffic is not caused by malicious code being successfully installed and executed. In other cases, however, #1 and #2 may indeed be caused by malicious code or some other type of rogue process.

It's another jumping off point that can be used to provide valuable insight into what may be occurring on the network. With the speed at which attackers maneuver, every little bit helps.

Sunday, October 16, 2011

Giving Credit Where Credit is Due

Giving credit where credit is due is a powerful and widely respected rule among analysts. Sometimes, certain types of opportunists (non-analyst types) will try to leverage the work of an analyst without giving credit where credit is due. Similarly, some of the opportunist types are inclined to use the connections/experience/reputation of an analyst without properly respecting those connections/experience/reputation, some of which may have taken years to earn. Needless to say, this doesn't go over very well with the exploited/leveraged analyst, nor the larger analytical community. Giving both respect and credit where they are due is important. People take notice of these things.

Talk is Cheap

It used to surprise me when a client would tell me something like: "Wow, it's so unusual that we get a consultant in here who actually knows what he's talking about!". This is a powerful statement, but unfortunately, it no longer surprises me when I hear it. In the words of a colleague whom I respect tremendously: "There are an awful lot of hacks out there." Sadly, this appears to be the case. The phrase "easier said than done" is a fitting one. "Talk is cheap" is an even better one. It seems that in general, there are people who talk about the need to make progress, and then there are people (perhaps like myself) who actually do something and move the state of the art forward.

I hear an awful lot of cyber security rhetoric. It seems to increase in volume almost daily. Oddly enough, despite all the talk, I see very little progress week to week, month to month, or year to year. Talk is cheap. My challenge to the talking heads out there is to spend less time talking and more time doing. Give your mouths a break and get your hands dirty making some real changes. That is the only way to progress.

Thursday, October 6, 2011

Passive DNS

Collecting passive DNS data on your network can produce a data source of extremely high analytical value. Because the data collection is passive, it has little (or no) impact to operations. What is so interesting and valuable about passive DNS data you ask? Well, for starters, it records what domain names were assigned to what IP addresses at the time those domain names were requested. This makes all sorts of interesting analysis possible. For example, which domain names were requested that point to IANA reserved (e.g., 192.168.0.0/16) or downright silly (e.g., 111.111.111.111) IP addresses? This can be an indicator of compromise, as malware authors will often park their callback or second stage domains at these types of IP addresses. Additionally, all the standard analysis that one would do on DNS logs can be done on passive DNS data as well. For example, which domain names have been changing IP addresses frequently or have extremely short TTLs (i.e., fast flux)? Or, as another example, which domain names have been requested periodically, or in a pattern more typical of a machine than a human-being?

Passive DNS data is a lot of fun to experiment with and analyze, and it provides a good deal of value. Check it out!

Thursday, September 22, 2011

Seek First to Understand

Taking a step back and truly understanding what you're looking at when analyzing network traffic is extremely important. There are quite a few people who can look for and analyze a well-defined, known threat. But what happens when attackers change tactics, or a new type of attack is encountered for the first time? It requires the analyst to take a step back, think deeply, and truly understand what is going on. This is a rare skill, but one that is invaluable. This type of thinking/mindset is what we as a community need more of in order to rise to the challenges we are confronted with on a continual basis. Long live the deep thinker.

Where to Look?

When I work with clients to build their Security Operations Centers (SOCs)/Incident Response Centers (IRCs), I often see a common challenge. As I've mentioned previously, most organizations spend a good deal of time instrumenting their network to collect data. Unfortunately, they don't often give enough thought to how one might analyze all that data. In other words, the questions of "Where do we put all this data?" and "What type of questions do we want to ask of all this data?" should be asked at the same time the instrumentation of the network is being planned. As you might expect, this is almost never the case. As a result, organizations often end up with large amounts of data in various different locations. There are some data types where there is a good deal of overlap with other data types, which results in redundancy, waste, and long query times (due to excessive volumes of data). In other data types, there may be (potentially large) gaps in the data, which results in the inability to ask certain (sometimes crucial) questions of the data.

What's striking is that the first question an analyst usually needs to ask is "Where do I go to get the data to answer my question?", rather than "What is the answer that the data provides to my question?". It's unfortunate. The good news is that better coordination between the collection side of the enterprise and the analysis side of the enterprise can result in incredible gains analytically. Something to keep in mind when building a SOC/IRC for sure.

Wednesday, September 14, 2011

Don't Forget About DNS

As a community, we've paid a good deal of attention to HTTP-based threats. This includes malicious downloads, callbacks, command and control, exfiltration of data, and other threats. To combat this threat, we've deployed proxies, firewalls, and myriad other technologies. A good deal of the threat feeds these days are dominated by HTTP-based threats, if not entirely focused on HTTP-based threats.

In some sense, the near-constant attention paid to HTTP-based threats is good. Looking at it from a different perspective though, it opens up organizations to attacks from other angles. We often pay an inordinate amount of attention to HTTP-based threats at the expense of other threats. One can't fault organizations for this -- there is still a tremendous amount of badness delivered via HTTP-based mechanisms of various different sorts and defending against it is very necessary.

In this post, I'd like to remind us not to forget about DNS. DNS is an incredibly flexible protocol that can move almost any amount of data in and out of organizations, sometimes undetected. For example, recently, I've seen DNS TXT records used for command and control of malicious code and even delivery of additional payload, rather than HTTP. Why would an attacker do this? Why not? When we learn of a malicious domain, what do we most often do? Yep, that's right -- we block HTTP-based communication with it (via proxy blocks, firewall blocks, or otherwise). But what do we most often do regarding DNS requests for those very same malicious domains? Yep, that's right -- nothing. So, can you blame the attackers for being creative in their exploitation and use of the DNS protocol?

Just another reason to stay diligent in the continual analysis of all types of network traffic -- not just HTTP-based traffic.

Sender Policy Framework

As I'm sure you know, many organizations face email spoofing/spam/phishing/spear phishing as one of their major infection vectors these days. Sender Policy Framework (SPF), which is RFC 4408, can help tremendously in combating this infection vector. SPF uses a DNS TXT record to specify which IP range(s) are permitted to send email as coming from a given domain. It's implementation is optimal. SPF's elegance is in its simplicity, and I would encourage organizations to consider implementing it if they haven't already.

To think about it through a concrete example, say I wanted to relay email and spoof the sender such that the email appears to be sent from someguy@example.com. If I'm attempting to relay email from a cable modem dynamic IP address, then I'm probably not a legit mail gateway for example.com. Implementing SPF instructs your mail server to perform this "reality check" before accepting the email. Seems straightforward, right? Exactly.

Taming Your Ingress/Egress Points

Many organizations have legacy ingress/egress points that will route traffic to and from the Internet. In some cases, these ingress/egress points may have been "forgotten" about and as a result, are not being properly monitored. A well-run Security Operations Center (SOC)/Incident Response Center (IRC) can be highly effective and can greatly improve the security posture of an organization, but only if all ingress/egress points are well known and properly instrumented. To think about it another way, it's like trying to defend the network based on data that simply isn't there. Pretty hard to do.

Thursday, September 8, 2011

Other Alphabets

Westerners often forget that there are a decent number of alphabets in use that don't use Latin characters. In 2009, internationalized domain names that use non-Latin characters were approved, allowing for a much broader array of characters from which to build domain names. The non-Latin characters are translated for our "Latin-only" DNS system using a schema known as "Punycode". What does this mean for analysts? There are a lot more domain names in use on our network than we might realize. Unfortunately, because of this, some of our "Latin-centric" analytical methods may miss certain traffic that we may want to inspect more closely. It's something to be aware of. Don't be a "Latin-centric" analyst. :-)

Window of Opportunity

As I'm sure most of you reading this blog know, drive-by web re-directs are a major malicious code infection vector for organizations these days. Many proxy vendors continually make a noble effort to stay on top of domains hosting malicious code and push blocks down to their customers' proxy devices. This is actually highly effective at preventing a large number of malicious code infections in enterprises. What's interesting analytically though is that there is usually a 24-48 hour window between when a domain begins hosting malicious code and when the proxy vendors are able to push the blocks down to their customers. That time period is a window of opportunity for the attackers, and it's often enough time to infect countless systems.

So how can we turn this tidbit into an interesting analytical technique? What about reviewing the list of blocks received from our proxy vendor and searching back a week or two in our proxy log data to see if any systems were infected before the block was pushed down? Pretty neat if you ask me, and a highly effective way to identify infected systems in the enterprise.

Sunday, August 28, 2011

Common Challenges

When I work with Security Operations Center (SOC)/Incident Reponse (IR) clients that are in the SOC/IR building phase, I often see them encountering similar challenges. More often than not, the clients are frustrated and overwhelmed, as they desperately want to do the right thing. The good news, I reassure them, is that all organizations have the same fundamental challenges. I've worked with quite a few different organizations, and there is always a way through the maze. Clients often find it reassuring to know that they are not the only organization with the challenges they see before them. I am more than happy to help them through the fog. In fact, one client recently told me I should write a book about SOC/IR building. Maybe I will one day....

Give and Take

The cyber security community is a community built almost entirely on trust. This is especially true in the Security Operations Center (SOC)/Incident Response (IR) world. Relationships are built over time through a give and take. In other words, an organization should expect to give to the community before expecting to take from the community. At the very least, an organization should attempt to truly understand the community and its nature before attempting to wedge in. It amazes me how many organizations attempt to take from the community with no history of or intention of giving anything back. The SOC/IR community is a close knit one, and this type of behavior is often perceived as untrustworthy and/or exploitative. Needless to say, these types of organizations aren't very successful in making any headway in the community. Perhaps what amazes me even more is how surprised some of these organizations are by the lack of progress, despite being advised to the contrary. I believe the psychological term for this type of behavior is "cognitive dissonance".

Wednesday, August 10, 2011

Uber Data Source

This afternoon I gave my GFIRST talk entitled "Uber Data Source: Holy Grail or Final Fantasy?". The purpose of the talk was to get people in the audience thinking about the challenges of complex network instrumentation/data collection and data overload confronting many Incident Response Center (IRC)/Security Operations Center (SOC) organizations today. A number of people seemed to agree that the current model of collecting dozens of different formats of data in increasingly larger and larger volumes and varieties can't continue.

The community appears to be receptive to the idea that we need to consider moving towards a consolidated uber data source that allows us to successfully monitor our networks and investigate incidents/events. In addition, there are a number of vendors beginning to move in this direction, which is great to see.

My talk should be posted on the GFIRST conference website (http://www.us-cert.gov/GFIRST) after the conclusion of the conference. I'd be interested in hearing thoughts and opinions regarding both the talk and the concept of the uber data source in general.

Chess Game

I'm at the GFIRST conference this week and have been bumping into a number of contacts and colleagues. The conference has been great so far. I had an interesting discussion this week with someone who is employeed at a large government agency. We were discussing the pros and cons of federal employment. One of the items we discussed was how some of the best and brightest get frustrated and burn out in the federal sector due to the politics, bureaucracy, etc. He responded by telling me about how he succeeds by approaching the politics and bureaucracy like a giant game of chess, always contemplating his next move and trying to outwit the opponent. I see his point, and I admire his ability to survive and flourish within the "system". But deep down, this troubles me.

The best and the brightest folks, those that we need to have monitoring and securing our most critical government assets aren't interesting in playing chess. They want to be put to work on analytically challenging and motivating tasks. The chess game frustrates them, burns them out, and causes the best and brightest to leave the federal sector. I think this is extremely unfortunate, and I can only hope that one day those at the top of the political pyramid will realize this and change things for the better. Until then, I see this as a major challenge for the federal sector.

In the great game of politics and bureaucracy, it's unfortunately the American people who lose.

Friday, July 22, 2011

Free and Open Discourse

In the world of network traffic analysis and network security monitoring, free and open discourse is extremely important. Analytical techniques that are subject to discussion and critique will always be better than those that aren't. Quite simply put, analysts cannot thrive in a bubble, and neither can a robust network security monitoring program.

Unfortunately, there are some organizations that keep their analysts insulated, for whatever reason. These organizations tend to wall themselves off from the free exchange of ideas. In my experience, this hurts those organizations, as their analysts often fall behind the collective intelligence.

The good news is that ideas are always willing to be heard if someone is willing to listen.

Friday, July 8, 2011

What? When? Where? How?

When conducting network traffic analysis in support of an incident investigation, it's important to remember the four questions of incident response that an analyst should seek to answer. They are:

What?
When?
Where?
How?

The other two question words in the English language, namely the questions of Who? and Why? are best left for law enforcement to answer for a number of reasons. That's a bit beyond the scope of this blog, so I'll brush it aside for now.

The four questions of incident response can be elaborated a bit more as:

What happened? What type of incident has occurred? What damage has occurred?
When did the incident happen? When was the incident detected?
Where did the incident occur? Is it isolated or widespread? Where is the incident coming from?
How did the incident occur? How did the intruders get in (the infection vector)?

If an analyst keeps these four questions in mind, it's much easier to focus an incident investigation/analysis and ensure that the correct supporting evidence is maintained and that the correct information is reported.

It's an intuitive approach that has been proven to help analysts focus their attention to the most value-added activities. Hopefully you'll find it useful as well.

Friday, June 24, 2011

Spear Phishing

Spear phishing is a common way that attackers get into organizations. Sometimes, when attempting to spear phish an organization, an attacker will spoof one of the targeted organization's email addresses to make the spear phishing message look more legitimate. Mail protocols aren't great at prohibiting this, and thus, it's a fairly successful technique.

A simple analytical method to monitor for this is to watch mail logs or a PCAP solution for "From" addresses claiming to be from within your organization, but from mail gateway IP addresses or sender IP addresses that are outside of your organization. The data resulting from this is quite fascinating. Have a look!

GFIRST Presentation

The GFIRST Agenda came out today, and I saw that I will be presenting on the Wednesday afternoon of the conference. I'm going to be speaking about layer 7 meta-data. My goal is to get people thinking about the difference between data value and data size. I'm hoping the talk is well received!

I always enjoy GFIRST, as I seldom have the opportunity to be around so many like-minded analyst geeks at one time.

Monday, June 20, 2011

4G Hotspot

I recently picked up a 4G hotspot and am loving it so far. It did make me realize, however, that there are now more than a few options for bringing your own network with you wherever you go and hopping on-line from anywhere. Think mobile phones, tablets, 3G/4G hotspots, etc. Why am I blogging about this? Because it occurred to me that it's now possible to physically sit inside an enterprise and send and receive information over your own portable network. Guess what? There's no way for an enterprise to monitor that. Scary.

Loss of Visibility

A couple of months ago, I spoke on a panel discussing some looming challenges in the field of cyber security. As might be expected, many people asked questions of the panel relating to the move to cloud computing. At one point, I was asked what my greatest fear was relating to the cloud. My answer? Loss of visibility. When an organization moves to the cloud, that organization effectively outsources all of its logging and auditing. What if the cloud provider doesn't have all the painful lessons learned that many of us do? It pays to ask, IMO.

Remember, even the best analyst can't identify security issues on a network if the data isn't there to support the analysis....

Friday, May 20, 2011

We Already Use Layer 7 Enriched Meta-Data and Don't Know It

It occurred to me the other day that many of us analyst types already use layer 7 enriched meta-data and likely don't realize it. In thinking about it, it dawned on me that DNS logs, proxy logs, IDS alerts, etc. are all a highly specialized versions of layer 7 enriched meta-data. Think of the power of a generalized layer 7 enriched meta-data -- the ability to exploit all the advantages of meta-data, while simultaneously providing much of the detail necessary for determining the true nature of network traffic of interest. Possibly the uber data source?

Merits of Meta-Data

In my previous post, I discussed the logic behind keeping a record of all traffic traversing the network. As we all know, the traffic that a large, enterprise network generates is incredibly voluminous. So what does one do to best keep eyes on the network? I believe the key here is meta-data. Meta-data describes the envelope information about transactions/conversations on the network, but doesn't include the content of the actual conversation. Network flow data is one type of meta-data, while layer 7 enriched meta-data (discussed in a previous blog post) is another type. This allows for several key advantages:

Long term retention of data for auditing and forensics purposes without the need for large amounts of expensive disk space.
The ability to see all the data without needing to sample, filter, or drop certain traffic.
Rapid search capability over vast quantities of data collected over long periods of time.

Now, for sure there is information in the packet data that is helpful for identifying the true nature of malicious or suspicious traffic. I believe that meta-data based technologies and packet-based technologies can work together beautifully here. Meta-data allows one to craft incisive queries designed to interrogate the data so as to identify network traffic that requires further investigation. I call these jumping off points (also discussed in a previous blog post). From there, the packet data can be consulted to assist in the investigation (presuming that the retention window for the packet data has not already expired).

As the amount of traffic on our networks continues to grow, I believe that we as a community will need to get used to the network traffic analysis model/work flow described above. I sometimes refer to it as breadth, then depth. I believe it to be a model capable of scaling with the data volumes of the present and future.

Seeing It All

There are some network monitoring technologies and some industry practitioners that practice sampling, filtering, or dropping of certain traffic. The logic here is that certain traffic is known to be noise that is of no concern from a cyber security perspective, and needn't be examined. Unfortunately, there is a fatal flaw in this logic. What may appear to be without value today may turn out to be priceless tomorrow. Where would I hide if I were an attacker and wanted to persist APT (Advanced Persistent Threat) style? In the traffic most commonly sampled, filtered, or dropped by most network monitoring technologies. Even the most highly skilled analyst can't find a stealthy threat if the data isn't there to analyze. We are only as good as our data. We need to see it all.

Wednesday, May 4, 2011

Analyst Freedom

By nature, analysts are an inquisitive bunch who enjoy discovering new ways to interrogate the data. Like in many professions, analytical inspiration comes from a variety of sources and in irregular spurts. One thing I've noticed throughout my career is that environments that are more flexible and allow for more outside-the-box thinking (analyst freedom if you will) generally produce more unique and novel analytical techniques. Although the organization has less control over things, they are often the better for it. There is something about bureaucracy and rigidity that seem to work against analytical inspiration. It's fascinating.

Thursday, April 14, 2011

Darknet

Team Cymru defines a darknet as "a portion of routed, allocated IP space in which no active services or servers reside" (http://www.team-cymru.org/Services/darknets.html). Darknets, as it turns out, are an analytical goldmine. Why is this you ask? The answer has to do with signal-to-noise ratio.

Since there are no legitimate services or servers on "dark" portions of the network, all the traffic destined for the darknet becomes suspect and therefore analytically interesting. For example, consider an aggregate analytic that looks at the top destination ports for inbound TCP traffic by number of sessions. If we were to look at traffic across the whole network, we would probably get something like this after running our analytic:

Port | Session Count
80 | #######
53 | #######
25 | #######

Not surprisingly, we see that our routine web, DNS (yes, even DNS over TCP), and SMTP traffic (all necessary and expected for normal business operations) would top the list. In this case, standard business traffic is the noise that hides the signal of the suspicious/malicious traffic. Could we somehow lower the noise to make the signal jump out at us? Yes -- through the power of darknet!

What happens, however, if we look at the same aggregate analytic, but now restrict it to look only at traffic destined for our darknet? That's where we might get a more interesting result:

Port | Session Count
5678 | #####

Why would we have traffic inbound to TCP port 5678 (which I chose purely for illustrative purposes)? I don't know why, but I do know that what we now have is a jumping off point. Is someone attempting reconnaissance of our network? If they are, what are they looking for? Do we have other systems communicating with the outside world on that port that we never noticed before? These questions and others would need to be answered by network traffic analysis via deep-diving into the data with the end result of determining the true nature of the traffic.

So, this is just a quick example of traffic analysis triggered via a darknet provided jumping off point. Hopefully it has helped to illustrate the broader power of the darknet. Darknets are truly an analytical goldmine.

Thursday, April 7, 2011

Analytical Platform

Network traffic analysis plays an important role both in a successful network monitoring program and in an organization's overall information security posture. So why isn't it practiced more widely within the cyber security profession? It's newness and relative obscurity (until recently) is one reason for sure, but I'd argue that there is also another reason. As previously discussed (reference the post entitled "Making Analysis About Analysis"), analysis is often just too hard. Data is diverse, complex, and voluminous, and most of us have a hard time getting any kind of a useful handle on it. When we do have ideas of how to make sense of the data, the amount of data munging and custom coding required to move our ideas from conception to implementation is discouraging at best.

So how can we best enable analysts to create new analytical techniques? I believe that analysts need to be provided with an analytical platform that allows them the freedom to quickly and easily develop, test, and implement new analytical techniques without the hassles of data munging and data manipulation. In other words, the analytical platform should abstract the data, providing the analyst with an intuitive way to interact with the data. Additionally, the analytical platform should allow the analysts to seamlessly interact with the results of their analytical queries as they conduct their investigation.

For many years, I have dreamed of such a capability. The good news is that there are now products and technologies coming onto the market that begin to address this need. Here, here!

Wednesday, April 6, 2011

Training

As the world awakens to the need for network monitoring, training will be an area we'll need to take a look at and put some effort into. The threat against us and the operational challenges confronting us are real. The network traffic analysis skill set, once an obscure, niche skill set, will need to be something we can rapidly imbue cyber security professionals with. There are a few challenges here:

There isn't a great deal of literature/background reading on the topic
There aren't specialized training classes that a cyber security professional can enroll in to gain this skill set per se
It turns out that it's often quite hard to do analysis for a number of reasons (reference an earlier post entitled "Making Analysis About Analysis").

For point 1, I'm looking to my recent ISSA Journal article, along with articles (past, present, and future) from others in the field to form the beginnings of a knowledge base for the industry. I envision this knowledge base growing over time to provide the necessary background material for those new to the network monitoring field.

Regarding point 2, I'm hoping that the various different cyber security training institutions/organizations that exist will begin to form curricula around the topic of network monitoring/network traffic analysis. I see this as necessary, since those organizations have trained and will continue to train a large number of professionals in the field.

On point 3, I'm looking to technology to help address this point. There are emerging products and technologies that will help address this point by providing an analytical platform upon which network monitoring/network traffic analysis techniques can be developed without all the frustrations of "fighting with the data" that are commonplace today.

There is some work that we as a community need to do here. I am optimistic that we will together rise to the challenge. The time has come to get to work.

Friday, April 1, 2011

ISSA Journal Article

I was fortunate enough to have an article I wrote on a methodology for network traffic analysis published in the April ISSA Journal. The article lays out the jumping off points approach and gives some practical techniques for monitoring an enterprise network. Here is the abstract from the article:

"This article describes practical techniques for the cyber security professional to efficiently sift through the voluminous amounts of network data. These techniques leverage different views of the data to discern between patterns of normal and abnormal behavior and provide tangible jumping off points for deeper investigation."

If you are interested, give it a read and share your thoughts!

Thursday, March 31, 2011

80-20 Rule

Throughout my career, I've had the utmost respect for the Pareto principle, also referred to colloquially as the 80-20 rule. This principle is a point of frustration for the most experienced network traffic analysts. They can often get 80% of the results they need with a reasonable amount of effort. Achieving the last 20% is often an arduous task, though that last 20% is often the 20% we should be the most concerned about. In other words, the last 20% is often where the most interesting results are, and where the attackers seem to repeatedly eat our lunch.

There are some emerging technologies coming onto the scene now to get the uber analyst closer to that last 20%. At the same time, the broader cyber security community is awakening to the first 80% (in that the awareness of the need for network monitoring is rising). We live in interesting times for sure, and I'm excited to watch the evolution.

Tuesday, March 29, 2011

Sharing

Of late, I've realized that the network monitoring and analysis techniques that are well known within my particular niche professional area are not well known in the larger cyber security community. The trouble with this, of course, is that there are many skilled and talented cyber security practitioners who could make good use of this knowledge to defend their networks and improve their information security posture. In thinking of a way to begin to share some of my accumulated knowledge for the good of our networks, I wrote an article. The article discusses a methodology for network monitoring and gives some practical tips. I am working on getting it published with the hope that it can serve as the first step in sharing some network monitoring advice with a professional community increasingly thirsty for it.

My intent is to continue to share knowledge and techniques with the larger community. All indications are that the community is extremely interested in the topic.

Monday, March 21, 2011

Collection and Analysis

We as a community often focus a large portion of our attention towards collection without giving analysis the forethought it deserves. I've seen lots of instances where all kinds of data is being collected, and in some cases, it's overwhelming. During the process of instrumenting the network to collect this data, how to make the best use of all the data being collected was never discussed or considered. The output of that is an unstructured, disorganized set of data that is difficult to turn into the organized, well structured data required for quality analysis.

We mustn't forget the other half of the equation: analysis. I think if we as a community thought more about what we wanted to do analytically before we instrumented our networks, we would save ourselves a lot of pain down the line. Perhaps we will improve in this area in the coming years.

Wednesday, March 16, 2011

The Future

Yesterday, I had the privilege of presenting to the University of Maryland Cyber Security Club. I spoke to them about network monitoring and some of the challenges one encounters when monitoring a real live network. We also discussed some techniques for monitoring a large, enterprise network. The students asked very insightful questions, and it was clear that they were very bright and had a firm grasp on the topic.

I issued the students this challenge: "Seek out, identify, and study the unknown unknowns and turn them into known knowns" (reference an earlier blog post regarding known knowns). I believe that this is the boiled down essence of our obligation as network monitoring professionals/analysts.

The future holds great potential for our field. I am realizing that the onus is on those of us currently in the field to capture the interest and energy of the brightest minds. The network monitoring field and broader cyber security field face many challenges, and in order to conquer them, we will need the best and the brightest.

Sunday, March 13, 2011

Jumping Off Points Revisited

I have previously blogged about the concept of jumping off points -- identifying subsets of data in a workflow that an analyst can run with. I often speak about this topic as well -- raw data can be overwhelming, and finding a way to present it to an analyst with a way forward can greatly improve productivity.

What I discovered last week was that this concept also holds with other types of work as well. I was working on a paper with another co-worker. We were having trouble finding a way forward until we identified a jumping off point that we could work from. Once we figured out our jumping off point, everything else flowed. It was amazing.

I guess I shouldn't be surprised by now that a structured, well organized approach would yield good results.

Monday, March 7, 2011

Known Knowns

When I speak at conferences or on panels, I often discuss the concept of thinking about data transiting a network in terms of "Known Knowns", "Known Unknowns", and "Unknown Unknowns". This was a concept that Donald Rumsfeld spoke to during the Iraq war. It's a concept that is very relevant to the cyber security world, and specifically to network monitoring/network traffic analysis. Unfortunately, it's a very underutilized framework, but it can add great value to an organization's network monitoring approach.

"Known Knowns" are network traffic data that we understand well and can firmly identify. Members of this class of network traffic can be categorized as either benign or malicious. Detection methods here can be automated and don't require much human analyst labor on a continuing basis. Unfortunately, this is the class of network traffic that we as a community spend the bulk of our time on. Why do I use the term unfortunately? More on that later.

"Known unknowns" are network traffic data that we have detected, but are puzzled by. We don't have a good, solid understanding of how to categorize this class of network data. One would think that because of this, we should spend a decent amount of time trying to figure out what exactly this traffic is. After all, if we don't know what it is, it could be malicious, right? Unfortunately, not enough time is put into this class of network traffic, and as a result, most organizations remain puzzled and/or turn a blind eye to the known unknowns. Why don't we work harder here? We're too focused on the known knowns.

"Unknown unknowns" are network traffic data that we have not yet detected, and as a result, we aren't aware of what this class of network traffic is (or isn't) doing on our network. This is the class of network data that contains most of the large breaches (and thus most of the collateral damage), as well as most of the truly interesting network traffic. Finding this traffic takes a skilled analyst, good tools, the right data, and a structured, well-organized approach to network monitoring. Ironically, this class would be extremely interesting to a skilled analyst, but due to the known known "rut" that we as a community are in, analysts don't really get a chance to touch this class.

So now I think you can understand why I think it's unfortunate that we as a community are so focused on the known knowns. We are so busy "detecting" that which we've detected time and time again, that we ignore the bulk of the rest of the network traffic out there. That's where we get in trouble repeatedly.

On the bright side, I do see the idea of taking an analytical approach to information security slowly spreading throughout the community. I think it's only a matter of time before one organization after another wakes up to the fact that their 1990s era signature-based approaches are only one part of the larger solution. With proper analysis and monitoring of network data and network traffic comes knowledge. And with knowledge comes the realization that what you don't know is often a lot scarier than what you do know.

Friday, March 4, 2011

Data Value

Lately, I've been thinking quite a bit about the value that different types of data provide. Specifically, I've been considering the analytical, monitoring, and forensics value of different data types. Typically, organizations instrument their networks to collect a wide variety of data (flow, IDS, DNS, router logs, firewall logs, proxy logs, PCAP, etc.). Very quickly, the amount of data collected, along with the diversity of the data, can confuse and complicate the network monitoring goals of an organization. An organized, well-structured approach is critical to successfully monitoring a network, and "data overload" is a serious detractor. I've seen it with my own eyes many times.

In thinking about why organizations end up in an overloaded/confused/complicated state, I've come up with two primary reasons:

1) No one data type by itself gives them what they need analytically/forensically/legally

2) There is great uncertainty of what data needs to be collected and maintained to ensure adequate “network knowledge”, so organizations err on the side of caution and collect everything.

To me, this seems quite wasteful. It's not only wasteful of computing resources (storage, instrumentation hardware, etc.), but it's also wasteful of precious analytical/monitoring/forensics cycles. With so few individuals skilled in how to properly monitor a network, the last thing we want to do is make doing so harder, more confusing, and further obfuscated.

The good news is, I think there is a way forward here. As I discussed in a previous post, enriching layer 4 meta-data (e.g., network flow data) with some of the layer 7 (application layer) meta-data can open up a world of analytical/monitoring/forensics possibilities. I believe that one could take the standard netflow (layer 4) meta-data fields and enrich them with some application layer (layer 7) meta-data to create an "uber" data source that would meet the network monitoring needs of most organizations. I'm not sure exactly what that "uber" data source would look like, but I know it would be much easier to collect, store, and analyze than the current state of the art. The idea would be to find the right balance between the extremes of netflow (extremely compact size, but no context) and full packet capture (full context, but extremely large size). The "uber" data source would be somewhat compact in size and have some context. Exactly how to tweak that dial should be the subject of further thought and dialogue.

This is something I intend to continue thinking about, as I see great promise here.

Wednesday, February 23, 2011

Pow-wow Power

Today I was on-site working with a customer, which I always enjoy. For part of the day, we engaged in an analytical pow-wow -- a free exchange of analytical thoughts, ideas, and methods. The session was great fun, but also highly effective. We all learned from each other, and after an hour or two, we had some cool new analytical techniques we wanted to try out. These pow-wows are always interesting and never fail to be productive. Definitely something to keep in mind for planning purposes.

Thursday, February 3, 2011

20th Century Thinkers

I've been reading a good number of articles lately that pose critical questions relating to the cyber domain. The more I read, and the more I listen to our leaders discuss the topic, the more I realize that we have a problem. Our society is filled with great 20th Century thinkers. Unfortunately, it appears that the 21st Century will be quite different from the 20th Century, and will require different ways of thinking about and approaching the challenges we face. So far, I don't see as much 21st Century thinking as I think we need. Hopefully we as a society will rise to the challenge.

Wednesday, January 26, 2011

Making Analysis About Analysis

At FloCon this year, I spoke about pictures. Yes, that's right, pictures. My point was that analysis is too hard -- most analysts spend about 80% of their time munging data and fighting with data and only about 20% of their time actually doing analysis. This is simply something we can't continue if we are too succeed in defending our networks. I tried to communicate my strong belief that analysis should be about analysis, and that we as a community need to both provide and use better tools to make this happen. I think the community will warm to this concept, but it won't happen overnight. I see "empowering the analyst" as a strategic direction that the community will likely be heading in the coming years. Plays nicely with the realization of the larger cyber security community as a whole that the time for analysis has come. We need to know our networks. Analysis has arrived.

Enriched Flow Data

Lately I've had a number of discussions with colleagues about how enriching network flow data (netflow) can take it from being a good analytical data source to a great and incredibly powerful analytical data source. Netflow is a data source with an incredible amount of breadth -- it's more or less a record of every transaction on your network. The good news for us analysts is that nowadays there is enough technology around to enrich netflow with layer 7 (application level) data. Once you do this, there is seemingly no limit to the creative and interesting analytical techniques you can develop. Something to think about for sure.