Tuesday, March 11, 2014

100 a Day

One of the goals of an incident response team should be to handle no more than 100 alerts a day. At first, this may sound like a ridiculous assertion. However, I think that if we examine this more closely, you will agree that it makes sense. Let's take an analytical approach and go to the numbers.

As previously discussed on this blog and elsewhere, one hour from detection to containment should be the goal in incident response. Put another way, one hour should be the time allotted to work an alert, perform all required analysis, forensics, and investigation, and take any necessary containment actions. Let's say we have each of our analysts working an eight hour shift. Assuming 100% productivity for each analyst, that allows each analyst to work approximately eight incidents per day. Let's assume that we want to work 96 alerts properly each day (since 100 is not divisible by eight). That works out to a requirement to have 12 analysts on shift (or spread across multiple shifts) to give proper attention to each alert. What happens if analyst cycles are taken away from incident response and lent to other tasks? The numbers look worse. What happens if the necessary analysis, forensics, and investigation take more than an hour (due to technology, process, or other limitations)? The numbers look even worse yet.

So, if you're the type of enterprise that has 500 analysts sitting in your SOC or Incident Response Center, you can probably stop reading this blog post and get back to your daily routine. What's that you say? The analyst is the scarcest resource, and you don't have enough of them? Yes, of course. I know.

Let's face it -- the numbers are sobering. Even a large enterprise with a large incident response team can realistically handle no more than 100-200 alerts in a given day. Sometimes I meet people who tell me that "we handle 5,000 incidents per day". I don't believe that for a second (putting aside, for now, the fact that incidents, events, and alerts are not the same thing). Either that organization is not paying each alert the attention it deserves, or the alerts are of such low value to security operations that it wouldn't make much difference whether they fired or not. One need only look to the recent Nieman Marcus intrusion to see the devastating effects of having too large a volume of noisy, low fidelity, false-positive prone alerts that drown out any activity of true concern (http://www.businessweek.com/articles/2014-02-21/neiman-marcus-hackers-set-off-60-000-alerts-while-bagging-credit-card-data).

Clearly, the challenge becomes populating the alerting queue with reliable, high fidelity, actionable alerts for analysts to review in priority order (priority will be the subject of an upcoming blog post). This process is sometimes referred to as content development and can be outlined at a high level as follows:
  • Collect the data of highest value and relevance to security operations and incident response. As previously discussed on this blog, fewer data sources providing higher value at lower volume/size, while still maintaining the required visibility are desired.
  • Identify goals and priorities for detection and alerting in line with business needs, security needs, management/executive priorities, risk/exposure, and the threat landscape. Use cases can be particularly helpful here.
  • Craft human language logic designed to extract only the events relevant to the goals and priorities identified in the previous step.
  • Convert the human language logic into precise, incisive, targeted queries designed to surgically extract reliable, high fidelity, actionable alerts with few to no false positives
  • Continually iterate through this process, identifying new goals and priorities, developing new content, and adjusting existing content based on feedback obtained through the incident response process.
Resources are limited. Every alert counts. Make every alert worth the analyst's attention.

No comments:

Post a Comment