An Analytical Approach: May 2014

Thursday, May 29, 2014

Collect It, Keep It, Need It

According to the 2014 Verizon Data Breach Investigations Report (DBIR), as well as the Mandiant M-Trends 2014 Threat Report, intrusions remain on networks for many months before they are detected. Further, organizations do not discover the breaches themselves, but rather, are notified by third parties a majority of the time. These two facts bring with them an unfortunate reality -- when it comes time to do incident response, the data required for incident response is often not available to the incident response team.

There are several common reasons why the required data may not be available:

Collection: In some cases, organizations may not have their network properly instrumented for collection. In other cases, organizations may not be properly equipped to retain and expose for analysis the volume of data created by the network instrumentation. Either way, when it comes time to investigate, the relevant data will not be available.
Visibility: Some organizations may have portions of their network instrumented for collection. But what if the breach occurs in an area of the network that is not included in the area of visibility? In those cases, data that is relevant to the breach investigation will not be available.
Retention: Sometimes, the network is properly instrumented in the appropriate places, but there is simply nowhere to put the volume of data that is generated. As the volume of data grows, either the retention period shrinks, or the storage capacity grows to compensate. It is not uncommon for the retention period to get down to one month or even less, making incident response for breaches with long-term presence extremely difficult.

The collection and visibility issues can be addressed through improving instrumentation of the network for collection. But what about the retention issues? Collecting fewer data sources of higher value to security operations can help, as I discussed in my “Data Value vs. Data Volume” post (http://ananalyticalapproach.blogspot.com/2014/04/data-value-vs-data-volume.html). Ultimately, even with optimization of data collection, a large network will generate a large volume of data. Given the amount of time breaches remain present before detection, and the key role network traffic data plays in investigating a breach, organizations are retaining network traffic data for increasingly longer periods. It’s probably a good idea to retain 6-12 months of meta-data (or more) if possible. Granted, this involves budgeting resources to accomplish this. Given the financial damage recent breaches have inflicted, if done properly and optimally, increased retention seems like a good investment.

Tuesday, May 27, 2014

A Way Forward In Information Sharing

In a recent piece in Security Week, I discussed the various challenges I perceive relating to information sharing (http://www.securityweek.com/understanding-challenges-information-sharing). Although I won’t rehash the details of the piece here, I did want to discuss a related point. I’m sure we can all understand the challenges, but what can an organization do about it?

In spite of, or perhaps because of, the challenges in information sharing, much information sharing happens informally through ad hoc trusted relationships and informal information sharing forums. It may be preferable for organizations to have formal agreements and policies in place, but we are not there as a community yet. Until that time, security practitioners still need to exchange data to perform their jobs properly, and as such, informal relationships exist. That being said, there are still steps that can be taken to formalize information sharing efforts a bit more.

Communication and education are extremely important and seem to me to be good starting points. Security leadership within an organization can communicate the information sharing vision. A dialogue can begin with legal, privacy, and other relevant stakeholders within the organization, so that they can be educated as to the value and importance of information sharing and included in the efforts. People can begin tackling the information sharing challenge and working to build the organization’s street cred and foster collaborative relationships. The importance of information sharing can be communicated to external organizations and entities that can better facilitate formalized information sharing.

Once communication and education are underway, a formal information sharing process can be developed. This process will include details regarding what type of information may and may not be shared, what to do if information is shared that should not have been, as well as the actual nuts and bolts around how information is collected, handled, used, and shared. Process in itself brings more formality to an information sharing effort and is an important part of the overall picture.

Technology is also important. Technology that facilitates, rather than fights, information sharing is a must. The data of record should be recorded with no losses or gaps. Searches for evidence of Indicators of Compromise (IOCs) should complete rapidly. It should be straightforward and smooth to both both receive and share information. All of these factors contribute to enabling and empowering successful information sharing, rather than fighting it.

Like most security endeavors, information sharing comes back to people, process, and technology. All of them are important, and all of them play an important role in a successful information sharing effort. Most people say that information sharing is a critical piece of a complete security operations program, and, as a result, it should be given the proper attention accordingly.

Friday, May 23, 2014

Servers, Shovels, and On-line Sales

eBay is in the news this week, having been the latest high profile victim of a breach. From the media reports I’ve seen, it appears that the attackers compromised certain eBay employee credentials and then used those credentials to log into internal databases containing information about eBay’s users. While the story has been widely covered, I thought it would be interesting to look at it from a different perspective than I’ve seen in the media coverage.

This breach interests me because it brings together two concepts I’ve written about in the past: “The Forgotten Servers” (http://ananalyticalapproach.blogspot.com/2014/03/the-forgotten-servers.html) and “Don’t Forget to Dig” (http://ananalyticalapproach.blogspot.com/2014/05/dont-forget-to-dig.html).

I won’t rehash the details of those posts here, but in essence, identifying server compromises is extremely difficult. Detecting infected endpoints is much easier in comparison, and because of that, it typically dominates the security operations workflow. As we know though, servers generally contain much more valuable information, but it is much more difficult to detect when they have been compromised. Server compromises generally live in the realm of the unknown unknowns, and they are usually far more serious than endpoint compromises.

This is where digging becomes important. While it may someday be possible to write reliable, high fidelity, low noise alerting for server environments, we’re not quite there as a community yet. No matter how rich and actionable our work queue of alerts is, we still need to dedicate some resources to work outside that linear flow. Those resources should be used to perform intensive analysis and “dig” through the network data using a variety of techniques. Server environments seem, to me at least, to be a great place to begin a digging initiative. Servers are valuable resources with valuable information that is worth keeping an eye on.

Bring your shovels. You’re going to need them.

Thursday, May 22, 2014

Shift Perspective

If you’ve ever worked in a security operations environment, you’re aware of how much day-to-day work revolves around systems (identified by IP address, MAC address, hostname, or otherwise). Alerts and events are generated per system. Investigations are based on detecting, analyzing, containing, and remediating infected systems. Intelligence is primarily leveraged to identify systems of interest. What’s interesting to me, though, is that if we take a step back, we see that systems aren’t actually the true pivot point -- users are.

Each user in an organization may use a number of different systems. For example, a given user may have a desktop computer, a laptop computer, a tablet, smartphone, and other devices. Likewise, each system may be used by a number of different users. For example, a virtual desktop environment may have one IP address but be used by dozens of users. If all of our analysis is centered on systems, we miss the correlation that arises from linking a single user to multiple systems, or conversely, multiple users to a single system.

Why is this important? Let’s have a look at what happens when we shift our perspective for a few different use cases.

Insider Threat: Insider threat is topic on many people’s minds these days. Whether the concern is a rogue employee, espionage, or something else, insider threat is a challenge designed to be approached from the user perspective. Trying to identify insider threat activity is already extremely difficult. Trying to identify it solely by analyzing the activity of systems, rather than analyzing the activity of users is nearly impossible.

Serial Offender: What is the difference between five different systems infected over a period of a few months and a serial offender? Correlation at the user level. Sometimes users have bad security “hygiene” that causes them to pose a greater risk to the organization. Taking a user perspective allows us to identify serial offenders and take steps to address the issue.

Lateral Movement/Staging for Exfiltration: From the system perspective, lateral movement and staging of data for exfiltration look very similar to legitimate network activity. The difference lies mainly in intent, which is nearly impossible to infer when looking at the problem from a system perspective. Looking at the problem from the user perspective allows us to gain an edge. Correlating activity to the user allows us to see if users are logging in from unusual places or logging into unusual places, among other suspect behaviors. Perspective changes everything here.

Stolen Credentials: Two systems may log in to a server or access a file share at the same time, and we would think nothing of it. But if the same user account was used at the same time from two different systems in two different divisions of the organization on two different sides of the globe, that activity becomes a bit more suspect. Looking at the activity through the lens of user-level correlation allows us to tease out the difference.

Essentially, systems are merely tools leveraged by users. Taking a different vantage point that allows us to correlate activity by user, rather than by system alone gives us a very different perspective. That different perspective allows us to better identify and analyze certain types of activity on the network that we may want to investigate further.

Monday, May 19, 2014

Knowledge Without Borders

Knowledge knows no borders. Contributions to the collective human understanding have come from all different types of people. Not surprisingly, this is also the case within the security profession. We all learn from an incredibly diverse set of peers, and this is a good thing. Each person’s experience brings with it a fresh perspective that can be applied to the challenges at hand, and in the security profession, the challenges are many.

Unfortunately, there are still some people in this world that will discount, dismiss, or exclude ideas, experience, and expertise because of the race, religion, creed, ethnicity, or national origin of the people they belong to. Aside from the obvious personal pain this inflicts on the victims of this type of hatred, there is also a professional issue that arises from this that I would like to discuss.

Each security professional has a professional duty to protect his or her organization to the best of his or her ability. The attackers are constantly learning new tricks, improving their skills, and modifying their behavior. The risks to an organization resulting from an attack continue to rise. The threat landscape is continually evolving. Even under the best circumstances, we as defenders can barely keep up. If we are to have any hope of successfully protecting the valuable assets and information we are charged with protecting, we need all the help we can get, from any reliable and trustworthy source available.

When someone turns away, discounts, dismisses, or excludes ideas, experience, and expertise because of their origin, that person is putting his or her organization at great risk for no good reason. In essence, politics and personal prejudices are being put ahead of professional duty. It should come as no surprise that this is not acceptable -- the organization and its clients, shareholders, partners, leadership, and employees deserve and demand better. Security knowledge should be valued and respected regardless of its source. Anything less would simply be unprofessional.

Friday, May 16, 2014

Operational Experience

I firmly believe that there is no substitute for operational experience. I try to make it a daily practice to read blogs, articles, and other posts from around the information security community. This allows me to keep up with the latest news and developments, as well as to understand and learn from the views of others. I find it rather interesting that I can usually tell the difference between authors who have operational experience in the security field and those who do not. I often check my assessment via LinkedIn, Google, and other means, and I am usually correct. I’ve always wondered why there is such a clear delineation between the writings of those with operational experience and the writings of those without.

As the Albert Einstein quote reminds us, “In theory, theory and practice are the same. In practice, they are not.” This is a salient point. There is no shortage of ideas, theories, and suggestions for improving the state of security, but how many of them are rational, practical, and realistic? Operational experience causes people to see the world from a different perspective. It causes people to identify practical suggestions that can be implemented and operationalized in a realistic timeframe and without an unrealistic amount of resources. Further, operational experience causes people to place more of weight on the ratio of the resulting impact to the effort required to produce that impact, rather than other potential decision making metrics. In my experience, operational experience enables better decision making and produces a better result, whatever the undertaking. This is particularly true in the security community, where resources are quite limited and expectations are quite high.

So, perhaps it is important to think about key personnel and decision makers within your security organization, at your vendors, and at your consultancies. What is their level of operational experience and familiarity with the issues you face? Have they spent time in the trenches? Are they making decisions and offering advice based on a solid foundation of formal training and on-the-job experience? In my experience, these are important questions to consider when hiring, selecting vendors, and retaining consultants. There is really no substitute for operational experience.

Wednesday, May 14, 2014

Difference Between Logging and Alerting

Based upon some conversations I’ve had lately, I feel comfortable stating that some people don’t understand the difference between logging and alerting particularly well. There is an important difference, and it becomes noticeably more important as the volume, velocity, and variety of data within a security operations setting continue to grow. In this post, I hope to illustrate the difference in order to help people mature their security operations programs.

In essence, I think much of the confusion stems from ambiguity around the meaning of the word “alert”. In a security operations setting, an alert is something that needs to be qualified, vetted, and acted on appropriately. Every alert that presents itself to the work queue should be reviewed. Due to resource limitations, an organization can realistically handle only a small number of alerts per day -- say on the order of hundreds per day at a typically resourced organization. We’ve all met people who make statements like, “Our SOC handles 100,000 alerts per day”. Bollocks.

Unfortunately, many technologies use the term “alerts” when they are really producing “logs”. For illustrative purposes, consider the following example. Say we install an IDS, outfit it with the latest ruleset, and set it loose. It will produce tens of thousands or hundreds of thousands of events per day. The vendor will call them alerts, but let’s take a step back and think about what the IDS is producing in actuality. The IDS is producing a notification every time it sees a packet matching a signature that the IDS maintains. To me, this is a log -- a log of network events. I don’t intend to belittle or pick on IDS -- there is huge value to be brought to security operations by IDS and the events it produces. Rather, I am saying that IDS is essentially producing logs to be sent to the SIEM or data warehouse, just like any other network sensor produces.

So, I’m sure you will ask me, “Well, what do you consider an alert then?”. Great question -- thank you for asking. To me, an alert consists of one or more events that match a defined risk, threat, concern, and/or priority to the business with a low incidence of false positives. For example, it is possible that thousands of IDS events, hundreds of proxy log entries, and a DLP event could meet the criteria of a particular alert we have developed. Thus, one alert would fire that happened to involve thousands of event logs.

All of the various network sensors we have are important and contribute to our overall security posture. But, it’s important to remember that they produce logs, and not alerts. Alerts should be specifically designed around our requirements and use cases. Alerts are what get bubbled up to our work queue and provide us with jumping off points into analysis, forensics, and investigation around activity of interest. Because of this, alerts need to be higher in sophistication and fewer in number than logs. The logs that our network sensors produce are an important component of our alerts, but they are not alerts in and of themselves. In my experience, this is an important concept that, when properly understood, contributes to the differentiation between weaker security programs and stronger security programs.

Tuesday, May 13, 2014

Intent

I’ve been thinking lately about why so many attempts to identify attacks and detect intrusions result in such a large number of false positives. I believe one of the main reasons is because it is difficult for us to properly understand intent. Before I explain why, I think it helps to shift perspective to think about what an alert, and specifically a signature-based alert, means conceptually. The methodology of signature-based detection is essentially a packet by packet view of the world. Each “known bad” is checked against a snapshot -- a moment in time as the packet flies across the network. From this perspective, a packet by packet approach to detecting intrusions doesn’t quite seem rational, or at least not for intrusions that involve more than a small number of packets or take place over a period of time. We wouldn’t dream of identifying winning stocks solely based on their price as of 12:34:56 PM EST -- a single moment in time. Instead, we opt for additional context, so that we can better understand the complete picture and make a more informed decision.

The first step up from a packet by packet or alert-driven view of the world is to a session-driven view of the world. Building sessions allows us to put like activity together -- at least within the same type of data. Sessions have been in use for a while now for many different types of data, and they are quite a bit more useful for analysis and forensics than data that is not sessionized. For example, summarizing communication between two hosts is far easier with netflow data than it is with firewall logs.

Sessions point us in the right direction, but I believe that building narratives is the next logical step. In a piece in SecurityWeek (http://www.securityweek.com/security-operations-moving-narrative-driven-model), I explained the concept of the narrative-driven model. To understand the motivation for moving to a narrative-driven model, I think it helps to think about what detecting intrusions means in abstract. Essentially, when we want to detect or hunt for intrusions, we are looking to understand intent -- what does a particular sample of traffic intend to do. For example, is this particular packet a command and control or data exfiltration channel, or is it a false positive? The answer to that question requires understanding intent. And understanding intent requires more context than packets or sessions can provide. To understand intent, we need to understand the whole story around an alert. We need the narrative.

To me, it seems that where we don’t properly understand intent, we run into false positives. A command and control channel and benign activity might look nearly identical on the wire, until we understand their intent. In the analog world, what is the difference between a hard-working employee taking a laptop and proprietary information home to continue working on and an employee being paid to commit industrial espionage? At the packet level (what we see at a moment in time), there isn’t a difference -- to the human eye, both scenarios look exactly the same as the employee leaves the office for the day. The difference lies only in the intent, and that is something that we need additional information to understand. We can’t understand intent from a single packet, or even a single session. To understand intent, we need a richer context. We need the narrative. In my opinion, a better understanding of intent results in fewer false positives. The challenge lies in operationalizing this concept, of course, and the solution begins with building the narrative.

Monday, May 12, 2014

Dead or Undead

Recently, Symantec’s Senior VP for Information Security, Brian Dye, made headlines when he proclaimed that AV “is dead”. This, of course, created quite a bit of buzz in the press, presumably achieving the goal of the proclamation. Even with all the buzz, I’m not entirely sure that we’re having the appropriate follow-on discussion as a community. Allow me to explain.

If we think about what anti-virus is conceptually, it is a means or an approach to detect and contain malicious code. This is a noble goal and something we ought to be doing. If that is the case, then why are AV detection and containment rates well below 25%? I think the answer lies in the fact that the attackers have changed with the times, whereas AV methodologies and techniques are largely stuck in the 1990s. I believe that the main challenge with AV is that, like all signature-based technologies, it is designed to look for “known knowns”.

I’ve discussed the difference between “known knowns” and “unknown unknowns” in previous blog posts. We should, of course, be looking for what we know is bad. We would be crazy not to do that. These are the “known knowns”. In practice, however, the “known knowns” turn out to be only a portion of the attacks that compromise organizations. The most interesting activity lies in the “unknown unknowns”, and it is in this “pile” that we find the attacks that AV cannot detect (and thus cannot contain), including attacks that do not involve any malware at all.

Is it easy to find the “unknown unknowns”? Of course not. But that doesn’t mean that we shouldn’t try, and that we shouldn’t develop methodologies and techniques for that purpose. Put another way, the attackers have changed their game, and we need to change ours. We need to work harder on developing reliable ways to detect and contain anomalous behavior and activity. That is the only way to supplement the signature-based approaches of the past and increase detection and containment rates.

The question isn’t so much whether or not AV is dead. Rather, the question would seem to be whether detection and containment of malware can be done in a more effective way. I believe the answer to that question is yes.

Wednesday, May 7, 2014

Narrative-Driven Model

In a recent piece in SecurityWeek (http://www.securityweek.com/security-operations-moving-narrative-driven-model), I explained why the current alert-driven security operations model does not scale to meet today’s or tomorrow’s challenges. While the full piece offers a detailed explanation, I thought it would be useful to summarize some key points in this blog posting.

In the current model, alerts are generated by various different technologies and then sent to the work queue. Analysis and forensics are then performed manually to build a more complete picture of what occurred -- the narrative -- around the alert. Alerts contain a snapshot of a moment in time, while narratives tell the story of what unfolded over a period of time -- the attack kill chain.

When attacked, an enterprise needs to move rapidly from detection to containment. In order to make this leap, the enterprise needs to understand what needs to be contained. In order to understand what needs to be contained, the enterprise needs to understand what occurred. In other words, the narrative needs to be built. This can be time consuming, and thus cannot be done for every alert. Because of this, analysts must make a snap decision when an alert fires with little contextual information available to help them. As a result of this, misjudgments are to be expected. The consequence of misjudgments is that sometimes true positives are missed or overlooked, as was the case with some of the intrusions recently publicized in the media.

The security community needs a paradigm shift from an alert-driven security operations model to a narrative-driven security operations model. In other words, analysts need to be presented with complete (or nearly complete) narratives in their work queue, rather than alerts. More context enables better decision making. Better decision making enables better security operations.

Have a look at the SecurityWeek piece and let me know what you think.

Monday, May 5, 2014

Heads Will Roll

Today’s big news in the security world is that Target has decided to replace CEO Gregg Steinhafel. The leadership change is reportedly a result of the much publicized breach in late 2013 and follows the departure of Target’s CIO. There are a number of different aspects that I could write about relating to this topic, but there is one aspect in particular that captures my interest. What captivates me about this news is that, in 2014, we have come to the point where a security incident can topple an executive at the top of his or her career.

Replacing leadership after a serious incident (security or otherwise) is something we see frequently. Baseball team losing too many games? Fire the manager. Talk show not getting the ratings? Fire the host. I think what’s important here is not that Target will replace its CEO and CIO, but what comes of it in the long term. Sure, making big leadership changes is one way to catalyze the cultural change that is necessary within an organization. But seeing an improvement in the overall security posture, in any organization, requires strong and competent leadership at all levels, among other requirements.

Today’s CEO needs to be aware of the security threats to the enterprise and prepared to counter them. No one expects the CEO to be a security expert, but a security conscious CEO will put a knowledgeable, trustworthy CSO or CISO in place. That CSO or CISO will have the knowledge and skills required to put a strong security program in place. This is no easy task, and a big part of being successful in this endeavor is putting competent leadership in place at all levels. It is absolutely critical that every link in the security organization’s management chain be strong -- one weak link can completely change the dynamic and result in the introduction of a large amount of risk. Is it easy to find strong and competent leaders in the security field? Absolutely not. It is worth the investment in time to seek the right leaders? Absolutely.

It is tempting to fire key leadership after a serious security incident, but the true test of an organization is whether or not it improves its security posture in earnest. Leadership changes can catalyze action that is required to bring about this improvement, but it is not sufficient. A strong management chain from first level managers up though the CSO or CISO is a critical component in a strong security posture overall.

Friday, May 2, 2014

Don't Forget to Dig

Detection is the first stage of the incident handling life cycle, and it is an important one. Detection allows us to identify and respond to issues, compromises, and breaches on our networks before they become big headlines. Given this, I’ve always been surprised at how much of the dialogue around detection is dominated by signature-based approaches to detection. When news of a new attack hits the airwaves, everyone rushes to grab the latest signatures to detect that attack. Of course, this is important and should be continued, but it is not sufficient. Signatures only find what we know about -- the known knowns, and they are reactive at best. What about a variant of the latest attack that might leave different footprints on the network than the signatures are designed to detect? What about another equally worrisome but yet unidentified attack that might be occurring as we speak? These are the unknown unknowns, and it is usually because of the unknown unknowns that organizations wind up in the press.

So how can we find the unknown unknowns? The answer is to dig. Digging involves performing analysis of the network traffic data. No one method or technique will meet the digging needs of an organization or find all intrusions. In my experience, it is most productive to try multiple different approaches and record the results of each one (e.g., in some sort of a knowledge base). Although there are many different approaches that can be tried, I have included some interesting examples here to help illustrate my point:

Mining for domains that resolved to “parked” IP addresses for a period of time, but then began resolving to “live” IP addresses
Aggregating over different combinations of fields (e.g., source IP, source port, destination IP, destination port, protocol, application protocol, domain, URL, byte size, etc.) and looking at outbound traffic to identify any patterns or “clustering/clumping” of activity
Hunting for malformed packets (e.g., byte size below minimum required by RFC)
Searching for a large number of DNS TXT records from the same system in a relatively short amount of time
Watching for domain names with excessively short TTLs or domain names that continually change the IP addresses they resolve to

This is by no means an exhaustive list of approaches that can be tried, but I believe it does serve to illustrative the concept of digging. Results will vary with different approaches to digging, and it is important to track and document what was tried, and what resulted. This allows the team to continually build on and improve its digging. When an approach to digging is considered successful and mature, it can be automated, and the output/results from it can be sent to the alert queue. This allows successful digging techniques to integrate with the operational workflow just like successful signature-based detection techniques.

It will always be important to watch for the known knowns of course, but it is the unknown unknowns that really get us in trouble. I’m often surprised at how little digging is going on in many Security Operations Centers (SOCs). My hope is that this will soon change, and that more and more organizations will begin to dig.

Thursday, May 1, 2014

Drowning in Information, Starved for Knowledge

In his 1982 book, Megatrends, author John Naisbitt penned the famous quote, "We are drowning in information but starved for knowledge."

This quote is particularly relevant to the security operations field. Information, or data, comes at us faster than we can make sense of it. In a large enterprise, daily log volumes can quickly rise to 5, 10, 20 billion rows of data or more. We can gain access to 30 or 40 "intelligence" feeds in the blink of an eye. Threat reports are more plentiful than the eyes available to read them. Information sharing groups and mailing lists may deluge us with hundreds of emails per day. As we know, there is a big difference between information and knowledge. We have more than enough information. How can we turn that information into knowledge?

In my experience, most Security Operations Centers (SOCs) feel this pain continually. As security professionals, we have quantities of information at our fingertips that would have been unimaginable 10 or 20 years ago. Yet, with all that information, we struggle to answer simple questions such as, "What is happening on the network right now?". The data, feeds, reports, and emails come at us continuously, and most organizations understandably struggle to make sense of them. While there is no easy solution to this complex challenge, I can offer some guidelines and advice that I have found helpful in bringing order to the chaos:

Evaluate risk to the organization: Boiled down to its essence, security is about managing, reducing, and accepting risk, rather than eliminating it. Different organizations will be exposed to and concerned about different risks. Understanding what needs to be protected (e.g., sensitive information, intellectual property, money, reputation, etc.) is the first step towards understanding how to protect it. It is difficult to build a security operations program without a clear vision of what exactly that program is designed to protect.
Create human language goals and priorities: Before diving into any technology or creating any alert logic, it is helpful to write down, in human language sentences, what you would like to accomplish. This is similar to the programming practice of writing pseudo-code before writing any actual code. This accomplishes two important things. First, it helps the organization to organize its goals and priorities, which can subsequently be used to frame and execute a work plan. Second, it serves to document those goals and priorities. Documentation (at all levels) is seldom fun, but it is an extremely important activity for a number of reasons.
Identify appropriate data sources: Once goals and priorities are identified, the appropriate data sources can be identified to meet those goals and priorities. The relevant data sources can include network traffic data, various types of logs (network, end-point, and malware), intelligence feeds, threat reports, and other sources. It is important to consider all data sources relevant to a particular goal or priority and to explicitly identify and document the relevant sources alongside it.
Identify appropriate technologies: Different technologies suit different needs. For example, for certain goals and priorities, a SIEM or data warehouse may be the appropriate tool. For others, a network forensics platform might be the right fit. Or, for yet others, something different entirely.
Throw out the default rule set: This may sound radical, but for each technology, throw out the default or standard rule set. Why? Because it wasn’t written specifically for your organization. Are there specific signature sets, rules, logic, alerts, etc. that meet the needs of your organization? Absolutely, and those should be retained selectively. It’s important to remember, though, that many elements of the system’s default set won’t meet the needs of your organization. Keeping them in there will create noise and false-positives that won’t help you accomplish your goals and priorities.
Write targeted alerting: Alerts are a powerful force in security operations and should be leveraged accordingly. Write alerting designed to identify suspicious, malicious, or anomalous activity as defined by your goals and priorities. Don’t bother writing any alerts that don’t fit your goals and priorities. Why? Because that will just produce additional noise and false positives that you’ve already decided you aren’t interested in.
Streamline the workflow: Regardless of how and where alerts are generated, they should all flow to one unified work queue. Priority should be used to assist the team in identifying what to work on first, second, etc. Highest priority goes to the highest fidelity, most reliable alerts covering the most critical assets. Lowest priority goes to the lowest fidelity, least reliable alerts covering the least critical assets. The most important takeaway here is that one work queue allows your team to focus and provides them jumping off points into analysis, forensics, and investigation. More than one work queue leads to complication and confusion.
Practice Continuous Security Monitoring (CSM): Once you set up alerting and a work queue, use it. Every alert should be reviewed by the team and investigated appropriately to build context around it and understand what occurred. Some alerts will require more investigation, while others will require less investigation. There is really no point in setting up alerts that you never intend to look at. That’s really the point of CSM -- reviewing each alert and performing analysis, forensics, and investigation as required.
Follow a mature process: Process is the glue that binds people and technology together. If you have great people and have set up great technology (perhaps leveraging the suggestions provided in this blog), a great process is also required. Process helps the team focus on what tasks are value-added and converge to a conclusion. After all, analysis and forensics are not done for analysis’ and forensics’ sake, but rather, to reach a conclusion.
Leverage automation where appropriate: Once a mature process is in place, study the application of that process operationally. If time-consuming, manual labor exists for certain aspects of the process, consider automation. If time can be saved in one place, it means that more time can be spent elsewhere. This leads to greater overall visibility on the network.
Maintain a communal presence: We can learn a lot from our peers. Maintaining a presence in the broader security operations community ensures that an organization is in the loop regarding current events, topics, discussions, and issues. All of these factors play a role in ensuring that the security operations program changes with the times, and that information is shared in a timely and relevant manner. Just remember -- this relationship involves both giving and receiving.
Continuously improve: Never believe that the work has been completed. Technologies, methodologies, and the threat landscape change continuously and quickly. It is important to continually seek feedback and improve. The steps described here are iterative and should continually be stepped through in a cyclical fashion.

There is no one size fits all solution to security operations. That being said, I have found the above guidelines to be quite helpful when working with enterprises to improve the state of their security operations programs. My hope is that if individuals find this advice and other advice on this blog helpful, that they will share with their peers, colleagues, and friends. Contact me on Twitter (@ananalytical) or drop me a comment here to let me know what you think. Your input is important to me as well.

An Analytical Approach