Monday, August 24, 2009

SIEM 101: an introduction to SIEM functionality

Just in time for "Back to School" Decurity presents "SIEM 101": An introduction into SIEM functionality. What is SIEM correlation? What does it deliver? What is the value to a business or organization? What is aggregation, normalization, prioritization and how do they differ or enable correlation scenarios?

Every SIEM Vendor seems to have a different definition and marketing spiel about the functionality of SIEM “correlation”. Some times correlation is described in a manner that evokes thoughts of a magic trick, other times it is simply labeled as “too confusing” and therefore not relevant. Obviously, this causes confusion and an inconsistent expectations, or should I say anticipation, of the results that correlation will (or won’t) deliver. This results in the prospective customer ending up with a skewed perspective and, in all likelihood dissatisfaction. On the other hand it may also result in the customer not knowing the full extent of the power the solution makes available to them. Neither situation benefits anyone involved. The purpose of this posting is to help describe common SIEM functionality so that current and prospective users of SIEM can effectively compare the capabilities of different vendors purporting to support or deliver “correlation”.

Some Basic SIEM Terminology. Let's start by outlining some basic terminology and functionality included in most SIEM solutions to provide some context. After that, we will be able to dive deeper into what is correlation and its related functionality.

Collection: Collection refers to the process of obtaining the logged information from various event sources. The “battle” of agent versus agent-less is meaningless should just be ignored as marketing fluff. Things like network architecture, Network speed/latency, event source platforms, security, compliance and your environment variables all drive the decision of where is the best place to locate an agent/collector to collect information. It is simply a matter of your use-cases and environment that drive your deployment architecture decisions.

Event Sources: These are the devices/systems that generate events for consideration. Inclusion of the "right" event sources, logging in the "right" way is absolutely critical to the success of your SIEM. The SIEM can't consider information that does not exist or is not contextually relevant with other information in the system. I'll spend more time on this topic in an upcoming "SIEM 201" blog post.

Normalization: This is the process, at either the collector (agent) or SIEM engine that makes sense of the event data being input into the system. The normalization process tries to map the different log event data formats into a common structure, or taxonomy, or in some cases indices, so that things common fields like names, activity type, timestamps and IP addresses, etc can be quickly compared using a simple taxonomy. Usually this means that the data is more accessible and efficiently stored for the SIEM solution. Each vendor performs this process differently in the background and the level of functionality, intelligence and capabilities associated with the process varies for each vendor, some do it well, some don’t. Some vendor solutions don’t index/normalize on input into the system, they accomplish this task when the user requests output from the system.

Aggregation: This process summaries (counts) event data, based on (hopefully) flexible pre-defined fields. The purpose of this process is to reduce the event data load, either in terms of network traffic, data storage and/or SIEM engine efficiency.

A typical example of this process can happen if the following situation is detected:
1. "N" number of events
2. That contain the same event characteristics
3. For a given timeframe

In this situation the aggregation process could send one event record with a count inside it, instead of sending all of the individual event records. A flexible SIEM solution should allow you to decide which fields are leveraged in the aggregation process, allow you to specify the event field characteristics that must be similar, and what information should be included in the summarized event record. The downside to aggregation, if it is incorrectly configured or designed, is loss of important information (i.e. it could cause more Aggravation then Aggregation.).

Thresholding: Some consider thresholding to be correlation. I consider thresholding to be aggregation with alerting. “N” events occurred in a sliding time window, then let someone know. An example of this could be the popular “number of failed logins over a fixed period of time”.

Filtering: This is the ability to ignore, suppress or block certain event records or messages from being processed or displayed. Some consideration is required if you decide to start suppressing messages or event records. It can be a great way to reduce “noise”, but it is also a very good way to lose very important context from “previously unknown” activities.

Intelligent Filtering is the process by which you forward events from a Log Management device to a SIEM on a per Use-Case basis. Ensuring the full data set is fully searchable and easily available within the overall solution, without overloading the SIEM. Keeps costs down, increases efficiency and enhances solution value.

Simple Prioritization: This is the process of mapping of the message priority, assigned by a particular event source vendor, for an event record to the SIEM's message priority.
For example, IDS vendor "X" assigns an event with a priority of "1a". The mapping process takes this value and translates it to the SIEM Vendor's priority field and assigns a value of "10" which indicates that the priority is "High/Critical". All similar events will always have similar priority. This functionality is typically mapped at the agent/collector, but can also be accomplished at the engine depending on the Vendor.

Advanced Prioritization: This is similar to simple prioritization, with the addition of context from the environment or from how SIEM has been configured. This offers more dynamic prioritization model for similar type events. An example is a priority schema that takes into account, current Vulnerability information for a targeted asset. If the target has a relevant vulnerability and a corresponding IDS Event is received, then the priority of the alerts is raised (it is relevant). On the other hand, if the vulnerability (or system) does not exist, then the priority is reduced to "Informational", for this particular event. This functionality is typically performed at the SIEM Engine. This is one way to highlight known-bad activity and help prioritize workflow. Advanced prioritization might be considered a form of very basic correlation by some.


Ok with that in mind, what is Correlation?

As I see it correlation included the evaluation of collected data by using one or more of the following methods:

(1) Pre-defined pattern matching
(2) Statistical analysis (anomaly detection)
(3) Basic conditional Boolean logic statements
(4) Contextually relevant and/or enhanced data set + Boolean logic

Correlation output: the goal of event correlation is to produce a meaningful ”event of interest” that is intended to create output for use by either other correlation criteria, or to influence and/or directly enable workflow creating actionable output (potential incident identification).

Meaning either
(1) You have a higher degree of confidence that something bad has happened or,
(2) You now know something that you did not or could not know previously.



Additional functions used within Correlation:


Comparison List/Capability: IP, Subnet, ASN, Domain Names, File Names, MAC Address, User names, Event IDs, Custom Attributes, etc. Being able to dynamically update and/or query these lists with or without Boolean logic allows your correlation scenarios to include "fresh" information all the time. Linking lists allows for even more flexibility in prioritization of events. Events can move between lists based on thresholding or other learned context. Move from suspicious to malicious or from malicious to normal based on how correlation scenarios are defined. Decurity’s Threat Intelligence Offering keeps these current for you!

SIEM Boolean Logic: True/False and the use of IF, THEN, AND/OR, NOT variables. This is the process where you articulate your logic statements. More on this in the “201” blog post coming soon.

Statistical Evaluation: In my mind this is by far the most underutilized component of some SIEM solutions. Anomaly detection, Thresholding and even comparison can be accomplished in a very scalable and in most cases a low overhead manner using the correct set of statistical evaluations. The output of these evaluations can also be "events" for comparison is advanced correlation scenarios. Expert usage only.

Contextual Comparison: Vulnerability Info, System (Computer or Network Node) Information, Application Information, User Information, or other categorized attributes describing how the network, systems, users, applications or data are used and/or organized. The more context added to each correlation scenario the more refined (and meaningful) the output will be. In most cases, if accomplished correctly it also means the most efficient use of system resources. A Simple example could be defining assets with PCI, PII relevance.

Meta Correlation: Using SIEM enhanced data from previously/currently correlated events to form new correlation scenarios. This can also use the output of Statistical evaluations. The meta-correlation can be between previous correlated events and new event stream data or multiple previous correlated events. This is also how many systems handle basic scalability or higher tier deployment scenarios. A baseline of content is deployed at lower tiers and matching events are forwarded upward for inclusion in “enterprise-wide” correlation scenarios.


Summary:

Correlation is a very powerful SIEM functions that can help you refine the identification of anomalous or malicious activity. If your (the customer) can articulate your use-cases clearly, then most vendors can find a way to solve the defined problem using existing functionality within their product set. It is my hope that you will be able to use this blog post as a way to map the various solution offerings to a common and understandable taxonomy so you can fully comprehend what you are getting with each solution.

In the next post in this "Back to School" series (SIEM Correlation 201) we’ll talk about Use Case Definitions, Event Sources, Performance Impact, Flexibility and Scalability.

"ring, ring" class dismissed until next week.

-Rocky

No comments: