Leveraging Big Data in an Information Security Program

The emergence of “Big Data” as an IT buzzword, along with confusion over the exact meaning of the term, has left some security teams scrambling to get a grasp on something that might not even exist within their organization. So what is Big Data exactly, and how can it help strengthen an Information Security program?

Defining Big Data isn’t as easy as one would think. Vendors and clients come to the table with different definitions of the term, many of which hinge on their roles within a particular organization and the products they’re trying to develop and sell.

For some, Big Data can refer to the tools used to manage massive amounts of information. The term could also be used to describe the incoming flood of data from multiple devices, or problems related to data storage. According to one expert, when it comes to security, Big Data should focus on making sense of data sprawl and managing data as a whole in order to support the business.

“It’s about being able to be able to capture, store and make sense of that data in a way [that supports] the different parts of the business that can take advantage of it,” said Paul Stamp, Director of Product Marketing for RSA EMC’s security division. “This means having an infrastructure that can do the capture, storage and processing of that data, tools that can analyze and visualize that data in a way that supports decision making processes, plus intelligence to make that data relevant to the business.”

Does this mean that existing tools can be repurposed, allowing the organization to better leverage Big Data in a security program? Stamp doesn’t think so.

“I’m not sure this is about repurposing existing tools. Sure, if you have a full-blown B.I. [business intelligence] implementation in place there are tools you can leverage within that suite,” he explained, “but most organizations don’t have that sort of function in place. As for most existing security tools, forget about it—most tools lack the scalability or the analytic capability to call themselves anything near a true Big Data technology.”

Leveraging Big Data in an organization’s security program comes down to a few essentials, starting with infrastructure capable of capturing viable, time-sensitive information on both the infrastructure’s overall state and its internal machinations. “This means log data, as well as deep visibility into network activity, plus identity and asset data to tie that activity back to the most interesting people and systems,” Stamp said.

Part of this process involves identifying the data available for collection. As mentioned above, a big part of Big Data is dealing with data sprawl—so knowing where the data lives, who or what has access to it, and its life expectancy is critical.

Life Expectancy and Agility

Life expectancy may seem like an odd thing to consider, but data has a shelf life. A six-month-old report on a port scan may help with post-attack forensics, but does nothing to help with an attack in progress. If Big Data is going to be used as part of a security program, the information has to be available in real-time, or as close to it as possible.

In addition, organizations will need tools that can analyze and visualize data that supports security functions, such as issue prioritization for incident response. From there, the prioritized data can be broken down into elements such as community-based OSINT (Open Source Intelligence), malware analysis, and more. This is where various logs come into play. The tools should allow security teams to monitor the health of the network, the flow of data in and out of the network, access controls, DLP, etc.

Another important aspect is agility. A security program that leverages Big Data needs to use all of the information available to the organization while maintaining flexibility. It should grow with the organization, using the new data streams created as new business initiatives come online.

Keep in mind, however, that these new initiatives and their data streams also widen the organization’s attack surface. The security program should leverage available OSINT on the risks faced by the initiative.

Security Policy

What about policy? Should a security program leveraging Big Data overlap with existing policy?

“So using Big Data techniques to aid security teams is one thing,” Stamp said. “Protecting Big Data is another—and any project that involves creating new instances of data, plus some serious data aggregation and data inference needs to be handled carefully.

“How that data gets protected at rest, in motion and in use is a critical,” he added, “as is who gets to access that data and its analytical outputs. Existing policies need to be taken into account, but it’s pretty likely that existing policies aren’t going to explicitly cover many of the situations you’ll be creating.”

Policy isn’t within the scope of this article, and policy creation, monitoring and enforcement is a tricky beast in its own right. However, when it comes to tools for implementing a Big Data centric security program, there are plenty of options—including RSA’s NetWitness.

RSA’s NetWitness has a range of tools that will help an organization get something going when it comes to leveraging Big Data within a security program. The one that most organizations look at first is the NetWitness “NextGen” platform. The reason that NextGen is the first stop for this level of research is the combination of RSA and EMC technologies, which allow an organization to monitor and assess the network from multiple levels at the same time.

NetWitness’ power comes from the ability to manage and analyze the data collected. This is important; in a report on Big Data issued last summer, research firm Gartner wrote: “Too much volume is a storage issue, but too much data is also a massive analysis issue.” The manager’s approach to data must be rethought in the context of all the dimensions of information management.

Another vendor worth examining is McAfee. Almost everything that company offers to the enterprise ties into ePolicy Orchestrator (ePO), including their SIEM suite of tools. Like NetWitness, McAfee can offer a multi-level view into all of the data streams that exist on a given network at the same time.

There are other vendors worth a look as well, including LogRythm for their SIEM offerings and Q1 Labs for their QRadar offering.

In all fairness, it must be noted that, depending on the size of a given organization’s environment and its unique needs, developing a security program that leverages Big Data can be a costly, time consuming, and daunting task. Be skeptical of any vendor that claims otherwise, or promises a fully working solution out of the box using default settings with little to no tuning.

But if implemented correctly and used as designed, there is serious value in leveraging Big Data within a security program.


Image: bloomua/Shutterstock.com