Cybersecurity Professionals Need to be Data Scientists Too

By on

Data ScientistCybersecurity practitioners don’t necessarily think of themselves as Data Scientists – but they should. In fact, most of them are already doing that work, even if they don’t recognize that the term applies to them.

Securing the network and protecting the enterprise means that information security pros do everything from administrating servers and workstations in order to secure them to implementing routers and firewall. But they’re also looking at data from these sources that may indicate threats in order to meet their mandate of shielding the business from security breaches. They can apply modeling methods that may take advantage of Machine Learning and Artificial Intelligence for analytics to gain business value from that data.

“The value they bring is enterprise security,” says John Omernik, distinguished technologist at Big Data platform vendor MapR. Being able to use Data Science to keep personal data safe or to protect the company’s intellectual property or to subvert other attacks is no different from the work that personnel with “Data Scientist” in their title do in support of marketing ROI or improved customer service.

Omernik has himself made the journey to becoming a Data Scientist in the security field. At previous roles in security at US banks and Managed Security Services providers, he saw that,

“We could be better information security practitioners if we could secure and work with the Big Data platform that MapR provided rather than traditional security tools that focus on doing a specific thing such as security information and event management solutions (SIEM).”

Using MapR technology to apply Data Science to the cybersecurity realm prevented problems at his previous companies including DDoS attacks. (Insights were generated that helped identify hacked computers before log-in.) Omernik realized, “I could do this with data!”

Now he’s interested in generating more conversation on the topic of Data Science among information security pros. He wants to encourage them to make the same jump, not only to stay relevant in their careers but also to grow into exceptional leaders.

Today, he believes, information security pros tend to see Data Science as affiliated with a completely different group in the organization. But as organizations shift to the cloud, in part or in large, information security pros will be charged with securing enterprise data stored there as well as on-site, and “we will have to apply our knowledge on a broader scope than just securing one network in one data center,” says Omernik. “So, Data Science and the skills that go into understanding data well are integral to what information security practitioners must do on a day-to-day basis” across multiple realms if they’re to be successful at maintaining confidentiality, integrity, and availability around enterprise data and systems.

Building the Infosec-Data Science Connection

One potential reason for hesitancy about joining the worlds of information security and Data Science could be that Big Data platforms in use at some companies may not provide the agility in working with data that information security professionals need, given that they are working in environments where threats are always changing and adapting to get around controls.

“Oftentimes, these Big Data platforms that are implemented for a different part of the organization, like marketing, may not be nimble enough for information security teams to secure the network,” Omernik says.

How to address this? Omernik thinks the answer lies in creating more nimble environments by adopting Big Data platforms that let them work with containers, like Docker, for running data-intensive workloads and enabling them to deploy and test code faster and in a more automated way. “In the context of information security on a Big Data platform, those things will draw practitioners in,” he says, saving a minimum two-to-three weeks that would otherwise be spent deploying specialty code to a 50-node server cluster.

“If it’s too slow, they will just disregard it and try to use other tools that they think move things faster, but that aren’t as holistic,” he says. “A nimble data platform with [container orchestration] tools like Kubernetes can help them make the change.”

Another issue with information security pros making use of Data Science relates to limited data retention attitudes. Sometimes this is the result of security tools or appliances whose licensing models and methodologies demand upgrading to the next licensing tier to store more data and keep performance levels high. That increased cost would create pressures on information security departments working with limited budgets, so they prune back data that could be analyzed to help deal with enterprise threats of many kinds in the future.

That situation can be fixed with a Big Data platform that doesn’t limit the amount of data that can be stored in it; data that may be useful for security purposes can be moved to that environment. “If the platform allows that, then it does you a favor because it helps your ability to see threats on the network,” says Omernik.

He notes that while working at one organization in the past, his security team’s storage of web data included full details of every web request coming in from a client, which is not normally retained by security logging solutions. Those tend to remove things like raw headers, instead taking the approach of figuring out the most important header values and storing those for ease and efficiency.

“It’s not very feasible to keep a whole web request in a data-constrained environment,” Omernik says. “We did it because the MapR Big Data platform we used didn’t punish us for storing extra data from a cost perspective.”

That was fortunate in an episode when the company was involved in a civil suit, coincidentally, at the same time that someone was anonymously blackmailing the organization.

Omernik was able to analyze “the full data stored in email and web logs to tie the anonymous email coming into our network from Yahoo to a person’s activity within our web site, where it was no longer anonymous,” thus showing the individual’s tie to the civil suit. With that information in hand, the company’s legal team got a good portion of the case thrown out of court, saving hundreds of thousands of dollars.

“We didn’t know that data was going to be valuable when we stored it” – and it’s possible that there wouldn’t have been the capability to store it had there been no way to get around the problem of high licensing costs. With Big Data platforms in place that don’t restrict data storage, “maybe information security practitioners will be able to explore and analyze data and, from that, come up with new ideas about how to protect the network and the business,” he says.

Bring Other Data Scientists Around to Your Big Data Platform Choice

If information security practitioners come around to the idea that Data Science can drive data security and want to choose a Big Data platform different from what other departments are using, it’s a good idea to be ready to make a case about why the solution they want to use is a better one, Omernik says.

“If they can’t demonstrate how the Big Data platform helps them to secure the network, it’s hard to convince other Data Scientists to rely on the same platform for their Big Data security needs,” he says. In the past, Omernik used information security groups’ implementation of MapR to demonstrate why it should be the solution all the Data Scientists in the business should rely on. That included the fact that it has security baked in.

“I wound up with Data Scientists from other parts of the business knocking down my door to use it,” he says. “Using it made security just a standard part of how we all operated, versus adding on a layer or putting something else in place that users would have to work around.”

Ultimately, Data Scientists in other areas may have different skills or practice their craft in a different way than information security pros. “But it doesn’t mean that you don’t get to do Data Science because you are an information security practitioner,” Omernik says.


Photo Credit: Sergey Nivens/

Leave a Reply