Semantic Technology is Ready to Power Next-Generation Cyber Security

Cyber security is always a hot topic, but every time another Big Data exposure makes headlines, it becomes even hotter. The leak of some 11 million records held in a database operated by offshore law firm Mossack Fonseca, which led to the Panama Papers investigation, is just the latest example of a massive breach generating massive attention.

At least in this case, the data leak can serve a social good: It brought to light secret offshore tax schemes that lawbreakers, drug masterminds, and the merely rich and powerful have been able to exploit for money laundering, tax evasion or otherwise dubious dealings. To be fair, not all Mossack Fonseca clients were involved in creating offshore shell companies for improper purposes. They were targets of the data leak, too, and indeed, the unfortunate reality is that many cyber-attacks compromise the personal information of entirely innocent parties as well as corporate IP.

Add to the rolls of those working to shore up cyber security in these high risk times the EU’s PANOPTESEC Consortium. Its goal has been to create:

“A prototype of a cyber defense decision support system, demonstrating a risk-based approach to automated cyber defense that accounts for the dynamic nature of information and communications technologies (ICT) and the constantly evolving capabilities of cyber-attackers.”

Its operational prototype provides a means to prevent, detect, manage, and react to cyber incidents in real time. It supports breach notifications, helps to improve situational awareness, and decision-making processes for security teams who constantly face threats from attackers who can penetrate systems to extract sensitive information, tamper with data accuracy, and stop access to critical services.

One key piece of the consortium’s prototype hails from Semantic Web technology company Epistematica, whose staff of data modelers and system designers practice the art of transforming data into computable knowledge. (See our previous coverage of the vendor here.) For PANOPTESEC, Epistematica has developed the Reachability Matrix Correlator (RMC) that provides algorithms for computing reachability information; the Reachability Matrix Ontology (RMO) that describes the cyber-security domain; and, the Semantic Reasoning Task, which uses the RMO ontology in order to help compute the Reachability Matrix (RM). An RM is a matrix used as a means of representing an adjacency structure that in turn represents a graph. Epistematica’s RM is used as input to PANOPTESEC modules like the Attack Graph Generator.

Cyber Security in the PANOPTESEC Framework

The Panoptesec framework offers some innovative features, such as the ability to measure the business impact of attacks, says Luca Severini, founder of Epistematica. What makes it really unique, though, he says, is the RMC:

“It takes a Semantic approach to computing an RM across a monitored network to deduct if two nodes are reachable from each other in the network, for all pairs of nodes representing devices.”

RMC gets three files for input data: network inventory (a node’s entity and properties), deploy/access control policy (routing, NAT, and firewalling rules), and vulnerability inventory (known vulnerabilities). Input data are parsed as axioms of the ontological concepts to populate the RMC Knowledge Base, Severini explains. Once created, a Knowledge Base inferential process is launched, and the Semantic Reasoner performs the logic rules and calculates the RM. At the end of the automated reasoning process, the output is a file that contains the RM data. “This file is a big dataset that represents a giant graph,” he says. It stores the output data so that other modules, including the Attack Graph Generator developed by Alcatel Lucent Bell Laboratories, can use it to help compute ways a system can be attacked.

Determining if a node can reach another node via ISO/OSI layer protocols is crucial to risk management. The RMC calculates the RM through ISO/OSI Layer 4 in a very short time, so that all the other modules of the Panoptesec framework may act or react at an accelerated pace, he explains. The RM also makes it possible to automate more processes that otherwise would require human intervention, thus increasing the efficiency of the system.

The traditional approach to an RM calculation is extremely complex, essentially based on the development of algorithms specialized in solving problems of reachability between nodes in a network. The defects of this process, Severini says, include the fact that it is slow, because the algorithms are designed to solve the mathematical problem and not to optimize the performance of computing. Also, the traditional approach relies on a non-intuitive model in which mathematics for calculating the RM are not directly mappable to the logic used by an expert in penetration testing to determine the reachability between network nodes. It also is plagued by maintenance difficulties. “When networks’ characteristics change, also the mathematics may change. Then, the algorithms must be adapted, and the code fixed, every time,” he says.

The benefits to the Semantic approach to Reachability Matrix Computation that Epistematica leverages stem from using a standard algorithm to solve logic. “The Semantic Reasoners are designed to manage graphs (the RM is a graph) so that the computation is executed and completed much more quickly,” he says. The model also is intuitive in that knowledge is represented in an ontological form – easily understandable because it is self-explanatory for a human. And, “the Semantic rules are the formalization of the logic process carried out by the experts in penetration testing to determine the reachability between network nodes,” he explains. Maintenance also is easy: If something changes in the network topology, one need only act on the data level; that is, modify the ontology, the rules or the Semantic queries. “No change is required to the algorithms and, even less, to the code,” he says.

This Semantic approach was described in a paper written by Epistamatica’s researchers that was presented at the 10th International Conference on Semantic Technology for Intelligence, Defense, and Security last November.

“The problem of the reachability matrix calculation was very challenging,” Severini says. “Thanks to our great experience in knowledge representation using Description Logics, we were sure that it would have been possible to use strong Semantics to represent the knowledge of an expert in penetration testing, and using the automated reasoning to infer the reachability matrix (that’s the logical topology of the network!).”

From Research to Startup

Together with the consortium’s other partners, Epistematica now is involved in some presentations of the research results, primarily to large European organizations that manage critical infrastructures. These include Cloud providers, telecommunications operators, energy providers, governments, and so on. “Many of these have already expressed interest to use the PANOPTESEC framework,” he says.

There have not yet been commercial products for RM calculation because of the difficulties associated with the traditional mathematical approach, he says. But,

“Providing the RMC with a set of APIs for creating connectors to other applications that produce network inventory and also providing it with a visualization plug-in, it will be possible to derive a commercial product from RMC,” he says.

That will be a software solution to automate the network’s logical topology discovering aimed at companies and professionals worldwide in cyber security and penetration testing.

That’s why Epistematica has decided to found a new startup to be launched soon – to transform the research results of Epistematica in this effort into a commercial product to be sold exclusively online, he says. The new company also will benefit from access to results from another publicly-funded research project from the European Space Agency (ESA): The outcome of this effort, the RARE project, is a software environment developed in Java for visualization of ontologies and browsing of Semantic Metadata. “In this context, it will be used to browse the nodes’ properties and to navigate the network’s physical and logical topology,” Severini says.

Epistematica is now searching for venture capital for the startup. At the same time, the company wants to explore creating new applications in other domains based on the RMC’s methodology and technology.

“Using strong Semantics to represent knowledge and automated reasoning to infer, applications like RMC are more autonomous than applications based on syntactic technologies. It could better support fully automated business processes, which are the business processes of the digital age.”

LISTEN NOW: MY CAREER IN DATA PODCAST

Data Topics

Semantic Technology is Ready to Power Next-Generation Cyber Security

Leave a Reply Cancel reply