It shouldn’t be surprising that Entagen, which makes the semantically-enabled Big Data analytics and collaboration engine TripleMap, has had its sights set on the life sciences space. CEO Christopher Bouton has his Ph.D in molecular neurobiology and has worked at a number of bio tech firms, as well as been the head of integrative data mining at Pfizer – a company that’s using TripleMap for visualized knowledge maps of associations between domain-specific entities (see our story here).
“We see some really compelling and exciting applications of this type of technology in the life sciences space,” says Bouton. But TripleMap can be applied to any scenario where Big Data dots must be connected so that users can collaborate around the understanding of the associations between entities – health care, legal, retail and finance all come to mind.
“We all now have so much data that it’s hard to understand what all the connections are between bits of information, and more specifically types of entities, the types of things represented in the data we work with,” he says. “TripleMap lets you first identify what entities you are interested in working with, and the system is capable of extracting from many different data sources all the metadata about these things and the associations between them.”
That’s what Entagen calls building its “Semantic Data Core,” a constantly updating index of entities that continuously scans and integrates data from internal and external document and data sources. Users create visual knowledge maps with TripleMap by interacting with the large-scale SDC integrated data sources. And, from across that graph of information, they can ask really interesting questions to see unexpected associations, to understand connections, and to collaborate around those connections with others. “We see that all as fundamentally important to tackling Big Data,” he says.
Semantic technology – specifically Linked Data – is incredibly valuable in that pursuit, providing needed flexibility in the way TripleMap identifies and integrates data, Bouton explains, as well as bridge the gap between structured and unstructured data sources. TripleMap utilizes open standards, including RDF and SPARQL, for data consumption, representation and querying. In fact, the Triple in the product’s title refers to the concept of semantic triples, while Map refers to users’ being able to work with visual knowledge maps for a birds-eye view of data entities and connections between them. The technology also uses a proprietary algorithm – dubbed Inferential Connectivity Analysis (ICA) – to identify connections between any two entities in its network.
The trick, Bouton says, is how to do what it does at scale in a high-performance way. “That’s where it’s really important to look at an architecture that’s not only semantic technology-capable but also Big Data-scalable,” he says. “TripleMap is completely proprietary technology we built from the ground up to do this.”
TripleMap’s Semantic Data Core, which is really the heart of any given TripleMap instance, focuses on search and scalability, as well as back-end scalability for integrating and stitching data together. It’s what provides millisecond-scale performance for searches across hundreds of millions to billions of triples,” he says. The company offers a cloud option as well as an on-site solution, which Bouton points out doesn’t need to be hosted on a supercomputer.
Search is what the end user has his or her eye on. “Once a user has run a search, then they have opened up a whole world of information relevant to that,” he says. In TripleMap, once a search has been run, it doesn’t just pull back the documents that mention the thing a user was searching for, but it also dynamically pulls together all the metadata properties for all the things that are relevant to that search and provides faceted browsing based on them.
“Say I run a search for a given compound,” he says. “Once I select the compound the system gives me all the targets known to be associated with that compound: The people in the organization that worked on it, the projects it may have been running, and so on. It represents this all as entities and associations between entities, so you can do more in provisioning of search results than just providing back documents that mention something.”
Looking ahead, Bouton says it continues to be a priority for the company to provide users better and better ways to collaborate around the information they’re working with within the system. Maps, which users can build and share with colleagues to help them understand why they may have made a particular decision or went down another line of reasoning, are a focus here. In fact, some exciting work is going on around utilizing maps on touch screens, he says. “The whole world is moving to touch-based computing environments and maps are perfectly suited for touch-based systems.”
Today, touch environments range from tablets to even desktops with operating systems like Windows 8. But Bouton says some customers have also expressed some interest in setting up large touch arrays – large walls of touch-screens from which they’d interact with colleagues. That’s further out there but it’s a tantalizing prospect. ‘That kind of interactive capability, to really allow a user to convert all this information into knowledge….,” Bouton says. “All these computational systems are tools we can use to basically catalyze that ah-ha moment, so how to do that most effectively, especially dealing with data at the scale we are all dealing with it – that’s where larger environments and larger screens come in handy.”