Bioinformatics software provider IO Informatics recently released its free Knowledge Explorer Personal Edition. Version 3.6 of the Personal Edition can handle most of what Knowledge Explorer Professional 3.6, launched in October, can, but it does all its work in memory without direct connectivity to a back-end database.
“In particular, a lot of the strengths of Knowledge Explorer have to do with modeling data as RDF and then testing queries, visualizing and browsing the data to see that you have the ontologies and data mappings you need for your integration and application requirements.” says Robert Stanley, IO Informatics president and CEO. The Personal version is aimed at academic experts focused on data integration and semantic data modeling, as well as personal power users in life sciences and other data-intensive industries, or anyone who wants to learn the tool in anticipation of leveraging their enterprise data sets for collaboration and integration projects.
The latest Knowledge Explorer 3.6 feature set extends the thesaurus application in the product, so that users can bring in additional thesauri and vocabularies, as well as the user interaction options for importing, merging and modifying ontologies. For the Pro edition, IO Informatics has also been working with database vendors to increase query speed and loading.
“One of our goals with this is to create a one-stop semantic integration tool. That fits at the intersection of ontology resources, vocabulary and thesauri resources and data sources, whether they are spreadsheets, tables from a relational database, public data, text mining output, or semantic data sources,” Stanley says. “It’s a pure GUI-driven product but you can look at SPARQL and edit if you need to. These facilities bring down the barrier to entry for semantic technology in particular, and for data integration generally.”
Customers of the Pro edition today are using the tool to bring together data, visualize relationships between data sets, identify patterns and turn them into SPARQL queries to create screening apps in the service of everything from discovering what is unique about a patient at risk of organ transplant rejection to understanding what combinations of treatment would be most effective for treating cancer based on a specific patient’s pattern of genes, proteins and clinical symptoms.
Stanley sees a number of potential applications for those who might like to try the Personal version for integrating and modeling smaller data sets. “Maybe a customer has a number of reports on protein expression experiments and lot of clinical data associated with that, including healthcare records and various report spreadsheets, and they must integrate those to do some research for themselves or their internal customers,” he says, as one example. “You can do that even using the Personal version to create a well integrated, semantically formatted file.” The Pro version applies the same methods but delivers integrated semantic data warehouses or datamarts for secure data sharing and research collaboration. The tool can also federate distributed SPARQL endpoints and can scale to enterprise integration, master knowledge building and management systems, he says.
The Personal version also can come in handy as an exploratory tool to help the student who isn’t sure whether to pursue hands-on biology in the lab or data-driven informatics. Also, it’s a good tool for interrogating datasets for potential inconsistencies or connections, in terms of helping users learn what they need to curate and integrate data. You need to integrate data but it is often so inconsistent or oddly constructed that you need to evaluate, curate, harmonize and normalize the data before it can be usefully linked,” Stanley says.
The free personal version has seen some fast uptake, he says. “One of the reasons we let the Personal version out for free is that we have seen so much benefit from the work at the W3C, Stanford, MIT and others, broadly, from many academic, open source and pre-competitive standards bodies. They have provided so much value to us. We wanted to make this tool available freely, for anyone to use.”