Image via Flickr, courtesy myguerilla.
Want to give Semantic Web technologies a leg up in your enterprise? Then you've got to make a business case for them.
That's what Vijay Bulusu, Customer Engagement Manager, Business Technology, at Pfizer Inc., has done. Granted, he is working in an industry, and at a company, that was one of the first to recognize the power of the Semantic Web to impact efforts such as drug discovery. But Bulusu's innovation was to take things beyond previous technology-based evaluations to a formal proof of concept application that was applied to real business problems.
In his discussions with business reps at Pfizer, two challenges were apparent, Bulusu says. One was dealing with compound purity verification -- the final results of impurity calculations in the substances being evaluated are hosted in one system but the raw data resides in another. 'Compound purity verification' sounds very academic as a concept, but the fact is that both internal scientists and regulatory agencies may need to see not just the final results of impurity values for various compounds, but the numbers that led to them. That kicks off a time-intensive process of having to manually match and verify data from the system containing the end results to the original values, to make sure the correct raw data is presented.
The second problem identified had to do with drug stability analysis. Pharma companies regularly take a drug and at periodic intervals test it for intactness, potency and other values, capturing the data for regulatory submissions. And Pfizer formulators who get tasked with making new drug products need to have an understanding of the impact of the excipients â€“ the inactive ingredients that are mixed with active pharmaceutical ingredients (APIs) to make a product â€“ on the stability of a drug product. The question: "Is there a way I can put into a system one or many names of excipients, and can the system then give me a list of all the drug products we made in the past using them, and can it also show me how stable those drug products have been over a period of time?" Bulusu says.
The basic problem typically comes down to data being spread across multiple repositories and databases. A traditional data warehouse approach could combine all this information into one giant database with a reporting layer built on top of it. "But we wanted to experiment with semantic technologies to see what the true benefits are in scalability, flexibility, and so on" Bulusu says.
Based on running these proof of concept pilots, scalability and flexibility are indeed where Semantic Web technologies win, he says. "You can keep adding more and more data from multiple sources without having to change the underlying schema because you don't have a schema," he says.
"You can keep loading as many triples as you want. And it's flexible--I think the ability to not have a pre-defined concept in mind [of all the applications the data can be used for] before you build your triple store is probably the biggest advantage of semantic technologies." A company like Pfizer can build a triple store with internal data, and later it's easy to merge that with external partners' data or publicly available data as opportunities present themselves. That's very different from the relational database world, where you build the best database for a particular application and don't think about what else that data may need to integrate with down the road.
The Pfizer pilot relied on integrated technology from Franz's AllegroGraph RDFStore and IO Informatics' Sentient software suit. "I always say it's not about the performance but the flexibility," agrees Jan Aasman, Franz President and CEO. "You don't have to think about all the things you want to do with this data. You can have some discussion in advance but not as much as you need with a relational database."
But even with business cases at hand, is every enterprise ready to embrace that flexibility? It's not easy, Bulusu admits. "Having been a technologist, it took me awhile to get over the traditional thinking into this new way of thinking about data and relationships. It's not easy for people who devoted their life to optimizing databases, and data warehousing. This is a whole new world."
Change will happen, eventually, and slowly. There are plenty of reasons why, including outsourcing trends. In the pharma industry, for example, there's a growing emphasis on outsourcing scientific studies and trials. "Historically companies like ours have exchanged data with our partners. We send data to them on disks or hard drives and they do the same," Bulusu notes. "You can stop doing all that by building a semantic infrastructure in place with partners. Obviously there are issues with trust and security, but as part of the semantic stack there are standards emerging."
The next steps after Pfizer's successful proof of concept are to be determined. Says Bulusu, "We know business problems can be solved using these technologies, and now it's a matter of Pfizer as a business trying to figure out what we want to do in the space."