George Thomas of Data.gov recently called out a number of technologies and products employed by Data.gov projects. Thomas writes, "When the Centers for Medicare and Medicaid Services (CMS) decided to publish their Clinical Quality Linked Data on Healthdata.gov, we made extensive use of DERI's RDF extension for Google Refine, helping to design the RDF Schemas we used to define the metadata to capture a controlled vocabulary for Hospital Compare."
Thomas goes on, "We did our first schema pass with Refine+DERI, using it to do rapid prototyping, leveraging the capabilities it provides for mapping data sources in csv/tsv formats to an instance of what resulted in our RDFS (resource description framework schema) vocabularies, which provided a quick and easy way to see whether the triples that were generated from the mapping looked like what we wanted. Usually, we ended up polishing our schemas with powerful IDE-based RDF editors, like the popular Top Braid Composer from TopQuadrant."
He continues, "Once we launched our Virtuoso-powered Clinical Quality Linked Data site, Refine+DERI proved useful with even more powerful capabilities, such as reconciliation services that leverage our SPARQL endpoint, allowing us to resolve the string-based attributes of health domain entities like hospitals from multiple publication sites, against the CMS published URI's that globally disambiguate the identity of those hospitals, enabling data from disparate publications to automatically aggregate around that identity. "
Image: Courtesy Data.gov