Coming in June from start-up Meronymy is a new RDF enterprise database management system, the Meronymy SPARQL Database Server. The company, founded by Inge Henriksen, began life because of the need he saw for a high-performance and more scalable RDF database server.
The idea to focus on a database server exclusively oriented to Linked Data and the Semantic Web came as a result of Henriksen’s work over the last decade as an IT consultant implementing many semantic solutions for customers in sectors such as government and education. “One issue that always came up was performance,” he explains, especially when performing more advanced SPARQL queries against triple stores using filters, for example.
“Once the data reached a certain size, which it often did very quickly, the size of the data became unmanageable and we had to fall back on caching and the like to resolve these performance issues.” The problem there is that caching isn’t compatible with situations where there is a need for real-time data.
The company programmed from scratch all the components in the database server with an eye on optimizing them in terms of their performance. Among the features it notes are an in-process query optimizer that determines the most efficient way to execute a query; an in-process memory manager for faster memory allocation and de-allocation; an in-process multi-threaded HTTP server for much faster SPARQL Protocol endpoint than through a standard out-of-process web server; and an in-process directly-coded lexical analyzer for efficient query parsing, as well as snapshot isolation for fast transaction processing, an in-process stream-oriented XML parser for fast RDF/XML parsing, and an RDF data model so there’s no data model abstraction layers to slow data processing.
In addition to focusing on high-performance, Henriksen had in his sites other requirements. For example, ACID properties for guaranteeing database transactions are processed reliably were in the picture. “Your data transactions are taken care of so that any data transaction isn’t committed fully to disk until it’s fully completed,” he notes.
Also, he saw that other offerings tended to be very programming-language and platform-dependent; he thinks that an option like Meronymy SPARQL Database Server, which can be used by virtually any modern programming language, will be a benefit to businesses, as will its adherence to fully implementing W3 recommendations like RDF/XML, RDFS, SPARQL, SPARQL Protocol, and the SPARQL/Update 1.1 working draft.
An opportunity also was there to distinguish the offering from competing database servers by its focus on security, he says. “That is something we often see is missing,” he says, in terms of a lack of facilitating access rights regarding database querying. “In some cases you don’t want the people accessing your SPARQL endpoint to query all your database systems on the database server or all the RDF documents on the database server. You might want something that is private. By having access control on the database level, you can manage who has read, write and execute access on the different databases.”
As an example scenario, an enterprise might have a big semantic data set that it works with internally and a subset of that for display to partners or customers on the Internet – and it’s only the latter that you want to let those third parties query via a SPARQL endpoint. “It’s easier to manage access control on the database level. You can use the management console [an optional piece of software that provides a graphical overview of the entire database] or write SPARQL queries to do this because the entire access control and database structure is managed in the RDF database of its own that is called System Catalogue,” Henriksen says. The management console can be used on remote workstations or on the server itself.
The company expects to conduct a closed beta test soon, which users can register for on its site.