Loading...
You are here:  Home  >  Data Education  >  Current Article

Road Blocks on the Path to Open Data

By   /  March 20, 2012  /  No Comments

Richard Van Noorden of nature.com recently reported that though text mining and semantic search have increased useful connections between research, publishers of that research are wary about their rights being infringed upon.

Richard Van Noorden of nature.com recently reported that though text mining and semantic search have increased useful connections between research, publishers of that research are wary about their rights being infringed upon. He writes, “When he was a keen young biology graduate student in 2006, Max Haeussler wrote a computer program that would scan, or ‘crawl’, plain text and pull out any DNA sequences. To test his invention, the naive text-miner downloaded around 20,000 research papers that his institution had paid to access — and promptly found his IP address blocked by the papers’ publisher. It was not until 2009 that Haeussler, then at the University of Manchester, UK, and now at the University of California, Santa Cruz, returned to the project in earnest. He had come to realize that standard site licences do not permit systematic downloads, because publishers fear wholesale theft of their content.”

Van Noorden continues, “So Haeussler began asking for licensing terms to crawl and text-mine articles. His goal was to serve science: his program is a key part of the text2genome project, which aims to use DNA sequences in research papers to link the publications to an online record of the human genome. This could produce an annotated genome map linked to millions of research articles, so that biologists browsing a genomic region could immediately click through to any relevant papers. But Haeussler and his text2genome colleague Casey Bergman, a genomicist at the University of Manchester, have spent more than two years trying to agree terms with publishers — and often being ignored or rebuffed. ‘We’ve learned it’s a long, hard road with every journal,’ says Bergman.”

Read more here.

Image: Courtesy Flickr/ Jayel Aherman

You might also like...

Using Your Data Corpus to Further Digital Transformation

Read More →