Tag Archive for annotated corpus

Google Releases Linguistic Data based on NY Times Annotated Corpus

Dan Gillick and Dave Orr recently wrote, “Language understanding systems are largely trained on freely available data, such as the Penn Treebank, perhaps the most widely used linguistic resource ever created. We have previously released lots of linguistic data ourselves, to contribute to the language understanding community as well as encourage further research into these…

KEYNOTE: Semantics at The New York Times – SemTech 2009 Video

The first semantic search system for The New Times was released in 1913 and was available bound in either paper ($6) or cloth ($8). In the 96 years since the advent of The Historical Index to The New York Times, semantic technology has become central to The New York Times’ daily operations and the focus of much internal research and development.