The Promise of Natural Language Processing


Nathan Safran of Search Engine World recently wrote, "The early days of search required that users query a search engine in query -> database format. That is, to extricate relevant results, the user must phrase the query in such a way that the machine can understand the request, query the database, and return results. Along the way many have claimed to solve this machine language problem, promising users can use natural language processing – the normal everyday language humans use as opposed to the 'query a database' language search has traditionally required (remember Ask Jeeves?)."

He continues, "The last few (turbulent) days in the search space with the rollout of Google's latest algorithm, Hummingbird, has seen us take additional steps toward this future. Hummingbird essentially replaces the old algorithm engine that sought to map words in a query to content with the same terms, to a new engine that seeks to actually understand the meaning behind the query and (hopefully) return more relevant results. As a user, it's hard not to be excited about the prospect of simply asking a computer a question in conversational language and getting back a relevant answer, rather than struggling to formulate a query in a format a machine can use to query a database. As data owners, we must be prepared by ensuring our data is organized in the most accessible ways possible, conforming to relevant data conventions and organized logically so the ever evolving spiders can access and make sense of it."

One can only presume that the "relevant data conventions" Safran is referring to are semantic, machine-readable formats like RDF. What do you think it will take to achieve the next level of natural language processing?

Image: Courtesy Flickr/ db Photography | Demi-Brooke