Loading...
You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  BI / Data Science Articles  >  Current Article

Natural Language Processing: The What, Why, and How

By   /  July 13, 2017  /  No Comments

natural language processingTo the outside observer, Natural Language Processing (NLP) may seem futuristic. Only around a third of smartphone owners use their personal assistants regularly (a hallmark of NLP technologies), even though 95 percent have tried them at some point, according to Creative Strategies, a consultancy. However, Natural Language Processing advances continue in leaps and bounds, as Digital Neural Networks (DNN) and Machine Learning become more intricate. Both technologies enhance NLP technologies up to 30 percent.

The Economist states, that this paradigm has shifted language technology from usable at a pinch to really rather good”. So, when Barclays, a British bank, offered an identification using a user’s voice and NLP, 84 percent of the users signed up within five months, indicating that consumers are jumping on the bandwagon and wanting more benefits from NLP.

Natural Language Processing (NLP) has entered the mainstream and integrates with Big Data. Take the business traveler. Today, when he or she stays at a hotel, like Wynn Las Vegas, the customer can bypass the front desk when getting extra towels or ordering room service. Thanks to Amazon’s Echo and its use of NLP, a hotel concierge may no longer be necessary. In this new world, Big Data flows in the form of speech, a loop between hotel guest and computer All the guests have access to this technology.  Wynn Las Vegas has already added Amazon Echo devices to each of its 4,748 hotel rooms. As consumers become more familiar with NLP and its time savings benefits, they will be more likely to adopt Natural Language Processing in the home and office, for other tasks.

What does NLP mean for Big Data? NLP will change everything, from Business Reporting and Data Analytics/Synthesis to Security and Data Governance. The future has arrived.

What is Natural Language Processing (NLP)

Natural Language Processing (NLP) combines Artificial Intelligence (AI) and computational linguistics so that computers and humans can talk seamlessly. Think the bridge in Star Trek, where the crew and space ship’s computer talk with each other to explore and survive. NLP endeavors to bridge the divide between machines and people by enabling a computer to analyze what a user said (input speech recognition) and process what the user meant. This task has proven quite complex.

To converse with humans, a program must understand syntax (grammar), semantics (word meaning), morphology (tense), pragmatics (conversation). The number of rules to track can seem overwhelming and explains why earlier attempts at NLP initially led to disappointing results.

In 1969 Pierce wrote that both the funders and eager researchers had often fooled themselves, and that “no simple, clear, sure knowledge is gained.” After that, funding and interest in language technology went into hibernation for two decades. Thanks to Charles Wayne, at America’s Defense Advanced Research Projects Agency during the 1980s, reframed the human computer linguistic problem through another approach, the “common task”.

With a different system in place, NLP slowly improved moving from a cumbersome-rules based to a pattern learning based computer programming methodology. Siri appeared on the iPhone in 2011. In 2012, the new discovery of use of graphical processing units (GPU) improved digital neural networks and NLP.

Now Google has released its own neural-net-based engine for eight language pairs, closing much of the quality gap between its old system and a human translator and fueling increasing interest in the technology. Computers today can already produce an eerie echo of human language if fed with the appropriate material.

Using NLP to Communicate and Summarize Complex Big Data

Business managers have a Big Data problem. They puzzle over dashboards and spreadsheets drowning in too much data and trying to compile it all together into meaningful information. Arria, a company based in London, has come up with a solution. The Arria NLG Platform is a form of Artificial Intelligence, specialized in communicating information which is extracted from complex data sources in natural language (i.e. as if written by a human).

It literally takes an organization’s data and transforms it into language, not standard computer-generated text that is overly technical and difficult to read, but natural human language that reads like a literate and well-educated person wrote it.

Arria’s software can turn a spreadsheet full of data, that is dragged and dropped automatically into a written description of the contents, complete with trends, essentially providing business reports. The Arria NLG Platform has achieved this through two main elements: an analytics component that is programmed to embody the expert knowledge of the domain in which it operates; and, a natural language generation component, which embodies the skill required to communicate information articulately using natural language.

In October of 2016, Aria announced the private beta launch of Articulator Lite, a Cloud-based toolkit that allows users to build their own applications that create content from data. What does this mean? Matt Gould, Arria’s Chief Strategy Officer, likes to think Arria will free Chief Financial Officers from having to write up the same old routine analyses for the board, giving them time to develop more creative approaches. Now computers allow an enterprise to analyze all that information and present it back to whatever human needs it, for whatever reason, in actual written or natural language.

Using NLP to Turn Language into Useful Data

Question answering technology built on 200 million text pages, encyclopedias, dictionaries, thesauri, taxonomies, ontologies, and other databases has gained traction. AI has helped data-rich companies such as America’s West-Coast tech giants organize much of the world’s information into interactive databases such as Google’s Knowledge Graph.

Datalingvo, a Silicon Valley startup, answers questions phrased in natural language about a company’s business data. If a user wants to know which online ads resulted in the most sales in California last month, the software automatically translates the typed question into a database query. However, behind the scenes, a human working for Datalingvo, vets the query to correct mis-interpretations by the computer. Now NLP enthusiasts have a drive to develop a smarter software application than Google, perhaps one that better understands the context of a user’s question (e.g. knows that a person is a student needing information for a report vs a field expert).

Natural Language Processing has much promise in Data Security as well. IBM’s new cognitive phishing detection capability uses Machine Learning to help businesses detect a phishing site up to 250 percent faster than traditional methods.

In addition, Watson for Cyber Security recently launched by IBM, uses Natural Language Processing to gain insights from security documents, aiding Cyber Security in organizations. Nuance, uses around 200 parameters to identifying a speaker, providing access. The software is probably more secure than a fingerprint, says Brett Beranek, a senior manager at the company.

With NLP advances in security, consumers will pay by the sound of their voices. Alipay, a ubiquitous mobile payments service in China uses a chatbot system, adept at Deep Learning and created by Ant, to carry on conversations and provide answers. With cutting-edge advances in NLP, data security in the financial industry will continue to see substantial changes ahead.

Natural Language Processing and the Future

Advancements in NLP have implications in Data Governance. NLP gathers copious amounts of data from users, raising important legal issues about data ownership, privacy, and security. Big Brother will take over more control of what we see and do, but big brother isn’t the government, it’s the big tech corporations: Google, Microsoft, Facebook, Amazon, and others. Governments, to be effective, will need to develop new regulations around how data gathered and disseminated through NLP. Especially where NLP will be tied to financial gain.

An European multinational is already piloting NLP  (in combination with RPA) to digitize its sourcing approach for its long-tail spend – the long list of small purchases that together may account for only a few percentage points of the budget. It uses using Natural Language Processing to interpret free-form text and match order requirements to groups of suppliers, cuing the procurement robot to compare bids and make a purchase. This could change business across a multiple of industries. Organizations around the world will need to be ready to benefit from NLP’s presence.

NLP technology will continue to gain momentum. If you get into a car accident in China in the near future, you’ll be able to pull out your smartphone, take a photo, and file an insurance claim with an AI system. MIT’s  Laboratory for Social Machines develops Data Science methods, primarily based on Natural Language Processing, network science, and Machine Learning – towards mapping and analyzing social systems, enabling new forms of human networks for positive change.

Already the Media Lab’s Cortico provides newsrooms, advocacy and nonprofit organizations, and community influencers tools and programs to connect with their audiences on greater common ground. Those with injuries or disabilities that make it hard to write will benefit and use machine translation based on NLP.  Natural Language Processing has come to transform business and its impact will only increase.

About the author

Michelle Knight enjoys putting her information specialist background to use by writing technical articles on enhancing Data Quality, lending to useful information. Michelle has written articles on W3C validator for SiteProNews, SEO competitive analysis for the SLA (Special Libraries Association), Search Engine alternatives to Google, for the Business Information Alert, and Introductions on the Semantic Web, HTML 5, and Agile, Seabourne INC LLC, through AboutUs.com. She has worked as a software tester, a researcher, and a librarian. She has over five years of experience, contracting as a quality assurance engineer at a variety of organizations including Intel, Cigna, and Umpqua Bank. During that time Michelle used HTML, XML, and SQL to verify software behavior through databases Michelle graduated, from Simmons College, with a Masters in Library and Information with an Outstanding Information Science Student Award from the ASIST (The American Society for Information Science and Technology) and has a Bachelor of Arts in Psychology from Smith College. Michelle has a talent for digging into data, a natural eye for detail, and an abounding curiosity about finding and using data effectively.

You might also like...

Property Graphs: The Swiss Army Knife of Data Modeling

Read More →