Identify Data Patterns with Natural Language Processing and Machine Learning

By on

natural language processingDiscovering, extracting, and analyzing data patterns in textual data from the myriad data sources streaming into modern data-driven organizations is no easy task. Organizations must be equipped with state-of-the art techniques such as Natural Language Processing (NLP) within well-developed Artificial Intelligence (AI) and Machine Learning (ML) platforms, to reliably understand the pulse of their consumers in real time, while also controlling the data deluge that often overwhelms under-prepared organizations.

The ability to derive patterns and insights from a plethora of structured and unstructured document types requires the skill to prioritize and understand which pieces of information are most important to act upon first. According to Karthikeyan Sankaran, Director of Data Science and Machine Learning at LatentView Analytics, in a recent DATAVERSITY® interview, such an skill requires organizations to have a platform that can “harness the textual data assets so they can then potentially solve interesting and profitable use cases.” With such potential in mind, said Sankaran, text Data Analytics is needed to accomplish two main objectives:

  1. A need for models and techniques to extract structure & meaning within each document.
  2. Given a content corpus, ways to tease out the relationships & concept similarities that exist within it.

This is where Natural Language Processing (NLP), as a branch of Artificial Intelligence steps in, extracting interesting patterns in textual data, using its own unique set of techniques. In the interview Sankaran elaborated about the techniques required, detailing how they apply to specific areas of an organization including brand sentiment analysis, recruitment, media and publishing, financial marketing, and call center operations.

DATAVERSITY (DV): What is the main value proposition of LatentView Analytics?

Karthikeyan Sankaran: LatentView is a leading global analytics and decision sciences provider, delivering solutions that help companies drive digital transformation and use data to gain a competitive advantage. With analytics solutions that provide 360-degree view of the digital consumer, fuel Machine Learning capabilities and support Artificial Intelligence initiatives. LatentView enables leading global brands to predict new revenue streams, anticipate product trends and popularity, improve customer retention rates, optimize investment decisions and turn unstructured data into a valuable business asset. LatentView is a trusted partner to enterprises worldwide, including more than two dozen Fortune 500 companies in the retail, CPG, financial, technology and healthcare sectors. LatentView has more than 550 employees in offices in Princeton, N.J., San Jose, Calif., London, Singapore and Chennai, India.

DV: What in your opinion is Natural Language Processing and how is it helpful to the Analytics industry?

Karthikeyan Sankaran: There is a data deluge and it is happening right now, accelerating all the time with the world’s data doubling every two years (equivalent of Moore’s law for the data world). A lot of this data is unstructured composed of images, text, speech, videos etc. From a business standpoint, the most fundamental and important piece of unstructured data is “Text” data which is part of business contracts, product documentation, pricing playbooks, marketing media, to name a few. So, if organizations can harness these text data assets, which are both internal & external to the enterprise, they can potentially solve interesting and profitable use cases. Natural Language Processing is a set of techniques used to extract interesting patterns in textual data.

DV: What is LatentView doing differently with NLP, AI, Deep Learning, and Machine Learning that helps them stand out from others?

Karthikeyan Sankaran: The question that NLP tries to answer is: “Given a large corpus of text, what steps are to be carried out to generate insights and meaning?” There are a wide range of techniques LatentView uses:

Photo Credit: LatentView Analytics


LatentView has a very comprehensive set of offerings around NLP as shown diagrammatically above. The power of analytics is multiplied when algorithms can scale up to large data sets. LatentView’s proprietary platform – Amplifyr is a simple, easy-to-use, code-free analytics platform that aids citizen data scientists to quickly discover insights from large datasets. Amplifyr breaks down barriers to data analysis by dramatically reducing complexity involved in ingesting and preparing data and exploring relationships within data sets. With Amplifyr, businesses can build ensemble models to predict future customer behaviour, churn, sales forecasts, and visually compare diagnostics and understand drivers of customer behaviour or risk.

Additionally, LatentView has over 20 automated proprietary solutions in the AI, ML, Deep Learning domain, focussed on solving key business problems for organizations across industries.

DV: On your website, you have a statement: “LatentView provides a 360-degree view of the digital consumer, enabling companies to predict new revenue streams, anticipate product trends and popularity, improve customer retention rates and optimize investment decisions.” Please elaborate on this? How does LatentView do this? What sort of technologies and practices has LatentView developed to bring this to a reality?

Karthikeyan Sankaran: LatentView’s propriety technology includes Advanced Analytics solutions designed to bring innovation to its customers’ digital transformation initiatives. These solutions support disruptive technologies such as Artificial Intelligence and Machine Learning, which are growing globally as companies seek deeper insights from their data and greater freedom for personnel to innovate.

LatentView works at the intersection of Business, Data, Math (Analytics / Quantitative Techniques) and Distributed Processing (Big Data) to produce actionable insights that helps answer business questions using an approach that is simple, pragmatic and effective. Quantitative Techniques cover the spectrum of Data Science and includes Machine Learning, Data Mining, Process Mining, Statistical Inference, Optimization, Business Process Simulations, Text Analytics, Data Visualization and other Cognitive technologies.

Specialists in business-focused analytics:

  • Single minded focus to create strong impact on business metrics – Use cases:
    1. For a Europe-based iconic automobile manufacturer, LatentView helped reduce warranty costs by 35%
    2. For an American multinational technology giant, LatentView helped improve customer experience that resulted in potential monetization benefit between $3M to $18M annually
    3. For a leading US-based beverages & snack manufacturer, LatentView identified demand spaces leading to a potential lift in sales in the range of 1.2-1.5X
    4. For a leading baby food nutrition company, LatentView drove a 30% increase in web enrollment
  • 33% of our people have a business degree:
    1. Top engineering + Business schools
    2. 1 month training program + LEAP – an extensive, skill-set based curriculum to bring them to industry standards
    3. Well-planned career path that gives them global exposure; play a strategic role in the growth of the Fortune 500 companies they consult for

Experts in unconventional data sources & Cloud-based data ecosystems

  • Partnership with Amazon AWS, & Microsoft Azure
  • Microsoft’s Gold Analytics Partner: LatentView has secured Microsoft Gold Partnership for Data Analytics. Microsoft Gold Partnership is the highest possible partnership in Microsoft Partner ecosystem.

Cutting edge customizable proprietary solutions & reusable frameworks

  • Panel Miner: Panel Miner is a Cloud-based, fully-automated, data engineering, exploration, visualization and analytics solution for digital panel data built using Hadoop (MapR), RedShift (Data Warehouse), S3, Tableau and Python and JavaScript, orchestrated with data pipelines.
  • TurfView: TurfView is a solution which helps firms with a digital presence analyze the competitive landscape with respect to SEM and pricing at a granular level to strategize their online marketing strategy. It is a fully automated, scalable, cloud-based solution capable of scaling up to millions of queries across multiple sites, geographic locations and form factors.

Focus on Innovation & Thought Leadership

  • Institutionalized IdeaLabs for innovation and research: We typically reinvest close to 5% of our revenues into R&D. We have built a culture of innovation and R&D due to the nature of the industry we are in and our teams often deal with new technologies. While innovation has always been an integral part of LatentView’s culture, IdeaLabs (our in-house R&D center) was set up in early 2015 to adopt a more formal and structured approach to innovation. It aims to build market-ready analytics solutions in dynamic and emerging technology areas. It also builds on solutions created by delivery teams, to make them more comprehensive and world class.
  • Partnership with IIT Madras to create IIT Data Labs: In addition to IdeaLabs, LatentView has also established a data analytics lab in association with IIT Madras. The aim of this lab is to conduct advanced research programs and projects in data analytics. The data lab also develops thought leadership while advancing the capabilities of the entire industry.

DV: Where do you see this going into the future? One to three years? Five years?

Karthikeyan Sankaran: Natural Language Processing is growing in stature and can be applied in a variety of situations that deal with text data. In this time where businesses are flooded with more data, these techniques provide decisive insights which was not practical with manual means. LatentView, as experts in harnessing unconventional sources of data, is working on cutting edge NLP techniques helping global brands understand the pulse of the consumer.

Text is only one dimension of the unstructured data challenge. LatentView is also actively working on mining deep insights from other cognitive areas such as voice recognition, speech, image classification etc.


About Karthikeyan Sankaran


Karthikeyan Sankaran (Karthik) has close to 20 years of experience in the software industry with specific focus on Business Intelligence, Data Science & Advanced Analytics. Karthik has provided business solutions using data-driven insights to customers in multiple industry domains like Financial Services, Retail, and Consumer Goods & Manufacturing. In his current role as Director at LatentView Analytics, he operates at the intersection of Business, Data, Math and Technology (Big Data, Cloud) to answer business questions using an approach that is simple, pragmatic and effective. Karthik has an engineering degree from Indian Institute of Technology, Madras and MBA from Indian Institute of Management, Calcutta.

More information regarding LatentView solutions can be found here.


Photo Credit: sasirin pamai/


Leave a Reply