Waqar Hasan, who in a past life was vp for data platforms at Yahoo, hasn’t lost his fascination with the power a business can gain when it knows what to do with its data – make that its Big Data. Now CEO of InsightsOne, Hasan and his company are focused on making predictive analytics accessible to the general B2C marketing organization via the cloud.
Among its early customers is online review site Angie’s List, which in mid-January selected the cloud-based predictive analytics solution to deliver a 1-to-1 consumer marketing experience to its members.
“We’re targeting B2C marketers to increase the relevance and profitability of their consumer interactions, by applying micro-segmentation on Big Data to extract all sorts of signals from the data and turn it to a more powerful predictor for the future – who will buy what and who is likely to churn,” Hasan says.
Behind the goal is a system that marries open source technologies like Hadoop and real-time processing with capabilities including machine learning built atop its graph processing infrastructure to help companies see patterns within time-stamped data, like consumers’ online searching or buying behaviors and recorded call center interactions. Its Graph and Sequence Processor (GRASP) supports new structured data types such as graph structures, time sequences and row sets, the company says, in order to find and exploit relationships within data such as correlations and dependencies.
“What is important in terms of predictive analytics is to take sequences of actions in time and see patterns in them,” he says. “That’s a very good fit with graph processing representations vs. traditional relational representations, where in a data warehouse you lay things out in a fact table and have a very large collection of events that happened. But you can’t turn them into sequences because relational technology is not very good at joins.”
Its configurable machine learning algorithms work on top of its graph data infrastructure. “What we are trying to do with machine learning is basically micro-segmentation, to combine lots of weak signals into strong ones,” Hasan says, to find otherwise less than apparent correlations in data. It applies whatever data a customer supplies from its interactions on web or email or other channels to the task, and its approach includes turning their unstructured data into structured data to feed into its graph processing and sequence processing infrastructure. It could break down an online product review, for instance, into separate pieces including sentiment expressed about the product, what time the review was posted, and by whom, so that gets accounted for, in addition to using the structured data.
Its approach is to use text processing techniques that are statistical in nature. Unstructured Data Processors are a collection of inference engines that extract signals from semi-structured data such as tweets, searches, transaction descriptions and reviews.
“We find in general that it is statistics technologies that are language independent that tend to work a lot better vs. something that seeks to build language-specific solutions or natural language processing-like techniques,” says Hasan. “Regarding sentiment, for instance, I would say it is not about building a deep understanding of language itself. It’s really about finding the right phrases that end up correlating with certain kinds of actions. So we look upon it as what is it that correlates in a statistical sense, and that is the basis for inferring it is sentiment that causes these kinds of actions.”
Companies have traditionally tried to do predictive analytics the manual way, with statisticians building models based on extracting data from a data warehouse, then handing it over to IT to build it into particular systems like web sites or other consumer interaction points, he says. “But those models are not using all the big data statistics signals. They typically don’t use unstructured data and they get stale very quickly,” he says. “So the transition is happening from coarse-grained, manual, batch-style treatment of consumer interactions to something that is Big Data-based, real time, and where the models are a lot more accurate, and that is changing the quality of the interaction and the profitability of the interaction.”
As an example of its success, he says InsightsOne also has customers including a major health insurance company that can predict who is likely to churn four times better than with the prior generation of technologies.
InsightsOne, which raised VC financing just over a year ago, has focused in that timeframe on features such as making its real-time capabilities more robust. That’s key for enabling interactions like one where call center employees can know that a customer calling in has just been on the company’s web site looking at a certain product, so that they can be advised to talk to the customer about that product, the prediction being that that’s what they’re most likely to respond to. The predictive analytics system knows the customer has just looked at that product right before the call, which makes that the more relevant area of focus than older interactions, Hasan says, though it’s also possible to look at what the same customer has done over a period of time and weigh that into the next-best action or offer for the center operator to make.
“Companies like Google, Facebook, Amazon and Yahoo have built consumer predictive analytics in house,” says Hasan. “InsightsOne wants to make similar capabilities available to the rest of the world.”