Exploratory Analysis: Parsing Event Data

By on

exploratory analysis x300by Jelani Harper

Of all the traits of Big Data that account for this term’s relevance in today’s Data Management landscape, its temporal aspect may prove the most valuable for the enterprise.

The utility in analyzing massive data of a variety of types is only increased when one is able to do so in real time (or close to real time). The capability of generating queries for such data and asking any variety of germane questions of them with NoSQL technologies can provide the sort of insight that traditional Business Intelligence tools and SQL-based approaches simply cannot match.

The latter can provide knowledge regarding what happened and who caused it to do so. The nearly instantaneous feedback provided by the former option builds upon those capabilities by also providing details into why such an event occurred, and what can be done to either decrease or increase the likelihood of such an event happening again (according to its auspiciousness).

Such a preoccupation with exploratory analysis on real-time Big Data sets that hinge on temporal factors is the focus of the recently released Interana platform.

“It’s all about a person looking at a data set and trying to find the meaning from it,” Interana Chief Technology Officer Bobby Johnson said. “Our angle on this is that what makes that use case really effective is having it be very fast and interactive so you can ask lots of questions quickly.”

Lots of Questions

The most eminent boon associated with analyzing real time, large sets of data is that doing so provides a glimpse into behavior and enables the user to evaluate patterns that can impact one’s business plan or operational procedures. Whether analyzing click-stream data or machine-generated data in the Internet of Things, the number of questions that one is able to issue directly relates to how fast the data is ingested and how fast one can run analytics on them.

Traditional BI and batch-oriented processes are fairly time consuming and limit the quantity and nature of queries because they are primarily historic. Users can generate more queries with a greater degree of specificity when leveraging real-time analytics—especially for event data—that allow them to understand how an occurrence is taking place and how best to capitalize on it.

In this respect, the querying of data in close to real time offered by Interana (and other vendors) proves that the data’s value is revealed in the questions.

“We really expect [our customers] to be seeing very large increases in the number of questions that they can ask,” Interana Chief Executive Officer Ann Johnson revealed. “We find that when people use something that’s quick you can be really sloppy about your questions—you can misprint them or whatever—and you get an answer back so quickly that you don’t need to think about it. However, when people do large batch jobs they end up having to cover a lot of ground, which only adds complexity and room for error.”

Visual Queries

Interana is an integrated solution with a visual interface that utilizes NoSQL. The bottom layer of the stack is a compressed column store that enables a horizontal distribution across machines, so that the user is able to effectively scale as much as desired. The product also features an analytics engine and a web app/user interface layer that allows users to input visual queries that respond with visual outputs to simplify the query process and which capitalize on the pattern recognition aspects of the underlying engine.

As a result, Interana can ingest hundreds of thousands of nodes of data per second in real time and also keep a long history of data in commodity servers that can encompass billions of events. It is also capable of stratifying users according to their behavior. Additional functionality includes the ability to create funnels at will and to create sessions in real time specific for customers and behavior types, with a bevy of options for altering sessions in the future. The pattern recognition capabilities of the analytics engine powering the Interana platform also facilitate a number of different options for analysis of events which can also be characterized by time to optimize insight gained from the solution. Bobby noted:

“It actually kind of splits into two camps in these places. One is the very operational stuff about what’s going well today or, for a SaaS company, what are my biggest accounts that are in danger of churn and let me go see what’s really going on there. And then there’s also the things that are sort of more the product end of the world. So for SaaS companies that would just be things like how are people using this feature or, we think this feature is really important, people aren’t using it, let’s figure out what click streams look like around this so we can figure out how to make the product better.”

Use Cases

Interana’s focus on time driven event data for Big Data sets makes its solution particularly viable for those within the retail and communications space—although the platform applies to numerous other vertical industries as well. Despite the fact that the product was initially released to the general public earlier in October, the company has already generated some noteworthy customers and uses.

  • SaaS: Interana is of extreme interest to SaaS companies due to some of the unique aspects of their business in which these entities need to monitor not only the habits and usage tendencies of individuals (end users), but also of collectives (the companies that supply the software to the end users). The complexity associated with understanding the behavior of both customer types can be reduced with the aforementioned options for analytics functionality.
  • Sony Playstation Now Streaming Platform: This customer and others in the gaming space utilize real-time analytics for Big Data to monitor the behavior and tendencies of their players in order to enhance their products and the overall gaming experience. Analytics solutions such as Interana can free administrators to take on more meaningful tasks and enable others to perform analytics. Bobby remarked: “It’s lots of simple data that gets questions [such as] usage is up or down, figure out why. And often those aren’t earth-shattering epiphanies, but if everyday you follow up on the things that have changed, then that compounds over time.”
  • Telecommunications: A large telecommunications company with headquarters in France was interested in Interana’s capabilities for monitoring call detail records to ascertain service quality, determine how many calls dropped and why, and reduce the rate of churn.
  • Internet of Things: The IoT will have a significant number of options for performing analytics on event driven data in real time. The increasing trend towards interconnected devices within the public and private space should only add to the need for real-time analytics platforms.

Exploratory Analytics

The advent of Interana is indicative of the growing tendency for analytics vendors to provide real-time insight into Big Data. In addition to the number of vendors that are doing so, the applications of these types of analytics are ever expanding across industries and enterprises as organizations find data driven ways to achieve their business and operations objectives. As such, the overall usage of data is also increasing as is the ease required to use it, which has shifted from predefined, rigid objectives to exploratory analysis to refine enterprise processes.


Leave a Reply