Business Intelligence (BI) has emerged from the backrooms of IT departments to take up residence in the front offices of personnel in business and operations, and even of some C-level executives.
Despite advancements in Data Discovery tools which considerably expedite the ease and insight offered, in Big Data applications the best of these platforms offer analysis of historic data that may come close to, yet frequently fall short of, real-time analysis.
How much more valued—and used—would this technology be if it could augment historical analysis with that of live data, showing real time updates of trends and developments as they actually occurred?
According to Time Warner, Thomson Reuters, Honda, and other clients of ScaleOut Software (which has 400 unique customers in 32 countries, 35 of which are in the Fortune 500) the answer is considerably more.
In addition to providing the means of synthesizing conventional BI with analytics for live data as it becomes available with its two principle products, ScaleOut hServer and ScaleOut Analytics Server, ScaleOut Software technologies utilize one of the most ubiquitous Big Data platforms, Hadoop, so organizations can leverage a framework with which they are already familiar to access Operational Intelligence—real-time analytics of Big Data.
“We are analyzing data while it’s changing,” said ScaleOut Software COO David Brinker. “It’s the data your operational applications are continuously updating. At the same time we’re able to run MapReduce analysis over that data so that you’re able to identify business opportunities or business issues in sub seconds or a few seconds.”
With ScaleOut Software’s technology, enterprises are not only able to perform real-time analytics on continuously occurring data (such as streaming or sensor data), but they are also able to analyze changes in the state of the data—as well as those changes themselves.
Expedient Analytics
ScaleOut Software technologies incorporate a number of innovations that enable it to take advantage of MapReduce’s scalability and potential for analytics. On its own, MapReduce is a parallel computing engine that is batch oriented and requires substantial time to perform analytics, particularly on the copious quantities of Big Data. ScaleOut’s advancements which enable it to significantly expedite the analytics process include the use of:
- In-Memory Data Grids: Data in ScaleOut Software’s products are stored in-memory, which grants faster response times than systems that utilize disk memory or access data from files systems such as Hadoop’s HDFS.
- Hadoop: Last fall the company put Hadoop within its grid so that the former’s MapReduce now utilizes the company’s proprietary engine, parallel method and location, for computations. Using ScaleOut Software’s engine seamlessly eliminates the need for batch processing and data motion overheads that contribute to MapReduce’s latency. By re-implementing MapReduce onto ScaleOut’s engine, the company was able to measure up to 20 times faster performance.
- Data Shuffling: There are two principle phases of MapReduce, a mapping phase which runs user programs and a reduce phase which presents the results of the initial phase. Typically, the results are re-organized prior to the reduce phase, which requires shuffling to re-organize the data. Software Scaleout expedites this process by minimizing the interaction between the mapping and reducing phases, as well as by having its data already in-memory.
“The key aspect is not only that it goes faster, but it works on live data,” stated ScaleOut Software founder and CEO Dr. William Bain said. “That data is hosted in-memory and is changing continuously. For example, portfolio positions are changing in a financial services application. You’re able to analyze them even as they’re changing. That’s something that’s very different from the concept people have of MapReduce in the data warehouse.”
BI and Operational Intelligence
Perhaps the full potential of operational intelligence is best realized by combining it with traditional BI. Conventional BI typically is applied to static data sets which frequently utilize disk storage and are useful for identifying long term trends based on historic data. Recent advancements in these applications pertaining to Data Discovery tools, Cloud Computing, and in-memory technologies have enabled enterprises to accelerate BI in the warehouse from a matter of days and hours to several minutes.
However, as Hadoop increases in popularity and Big Data becomes more ubiquitous, there is a growing trend to utilize the former as an integration hub to run analytics on all data, be they from legacy systems (such as CRM or MDM solutions), or from conventional Big Data sources. Doing so requires leveraging Hadoop’s scalability to move the warehouse within it. ScaleOut Software’s in-memory environment—in tangent with MapReduce—is ideal for performing Extract Transfer and Load (ETL) processes that are essential to getting the data into the warehouse for BI.
Subsequently, organizations can utilize a single skill set to perform both real-time analytics on live data and traditional BI to evaluate how those contemporary changes measure against historic trends. This synthesized approach provides both tactical feedback from live data and strategic insight from the warehouse, and a more comprehensive overview of factors affecting business and operational processes.
“Where we have unique value is the ability to run continuous analytics on data while it’s changing,” Brinker said. “We can analyze streaming data but we can also analyze stateful data—so fast changing states as opposed to just the changes to that state we can look at. We can look at a portfolio of stock as it changes rather than just looking at the stream of changes to that portfolio. That allows you to do much richer analysis. We can combine what you learn from your Business Intelligence activities into operational intelligence activities.”
Use Cases
The applications for Operational Intelligence—particularly in combination with conventional BI—are nearly as limitless as the applications for analytics in general. With customers in a wide variety of vertical industries, use cases can range from analyzing transactional data for fraud detection to enhancing recommender engines. Some of the more exemplary use cases from current customers are included below.
A fast food company in the Midwest has a number of point of sale physical locations as well as an Internet presence with which customers can order food online. Its use of Scaleout Software technology allows it to ingest data from its different remote locations into the data warehouse so management can evaluate various facets of product analysis and regional differentiation. Simultaneously, the in-memory computing grid can also analyze the live data at timely intervals to provide instant feedback on the performance of particular stores as well as of individual personnel, which provides a way of augmenting data for both customer and employee behavior. “So you see, over time it becomes a very powerful tool for tactical feedback that is an adjunct to the strategic feedback that they’re getting from their data warehouse,” Bain explained.
Another client, a hedge fund company in New York, utilizes Software hServer to manage the various sectors of long and short position hedge funds, which change numerous times throughout the course of the day as the market feed fluctuates. Prior to the implementation of Software Scaleout’s technology, the company went through the protracted process of storing all of this data in a server, issuing queries on certain hedges, extracting the relevant data and matching it to the market feed, updating the database and looking for alerts for a particular trade. Leveraging the power of parallel computing and MapReduce with Software Scaleouts products, however, the organization is able to reduce this 15 minute process to a matter of milliseconds.
“This is an example of how you can dramatically reduce the time to generate results by using pure MapReduce in a real time operational environment,” Bain remarked. “And what’s characteristic of this type of application is the data is a strategy that’s changing. Every few milliseconds as the market feed comes in we’re updating the positions and the prices.”
Best of Both Worlds
Operational Intelligence substantially accelerates analytics processes of BI and is applicable to live data, which presents an entirely greater degree of functionality for Big Data and its applications. With Operational Intelligence, users can actually see how continuous data changes and affects their business and operations processes, which enable them to largely realize the potential of Big Data as the game-changing interrupter of the data landscape that it was hailed to be.
The true genius of Software ScaleOut and its facilitation of Operational Intelligence is that it implements it with one of the most widely used, and readily accessible platforms for Big Data—Hadoop and MapReduce. The combination allows the enterprise to make the most of analytics and Big Data while using a framework with which it is already more than likely comfortable:
“This ability to analyze and respond within milliseconds to seconds has many applications that are useful in live environments” Bain said. “In a live reservation would be another environment—trying to reorganize reservations when a weather event occurs…the list just keeps growing. It’s a very useful thing to do and it’s an area in which people who have the MapReduce skill set from the data warehouse can apply it to a new environment.”