Trying to make sense of data is nothing new. Companies have been applying analytical methods since before the computer was invented. Once computers and storage became reasonably cheap and powerful, in the late 1980s, companies started using Business Intelligence (BI) software to try to find meaning in their data. Today’s Advanced Analytics go beyond the capabilities of Business Intelligence.
Business Intelligence is driven by an end user who knows the questions they want to ask and has deliberately collected data to support their inquiry. Advanced Analytics let users find answers when they don’t even know what they should be asking by identifying meaningful patterns. Rather than just looking backwards at the data to understand what has happened, Advanced Analytics use the data to look to the future and understand what is likely to happen. While both Business Intelligence and Advanced Analytics are used to improve decision making, the two techniques have different goals and use different methods.
Three Kinds of Analytics
There are three basic kinds of Analytics, increasing in both complexity and power.
- Descriptive Analytics: Like Business Intelligence, it looks backward to understand what happened. The techniques used in these Analytics are the least complicated, mostly summarizing data through counts, aggregations of metrics, and simple calculations such as averages. These Analytics may also use data mining to find correlations between variables and help identify reasons for previous success or failure.
- Predictive Analytics: Uses statistical models and other methods to predict what might happen based on previous events. These Analytics use statistics, data mining, Machine Learning methods, and business rules to make probabilistic predictions of the results of certain actions.
- Prescriptive Analytics: Gives recommendations on what should be done, including for questions that weren’t specifically asked. These Analytics extend upon Predictive Analytics through the use of optimizations and simulations to evaluate possible actions and the potential impact of each option.
The Power of Analytics
Each of the three kinds of Analytics serves a purpose, providing businesses with insight that can support decision making. Some use cases are general, cutting across all industries, while every industry also has its own specific applications. The sources of the data to be analyzed vary with the applications, pulling together structured transaction data, machine-generated data, and clickstream data with unstructured data from forums and social media posts.
Descriptive Analytics helps companies generate an understanding of customer behavior such as through Clickstream Analytics that quantify the behavior of the customer on the website. Telemetry data can help insurance companies understand customers’ actual driving behavior.
Predictive Analytics help companies make decisions. Common uses include generating credit scores and identifying potentially fraudulent transactions. Other uses include forecasting customer demand, managing inventory and the supply chain, and in scheduling preventative maintenance to replace parts that appear ready to fail before the failure actually occurs and impacts the user. Through combining transaction history with clickstream and social media data, companies can deliver targeted marketing and personalized recommendations.
Prescriptive Analytics is used to help companies optimize decisions made for scheduling, planning, and operations. Package delivery service such as UPS can use Prescriptive Analytics to plan delivery routes, while hospitals can optimize the scheduling of transport services to bring patients to labs for tests. Financial firms can use Prescriptive Analytics to make trading decisions.
Analytics projects use a wide variety of techniques drawn from statistical methods and computer science to identify relationships in the data being analyzed. Specific methods used include:
- Arithmetic Counters: Descriptive Analytics rely largely on basic aggregation functions that count the number of times an event or value occurs, or calculates simple metrics such as average, maximum, and minimum values.
- Machine Learning and Data Mining: Teach computers to identify patterns and relationships in data. Unlike Business Intelligence, which relies on the user asking a question, these techniques identify relationships that a user doesn’t know to ask about.
- Regression Analysis: Identifies how changing an independent variable influences another, dependent, variable.
- Text Analytics: Generates insights from unstructured text including social media and online forums. These Analytics combine computational linguistics, statistics, and Machine Learning.
- Social Network Analytics: Analyze the relationships (rather than the content shared) within a social network.
- Multimedia Analytics: Generates insights from video, still image, or audio data.
- Sentiment Analysis: Scores opinions expressed in text to evaluate them as positive or negative.
- Monte-Carlo Simulation: Helps predict what can happen.
Analytics Tools and Vendors
There are several open source projects common used to support Big Data projects. Hadoop simplifies the development of distributed programs that speed Big Data projects by allowing computations on subsets to run in parallel through its MapReduce component. Hadoop also provides HDFS, the Hadoop Distributed File System, that supports storage of Big Data files. Another open source project commonly used for Big Data is Spark, which relies on the resilient distributed dataset model and offers better performance than Hadoop’s MapReduce.
Because Analytics often works with unstructured data, traditional SQL databases are not always appropriate for data storage. So-called NoSQL databases provide an alternative. Common NoSQL databases used for projects include MongoDB and Cassandra. NoSQL databases don’t use the traditional row-and-column model for storage but rely on alternative models such as graph-based, column-based, document-based, or key-value store.
Despite these products being open source and available freely, working with them is complex and there is a vibrant market for Analytics companies. Products and services support the range of Big Data project work, from data acquisition and cleaning through developing models and analytic applications and interpreting the programs’ output.
Some companies provide supported versions of products, such as HortonWorks and Cloudera for Hadoop. Because of the storage demands of Big Data, many companies require Cloud storage, and Cloud providers such as Amazon Web Services and Google Cloud offer environments tailored for Big Data and Analytics development projects.
The shortage of Analytics professionals means many companies rely on vendors and consulting firms such as PwC and Accenture to develop their Analytics applications rather than hiring their own staff. Analytics-as-a-Service and desktop Analytics tools attempt to reduce the need for IT staff to create Analytics environments and applications and gives business users more direct access to their data and results. Oracle offers pre-built Analytics for areas such as ERP and CRM. Tableau offers desktop-based Analytics programs that connect to common business data sources such as Excel.
The Future of Advanced Analytics
Technology trends, such as the development of Spark and end-user Analytics tools, are making Analytics more efficient and more widely available than ever. Automation is streamlining the mechanical aspects of managing large datasets. Many companies have adopted Descriptive and Predictive Analytics, and Prescriptive Analytics are becoming more widespread.
While all Analytics look back at historic data, new trends are developing in real-time Analytics, which is enabled by processing in memory and in-database Analytics. These real-time recommendations will enhance customer support interactions and also support business decision-making through dashboards that update in real-time.
Analytics are also being embedded into Internet of Things devices, enabling them to make decisions autonomously, as in self-driving cars. These embedded Analytics will add intelligent decision-making capability to devices that are remote from the data center, such as oil rigs and satellites.
With the capability to use data to make immediate decisions and to allow machine decisions, Big Data and Analytics will enable the world of business will truly become data-driven.