Advertisement

Demystifying Data Analytics Models

By on
Read more about author Prashanth Southekal.

In today’s global landscape, organizations worldwide are increasingly turning to data analytics to enhance their business performance. Research conducted by McKinsey Consulting revealed that data-driven companies not only experience above-market growth but also witness EBITDA (Earnings Before Interest, Taxes, Depreciation, and Amortization) increases of up to 25% [1]. Additionally, Forrester’s findings indicate that organizations utilizing data to derive insights for decision-making are nearly three times more likely to achieve double-digit growth [2]. 

Fundamentally, the cornerstone of deriving actionable insights from data lies in the development of robust analytics models, which serve as the crucial bridge between raw data and valuable insights. But what exactly constitutes an analytics model? The term “analytics models” is frequently employed in the realm of analytics, yet it is prone to misuse and misinterpretation due to its reliance on the intended purpose, available resources, and other constraints. The term “analytics models” has indeed become somewhat of a cliché, often overused, and lacking in original thought.

Technically, an analytics model serves as a mathematical representation of a real-world system, facilitating insights into its behavior. It organizes business categories, entities, and events by utilizing data variables and frameworks, employing equations or algorithms. The primary objective of such a model is to derive insights that aid decision-making and the implementation of appropriate business strategies, leveraging data as the input. A high-quality analytics model exhibits three crucial characteristics:

  1. Generic: A robust analytics model not only performs well on the data it was initially trained on but also produces reliable insights when applied to new data within the existing business model. It should capture the essential system characteristics while abstracting away unnecessary details.
  2. Robust: The analytics model should possess adaptability, enabling it to reflect changes in the data, particularly in relation to evolving business models.
  3. Scalable: Effective analytics models should demonstrate the capability to swiftly process and analyze large volumes of diverse data, catering to both existing and new business models.

Essentially, there exist numerous approaches to formulating analytics models, and broadly speaking, analytics models can be conceptualized from four primary perspectives or manifestations, as illustrated in Figure 1 below.

Figure 1: Analytics Models Taxonomy

Analytics, fundamentally, involves questioning to extract insights from data for the purpose of measuring and enhancing business outcomes [3]. Depending on the nature of these inquiries, three types of analytics models emerge:

  1. Descriptive analytics models: These models address the question, “What happened?” Descriptive analytics delves into historical data to discern past patterns, trends, and relationships. It employs exploratory, associative, and inferential data analysis techniques. Exploratory data analytics models scrutinize and summarize datasets, while associative descriptive analytics models elucidate the relationships between variables. Inferential descriptive data analysis is utilized to infer or extrapolate trends about a larger population based on a sample dataset.
  2. Predictive analytics models: These models focus on answering the question, “What will happen?” Predictive analytics involves the process of utilizing data to forecast future trends and events. Predictive analysis can either be carried out manually, commonly referred to as analyst-driven predictive analytics, or through the utilization of machine learning algorithms, also known as data-driven predictive analytics. In either case, historical data serves as the basis for making future predictions.
  3. Prescriptive analytics models: These models assist in answering the question, “How can we make it happen?” Essentially, prescriptive analytics recommends the optimal course of action for progression, utilizing optimization and simulation techniques. Typically, predictive analysis and prescriptive analytics are intertwined, as predictive analytics identifies potential outcomes while prescriptive analytics explores these outcomes and identifies further options.

Another perspective on analytics models is based on the level of uncertainty or probability associated with deriving insights. This classification yields three types of analytics models:

  1. Deterministic models: These models provide insights accurately with certainty. Typically, insights derived from historical data fall into this category.
  2. Stochastic models: These models incorporate randomness or uncertainty into the model. Insights regarding future states derived from Regression and Gradient Boosting Machines (GBM) models exhibit a level of uncertainty and are considered stochastic.
  3. Numerical models: These models are utilized to approximate the behavior of complex systems by dividing them into smaller, more manageable components, which are then solved iteratively using computational methods. Examples of numerical models include the Finite Element Method (FEM) used in structural engineering, the Weather Research and Forecasting (WRF) model employed in weather prediction, and epidemiological models used to simulate the spread of diseases.

Analytics models can be classified based on the number of variables in the model. In this regard, the data analytics models can be univariate, bivariate, or multivariate [4]. 

  • Univariate analytics models: Univariate analysis involves analyzing the pattern present in a single variable using measures of centrality (mean, median, mode, and so on) and variation (standard deviation, standard error, range, variance, and so on).
  • Bivariate analytics models: There are two variables wherein the analysis is related to cause and the relationship between the two variables. These two variables could be dependent or independent of each other. The correlation technique is the most used bivariate analysis technique.
  • Multivariate analytics models: This model is used for analyzing more than two variables. Commonly used multi-variate analytics models are Linear Regression, Logistic Regression, and Support Vector Machines. 

Finally, regardless of the type of model selected, managing associated risks is crucial in analytics. From a risk perspective, analytics models can be classified into three main types:

  1. Single models: These models utilize a single algorithm to derive insights. Examples include linear regression, decision trees, logistic regression, and support vector machines, among others. Single models are straightforward to implement and typically perform adequately for the given dataset and problem.
  2. Hybrid models: Hybrid models combine two different analytics models. For instance, a hybrid classification model might consist of one unsupervised learner (or cluster) for preprocessing the training data and one supervised learner (or classifier) for learning the clustering result, or vice versa.
  3. Ensemble models: Ensemble models employ a variety of different algorithms on the same or different datasets to produce outputs. The amalgamation of models often yields superior performance compared to using a single model or a hybrid model. Essentially, ensemble models involve multiple models operating independently of each other to generate insights using a voting system.

Data analytics is revolutionizing business models by fostering the creation of new revenue streams, driving expense reduction, and mitigating risks for enterprises across diverse industry sectors. The significance of selecting the appropriate analytics model cannot be overstated, as it plays a pivotal role in enabling companies to realize the desired business impact. The four manifestations of data analytics models outlined here offer a framework for effectively deploying analytics tailored to specific business requirements, existing capabilities, and resource availability. Consequently, leveraging these models enhances the likelihood of delivering successful analytics initiatives, thereby fostering improved business outcomes.

References

  1. McKinsey, “Insights to impact: Creating and sustaining data-driven commercial growth,” January 2022
  2. Evelson, Boris, “Insights Investments Produce Tangible Benefits — Yes, They Do,” May 2020
  3. Southekal, Prashanth, “Analytics Best Practices,” Technics, 2020