The Convergence of Big Data and Small Data

By on

Big Data Small Data x300by Jelani Harper

The projected backlash or sentiment of disillusionment which accompanies virtually every technological innovation has manifested itself in terms of Big Data, with the growing movement towards Small Data.

An analysis of the spreading Small Data phenomenon in the wake of increasing adoption of Big Data reveals that these concepts are by no means mutually exclusive. In most instances they utilize the same technologies and share the same data sources and apps.

The notion of Small Data is based on a shifting of perspective – Small Data is the application of data (including Big Data or otherwise) that is most useful, easily accessible, and readily useful to a particular organization or branch of that organization.

Still, there are several aspects of the idea of Small Data that address deficiencies (or in some cases what are merely perceived deficiencies) regarding Big Data initiatives. These include:

  • Data Science Issues: Small Data alleviates the notion of hiring and training myriad Data Scientists, which are relatively scarce and therefore in high demand. Small Data is the application of data that laymen can understand and employ.
  • Decentralization: Big Data initiatives are still largely viewed as centralized processes that empower IT departments and alienate the end user from their data. Small Data, however, emphasizes a distributed approach with greater autonomy and capacity for leveraging data for the business.
  • Expensiveness: Amassing the requisite hardware and software to internally structure the architecture and infrastructure of Big Data initiatives (especially when eschewing more cost effective Cloud options) – which becomes exacerbated by necessary analytics – is perceived as too costly. Small Data initiatives, on the contrary, enable organizations to utilize infrastructure, data sources and data that they already have access to which reduces resource expenses.
  • Governance Concerns: Several aspects of Data Governance including data quality, data cleansing, and data integration can become complicated by Big Data, particularly with those that are unstructured. Small Data offers a narrower scope of focus that more readily fits into established governance regulations.

Moreover, there is a growing perception within the field of Data Management that Big Data is unnecessary, that its initiatives are too sprawling, and it burdens an organization with superfluous data that does not assist in achieving core business objectives.  In this respect the primary distinction between Big Data (which can encompass machine and sensor data) and Small Data is largely the scope. Although Small Data can involve Big Data, the former is specifically tailored to deliver information that the end user needs.

According to Small Data advocate Allen Bonde:

“Small Data is about the end-user, what they need, and how they can take action…Focus on the user first, and a lot of our technology decisions become clearer. This has been the case for customer-facing systems and applications.”

More with Less

One of the most intriguing aspects of Small Data is that they don’t require additional investment. Small Data simply necessitates that organizations do more with what they already have or with what’s available to them. One of the richest sources of Small Data which doesn’t require Big Data is transactional data, particularly for marketing departments.

The prudent analysis of transactional data such as information regarding inventory and basic buying history for customers can yield insight into demographic and purchasing patterns for cross and upselling. Other examples of Small Data sans Big Data include relatively simple web data, such as meteorological data which can impact consumer purchasing behavior and market trends.

Making Big Data Small

Small Data that incorporates Big Data is simply a refinement of the latter’s near infinite capabilities into an application that directly informs the end user. Small Data is about effectively bringing the power of Big Data to the individual without complicating the procedures and data sources with complex algorithms, statistics, and Data Scientists. Organizations are able to render Small Data by not only applying the vast resources of Big Data to a narrow focus, but by also building simple apps and tools with which users can access and manipulate data – without extensive training in Data Science.

Social media is a particularly appropriate place to begin the conversion process from Big Data to Small Data. Responsive apps created in a consumer-driven style can utilize text analytics and search technologies to quickly yield results from search engines and eminent sites such as Twitter or Facebook, detecting which products or services are resonating most or least with consumers. Common usability for such apps for marketing include intuitive processes to create profiles, issue and reuse campaigns, as well as share reports with others. Such data becomes all the more useful when combined with transactional and additional web data, which is why Small Data is an effective means for structuring Big Data initiatives.

Empowering Consumers

Another valuable application of Small Data is to create greater connectivity between the consumer and the enterprise. Although there are some notable exceptions and instances in which consumer intelligence is offered to customers, most Big Data applications consist of organizations collecting and analyzing data about current and potential customers.

However, by providing the consumer access to his or her own Small Data (and doing so with the easily manipulative, user friendly apps that accompany a number of mobile devices in contemporary times) the consumer can derive his or her own insight and fuel further product consumption based on such data. In certain industries such as healthcare, for instance, the difference such a Small Data application can make could be considerable.

A post on Cornell Tech’s site hints at the possibilities:

“It’s about greatly enhancing with…personalized data-driven insights, insights such as early warning signs of a problem, or indicators of gradual improvement….I like to think of it as a personalized “social pulse”…that can indicate subtle but significant changes in a person’s wellbeing… Once I, as a patient and consumer, can access the data that service providers have about me, I can then use these data to fuel apps that I subscribe to.”

Decentralized Democratization

The Small Data model is largely decentralized and is based on the individual needs of specific business units or operations personnel. As previously denoted, these needs can serve as the means of tailoring a Big Data program so that the proper data are collected and routed towards the users who need them the most. The centralization associated with typical Big Data efforts consolidates the data, its authority, and (subsequently) its use in the hands of a few. Whereas the decentralized aspect of Small Data effectively represents a democratization of data wherein access is granted to all who have relatively equal input in what data is selected and how it is employed. The democratized model enables greater agility and autonomy to users in a method that is suggestive of the overall consumerization of IT.

Vendor Focus

Small Data represents a switch in thinking about the Big Data phenomenon and a refocusing of priorities to the individual applications of data based on organizations, departments, and individual end users. Such a focus is easily lost in all of the possibilities and potential that Big Data has with its continuous data streams and surplus of sources. By instead concentrating on a decentralized model in which applications of data (instead of the data itself) is prioritized, organizations can ensure that they are deriving the most from their Data, whether they are Big or Small.

This repositioning of priorities in which data is designed for simplified use in specific cases has been and will continue to be reflected in apps, products, and services offered by some of the more progressive vendors and service providers, including Cloud provider GoodData and traditional Data Discovery vendors such as Tableau and QlikTech, among others.

Leave a Reply