What Is Big Data?

Big Data refers to extremely large data sets of varying types of data – structured, unstructured, and semi-structured – that can be collected, stored, and later analyzed to provide insights for organizations.

Big Data’s promise depends on how the data is managed. In the past data was organized in relational models, sometimes within data warehouses, and controlled through various ETL (Extract, Transform, and Load) processes. This strategy does not work well with Big Data; the size and complexity of the datasets have caused enterprises to adopt new processes and different approaches (such as NoSQL or non-relational databases) that have drastically changed many time-honored Data Management practices.

[dv-promo buttontext=’TAKE OUR DATA MANAGEMENT CERTIFICATION PREP COURSES’ buttonurl=’https://training.dataversity.net/learning-paths/dmbok-and-cdmp-preparation-learning-plan?utm_source=dataversity&utm_medium=inline_ad&utm_campaign=DMBOK_LP_temp2&utm_content=copy3′]

In the Data Management Body of Knowledge (DMBOK), Big Data is described by:

Volume: The amount of data. Often this consists of thousands of instances or billions of records.
Velocity: The speed at which data is captured, generated, or shared. This can be distributed and analyzed in real-time.
Variety/Variability: Forms in which data is captured or delivered. These can take different data structures that are often inconsistent within or across data sets.
Viscosity: The difficulty to use or integrate the data.
Volatility: The timeliness of the data. Its changeability.
Veracity: The credibility of the data.

Other Definitions of Big Data Include:

“High-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.” (Gartner)
“Vitality, in addition to Volume, Velocity, Variety, and Variability. Vitality describes a dynamically changing Big Data environment in which analysis and predictive models must continually be updated as changes occur to seize opportunities as they arrive.” (Dr. Peter Aiken)
“A major disruption in the business intelligence and data management landscape, upending fundamental notions about governance and IT delivery.” (Forrester)
“A driving force behind many ongoing waves of digital transformation, including artificial intelligence, data science and the Internet of Things (IoT).” (Forbes)
“A holistic information management strategy that includes and integrates many new types of data and data management alongside traditional data.” (Oracle)
“The way organizations create jobs by increasing the speed and transparency, creating a lot of data.” (Daisy Ridley)
“Data that exceeds the processing capacity of conventional database systems.” (O’Reilly)

A Few Uses of Big Data are:

Decrease expenses.
Stimulate innovation.
Find and act on business opportunities.
Define predictive and prescriptive models that anticipate customer needs, improve business interaction, and ultimately affect the ROI of a business.
Target customers in more efficient ways.
Optimize decision-making and business processes.
Enhance enterprise-wide performance.
Improve data security, compliance, and regulation.

Photo Credit: Photon photo/Shutterstock.com

What Is Data Modeling?

What Is a Knowledge Graph?

What Is a Graph Database? Definition, Types, Uses

Thanks!

What Is Big Data?

Related Data Concepts

What Is Data Modeling?

What Is a Knowledge Graph?

What Is a Graph Database? Definition, Types, Uses

Lead the Data Revolution from Your Inbox.

Thanks!