Top of Page

big-data

Six Ways to Define Big Data

Berkeley School of Information (datascience@berkeley) recently asked experts in a variety of industries to provide their definition of “Big Data”. They received over 40 definitions that ranged from the traditional 3 Vs (i.e., Volume, Velocity and Variety) to anything related to analytics or visualization. Some of these definitions were fairly narrow and focused only a single concept (e.g., integrating data) while other definitions were broad and included a variety of concepts (e.g., 3 Vs, analytics, visualization). Clearly, depending on what expert you ask, you will get a different definition of Big Data. But there are common themes that emerged from these definitions.

Based on the content of  the 40+ definitions, I generated 10 categories that described the content of the definitions. Each of these 10 categories could be used to describe the Big Data definitions.  Some definitions included multiple categories, while others included only a few or one category. A principle components analysis of the 10 categories (mentioned = 1; not mentioned = 0) across the definitions resulted in six (6) general categories that are used to define Big Data (N represents the number of definitions that could be described by this category):

  1. Characteristics of the data: Big Data is about the traditional 3 Vs (Volume, Velocity, Variety) of data (N = 19) and the non-routine computing resources needed to process those data (N = 11).  Big Data is “data that contains enough observations to demand unusual handling because of its sheer size.”
  2. Insights: Big Data  is about the insights/results/value (N = 17) we get from data and the people necessary for extracting these insights (N = 3). Big Data “enchants us with the promise of new insights.”
  3. Analytics: Big Data is about analytics and modeling methods (N = 12) and their application in improving decision-making (N = 4). Big Data allows us the “opportunity to gain a more complex understanding of the relationships between different factors and to uncover previously undetected patterns in data.”
  4. Data Integration: Big Data is about the the integration of various disparate data sources and harnessing its combined power (N = 6). What’s big in Big Data is “the big number of data sources we have, as digital sensors and behavior trackers migrate across the world.”
  5. Visualization and Story-telling: Big Data is about being able to tell a story (N = 1) through visualization (N = 2). Big data is “storytelling – whether it is through information graphics or other visual aids that explain it in a way that allows others to understand across sectors.”
  6. Ethical: Big Data is about being concerned how we use the vast quantities of data we have available today (N = 1). Big Data can provide us with “endless possibilities or cradle-to-grave shackles.”

Not surprisingly, these six areas are similar to how Big Data vendors see the field in 2014. Big Data is not just one thing. There are many different facets to this Big Data behemoth. While there is no consensus on a singular definition of Big Data, the consolidation of the current definitions shows that Big Data can be described by a handful of general areas, including characteristics of the data itself, insights you can get from the data, the analytics and modeling methods, data integration and more.

, ,

8 Responses to Six Ways to Define Big Data

  1. CyberH September 9, 2014 at 11:10 am #

    Bob, very nice definitions for Big Data. When considering a big data strategy, I think it’s worth mentioning HPCC Systems from LexisNexis. Designed by data scientists, HPCC Systems is an open source data-intensive supercomputing platform to process and solve Big Data analytical problems and can help companies derive actionable insights from their data.
    HPCC Systems provides proven solutions to handle what are now called Big Data problems, and have been doing so for more than a decade. The main advantages over other alternatives are the real-time delivery of data queries and the extremely powerful ECL language programming model. More info at http://hpccsystems.com

Trackbacks/Pingbacks

  1. Bob E. Hayes - Clouds of Big Data - December 22, 2015

    […] The biggest challenge for me was simply to get an understanding of the area of big data. When I first heard the term in late 2011, no standard definition of it was in existence. For example, check out these more than 40 different definitions of big data. Apparently, big data can be boiled down to these six areas: […]

  2. Six Ways to Define Big Data | Data | Scoop.it - May 17, 2015

    […] Berkeley School of Information (datascience@berkeley) recently asked experts in a variety of industries to provide their definition of Big Data. They received over 40 definitions that ranged from the traditional 3 Vs (i.e.  […]

  3. My Big Data Resolution for 2015 | Internet Marketing - December 31, 2014

    […] over 40 data experts for their definition of Big Data. These definitions could be reduced into six broad areas. These six areas represent the problems/challenges we face in the realm of Big Data. When you […]

  4. My Big Data Resolution for 2015 | Gary Schollmeier Blog - December 31, 2014

    […] over 40 data experts for their definition of Big Data. These definitions could be reduced into six broad areas. These six areas represent the problems/challenges we face in the realm of Big Data. When you […]

  5. Les signes distinctifs qui font de vos data des big data « Analyse « Business-analytics-info.fr - November 25, 2014

    […] Lire l’article (en anglais) […]

  6. Six Ways to Define Big Data | Big Data | Scoop... - September 11, 2014

    […] Berkeley School of Information (datascience@berkeley) recently asked experts in a variety of industries to provide their definition of Big Data.  […]

  7. Six Ways to Define Big Data | SoftSol, Inc | S... - September 9, 2014

    […] Berkeley School of Information (datascience@berkeley) recently asked experts in a variety of industries to provide their definition of Big Data. They received over 40 definitions that ranged from the traditional 3 Vs (i.e., Volume, Velocity and Variety) to anything related to analytics or visualization.  […]

bob@businessoverbroadway.com | 206.372.5990

UA-23043697-1