In the last few years, a long-brewing technology trend has begun to bubble up into the policy-making process at all levels of government. That trend is called “Big Data”, and industry experts expect policymakers will be dealing with the questions it raises over the next decade.
Big Data offers the potential to drastically increase our quality of life (self driving car) but is also raises questions about privacy and security (NSA snooping). The response of policymakers to the questions raised by Big Data technology will have an impact on every American business, from Google to Publix, and every American citizen, from high school students to cancer patients.
The six terms below are a primer on the lingo used in the Big Data discussion.
Data Science: An emerging field that combines statistics, computer science and business analysis to gain insight from data. Google’s chief economist has called data science the sexiest job of the next decade. Florida Poly will offer degrees in data science when it opens next year.
Big Data: A term that describes data sets of massive volume that change rapidly and come from a wide variety of sources. Big data sets are so big that they cannot be maintained on a traditional database and require new methods to process and search. Big data has applications ranging from the self-driving car to decoding the human genome.
Data Mining (Undirected Discovery)*: The methods used to explore big data sets for patterns, trends and relationships between data. A data mining project seeks to find the most compelling relationships in the data as a whole. Organizations use the insights extracted from these data mining activities to improve business functions, discover new trends, or explain the causes behind certain happenings in the business.
Analytics (Directed Discovery)*: Closely related to data mining, however the primary difference is that analytics tend to focus on improving a single business area or answering a specific question. Example: determining what key factors drive sales of a certain product.
Predictive Analytics: Using data to build a mathematical model that forecasts a future event. Example: an airline using data about certain parts to predict when they may be about to fail.
Business Intelligence (BI): A collection of key data sets of known significance to a business or organization. These key data sets are often formatted into charts, graphs and gauges on a “dashboard” for easy reference by decision makers.
Datatization: The increasing trend of everyday activities being digitized and recorded through sensors and WiFi internet connections. It is estimated that more data was created in the last two years than in all of preceding human history. The smart phone is the primary agent of datatization in everyday life.
*There is debate in the data science regarding the exact meanings of the terms “data mining” and “analytics.” Some even suggest ditching the term data mining completely because of its negative connotation.Click image to see a larger versionThe Real World of Big Data via Wikibon Infographics