Data mining definition

Data mining, sometimes used synonymously with “knowledge discovery,” is the process of sifting large volumes of data for correlations, patterns, and trends. It is a subset of data science that uses statistical and mathematical techniques along with machine learning and database systems. The Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining (SigKDD) defines it as the science of extracting useful knowledge from the huge repositories of digital data created by computing technologies.

The idea of extracting patterns from data is not new, but the modern concept of data mining began taking shape in the 1980s and 1990s with the use of database management and machine learning techniques to augment manual processes.

Data mining vs. data analytics

The terms data analytics and data mining are often conflated, but data analytics can be understood as a subset of data mining.

Data mining focuses on cleaning raw data, finding patterns, creating models, and then testing those models, according to analytics vendor Tableau. Data analytics, on the other hand, is the part of data mining focused on extracting insights from data. Its aim is to apply statistical analysis and technologies on data to find trends and solve problems.

The business value of data mining

Data mining is used at companies across a broad swathe of industries to sift through their data to understand trends and make better business decisions. Media and telecom companies use mine their customer data to better understand customer behavior. Insurance companies use data mining to price their products more effectively and to create new products. Educators are now using mining data to discover patterns in student performance and identify problem areas where they might need special attention. Retailers are using data mining to better understand their customers and create highly targeted campaigns.

Data mining use cases include the following:

