Data mining is an interdisciplinary field, the confluence of a set of disciplines, including database systems, statistics, machine learning, visualization, and information science. Because of the diversity of disciplines contributing to data mining, data mining research is expected to generate a large variety of data mining systems. Therefore, it is necessary to provide a clear classification of data mining systems, which may help potential users distinguish between such systems and identify those that best match their needs. Data mining systems can be categorized according to various criteria, as follows —
Fig. 1.7 Data Mining as a Confluence of Multiple Disciplines
Classification According to the Kind of Databases Mined
A data mining system can be classified according to the kinds of databases mined Database systems can be classified according to different criteria (such as data models, or the types of data or applications involved), each of which may require its own data mining technique. Data mining systems can therefore be 1: classified accordingly.
For instance; if classifying according to data models, we may have a relational, transactional, object-relational, or data warehouse mining system if classified according to the special types of data handled, we may have a• spatial, time-series, text, stream data, Multimedia data mining system, or a World Wide Web mining system.
Classification According to the Kinds of Knowledge Mined —
Data mining systems can be classified according to the kinds of knowledge they mine, that is, based on data mining functionalities, such as characterization, discrimination, association and correlation analysis, classification, prediction, clustering, outlier analysis, and evolution analysis. A comprehensive data mining system usually provides multiple and/or integrated data mining functionalities.
Data mining systems can also be categorized as those that mine data regularities (commonly occurring patterns) versus those that mine data irregularities (such as exceptions, or outliers). In general, concept description; association and correlation analysis, classification, prediction, and clustering mine data regularities, rejecting outliers as noise. These methods may also help detect outliers.
Classification According to the Kinds of Technique Utilized —
Data mining systems can be categorized according to the underlying data ‘Mining techniques employed. These techniques can be described according to the degree of user interaction involved (e.g., autonomous systems, interactive exploratory systems, query-driven systems) or the methods of data analysis employed (e.g., database-oriented or data warehouse oriented techniques, machine learning, statistics visualization patterns, recognition, neural networks, and so on). A sophisticated data mining system will often adopt multisite mining techniques or work out an effective, integrated technique that combines the merits of a few individual approaches.
Classification According to the. Applications Adapted —
Dataininirig systems can also be categorized according to the applications they adapt. For example, data mining systems may be tailored specifically, for finance, telecommunications, DNA, stock market, e-mail and so, on. Different applications often require the integration of application-speCific methods. Therefore, a generic, alt-purpose data mining system may not fit domain-specific mining tasks.