Process
The knowledge discovery in databases (KDD) process is commonly defined with the stages: Selection Pre-processing Transformation Data mining Interpretation/evaluation. It exists, however, in many variations on this theme, such as the Cross-industry standard process for data mining (CRISP-DM) which defines six phases: Business understanding Data understanding Data preparation Modeling Evaluation Deployment or a simplified process such as (1) Pre-processing, (2) Data Mining, and (3) Results Validation. Polls conducted in 2002, 2004, 2007 and 2014 show that the CRISP-DM methodology is the leading methodology used by data miners. The only other data mining standard named in these polls was SEMMA. However, 3–4 times as many people reported using CRISP-DM. Several teams of researchers have published reviews of data mining process models, and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in 2008. Pre-processing edit Before data mining algorithms can be used, a targe...
Comments
Post a Comment