Springer, 2008. — 180 p. — ISBN: 978-3540769163, e-ISBN: 978-3540769170.
This book covers the fundamental concepts of data mining, to demonstrate the potential of gathering large sets of data, and analyzing these data sets to gain useful business understanding. The book is organized in three parts. Part I introduces concepts. Part II describes and demonstrates basic data mining algorithms. It also contains chapters on a number of different techniques often used in data mining. Part III focusses on business applications of data mining. Methods are presented with simple examples, applications are reviewed, and relative advantages are evaluated.
Part I. Introduction.
What is Data Mining?, What is Needed to Do Data Mining, Business Data Mining, Data Mining Tools.
Data Mining Process.
CRISP-DM, SEMMA, Example Data Mining Process Application,
Comparison of CRISP & SEMMA, Handling Data.
Part II. Data Mining Methods as Tools.
Memory-Based Reasoning Methods.
Matching, Distance Minimization, Software, Appendix: Job Application Data Set.
Association Rules in Knowledge Discovery.
Market-Basket Analysis, Real Market Basket Data.
Fuzzy Sets in Data Mining.
Fuzzy Sets and Decision Trees, Fuzzy Sets and Ordinal Classification, Fuzzy Association Rules.
Rough Sets.
A Brief Theory of Rough Sets, Some Exemplary Applications of Rough Sets, Rough Sets Software Tools,
The Process of Conducting Rough Sets Analysis, A Representative Example.
Support Vector Machines.
Formal Explanation of SVM, Non-linear Classification, Use of SVM – A Process-Based Approach,
Support Vector Machines versus Artificial Neural Networks, Disadvantages of Support Vector Machines.
Genetic Algorithm Support to Data Mining.
Demonstration of Genetic Algorithm, Application of Genetic Algorithms in Data Mining, Appendix: Loan Application Data Set.
Performance Evaluation for Predictive Modeling.
Performance Metrics for Predictive Modeling, Estimation Methodology for Classification Models,
Simple Split (Holdout), The k-Fold Cross Validation, Bootstrapping and Jackknifing, Area Under the ROC Curve.
Part III. Applications.
Applications of Methods.
Memory-Based Application, Association Rule Application, Fuzzy Data Mining, Rough Set Models,
Support Vector Machine Application, Genetic Algorithm Applications:
Japanese Credit Screening, Product Quality Testing Design, Customer Targeting, Medical Analysis.
Predicting the Financial Success of Hollywood Movies.