2005 by Elsevier Inc.
Part I Machine learning tools and techniques.
What’s it all about?
nput: Concepts, instances, and attributes.
Output: Knowledge representation.
Algorithms: The basic methods.
Credibility: Evaluating what’s been learned.
mplementations: Real machine learning schemes.
Transformations: Engineering the input and output.
Moving on: Extensions and applications.
Part II The Weka machine learning workbench.
ntroduction to Weka.
The Explorer.
The Knowledge Flow interface.
The Experimenter.
The command-line interface.
Embedded machine learning.
Writing new learning schemes.
"If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start. "
-Jim Gray, Microsoft Research
FeaturesExplains how data mining algorithms work.
Helps you select appropriate approaches to particular problems and to compare and evaluate the results of different techniques.
Covers performance improvement techniques, including input preprocessing and combining output from different methods.
Shows you how to use the Weka machine learning workbench.
Translations
The book has been translated into German (first edition) and Chinese (second edition).
Table of Contents for the 2nd Edition:
Practical Machine Learning Tools and TechniquesWhat’s it all about?Data mining and machine learning
Simple examples: the weather problem and others
Fielded applications
Machine learning and statistics
Generalization as search
Data mining and ethics
Further reading
Input: Concepts, instances, attributesWhat’s a concept?
What’s in an example?
What’s in an attribute?
Preparing the input
Further reading
Output: Knowledge representationDecision tables
Decision trees
Classification rules
Association rules
Rules with exceptions
Rules involving relations
Trees for numeric prediction
Instance-based representation
Clusters
Further reading
Algorithms: The basic methodsInferring rudimentary rules
Statistical modeling
Divide-and-conquer: constructing decision trees
Covering algorithms: constructing rules
Mining association rules
Linear models
Instance-based learning
Clustering
Further reading
Credibility: Evaluating what’s been learnedTraining and testing
Predicting performance
Cross-validation
Other estimates
Comparing data mining schemes
Predicting probabilities
Counting the cost
Evaluating numeric prediction
The minimum description length principle
Applying MDL to clustering
Further reading
Implementations: Real machine learning schemesDecision trees
Classification rules
Extending linear models
Instance-based learning
Numeric prediction
Clustering
Bayesian networks
Transformations: Engineering the input and outputAttribute selection
Discretizing numeric attributes
Some useful transformations
Automatic data cleansing
Combining multiple models
Using unlabeled data
Further reading
Moving on: Extensions and applicationsLearning from massive datasets
Incorporating domain knowledge
Text and Web mining
Adversarial situations
Ubiquitous data mining
Further reading
The Weka machine learning workbenchIntroduction to WekaWhat’s in Weka?
How do you use it?
What else can you do?
The ExplorerGetting started
Exploring the Explorer
Filtering algorithms
Learning algorithms
Meta-learning algorithms
Clustering algorithms
Association-rule learners
Attribute selection
The Knowledge Flow interfaceGetting started
Knowledge Flow components
Configuring and connecting the components
Incremental learning
The ExperimenterGetting started
Simple setup
Advanced setup
Analyze panel
Distributing processing over several machines
The command-line interfaceGetting started
The structure of Weka
Command-line options
Embedded machine learning
Writing new learning schemes