O’Reilly Media, 2017. — 237 p. — ISBN: 978-1-491-92413-6.
By teaching you how to code machine-learning algorithms using a test-driven approach, this practical book helps you gain the confidence you need to use machine learning effectively in a business environment.
You’ll learn how to dissect algorithms at a granular level, using various tests, and discover a framework for testing machine learning code. The author provides real-world examples to demonstrate the results of using machine-learning code effectively.
Featuring graphs and highlighted code throughout, Thoughtful Machine Learning with Python guides you through the process of writing problem-solving code, and in the process teaches you how to approach problems through scientific deduction and clever algorithms.
Featuring graphs and highlighted code examples throughout, the book features tests with Python’s Numpy, Pandas, Scikit-Learn, and SciPy data science libraries. If you’re a software engineer or business analyst interested in data science, this book will help you:
Reference real-world examples to test each algorithm through engaging, hands-on exercises
Apply test-driven development (TDD) to write and run tests before you start coding
Explore techniques for improving your machine-learning models with data extraction and feature development
Watch out for the risks of machine learning, such as underfitting or overfitting data
Work with K-Nearest Neighbors, neural networks, clustering, and other algorithms
Probably Approximately Correct SoftwareWriting Software Right
Writing the Right Software
The Plan for the Book
A Quick Introduction to Machine LearningWhat Is Machine Learning?
Supervised Learning
Unsupervised Learning
Reinforcement Learning
What Can Machine Learning Accomplish?
Mathematical Notation Used Throughout the Book
K-Nearest NeighborsHow Do You Determine Whether You Want to Buy a House?
How Valuable Is That House?
Hedonic Regression
What Is a Neighborhood?
K-Nearest Neighbors
Mr. K’s Nearest Neighborhood
Distances
Curse of Dimensionality
How Do We Pick K?
Valuing Houses in Seattle
Naive Bayesian ClassificationUsing Bayes’ Theorem to Find Fraudulent Orders
Conditional Probabilities
Probability Symbols
Inverse Conditional Probability (aka Bayes’ Theorem)
Naive Bayesian Classifier
Naiveté in Bayesian Reasoning
Pseudocount
Spam Filter
Decision Trees and Random ForestsThe Nuances of Mushrooms
Classifying Mushrooms Using a Folk Theorem
Finding an Optimal Switch Point
Pruning Trees
Hidden Markov ModelsTracking User Behavior Using State Machines
Emissions/Observations of Underlying States
Simplification Through the Markov Assumption
Hidden Markov Model
Evaluation: Forward-Backward Algorithm
The Decoding Problem Through the Viterbi Algorithm
The Learning Problem
Part-of-Speech Tagging with the Brown Corpus
Support Vector MachinesCustomer Happiness as a Function of What They Say
The Theory Behind SVMs
Sentiment Analyzer
Aggregating Sentiment
Mapping Sentiment to Bottom Line
Neural NetworksWhat Is a Neural Network?
History of Neural Nets
Boolean Logic
Perceptrons
How to Construct Feed-Forward Neural Nets
Building Neural Networks
Using a Neural Network to Classify a Language
ClusteringStudying Data Without Any Bias
User Cohorts
Testing Cluster Mappings
K-Means Clustering
EM Clustering
The Impossibility Theorem
Example: Categorizing Music
Improving Models and Data ExtractionDebate Club
Picking Better Data
Feature Transformation and Matrix Factorization
Ensemble Learning
Putting It Together: ConclusionMachine Learning Algorithms Revisited
How to Use This Information to Solve Problems
What’s Next for You?