Kirk Matthew. Thoughtful Machine Learning with Python: A Test-Driven Approach

zip file
size 3,89 MB
contains mobi document(s)

added by XXL 08/31/2017 17:43
info modified 03/15/2024 20:18

Kirk Matthew. Thoughtful Machine Learning with Python: A Test-Driven Approach

O’Reilly Media, 2017. — 237 p. — ISBN: 978-1-491-92413-6.

By teaching you how to code machine-learning algorithms using a test-driven approach, this practical book helps you gain the confidence you need to use machine learning effectively in a business environment.
You’ll learn how to dissect algorithms at a granular level, using various tests, and discover a framework for testing machine learning code. The author provides real-world examples to demonstrate the results of using machine-learning code effectively.
Featuring graphs and highlighted code throughout, Thoughtful Machine Learning with Python guides you through the process of writing problem-solving code, and in the process teaches you how to approach problems through scientific deduction and clever algorithms.

Featuring graphs and highlighted code examples throughout, the book features tests with Python’s Numpy, Pandas, Scikit-Learn, and SciPy data science libraries. If you’re a software engineer or business analyst interested in data science, this book will help you:

Reference real-world examples to test each algorithm through engaging, hands-on exercises
Apply test-driven development (TDD) to write and run tests before you start coding
Explore techniques for improving your machine-learning models with data extraction and feature development
Watch out for the risks of machine learning, such as underfitting or overfitting data
Work with K-Nearest Neighbors, neural networks, clustering, and other algorithms

Probably Approximately Correct Software
Writing Software Right
Writing the Right Software
The Plan for the Book
A Quick Introduction to Machine Learning
What Is Machine Learning?
Supervised Learning
Unsupervised Learning
Reinforcement Learning
What Can Machine Learning Accomplish?
Mathematical Notation Used Throughout the Book

K-Nearest Neighbors
How Do You Determine Whether You Want to Buy a House?
How Valuable Is That House?
Hedonic Regression
What Is a Neighborhood?
K-Nearest Neighbors
Mr. K’s Nearest Neighborhood
Distances
Curse of Dimensionality
How Do We Pick K?
Valuing Houses in Seattle

Naive Bayesian Classification
Using Bayes’ Theorem to Find Fraudulent Orders
Conditional Probabilities
Probability Symbols
Inverse Conditional Probability (aka Bayes’ Theorem)
Naive Bayesian Classifier
Naiveté in Bayesian Reasoning
Pseudocount
Spam Filter

Decision Trees and Random Forests
The Nuances of Mushrooms
Classifying Mushrooms Using a Folk Theorem
Finding an Optimal Switch Point
Pruning Trees

Hidden Markov Models
Tracking User Behavior Using State Machines
Emissions/Observations of Underlying States
Simplification Through the Markov Assumption
Hidden Markov Model
Evaluation: Forward-Backward Algorithm
The Decoding Problem Through the Viterbi Algorithm
The Learning Problem
Part-of-Speech Tagging with the Brown Corpus

Support Vector Machines
Customer Happiness as a Function of What They Say
The Theory Behind SVMs
Sentiment Analyzer
Aggregating Sentiment
Mapping Sentiment to Bottom Line

Neural Networks
What Is a Neural Network?
History of Neural Nets
Boolean Logic
Perceptrons
How to Construct Feed-Forward Neural Nets
Building Neural Networks
Using a Neural Network to Classify a Language
Clustering
Studying Data Without Any Bias
User Cohorts
Testing Cluster Mappings
K-Means Clustering
EM Clustering
The Impossibility Theorem
Example: Categorizing Music

Improving Models and Data Extraction
Debate Club
Picking Better Data
Feature Transformation and Matrix Factorization
Ensemble Learning

Putting It Together: Conclusion
Machine Learning Algorithms Revisited
How to Use This Information to Solve Problems
What’s Next for You?