Early Release. — O’Reilly Media, 2016. — 90 p. — ISBN13: 978-1-491-95289-4.
A key component of data science is statistics and machine learning, but only a small proportion of data scientists are actually trained as statisticians. This concise guide illustrates how to apply statistical concepts essential to data science, with advice on how to avoid their misuse.
Many courses and books teach basic statistics, but rarely from a data science perspective. And while many data science resources incorporate statistical methods, they typically lack a deep statistical perspective. This quick reference book bridges that gap in an accessible, readable format.
Exploratory Data AnalysisElements of Structured Data
Rectangular Data
Estimates of Location
Estimates of Variability
Exploring the Data Distribution
Exploring Binary and Categorical Data
Correlation
Exploring Two or More Variables
Data and Sampling DistributionsRandom sampling and sample bias
Selection bias
Sampling Distribution of a Statistic
The bootstrap
Confidence intervals
Normal distribution
Long-Tailed Distributions
Student’s t distribution
Binomial distribution
Poisson and Related Distributions