Sign up
Forgot password?
FAQ: Login

Vermeulen Andreas Francois. Practical Data Science

  • zip file
  • size 2,15 MB
  • contains epub document(s)
  • added by
  • info modified
Vermeulen Andreas Francois. Practical Data Science
Apress, 2018. — 821 p. — ISBN: 1484230531.
Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets.
The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable results. He shows you how to apply practical methods to extract actionable business knowledge from data lakes consisting of data from a polyglot of data types and dimensions.
Data Science Technology Stack
Rapid Information Factory Ecosystem
Data Science Storage Tools
Data Lake
Data Vaul
Data Warehouse Bus Matrix
Data Science Processing Tools Spark
Mesos
Akka
Cassandra
Kafka
Elastic Search
R
Scala
Python
MQTT (MQ Telemetry Transport)
What’s Next?
Vermeulen-Krennwallner-Hillman-Clark
Windows
Linux
It’s Now Time to Meet Your Customer
Processing Ecosystem
Example Ecosystem
Sample Data
Layered Framework
Definition of Data Science Framework
Cross-Industry Standard Process for Data Mining (CRISP-DM)
Homogeneous Ontology for Recursive Uniform Schema
The Top Layers of a Layered Framewor
Layered Framework for High-Level Data Science and Engineering
Business Layer
Business Layer
Engineering a Practical Business Layer
Utility Layer
Basic Utility Design
Engineering a Practical Utility Layer
Three Management Layers
Operational Management Layer
Audit, Balance, and Control Layer
Balance
Control
Yoke Solution
Cause-and-Effect Analysis System
Functional Layer
Data Science Process
Retrieve Superstep
Data Lakes
Data Swamps
Training the Trainer Model
Understanding the Business Dynamics of the Data Lake
Actionable Business Knowledge from Data Lakes
Engineering a Practical Retrieve Superstep
Connecting to Other Data Sources
Assess Superstep
Assess Superstep
Errors
Analysis of Data
Practical Actions
Engineering a Practical Assess Superstep
Process Superstep
Data Vault
Time-Person-Object-Location-Event Data Vault
Data Science Process
Data Science
Transform Superstep
Transform Superstep
Building a Data Warehouse
Transforming with Data Science
Hypothesis Testing
Overfitting and Underfitting
Precision-Recall
Cross-Validation Test
Univariate Analysis
Bivariate Analysis
Multivariate Analysis
Linear Regression
Logistic Regression
Clustering Techniques
ANOVA
Principal Component Analysis (PCA)
Decision Trees
Support Vector Machines, Networks, Clusters, and Grids
Data Mining
Pattern Recognition
Machine Learning
Bagging Data
Random Forests
Computer Vision (CV)
Natural Language Processing (NLP)
Neural Networks
TensorFlow
Organize and Report Supersteps
Organize Superstep
Report Superstep
Graphics
Pictures
Showing the Difference
Closing Words
  • Sign up or login using form at top of the page to download this file.
  • Sign up
Up