Sign up
Forgot password?
FAQ: Login

Klein Bernd. Data Analysis. Numpy, Matplotlib and Pandas

  • pdf file
  • size 19,22 MB
Klein Bernd. Data Analysis. Numpy, Matplotlib and Pandas
Bodenseo, 2021. — 514 p.
This tutorial can be used as an online course on Numerical Python as Data Scientists and Data Analysts need it. Data science is an interdisciplinary subject that includes for example statistics and computer science, especially programming and problem-solving skills. Data Science includes everything necessary to create and prepare data, manipulate, filter, and cleanse data and analyze data. Data can be both structured and unstructured. We could also say Data Science includes all the techniques needed to extract and gain information and insight from data.
Data Science is an umbrella term that incorporates data analysis, statistics, machine learning, and other related scientific fields to understand and analyze data. Another term occurring quite often in this context is "Big Data". Big Data is one of the most often-used buzzwords in the software-related marketing world. Marketing managers have found out that using this term can boost the sales of their products, regardless of whether they are dealing with big data or not. The term is often used in fuzzy ways.
Python is a general-purpose language and as such, it can and is widely used by system administrators for operating system administration, by web developers as a tool to create dynamic websites, and by linguists for natural language processing tasks. Being a truly general-purpose language, Python can of course - without using any special numerical modules - be used to solve numerical problems as well. So far so good, but the crux of the matter is the execution speed. Pure Python without any numerical modules couldn't be used for numerical tasks MatLAB, R, and other languages are designed for. If it comes to computational problem-solving, it is of the greatest importance to consider the performance of algorithms, both concerning speed and data usage.
If we use Python in combination with its modules NumPy, SciPy, Matplotlib, and Pandas, it belongs to the top numerical programming languages. It is as efficient - if not even more efficient - than MatLAB or R.
Numpy Tutorial.
Numpy Tutorial: Creating Arrays.
Data Type Objects, dtype.
Numerical Operations on NumPy Arrays.
NumPy Arrays: Concatenating, Flattening, and Adding Dimensions.
Python, Random Numbers, and Probability.
Weighted Probabilities.
Synthetical Test Data With Python.
Numpy: Boolean Indexing.
Matrix Multiplication, Dot, and Cross Product.
Reading and Writing Data Files.
Overview of Matplotlib.
Format Plots.
Matplotlib Tutorial.
Shading Regions with fill_between().
Matplotlib Tutorial: Spines and Ticks.
Matplotlib Tutorial, Adding Legends and Annotations.
Matplotlib Tutorial: Subplots.
Exercise.
Exercise.
Matplotlib Tutorial: Gridspec.
GridSpec using SubplotSpec.
Matplotlib Tutorial: Histograms and Bar Plots.
Matplotlib Tutorial: Contour Plots.
Introduction to Pandas.
Data Structures.
Accessing and Changing values of DataFrames.
Pandas: groupby.
Reading and Writing Data.
Dealing with NaN.
  • Sign up or login using form at top of the page to download this file.
  • Sign up
Up