4 out of 5
4
1 review on Udemy

Data Visualization with Python

Understand, explore, and effectively present data using the powerful data visualization techniques of Python.
Instructor:
Packt Publishing
6 students enrolled
English [Auto-generated]
Understand and use various plot types with Python
Explore and work with different plotting libraries
Understand and create effective visualizations
Improve your Python data wrangling skills
Work with industry-standard tools like Matplotlib, Seaborn, and Bokeh
Understand different data formats and representations

You’ll begin with an introduction to data visualization and its importance. Then, you’ll learn about statistics by computing mean, median, and variance for some numbers, and observing the difference in their values. You’ll also learn about key NumPy and Pandas techniques, such as indexing, slicing, iterating, filtering, and grouping. You’ll study different types of visualizations, compare them, and find out how to select a particular type of visualization using this comparison. You’ll explore different plots, including custom creations. After you get a hang of the various visualization libraries, you’ll learn to work with Matplotlib and Seaborn to simplify the process of creating visualizations. You’ll also be introduced to advanced visualization techniques, such as geoplots and interactive plots. You’ll learn how to make sense of geospatial data, create interactive visualizations that can be integrated into any webpage, and take any dataset to build beautiful and insightful visualizations. You’ll study how to plot geospatial data on a map using Choropleth plot, and study the basics of Bokeh, extending plots by adding widgets and animating the display of information.

About the Authors

Mario Döbler is a graduate student with a focus in deep learning and AI. He previously worked at the Bosch Center for Artificial Intelligence in Silicon Valley in the field of deep learning, using state-of-the-art algorithms to develop cutting-edge products. Currently, he dedicates himself to apply deep learning to medical data to make health care accessible to everyone.

Tim Großmann is a CS student with an interest in diverse topics ranging from AI to IoT. He previously worked at the Bosch Center for Artificial Intelligence in Silicon Valley in the field of big data engineering. He’s highly involved in different open source projects and actively speaks at meetups and conferences about his projects and experiences.

Erik Sevre is a Doctoral Student in Computational Science and Technology at Seoul National University. He is a researcher at Seoul National University.

Importance of Data Visualization and Data Exploration

1
Course Overview

Python has recently emerged as a programming language that performs well for data analysis. Python has applications across data science pipelines that convert data into a usable format, analyze it, and extract useful conclusions from the data to represent it well. It provides data visualization libraries that can help you assemble graphical representations quickly.

In this course, you will learn how to use Python in combination with various libraries, such as NumPy, pandas, Matplotlib, seaborn, and geoplotlib, to create impactful data visualizations using real-world data. The GitHub link for this course is – https://github.com/TrainingByPackt/Data-Visualization-with-Python-eLearning

2
Installation and Setup

Before you start this course, we'll install Python 3.6, pip, and the other libraries used throughout this course.

3
Introduction

Unlike machines, people are not usually equipped for interpreting a lot of information from a random set of numbers and messages in a given piece of data. While they may know what the data is basically comprised of, they might need help to understand it completely. Out of all our logical capabilities, we understand things best through the processing of visual information. When data is represented visually, the probability of understanding complex builds and numbers increases.

4
Overview of Statistics

Statistics is a combination of the analysis, collection, interpretation, and representation of numerical data. Probability is the measures of the likelihood that an event will occur and will be quantified as a number between 0 and 1. The higher the probability, the more likely the event. Let us learn about this in further detail.

5
NumPy

When handling data, we often need a way to work with multidimensional arrays. As we discussed previously, we also must apply some basic mathematical and statistical operations on that data. This is exactly where NumPy positions itself. It provides support for large n-dimensional arrays and is the built-in support for many high-level mathematical and statistical operations.

6
pandas

The pandas Python library offers data structures and methods to manipulate different types of data, such as numerical and temporal. These operations are easy to use and highly optimized for performance. Let us learn about the different built-in solutions provided by pandas.

7
Lesson Summary

Let us quickly recap our learning from this lesson.

8
Test your knowledge

All You Need to Know About Plots

1
Lesson Overview

In this lesson, we will focus on various visualizations and identify which visualization is best to show certain information for a given dataset. We will describe every visualization in detail and give practical examples, such as comparing different stocks over time or comparing the ratings for different movies.

2
Comparison Plots

Comparison plots are well-suited charts that compare multiple variables or variables over time. Different plots are used based on the data requirements like bar charts, line charts, vertical bar charts, radar charts and so on. We will learn each of these types in detail with examples.

3
Relation Plots

Relation plots are perfectly suited to show relationships among variables. A scatter plot visualizes the correlation between two variables for one or multiple groups. Bubble plots can be used to show relationships between three variables. The additional third variable is represented by the dot size. Heatmaps are great for revealing patterns or correlating between two qualitative variables. A correlogram is a perfect visualization to show the correlation among multiple variables.

4
Composition Plots

Composition plots are ideal if you think about something as a part of a whole. For static data, you can use pie charts, stacked bar charts, or Venn diagrams.

5
Distribution Plots

Distribution plots give a deep insight into how your data is distributed. For a single variable, a histogram is well-suited. For multiple variables, you can either use a box plot or a violin plot. The violin plot visualizes the densities of your variables, whereas the box plot just visualizes the median, the interquartile range, and the range for each variable.

6
Geo Plots

Geological plots are a great way to visualize geospatial data. Choropleth maps can be used to compare quantitative values for different countries, states, and so on. If you want to show connections between different locations, connection maps are the way to go.

7
What Makes a Good Visualization?

In this video, we will learn about the various factors that make a good visualization.

8
Lesson Summary

Let us quickly recap our learning from this lesson.

9
Test your knowledge

A Deep Dive into Matplotlib

1
Lesson Overview

Matplotlib is the most popular plotting library for Python, used for data science and machine learning visualizations. Several features like the global style of MATLAB were introduced into Matplotlib to make the transition to Matplotlib easier. The Matplotlib library is a huge project which shows the level of abstraction worked into it to make the usage intuitive and convenient. Let us try to understand the concepts behind the plots.

2
Overview of Plots in Matplotlib

Plots in Matplotlib have a hierarchical structure that nests Python objects to create a tree-like structure. Each plot is encapsulated in a Figure object. This Figure is the top-level container of the visualization. It can have multiple axes, which are basically individual plots inside this top-level container. Let us learn about this is more detail.

3
Basic Text and Legend Functions

All the functions, except for the legend, create and return a matplotlib.text.Text() instance. We are mentioning it here so that you know that all the discussed properties can be used for the other functions as well. Let us learn about all the text functions in this video.

4
Basic Plots

Let us look at the different types of basic plots in this video.

5
Layouts

There are multiple ways to define a visualization layout in Matplotlib. We will start with subplots and how to use the tight layout to create visually appealing plots and then cover GridSpec, which offers a more flexible way to create multi-plots. Let’s see how!

6
Images

In case you want to include images in your visualizations or in case you are working with image data, Matplotlib offers several functions to deal with images. In this section, we will learn to load, save, and plot images with Matplotlib.

7
Writing Mathematical Expressions

In case you need to write mathematical expressions within the code, Matplotlib supports TeX. You can use it in any text by placing your mathematical expression in a pair of dollar signs. There is no need to have TeX installed since Matplotlib comes with its own parser.

8
Lesson Summary

Let us quickly recap our learning from this lesson.

9
Test your knowledge

Simplifying Visualizations Using seaborn

1
Lesson Overview

Unlike Matplotlib, Seaborn is not a standalone Python library. It is built on top of Matplotlib and provides a higher-level abstraction to make visually appealing statistical visualizations. A neat feature of Seaborn is the ability to integrate with DataFrames from the pandas library. With Seaborn, we attempt to make visualization a central part of data exploration and understanding. Internally, Seaborn operates on DataFrames and arrays that contain the complete dataset. This enables it to perform semantic mappings and statistical aggregations that are essential for displaying informative visualizations. Seaborn can also be solely used to change the style and appearance of Matplotlib visualizations.

2
Controlling Figure Aesthetics

Matplotlib is highly customizable. But this also has the effect that it is difficult to know what settings to tweak to achieve a visually appealing plot. In contrast, Seaborn provides several customized themes and a high-level interface for controlling the appearance of Matplotlib figures.

3
Color Palettes

Color is a very important factor for your visualization. Color can reveal patterns in the data if used effectively or hide patterns if used poorly. Seaborn makes it easy to select and use color palettes that are suited to your task. The color_palette() function provides an interface for many of the possible ways to generate colors. seaborn.color_palette([palette], [n_colors], [desat]) returns a list of colors, thus defining a color palette.

4
Interesting Plots in seaborn

Seaborn offers a very convenient way to create various bar plots. They can be also used in Seaborn to represent estimates of central tendency with the height of each rectangle and indicates the uncertainty around that estimate using error bars. Let us learn this through the following example which plots the salary based on the employee qualification in various districts.

5
Multi-plots in seaborn

In the previous section, we introduced a multi-plot, namely the pair plot. In this video, let us learn about a different way to create flexible multi-plots.

6
Regression Plots

Many datasets contain multiple quantitative variables, and the goal is to find a relationship among those variables. We previously mentioned a few functions that show the joint distribution of two variables. It can be helpful to estimate relationships between two variables. We will only cover linear regression in this topic; however, Seaborn provides a wider range of regression functionality if needed. To visualize linear relationships determined through linear regression, use the function regplot(). Let us look at an example of a seaborn regression plot.

7
Squarify

Matplotlib and seaborn do not offer tree maps, therefore, the Squarify library built on Matplotlib is used in place of tree maps. seaborn is a great addition to create color palettes. Let us look at the following example of a tree map using the Squarify library.

8
Lesson Summary

Let us quickly recap our learning from this lesson.

9
Test your Knowledge

Plotting Geospatial Data

1
Lesson Overview

Geoplotlib is an open source Python library for geospatial data visualizations. It has a wide range of geographical visualizations and supports hardware acceleration. It also provides performance rendering for large datasets with millions of data points.

2
Geoplotlib Basics

Geoplotlib is an open-source Python library for geospatial data visualizations which contains a wide range of geographical visualizations. It has a simple interface.

Some of its features include:

  • Supports hardware acceleration

  • Provides performance rendering for large datasets

  • Provides map tiles for interactivity and simple animations

3
Tile Providers

Geoplotlib supports the usage of different tile providers. This means that any OpenStreetMap tile server can be used as a backdrop to our visualization. Some of the popular free tile providers are Stamen Watercolor, Stamen Toner, Stamen Toner Lite, and DarkMatter.

4
Custom Layers

Custom layers allow you to create more complex data visualizations. They also help with adding more interactivity and animation to them. Creating a custom layer starts by defining a new class that extends the BaseLayer class that's provided by Geoplotlib. Besides the __init__ method that initializes the class level variables, we must at least extend the draw method of the already provided BaseLayer class.

5
Lesson Summary

Let us quickly recap our learning from this lesson.

6
Test your knowledge

Making Things Interactive with Bokeh

1
Lesson Overview

Bokeh has been around since 2013, with version 1.0.4 being released in 2018. It targets modern web browsers to present interactive visualizations to users rather than static images. In this lesson, we will design interactive plots using the Bokeh library.

2
Bokeh Basics

Let us look at some of the features of Bokeh. Since we are using Jupyter Notebook throughout this courseware, it's worth mentioning that Bokeh, including its interactivity, is natively supported in Notebook.

3
Adding Widgets

One of the most powerful features of Bokeh is its ability to use widgets to interactively change the data that's displayed in the visualization. Bokeh widgets work best when used in combination with the Bokeh server. However, using the Bokeh server approach is beyond the content of this courseware, since we would need to work with simple Python files and can't leverage the power of Python notebook. Instead, we will use a hybrid approach that only works with the older Jupyter Notebook.

4
Lesson Summary

Let us quickly recap our learning from this lesson.

5
Test your Knowledge
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4
4 out of 5
1 Ratings

Detailed Rating

Stars 5
0
Stars 4
1
Stars 3
0
Stars 2
0
Stars 1
0
19ecab686d7332aed085f99af761ee8f
30-Day Money-Back Guarantee

Includes

3 hours on-demand video
Full lifetime access
Access on mobile and TV
Certificate of Completion