**Matplotlib** is one of the most commonly used Python package for 2D-graphics. It provides a quick way to visualize data from Python and also create publication-quality figures in various different formats. In this article, we are going to explore matplotlib in interactive mode covering 7 basic cases.

# Matplotlib, Pyplot and IPython Shell

Before we dive into the details of firing Matplotlib and creating visualizations with it, there are a few useful things to note. Matplotlib is a multi-platform data visualization library built on NumPy arrays. This allows it to work with the broader SciPy stack.

Another important feature is its ability to play well with many operating systems and graphics backends. Thus, making it a cross-platform software. Its *everything-to-everyone* approach is one of the great strengths of Matplotlib.

## Importing Matplotlib

You are encouraged to **follow along** with the tutorial and play around with Matplotlib, trying various things and making sure you're getting the hang of it. Let's get started!

Just as we use the np shorthand for NumPy and the pd shorthand for Pandas, we will use some standard shorthands for Matplotlib imports:

In [1]: import matplotlib as mpl

import matplotlib.pyplot as plt

## Pyplot

Pyplot, shortened above as plt, is a module within the matplotlib package that provides a convenient interface to the matplotlib's plotting classes and methods. Good news for Matlab users is that pyplot’s striking similarity to Matlab(TM) gives them an upper hand on the plotting commands and arguments.

## Plotting from an IPython shell

IPython is built to work well with Matplotlib if you specify Matplotlib mode. To enable this mode, you can use the %matplotlib magic command after starting ipython:

In [2]: %matplotlib inline

In the very same IPython notebook, you also have the option of embedding graphics directly, with two possible options:

- %matplotlib notebook will lead to interactive plots embedded within the notebook
- %matplotlib inline will lead to static images of your plot embedded in the notebook

For all purposes and use cases of this notebook, we will opt for %matplotlib inline. What does it actually do? Well, in layman’s terms it provides the ability to renders figures instead of receiving object’s dump. So next time your plot isn’t displayed, you’ll know why!

## Data types and Formats

In data visualization, there are three main types of variables: **Numerical**, **Categorical** and **Ordinal**. Matplotlib supports them with int, float and uint8 data types while accepting such input data types in these formats: NumPy List, NumPy Array, Pandas Series and Pandas DataFrame.

# First Look: Line Chart

Creating plots with Matplotlib can be easily accomplished with just a few lines of code. As our first choice, we will create sine and cosine waves with a **line chart**. Line charts in general are also a good choice for showing trends.

First thing first, let’s import the NumPy library and create an array of data points. Next we initiate a figure plot by fetching it from the pyplot library. Here is your chance to create your model, function or graph as you wish. Tip: Explore the official Matplotlib documentation for in-depth understanding! With this done, you are now ready to display this plot.

In [3]: import numpy as np

x = np.linspace(-np.pi, np.pi, 256, endpoint=True)

# Start your figure

plt.figure()

S,C = np.sin(x), np.cos(x)

# Plot sine curve with a solid - line

plt.plot(x, S, '-')

# Plot cosine curve with a dotted -- line

plt.plot(x, C, '--')

# Display plot and show result on screen.

#Its a common practice that will be carried throughout

plt.show()

Let’s move on to instantiating all the default settings built in so that we can customize the appearance of our plot to suit our needs. The settings have been explicitly set to their default values, but here we can interactively play with the values to explore their affect.

In [4]: # Create a new figure of size 10x6 inches, using 80 dots per inch

fig = plt.figure(figsize=(10,6), dpi=80)

# Plot cosine using blue color with a dotted line of width 1 (pixels)

plt.plot(x, C, color="blue", linewidth=2.5, linestyle="--", label= "cosine")

# Plot sine using green color with a continuous line of width 1 (pixels)

plt.plot(x, S, color="green", linewidth=2.5, linestyle="-", label= "sine")

# Set x limits

plt.xlim(-4.0,4.0)

# Set x ticks

plt.xticks(np.linspace(-4,4,9,endpoint=True))

# Set y limits

plt.ylim(-1.0,1.0)

# Set y ticks

plt.yticks(np.linspace(-1,1,5,endpoint=True))

#Adding legends, title and axis names

plt.legend(loc='upper left', frameon=False)

plt.title("Graph of wave movement with Sine and Cosine functions")

plt.xlabel("Time, t")

plt.ylabel("Position, x")

#Turning on grid

plt.grid(color='b', linestyle='-', linewidth=0.1)

#Moving spines to center in the middle

ax = plt.gca()

ax.spines['right'].set_color('none')

ax.spines['top'].set_color('none')

ax.xaxis.set_ticks_position('bottom')

ax.xaxis.set_label_coords(1,0)

ax.spines['bottom'].set_position(('data',0))

ax.yaxis.set_ticks_position('left')

ax.yaxis.set_label_coords(0,1)

ax.spines['left'].set_position(('data',0))

plt.show()

Well, there you have it!

If you would like to save the figure instead of seeing its output in the interactive notebook, you can use the savefig() command.

In [5]: fig.savefig('my_figure.png')

There are multiple formats we can save this image in.

In [6]: fig.canvas.get_supported_filetypes()

Out[6]: {'eps': 'Encapsulated Postscript',

'jpeg': 'Joint Photographic Experts Group',

'jpg': 'Joint Photographic Experts Group',

'pdf': 'Portable Document Format',

'pgf': 'PGF code for LaTeX',

'png': 'Portable Network Graphics',

'ps': 'Postscript',

'raw': 'Raw RGBA bitmap',

'rgba': 'Raw RGBA bitmap',

'svg': 'Scalable Vector Graphics',

'svgz': 'Scalable Vector Graphics',

'tif': 'Tagged Image File Format',

'tiff': 'Tagged Image File Format'}

# Types of plots

For your reference, here are all the kinds of plots you can call (more on this below):

- ‘bar’ or ‘barh’ for bar charts
- ‘hist’ for histograms
- ‘box’ for boxplots
- ‘kde’ or 'density' for density plots
- ‘area’ for area plots
- ‘scatter’ for scatter plots
- ‘hexbin’ for hexagonal bin plots
- ‘pie’ for pie charts

# Bar Chart

A bar chart is a good choice when you want to show how some quantity varies among some discrete set of items. Let’s create a Bar chart from described set.

In [7]: # Setting figure size to 7x5

fig = plt.figure(figsize=(7,5))

# Setting data set

menMeans = (20, 35, 30, 35, 27)

menStd = (2, 3, 4, 1, 2)

# Setting index

ind = np.arange(5)

# Setting argument for width

width = 0.35

# Plotting a horizontal bar graph for menMeans against index

#with errorbars equal to men standard deviation

p1 = plt.barh(ind, menMeans, width, xerr=menStd)

In [8]: # Setting figure size to 7x5

fig = plt.figure(figsize=(7,5))

# Setting data set values

womenMeans = (25, 32, 34, 20, 25)

womenStd = (3, 5, 2, 3, 3)

# Plotting a horizontal bar graph with women's data on top and men's data at the botttom.

p1 = plt.bar(ind, menMeans, width, yerr=menStd)

p2 = plt.bar(ind, womenMeans, width, bottom=menMeans, yerr=womenStd)

plt.show()

# Histogram

Histograms are plot type used to show the frequency across a continuous or discrete variable. Let's have a look.

In [9]: import numpy as np

# Generating 3 different arrays

x = np.random.normal(0, 0.8, 1000)

y = np.random.normal(-2, 1, 1000)

z = np.random.normal(3, 2, 1000)

# Setting figure size to 9x6

fig = plt.figure(figsize=(9, 6))

# Configuring keyword arguments to customize histogram.

# Alpha adjusts translucency while bins define spacing.

#More features available in the documentation.

kwargs = dict(histtype='stepfilled', alpha=0.9, normed=True, bins=40)

# Plotting all 3 arrays on one graph

plt.hist([x, y, z], **kwargs)

plt.show()

In [10]: # Generating 3 dimensional numpy array

X = 200 + 25*np.random.randn(1000,3)

# Setting figure size to 9x6

fig = plt.figure( figsize=(9, 6))

# Plotting histogram from 3 stacked arrays after normalizing data

n, bins, patches = plt.hist(X, 30, alpha = 0.9, stacked=True, normed = True)

plt.show()

# Scatter Plot

A Scatter plot is the right choice for visualizing the relationship between two paired sets of data.

In [11]: N = 100

# Generating 2 different arrays

x = np.random.rand(N)

y = np.random.rand(N)

fig = plt.figure( figsize=(9, 6))

# Plotting a scatter graph at the given x-y coordinates

plt.scatter(x,y)

plt.show()

In [12]: N = 100

# Generating 2 different arrays

x = np.random.rand(N)

y = np.random.rand(N)

fig = plt.figure( figsize=(9, 6))

# Assigning random colors and variable sizes to the bubbles

colors = np.random.rand(N)

area = np.pi * (20 * np.random.rand(N))**2 # 0 to 20 point radii

# Plotting a scatter plot on x-y coordinate with the assigned size and color

plt.scatter(x, y, s=area, c=colors, alpha=0.7)

plt.show()

# Box and Whisker Plot

Box plot is an easy and effective way to read descriptive statistics. These statistics summarize the distribution of the data by displaying: minimum, first quartile, median, third quartile, and maximum in a single graph.

In [13]: np.random.seed(10)

# Generating 4 different arrays and combining them in a list called data_to_plot

u = np.random.normal(100, 10, 200)

v = np.random.normal(80, 30, 200)

w = np.random.normal(90, 20, 200)

x = np.random.normal(70, 25, 200)

data_to_plot = [u, v, w, x]

fig = plt.figure(figsize=(9, 6))

# Plotting a box plot that shows the mean, variance and limits within each column.

# Add patch_artist=True option to ax.boxplot() to get fill color

bp = plt.boxplot(data_to_plot, patch_artist=True, labels = list('ABCD'))

# change outline color, fill color and linewidth of the boxes

for box in bp['boxes']:

# change outline color

box.set( color='#7570b3', linewidth=2)

# change fill color

box.set( facecolor = '#1b9e77' )

# change color and linewidth of the whiskers

for whisker in bp['whiskers']:

whisker.set(color='#7570b3', linewidth=2)

# change color and linewidth of the caps

for cap in bp['caps']:

cap.set(color='#7570b3', linewidth=2)

# change color and linewidth of the medians

for median in bp['medians']:

median.set(color='#b2df8a', linewidth=2)

# change the style of fliers and their fill

for flier in bp['fliers']:

flier.set(marker='o', color='#e7298a', alpha=0.5)

plt.show()

# Area Plot

Area charts are used to represent cumulative totals using numbers or percentages over time. Since these plot by default are stacked they need each column to be either all positive or all negative values.

In [14]: x=range(1,6)

# Setting list values of a list

y=[ [1,4,6,8,9], [2,2,7,10,12], [2,8,5,10,6], [1,5,2,5,2] ]

# Setting figure size to 9x6 with dpi of 80

fig = plt.figure(figsize=(9,6), dpi=80)

# Plotting a stacked area plot

plt.stackplot(x,y, labels=['A','B','C','D'], alpha= 0.8)

# Setting location of legend

plt.legend(loc='upper left')

# Pie Chart

Pie charts show percentage or proportion of data. This percentage represented by each category is right next to its corresponding slice of pie. For pie charts in Matplotlib, the slices are ordered and plotted counter-clockwise, as shown:

In [14]: # Settign keyword arguments

labels = 'Kenya', 'Tanzania', 'Uganda', 'Ruwanda', 'Burundi'

sizes = [35, 30, 20, 10 ,5]

explode = (0, 0.1, 0, 0, 0) # only "explode" the 2nd slice (i.e. 'Tanzania')

fig = plt.figure( figsize=(9, 6))

# Plotting pie chart with the above set arguments

plt.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True, startangle=90)

plt.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle.

plt.show()

# For Further Exploration

**Seaborn** is built on top of matplotlib and allows you to easily produce prettier (and more complex) visualizations. **D3.js** is a JavaScript library for producing sophisticated interactive visualizations for the web. Although it is not in Python, it is both trendy and widely used. **Bokeh** is a newer library that brings D3-style visualizations into Python. **ggplot **is a Python port of the popular R library ggplot2, which is widely used for creating “publication quality” charts and graphics. It’s probably most interesting if you’re already an avid ggplot2 user, and possibly a little opaque if you’re not.- For more on Matplotlib: pyplot — Matplotlib documentation

Before wrapping up, I'll leave you to ponder over this Antoine de Saint-Exupery's quote. "*Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away*".