Columbia Seaborn Tutorial
Columbia Seaborn Tutorial
Created for: Research Data Services at Columbia University Libraries Resources used to
create tutorial: DataCamp’s Introductory Tutorial Pandey’s Visualization Examples Seaborn Py-
Data Swarm Plots Seaborn PyData Heat Maps List of Colors in Python
In [3]: # import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn
%matplotlib inline
The seaborn library has many in-house datasets. You may find them here. We’ll be starting off
with the tips dataset.
In [4]: # load in data and save to a variable
df = seaborn.load_dataset("tips")
In [5]: # first five rows of dataset
df.head()
Out[5]: total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
In [6]: # last five rows of dataset
df.tail()
Out[6]: total_bill tip sex smoker day time size
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
1
Swarm Plots
2
In [28]: # color points by category
# create customized palette
gender_palette = ["#A833FF", "#FFAF33"]
seaborn.swarmplot(x="day", y="tip", hue="sex", palette=gender_palette, data=df)
plt.show()
3
In [41]: # control plot order on x-axis
seaborn.swarmplot(x="smoker", y="total_bill", data=df, palette="husl", order=["Yes", "N
plt.show()
Violin Plots
4
In [54]: # draw plot based on variable
seaborn.violinplot(x = "sex",y ="total_bill",data=df)
plt.show()
5
In [57]: # Split drawings to compare with hue/legend variables
seaborn.violinplot(x = "time",y ="tip",data=df, hue ="sex",palette ="dark",split = True
plt.legend()
plt.show()
Facet Grids
6
In [73]: # we can also change the type of plot
# ...and the colors around the points
fg = seaborn.FacetGrid(df, col="time", row="sex")
fg = fg.map(plt.scatter, "total_bill", "tip", color="floralwhite", edgecolor="hotpink")
7
In [84]: # plot by category
x = seaborn.FacetGrid(df, col="time", hue="sex")
x = x.map(plt.scatter,"total_bill","tip")
x =x.add_legend()
8
Heat Maps
9
In [147]: # load in flights dataset
flights = seaborn.load_dataset("flights")
10
In [152]: # change color and add value
x = seaborn.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")
11
Now, it’s time for you to start working with your own data of choice and produce the visual-
izations you like! You can use one of seaborn’s in-house datasets or load in your own. If you’d
like to use in your own .csv file, you can load that into a dataframe by doing something like this:
import pandas as pd
df = pd.read_csv("<filename>", sep=",")
12