0% found this document useful (0 votes)
71 views

Columbia Seaborn Tutorial

This document provides a tutorial on data visualization using the Seaborn library in Python. It introduces Seaborn and loads a sample tips dataset. It then demonstrates various plot types in Seaborn like swarm plots, violin plots, facet grids, and heat maps to visualize the tips data in different ways. It encourages readers to explore visualizing their own data using Seaborn.

Uploaded by

Patri Zio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Columbia Seaborn Tutorial

This document provides a tutorial on data visualization using the Seaborn library in Python. It introduces Seaborn and loads a sample tips dataset. It then demonstrates various plot types in Seaborn like swarm plots, violin plots, facet grids, and heat maps to visualize the tips data in different ways. It encourages readers to explore visualizing their own data using Seaborn.

Uploaded by

Patri Zio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

data_visualization_using_seaborn

April 22, 2018

0.0.1 Data Visualization using Seaborn (a Python library)


Tutorial by: Navie Narula, Digital Centers Teaching Intern

Created for: Research Data Services at Columbia University Libraries Resources used to
create tutorial: DataCamp’s Introductory Tutorial Pandey’s Visualization Examples Seaborn Py-
Data Swarm Plots Seaborn PyData Heat Maps List of Colors in Python
In [3]: # import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn

%matplotlib inline
The seaborn library has many in-house datasets. You may find them here. We’ll be starting off
with the tips dataset.
In [4]: # load in data and save to a variable
df = seaborn.load_dataset("tips")
In [5]: # first five rows of dataset
df.head()
Out[5]: total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
In [6]: # last five rows of dataset
df.tail()
Out[6]: total_bill tip sex smoker day time size
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2

1
Swarm Plots

In [12]: # use swarmplot to visualize tip observations and amounts


# by day of the week
seaborn.swarmplot(x="day", y="tip", data=df)
seaborn.set_style("whitegrid")
plt.show()

In [16]: # visualize tip observations


seaborn.swarmplot(x=df["tip"])
seaborn.set_style("darkgrid")
plt.show()

2
In [28]: # color points by category
# create customized palette
gender_palette = ["#A833FF", "#FFAF33"]
seaborn.swarmplot(x="day", y="tip", hue="sex", palette=gender_palette, data=df)
plt.show()

3
In [41]: # control plot order on x-axis
seaborn.swarmplot(x="smoker", y="total_bill", data=df, palette="husl", order=["Yes", "N
plt.show()

Violin Plots

In [50]: # plot tips


seaborn.violinplot(x = df["tip"], color="gold")
plt.show()

4
In [54]: # draw plot based on variable
seaborn.violinplot(x = "sex",y ="total_bill",data=df)
plt.show()

5
In [57]: # Split drawings to compare with hue/legend variables
seaborn.violinplot(x = "time",y ="tip",data=df, hue ="sex",palette ="dark",split = True
plt.legend()
plt.show()

Facet Grids

In [72]: # draw facet grid based on tip variable


fg = seaborn.FacetGrid(df,col = "time",row = "sex")
fg = fg.map(plt.hist,"tip", color ="tomato")

6
In [73]: # we can also change the type of plot
# ...and the colors around the points
fg = seaborn.FacetGrid(df, col="time", row="sex")
fg = fg.map(plt.scatter, "total_bill", "tip", color="floralwhite", edgecolor="hotpink")

7
In [84]: # plot by category
x = seaborn.FacetGrid(df, col="time", hue="sex")
x = x.map(plt.scatter,"total_bill","tip")
x =x.add_legend()

8
Heat Maps

In [146]: # create random data


uniform_data = np.random.rand(5, 3) # five rows, 3 columns
print(uniform_data)
seaborn.heatmap(uniform_data)
plt.show()

[[0.39376482 0.61566449 0.94105178]


[0.57360765 0.66858876 0.03326495]
[0.35962929 0.46553437 0.28784689]
[0.32919801 0.02822342 0.38018925]
[0.69303348 0.559752 0.61115946]]

9
In [147]: # load in flights dataset
flights = seaborn.load_dataset("flights")

In [148]: # print first five rows


flights.head()

Out[148]: year month passengers


0 1949 January 112
1 1949 February 118
2 1949 March 132
3 1949 April 129
4 1949 May 121

In [149]: # print last five rows


flights.tail()

Out[149]: year month passengers


139 1960 August 606
140 1960 September 508
141 1960 October 461
142 1960 November 390
143 1960 December 432

In [150]: flights = flights.pivot("month", "year", "passengers")


# draw border
x = seaborn.heatmap(flights, linewidths=0.3)

10
In [152]: # change color and add value
x = seaborn.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")

11
Now, it’s time for you to start working with your own data of choice and produce the visual-
izations you like! You can use one of seaborn’s in-house datasets or load in your own. If you’d
like to use in your own .csv file, you can load that into a dataframe by doing something like this:

import pandas as pd
df = pd.read_csv("<filename>", sep=",")

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy