0% found this document useful (0 votes)
44 views

03 Temporal, Geospatial Multivariate Data

This document discusses different types of data that can be visualized, including one-dimensional, two-dimensional, three-dimensional, temporal, multivariate, tree, and network data. It then describes some basic visualization techniques like bar charts, histograms, box plots, 2D and 3D scatter plots, line graphs, and scatter plot matrices that can be used to visualize different aspects of the iris data set.

Uploaded by

Ng Yiu Fai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

03 Temporal, Geospatial Multivariate Data

This document discusses different types of data that can be visualized, including one-dimensional, two-dimensional, three-dimensional, temporal, multivariate, tree, and network data. It then describes some basic visualization techniques like bar charts, histograms, box plots, 2D and 3D scatter plots, line graphs, and scatter plot matrices that can be used to visualize different aspects of the iris data set.

Uploaded by

Ng Yiu Fai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Temporal, Geospatial

& Multivariate Data

COMP7507
Visualization & Visual Analytics
Types of Data
• One-dimensional — linear data
– e.g., age distribution
– includes sequential data such as text, program source
codes
• Two-dimensional: planar or map data
– e.g., geographical maps, floorplans, newspaper layouts
• Three-dimensional: real-world objects
– e.g medical scans, architectural design
• Temporal
– e.g. timelines, weather

2
Types of Data
• Multidimensional or Multivariate
– e.g., financial data, customer behaviour
• Tree (hierarchical data)
– e.g., file structure, evolution
• Network (relational data)
– e.g., social network, air traffic
• The above classification is not mutually exclusive
– E.g., how about air traffic data?
– multivariate, geographical, network, and temporal
Shneiderman, B., "The eyes have it: a task by data type taxonomy for information
visualizations," Proc. IEEE Symposium on Visual Languages, 1996, pp.336,343. 3
The Iris Sample Data Set
• Created by R.A. Fisher
• Possibly the best known data set in
the pattern recognition community
Iris Setosa
• 3 classes (types of iris)
• 50 objects in each class
• 5 attributes
– sepal length & width (cm) Iris Versicolour Iris Virginica
– petal length & width (cm) [wikipedia]
– class (Iris Setosa, Iris Versicolour, Iris Virginica)

4
The Iris Sample Data Set

5
Some Basic Plots
Bar Charts / Histograms
• To show distribution of values of a single variable
• Values are divided into bins
• A bar plot is used so that the height of each bar
indicates the number of objects in each bin
• Shape of histogram depends on the number of bins
40 30
35 10 bins 25 20 bins
30
20
25
20 15
15 10
10
5
5
0 0
0.25

1.5

2.25
1.625
0.375

1.125

1.875

2.375
0.75

1.25
1.375
1

2
0.875

1.75
0.5

2.5
0.625
0.125

2.125
0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5
Iris’s data – petal width
7
Box Plots
• Conventional histograms / distribution plots might
not be space-efficient.

https://machinelearningmastery.com/data-visualization-in-r/

8
Box Plots
• An efficient way to show quantitative distribution
of 1D data
Value axis
outliers (shown individually)

Q3 + 1.5 * IQR (or the highest


datum smaller than this)
Q3: 75th percentile
Q2: 50th percentile

x Q1: 25th percentile

25% of the data


samples are
with a value Q1 – 1.5 * IQR (or the lowest
below x datum greater than this)
9
Box Plots
Right-skewed

Q1 Q2 Q3

Left-skewed

Q1 Q2 Q3

For normal distribution (symmetric)


https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51 10
Box Plots
With a box plot, outliers can be highlighted
9
and further examined.
8

7
Values (cm)

sepal length sepal width petal length petal width

11
2D Bar Charts
• To show the joint distribution of the values of
two variables

50

40
Frequency

30
20
10
0
2.3
0.8
1.7 4.7

2.5 7 petal length


petal width

3D effect not good in showing the exact values, but the


correlation can be seen clearly
12
Line Graphs
• Points connected by lines to show how
something changes in value (usually over time)
GDP Growth (annual %)

Issues:
• occlusion when too many lines,
• when to use piecewise or smooth lines
13
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3

2.5
• Point position determined
2
by attribute values
Petal Width

1.5
• Additional attributes can
1
be marked by size, shape,
0.5 or color for each item
0
0 1 2 3 4 5 6 7 8
Petal Length

Iris Setosa Iris Versicolour Iris Virginica

14
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3

2.5
Petal Width

1.5

0.5

0
0 1 2 3 4 5 6 7 8
Petal Length

15
Scatter Plots
• 2D scatter plots are commonly used, and there
are also 3D scatter plots
• Can compare two attributes at a time only. What
if we have a lot of pairwise comparisons to show?
– have an array (a matrix) of scatter plots

16
Scatter Plot Matrices
sepal length sepal width petal length petal width
sepal length

• Diagonal plot show


the distribution of the
1D data
sepal width

• Can also make use of


the diagonal space for
other kinds of 1D
petal length

visualization, e.g., a
histogram
• Matrix plot can be
petal width

used not only for


scatter plot, but
anything that can deal
with a bivariate plot
17
Contour Plots
• To show continuous attributes measured on a
spatial grid
• Partition the space into regions of similar values;
boundaries of regions are contour lines called
iso-value lines, or isolines.

[Math, NYU]
18
Contour Plots
• Commonly used in
scientific visualization
• Examples: height fields,
temperature, rainfall, etc.

[3DFieldPro]

[OriginLab]
19
Temporal Data
Time-Series Data
• Set of values that change over time
• Examples:
– Finance (stock prices, exchange rates)
– Science (temperatures, pollution levels, electric
potentials)
– Public policy (crime rates, public health)
• Common requirements:
– Able to compare many time series simultaneously
– Able to use different visualizations in combination

21
Index Charts
• Interactive line chart showing % change based
on a selected index point
• Useful for showing relative changes

percentage change
of selected stock
prices according to
the day of purchase

[Heer et al., 2010]

22
Stacked Graphs
Centered

Zero-Basedline

Total counts of unemployed


US workers per industry,
2000-2010.

[Heer et al., 2010]


23
Stacked Graphs
• Stack area charts on top of each other
• Useful for showing summation of time-series
values (aggregation)
• Limitation:
– negative numbers not supported
– difficult to interpret trends accurately
– meaningless for some kind of data
(e.g., temperatures)

24
Horizon Graphs

US unemployment rate, 2000-2010.


[Heer et al., 2010]
Positive values: above average unemployment
Negative values: below average unemployment

[ http://homes.cs.washington.edu/~jheer//files/zoo/ex/time/horizon.html ] 25
Horizon Graphs
• To divide the area plot into horizontal bands and
layer them over each others.
• Useful for increasing the data density (i.e. save
space) without sacrificing resolution.
• Limitation: Not intuitive and takes time to learn

26
Spiral Graphs
• Use a spirally shaped time axis
• Good for showing or identifying periodic structure of
data
Number of influenza cases over a period of three years

Time series 27 days per cycle 28 days per cycle

[Aigner et al., "Visual Methods for Analyzing Time-Oriented Data", IEEE TVCG, 2008.]
27
Multivariate Data
Chernoff Faces
• Relate data to facial features, something which we find easy
to differentiate
• Each feature, e.g., mouth, encode a data dimension by their
shape, size, placement and orientation

10 facial features, each


corresponds to a
parameter in [0,1]

All 0.5 Random parameters


All 0 All 1 29
[ http://kspark.kaist.ac.kr/Human%20Engineering.files/Chernoff/Chernoff%20Faces.htm ]
[ http://kspark.kaist.ac.kr/Human%20Engineering.files/Chernoff/Chernoff%20Faces.htm ]

Chernoff Faces
• Represent only trends
but not actual values
• Drawback: Affected by
our perceived
importance of a facial
feature

30
Heat Maps
• Encode values stored in table entries
as colors
• Rows and columns can be reordered to
better expose features.

[M. Ward]

31
column: patient
row: gene

Heatmap from DNA


microarray data
showing genes
expressed differently
for two types of
leukemia.

[Warwick, http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/heatmap/ ]

Interactive examples: http://amp.pharm.mssm.edu/clustergrammer/ 32


Parallel Coordinates
• How to present all n axes of the n dimensions on
a 2D plane?
• Use parallel axes instead of orthogonal axes

[ http://mbostock.github.io/d3/talk/20111116/iris-parallel.html ]
33
• Each attribute value of a data item corresponds to a point on a
coordinate axis, and the data item is represented as a polyline
connecting these points
• A distinct class of objects can sometimes be seen as a group of
lines on some axes
• Ordering of axes is important for seeing patterns 34
Parallel Coordinates

[Ward 2010]

Parallel coordinate plot showing the distribution


(i.e., centers and extents) of clusters
35
Parallel Coordinates
• Parallel correlation

X1 & X2
proportional

Cartesian point plot PC plot

X1 & X2
inversely proportional

[Wong and Bergeron, “30 Years of Multidimensional Multivariate Visualization,” 36


Scientific Visualization: Overviews, Methodologies & Techniques, 1997.]
Parallel Coordinates
• Parallel correlation

Parallel coordinate plot of


six-dimensional data
illustrating correlations of
r (correlation coefficient)
=1, .8, .2, 0, -.2, -.8 and -1.

[Wegman, “Hyperdimensional Data


Analysis Using Parallel Coordinates”,
Journal of the American Statistical
Association, 1990.]

37
Dimension Reduction
• To remove some of the dimensions out from the display to
avoid cluttering
– Examples: Principle Component Analysis (PCA), Multidimensional Scaling
(MDS), Self Organizing Maps (SOM)

• Issue: Resulting dimensions are not the original ones, not


intuitive to users
PCA MDS

[http://commons.wikimedia.org/wiki/File:GaussianScatterPCA.png#mediaview [http://commons.wikimedia.org/wiki/File:Recent
er/File:GaussianScatterPCA.png] Votes.svg#mediaviewer/File:RecentVotes.svg] 38
Dimension Ordering
• Crucial for the effectiveness of many visualization
techniques
• Relationship among adjacent dimensions are easier to
detect than relationship among those positioned far
apart, e.g., Parallel Coordinates, Heat Maps
• Use for attribute mapping to highlight important
dimensions, e.g., Chernoff face,
• An NP-complete problem equivalent to the Travelling
Salesman Problem (TSP)
• Use approximation to compute ordering or by manual
ordering (interaction needed)

39
Geospatial Data
Geospatial Data
• Data refers to a specific location in the world.
– e.g., population, health data, traffic, etc.
• Visualization techniques used intensively in
geographic information systems (GIS),
cartography.
• Issues:
– Geographical aggregation
• Recall the London Cholera Case
– Map projection

41
Map Projections
• A mapping from a position on Earth (spherical
surface) to a position on screen (a flat plane)
• From longitude+latitude pair (l,j) to screen
coordinates (x,y)

All map
projections must
have distortions!
https://vvvv.org/blog/polar-spherical-
and-geographic-coordinates

meridians and circles of latitude


[An Album of Map Projections, U.S. Geological SurveyProfessional Paper 1453]
42
Map Projections
• Projection methods differ by spatial properties
that they preserve
– Conformal (preserves local angle and thus shape; not
area-preserving)
– Equal area (preserves area; shape can change)
– Equidistance (preserves distance from a specific point
or line)
– Others: Gnomonic (great circles as straight lines),
Azimuthal/Retrozimuthal (preserves direction
from/to a point)
Map projections: a video lecture
https://www.youtube.com/watch?v=v5fSBQRbPR0
43
Cylindrical Projection
• Each point on the sphere surface is projected
outward on a cylinder that is put around the sphere.
• No distortion around the equator where the cylinder
touches the globe, but severely distorted at the
poles
• Two common cylindrical
map projections:
– Equirectangular projection
– Lambert cylindrical projection

[Wikipedia]
44
Equirectangular Projection

[Wikipedia]

Mapping: x = l, y = j
• Cylindrical Projection
• Meridians are mapped to equally spaced vertical straight lines
• Circles of latitude are mapped to equally spaced horizontal
straight lines 45
Equirectangular Projection

[Wikipedia]

Mapping: x = l, y = j

• Neither conformal nor equal area, i.e., much distortion


• Use often in thematic mapping, e.g., choropleth map
46
Lambert Cylindrical Projection
• Cylindrical Projection
• Area preserving
• Undistorted along equator, but highly distorted near
the poles.
Mapping: x = l, y = sin j

[Wikipedia] 47
Choropleth Maps
• For showing data collected or aggregated by
geographical areas
• In Greek: choro = area, pleth = value
• Use color to encode values for a region

48
Choropleth Maps
Obesity in the US, 2002

[Heer et al., 2010]

• Problem: tends to highlight patterns in large


areas, while highly populated but small areas
might be of more interest 49
Cartograms
• Regions are resized so that the area directly
encodes a data variable
• Cartograms differ by the properties of:
– Shape preservation
– Exact area correspondence
– Topology preservation (i.e., region connectivity)
• An optimization problem to find a good
compromise between the above conflicting
criteria

50
Circular Cartograms
a.k.a. Dorling cartograms [Heer et al., 2010]

Data represented by area


Obesity in the US, 2002
faithfully, but shape &
Color: % of obese people
Circle size: absolute number of obese people topology are not retained
51
Noncontinuous Cartograms

Use shaded subregion


to represent data

[Ward et al., 2010]


Example: http://bl.ocks.org/mbostock/4055908
• Exact area, preserves shape
• Not preserving topology, map perception still ok
• Size limited by the maximum scaling w.r.t. map region
52
Noncontiguous Cartograms

[Ward et al., 2010]

• Exact area, preserves shape as much as possible


• Not preserving topology
• Map perception is more difficult
53
Continuous Cartograms

[Ward et al., 2010]

• Preserves topology Example: http://prag.ma/code/d3-cartogram/

• Preserves area & shape as much as possible only


• Takes a long time to compute the visualization, interactive data
change is not possible
54
Graduated Symbol Maps
• Data is showed by placing symbols over a map
• More dimensions can be visualized by encoding with
the attributes of the symbol

[Heer et al., 2010]

Obesity in the US, 2002

55
Exploiting 3D

Microsoft GeoFlow

https://dylanbabbs.com/writing/map-data-viz-design

56
Book: Visual Analytics for Data Scientists
Coordinate Transformations
• Select a spatial reference frame which best
facilitates data comparison, pattern finding, etc.
– Spherical coordinates? Cartesian coordinates?

Example from: Visual Analytics for Data Scientists

w.r.t. football field w.r.t. team center 57


Volume Data
3D Data as Point Set
• Each data sample contains 3 variables
• E.g., we want to show the relationship between
petal length, petal width & sepal length for the
Iris data Sepal length (y)

Each data point (x, y, z) represents a sample


in the data set

A 3D scatter plot will do


Petal length (x)

How about a volumetric data set?


Petal width (z) 59
Scalar Function Visualization
• Univariate
– a plot v = f(x)

• Bivariate
– a surface v = f(x,y)

2D surface
Contours (isolines) 60
Volume Data
A volume data is essentially a trivariate scalar function
A scalar value is defined at every (x, y, z) in the volume
domain: v = f(x, y, z)
If we have a discrete sampling of the 3D domain, we
obtain a voxel (volume element) representation.

https://minecraft.net
https://en.wikipedia.org/wiki/Volume_rendering
61
Isosurface Rendering

Visible Human Project

https://en.wikipedia.org/wiki/Voxel-Man
https://youtu.be/dPPjUtiAGYs

Slicing

Direct Volume Rendering 62


https://youtu.be/ojCNUoVfzh4
Slicing with Cut Planes
• Allow probing the 3D volume to see a subset (2D)
of data
Axis-aligned slicing Arbitrary cut plane

https://www.uni-muenster.de/... http://www.asawicki.info/...

63
Isosurface Rendering
To extract an isosurface from the volume data and use
standard surface rendering techniques to visualize
The Marching Cube Algorithm
Lorensen, W. E.; Cline, Harvey E. (1987). "Marching cubes: A high resolution 3d surface construction
algorithm". ACM Computer Graphics. 21 (4): 163–169.

Basic idea: identify if an


isosurface passes
through a voxel

The 15 cube
configuration (symmetry
considered)

64
https://en.wikipedia.org/wiki/Marching_cubes
Marching Cube Algorithm (in 2D)

https://en.wikipedia.org/wiki/Marching_squares 65
Isosurface Rendering

https://www.eriksmistad.no/...

Marching Cube in Action


https://youtu.be/LfttaAepYJ8

66
Direct Volume Rendering
Assigning color & transparencies based on voxel value
via a transfer function

http://cg.inf.h-bonn-rhein-
sieg.de/?page_id=2700

https://youtu.be/gq8oqtnKFH4
67
Visualization Gallery
• Take a look at:
– Tableau Public
(https://public.tableau.com/s/gallery)
– D3.js
(http://d3js.org/)
– Google Charts
(https://developers.google.com/chart/interactive/docs/gallery)

• Try visualize the Iris data set with the different techniques
taught in this class using the above tools.
• What can/cannot be done by these tools?

68
Reference
• Jeffrey Heer, Michael Bostock, and Vadim
Ogievetsky. 2010. A tour through the visualization
zoo. Commun. ACM 53, 6 (June 2010), 59-67.
(http://hci.stanford.edu/jheer/files/zoo/ )
• Matthew Ward, Georges Grinstein and Daniel Keim,
"Interactive Data Visualization: Foundations,
Techniques, and Applications", 2010 [Chapters 6 & 7]

69

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy