Chapter 2 Review
Chapter 2 Review
Frequency Distributions
What is a Frequency Distribution?
A frequency distribution is a list or a table …
containing the values of a variable (or a set of ranges within
which the data fall) ...
and the corresponding frequencies with which each value occurs
(or frequencies with which data fall within each range)
Why Use Frequency Distributions?
A frequency distribution is a way to summarize data
The distribution condenses the raw data into a more useful
form...
and allows for a quick visual interpretation of the data
Frequency Distribution: Discrete Data
Frequency Distribution
Histograms
The classes or intervals are shown on the horizontal axis
frequency is measured on the vertical axis
Histogram
7 6
6 5
Frequency
5 4 No gaps
4 3 between
3 2 bars, since
2
continuous
1 0 0 data
0
5 15 25 36 45 55 More
Class Midpoints
Questions for Grouping Data into Classes
1. How wide should each interval be?
(How many classes should be used?)
2. How should the endpoints of the intervals be
determined?
Often answered by trial and error, subject to user
judgment
The goal is to create a distribution that is neither too
"jagged" nor too "blocky”
Goal is to appropriately show the pattern of variation
in the data
How Many Class Intervals?
3.5
Many (Narrow class intervals) 3
Frequency
with gaps from empty classes 2
1.5
Can give a poor indication of how
1
frequency varies across classes
0.5
0
4
8
28
36
56
12
16
20
24
32
40
44
48
52
60
More
Few (Wide class intervals)
Temperature
may compress variation too much
12
and yield a blocky distribution
10
can obscure important patterns of
8
variation.
Frequency
2
0
0 30 60 More
Temperature
General Guidelines
Class widths can typically be reduced as the number of
observations increases
Distributions with numerous observations are more likely to
be smooth and have gaps filled since data are plentiful
Class Width
The class width is the distance between the lowest possible value and
the highest possible value for a frequency class
Histograms in Excel
1
Select
Tools/Data Analysis
2
Choose Histogram
3
Input data and bin ranges
35 is shown as 3 5
Stem Leaf
613 would become 6 1
776 would become 7 8
...
1224 becomes 12 2
Graphing Categorical Data
Categorical
Data
Investor's Portfolio
Savings
CD
Bonds
Stocks
0 10 20 30 40 50
Amount in $1000's
Pie Chart Example
Bonds Percentages
(Variables are Qualitative) are rounded to
Pareto Diagram Example 29% the nearest
percent
45% 100%
40% 90%
% invested in each category
80%
cumulative % invested
35%
70%
30%
(bar graph)
(line graph)
60%
25%
50%
20%
40%
15%
30%
10%
20%
Number of Frequency
5%
days read 10%
0
Bar 44 Example
Chart
0%
Stocks Bonds Savings CD
0%
1 24
2 18
3 16
4 20
5 22
6 26
7 30
Total 200
Newspaper readership per week
50
40
Freuency
30
20
10
0
0 1 2 3 4 5 6 7
Number of days newspaper is read per week
Comparing Investors
Savings
CD
Bonds
Stocks
0 10 20 30 40 50 60
60
50
40
East
30 West
North
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
0
1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
Year
250
200
Cost per Day
150
100
50
0
0 10 20 30 40 50 60 70
Volume per Day
Chapter Summary
Data in raw form are usually not easy to use for decision making
-- Some type of organization is needed:
♦ Table ♦ Graph
Techniques reviewed in this chapter:
Frequency Distributions and Histograms
Bar Charts and Pie Charts
Stem and Leaf Diagrams
Line Charts and Scatter Diagrams