Principles of Scalable Performance
Principles of Scalable Performance
performance
Speedup
Speedup measures increase in running time due
to parallelism.
Speedup
Superlinearity
Efficiency:
Speedup:
S = Time(the most efficient sequential
algorithm) / Time(parallel algorithm)
Efficiency:
E=S/N
with N is the number of
processors
Amdahls Law
Gene Amdahl,
10
Amdahls Law
Implications
11
Amdahls Law
1
1
S ( p)
f (1 f ) / n f
12
Amdahls Law
13
Amdahls Law
14
(1 f )t s
tp
ft s
n
1
(1 f )
f
n
15
n
n
S ( n)
nf (1 f ) 1 (n 1) f
16
As n increases,
Amdahls law
18
Amdahls law
19
Amdahls law
20
Amdahls law
21
Speedup
n = 10,000
n = 1,000
n = 100
Processors
22
Example
1
S
5.9
0.05 (1 0.05) / 8
23
Essence
24
25
26
27
Parallelism profile
Average Parallelism
29
Average Parallelism
30
Average Parallelism
31
32
Example 3.1
33
Example 3.1
34
Available Parallelism
Available Parallelism
Asymptotic Speedup - 1
Wi iti
W Wi
i 1
ti (1) Wi /
ti (k ) Wi / k
ti () Wi / i
Asymptotic Speedup - 2
m
Wi
T (1) ti (1)
i 1
i 1
m
Wi
T () ti ()
i 1
i 1 i
W
T (1)
i 1 i
S
m
A
T () Wi / i
i 1
(Asymptotic Speedup
the ideal case)
S in
Asymptotic Speedup - 3
A i ti / ti
i1
i1
m
W
T (1)
i 1 i
S
m
A
T () Wi / i
i 1
m
In ideal case S A
S A if communication latency and other
system overhead are considered.
39
Arithmetic Mean
42
Harmonic Mean
Rh
1/ R
m
i 1
Rh
f
m
i 1
/ Ri
S T1 / T
*
1
n
f / Ri
i 1 i
T * 1/ Rh*
(weighted arithmetic mean
execution time)
Example 3.2
46
Amdahls Law
n
Sn
1 n 1
The implication is that the best speedup possible is 1/ , regardless of
n, the number of processors. (n infinite)
Amdahls Law
48
System Efficiency 1
O (n) = total number of unit operations performed by an nprocessor system in completing a program P.
T (n) = execution time required to execute the program P on an
n-processor system.
System Efficiency 2
Redundancy
System Utilization
System utilization is defined as
U (n) = R (n) E (n) = O (n) / ( n T (n) )
Quality of Parallelism
Doing What?
56
Other Measures
Problem 1
A.
B.
58
Problem 1 Solution
59
Problem 1 Solution
60