PDC Assignment Group#7
PDC Assignment Group#7
Session: 2019-2023
i
Table of Contents
Performance Analysis 1
Speedup 1
Efficiency 2
Amdahl’s Law 3
Background and origin of Amdahl’s law 4
Importance of Amdahl’s law 5
Limitations and Assumptions of Amdahl’s law 6
Effects of Varying the Parallelizable fraction 7
Practical Applications of Amdahl’s law 8
Future Directions and Research Opportunities related to Amdahl’s Law 9
References 11
ii
Performance Analysis
Introduction
• Analysis of execution time for parallel algorithm to determine if it is worth the effort to
code and debug in parallel
• Understanding barriers to high performance and predict improvement
• Goal: to figure out whether a program merits parallelization
Speedup
– Expectation is for the parallel code to run faster than the sequential counterpart
– Ratio of sequential execution time to parallel execution time
• Execution time components
– Inherently sequential computations: σ (n)
– Potentially parallel computations: ϕ (n)
– Communication operations and other repeat computations: κ (n, p)
• Speedup ψ (n, p) solving a problem of size n on p processors
– On sequential computer
∗ Only one computation at a time
∗ No interprocess communication and other overheads
∗ Time required is given by
Σ (n) + ϕ (n)
– On parallel computer
∗ cannot do anything about inherently sequential computation
∗ in best case, computation is equally divided between p PEs
∗ also need to add time for interprocess communication among PEs
∗ Time required is given by
∗ Parallel execution time will be larger if it cannot be perfectly divided among p processors,
leading to smaller speedup
– Actual speedup will be limited by
1
Figure 1 Computation time Figure 2 Communication time
Efficiency
– Measure of processor utilization
– Ratio of speedup to number of processors used
– Efficiency ε (n, p) for a problem of size n on p processors is given by
Amdahl’s law
2
• Consider the expression for speedup ψ (n, p) above
This gives us the maximum possible speedup in the absence of communication overhead
Let f denote the inherently sequential fraction of the computation
Then, we have
• Based on the assumption that we are trying to solve a problem of fixed size as quickly as
possible
Provides an upper bound on the speedup achievable with a given number of processors,
trying to solve the problem In parallel
Also useful to determine the asymptotic speedup as the number of processors increases
• Example 1
Determine whether it is worthwhile to develop a parallel version of a program to solve a
particular problem
95% of program’s execution occurs inside a loop that can be executed in parallel
Maximum speedup expected from a parallel version of program executing on 8 CPUs
3
Expect a speedup of 5.9 or less
• Example 2
20% of a program’s execution time is spent within inherently sequential code
Limit to the speedup achievable by a parallel version of the program
• Example 3
Parallel version of a sequential program with time complexity Θ(n2), n being the size of
dataset
Time needed to input dataset and output result (sequential portion of code): (18000 + n) µs
Computational portion can be executed in parallel in time (n2/100) µs
Maximum speedup on a problem of size 10,000
Amdahl’s Law is named after Gene Amdahl, a prominent computer architect and
entrepreneur. In 1967, Amdahl introduced this law as a way to quantify the potential speedup
achievable through parallel computing.
At that time, computing systems were transitioning from single-core processors to multiple
cores, which allowed for parallel execution of tasks. Amdahl recognized the need to
understand how the proportion of parallel work in a task affects the overall performance
improvement.
4
By understanding the limitations imposed by the serial portion of a program, software
developers and computer architects can make informed decisions regarding parallelization
strategies, resource allocation, and system design to achieve optimal performance.
Amdahl’s Law has since been used as a guiding principle in various domains, including high-
performance computing, supercomputing, cloud computing, and parallel algorithm design. It
continues to shape the way researchers and practitioners approach parallel computing and
optimize systems for improved efficiency and speed.
5
Developers can use this insight to explore alternative algorithms or algorithmic
modifications that reduce the serial component and improve parallel scalability.
Amdahl’s Law has certain limitations and assumptions that should be considered when
applying it to parallel computing scenarios. Here are some key points to understand:
a) Fixed Workload: Amdahl’s Law assumes that the total workload remains constant
regardless of the number of processors employed. It implies that the size of the
problem being solved or the amount of work to be done remains the same. This
assumption is often reasonable for many parallel computing scenarios. However, in
some cases, the workload may vary with the number of processors, which can
influence the applicability and accuracy of Amdahl’s Law.
b) Fixed Serial Fraction: Amdahl’s Law assumes a fixed proportion or fraction of the
program that must be executed serially. This assumption implies that the serial portion
remains constant regardless of the number of processors used. However, in practice,
the proportion of serial and parallel work may vary depending on factors such as the
algorithm, input size, or problem complexity. If the proportion of serial work changes
significantly with the number of processors, Amdahl’s Law may provide less accurate
predictions.
c) Independent Work: Amdahl’s Law assumes that the parallelizable portion of the
program can be executed independently by multiple processors without any
communication or synchronization overhead. This assumption implies that the parallel
work can be divided into equal-sized tasks, and each task can be executed
concurrently without requiring interaction or coordination between processors. In
reality, parallel programs often involve communication, synchronization, and data
dependencies, which can introduce additional overhead and impact the achievable
speedup.
Example 1:
Consider a program that performs image processing tasks. The parallel portion
involves applying filters to different regions of the image in parallel, while the serial portion
involves image I/O operations. Amdahl’s Law assumes a fixed proportion of the program
dedicated to image I/O, regardless of the number of processors used. However, in practice, as
the number of processors increases, the I/O overhead may become more significant relative to
the parallel computation, potentially impacting the accuracy of Amdahl’s Law predictions.
Example 2:
6
Suppose a parallel program divides a large dataset into equal-sized chunks, and
each processor processes a chunk independently. Amdahl’s Law assumes that the division of
data and computation is balanced, and each processor takes the same amount of time to
process its chunk. However, if the workload is unevenly distributed due to data
characteristics or load imbalances, the actual speedup achieved may deviate from the
predictions of Amdahl’s Law.
These examples highlight that the accuracy and applicability of Amdahl’s Law depend on the
adherence of the workload and system characteristics to its assumptions. While Amdahl’s
Law provides a valuable framework for understanding the impact of serial portions on
parallel performance, it is essential to consider its limitations and adapt it to specific
scenarios for accurate predictions and analysis.
In Amdahl’s Law, the parallelizable fraction refers to the portion of a computation that can
be effectively parallelized. This fraction represents the part of the program that can benefit
from parallel execution, while the remaining fraction must be executed sequentially.
Understanding the effects of varying the parallelizable fraction is crucial for optimizing
performance in parallel computing.
When the parallelizable fraction is small, Amdahl’s Law tells us that no matter how many
processors or cores we allocate to the parallel portion, the overall speedup of the computation
will be limited. This is because the sequential portion, which cannot be parallelized, acts as a
bottleneck, constraining the potential speedup achievable by parallelization.
To illustrate this, let’s consider an example. Suppose we have a program with a total runtime
of 100 seconds, consisting of a parallelizable fraction of 0.8 (80%) and a sequential fraction
of 0.2 (20%). If we parallelize the 80% portion across multiple processors, we can calculate
the potential speedup using Amdahl’s Law.
For simplicity, let’s assume we have an infinite number of processors. In this case, the
speedup is given by the formula:
- If we have 10 processors:
7
Speedup = 1 / [(1 – 0.5) + (0.5 / 10)] ≈ 3.33
From these examples, we can observe that as the parallelizable fraction decreases, the
potential speedup diminishes even with a larger number of processors. This highlights the
importance of identifying and optimizing the parallelizable portion of a computation to
achieve significant performance improvements in parallel computing.
By analyzing the effects of varying the parallelizable fraction, developers can determine the
optimal balance between sequential and parallel execution to achieve the best possible
speedup in their parallel computing systems.
1. Performance Analysis: Amdahl’s Law allows for the evaluation of the potential
speedup achievable by parallelizing a given computation. It helps identify the critical
portions of a program that limit scalability and overall performance. By quantifying
the impact of the serial component, developers can focus their optimization efforts on
the most significant areas.
8
6. Load Balancing: Amdahl’s Law highlights the importance of load balancing in
parallel systems. If the workload is not evenly distributed across processors, the serial
portion may become a bottleneck. By balancing the workload and minimizing idle
time, developers can maximize the benefits of parallelization and achieve higher
speedup.
7. System Design and Architecture: Amdahl’s Law plays a crucial role in designing
efficient parallel systems. It helps architects determine the optimal balance between
serial and parallel components, guiding decisions related to processor count, memory
capacity, interconnect design, and other system parameters. This ensures that the
system is well-suited for the desired workload and can achieve the expected
performance gains.
By leveraging Amdahl’s Law, practitioners can gain insights into the limitations and
potential gains of parallel computing, enabling them to make informed decisions, optimize
algorithms, allocate resources effectively, and design efficient parallel systems.
4. Task Scheduling and Load Balancing: Task scheduling and load balancing are
critical factors in achieving efficient parallel execution. Future research may delve
into advanced scheduling algorithms and load balancing strategies that consider
dynamic workloads, communication patterns, and system heterogeneity.
9
programming models, frameworks, and tools that facilitate parallel programming and
make it easier to exploit parallelism effectively.
Overall, above point highlights the evolving nature of parallel computing and invites
researchers to explore various avenues for advancing the understanding, application, and
optimization of parallel computing systems beyond the scope of Amdahl’s Law. It
underscores the need for continuous research and innovation to overcome challenges and
unlock the full potential of parallel computing.
10
References:
1. ^ Rodgers, David P. (June 1985). "Improvements in multiprocessor system design". ACM
SIGARCH Computer Architecture News. New York, NY, USA: ACM. 13 (3): 225–231 [p. 226].
doi:10.1145/327070.327215. ISBN 0-8186-0634-7. ISSN 0163-5964. S2CID 7083878.
2. ^ Reddy, Martin (2011). API Design for C++. Burlington, Massachusetts: Morgan Kaufmann
Publishers. Doi: 10.1016/C2010-0-65832-9. ISBN 978-0-12-385003-4. LCCN 2010039601. OCLC
666246330.
3. ^ Bryant, Randal E.; David, O’Halloran (2016), Computer Systems: A Programmer's Perspective (3
ed.), Pearson Education, p. 58, ISBN 978-1-488-67207-1
4. ^ McCool, Michael; Reinders, James; Robison, Arch (2013). Structured Parallel Programming:
Patterns for Efficient Computation. Elsevier. p. 61. ISBN 978-0-12-415993-8.
5. ^ Hill, Mark D.; Marty, Michael R. (2008). "Amdahl's Law in the Multicore Era". Computer. 41
(7): 33–38. doi:10.1109/MC.2008.209.
6.^ Rafiev, Ashur; Al-Hayanni, Mohammed A. N.; Xia, Fei; Shafik, Rishad; Romanovsky, Alexander;
Yakovlev, Alex (2018-07-01). "Speedup and Power Scaling Models for Heterogeneous Many-Core
Systems". IEEE Transactions on Multi-Scale Computing Systems. 4 (3): 436–449.
doi:10.1109/TMSCS.2018.2791531. ISSN 2332-7766. S2CID 52287374.
7.^ Al‐hayanni, Mohammed A. Noaman; Xia, Fei; Rafiev, Ashur; Romanovsky, Alexander; Shafik,
Rishad; Yakovlev, Alex (July 2020). "Amdahl's law in the context of heterogeneous many ‐core
systems – a survey". IET Computers & Digital Techniques.
14 (4): 133–148. doi:10.1049/iet-cdt.2018.5220. ISSN 1751-8601. S2CID 214415079.
11