0% found this document useful (0 votes)

194 views31 pages

PDC - Lecture - No. 2

This document discusses key concepts in parallel and distributed computing including: - Amdahl's law, which establishes an upper bound on the maximum speedup that can be achieved from parallelizing a program. It depends on the fraction of the program that must be executed sequentially. - Types of parallelism like data parallelism, functional parallelism, and pipelining. Data parallelism applies the same operation to different data elements, while functional parallelism applies different operations to different data elements. - Differences between multi-processor systems with shared memory versus multi-computer systems with distributed memory, where processors only have direct access to local memory. - Cache coherence and snooping protocols needed to maintain consistency between caches

Uploaded by

nauman tariq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

194 views31 pages

PDC - Lecture - No. 2

Uploaded by

nauman tariq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Parallel and Distributed

Computing

Lecture 2
Amdahl’s Law
Outline

 Amdahl’s Law of Parallel Speedup

 Karp-Flatt Metric

 Types of Parallelism
 Data-parallelism
 Functional-parallelism
 Pipelining

 Multi-processor vs Multi-computer

 Cluster vs Network of workstations

Amdahl’s Law

 Amdahl’s was formulized in 1967

 It shows an upper-bound on the maximum speedup that can be
achieved
 Suppose you are going to design a parallel algorithm for a
problem
 Further suppose that fraction of total time that the algorithm
must consume in serial executions is ‘F’
 This implies fraction of parallel potion is (1- F)
 Now, Amdahl’s law states that

 Here ‘p’ is total number of available processing nodes.

Amdahl’s Law

Derivation
 Let’s suppose you have a sequential code for a problem that
can be executed in total T(s)time.
 T(p) be the parallel time for the same algorithm over p
processors.
Then speedup can be calculated using:-

 T(p) can be calculated as:

Amdahl’s Law

Derivation
 Again

 What if you have infinite number of processors?

 What you have to do for further speedup?
Amdahl’s Law

 Example 1: Suppose 70% of a sequential algorithm is

parallelizable portion. The remaining part must be calculated
sequentially. Calculate maximum theoretical speedup for
parallel variant of this algorithm using i). 4 processors and ii).
infinite processors.

 F and 1-F use Amdahl’s law to calculate theoretical speedups.

Amdahl’s Law

 Example 2: Suppose 25% of a sequential algorithm is

parallelizable portion. The remaining part must be calculated
sequentially. Calculate maximum theoretical speedup for parallel
variant of this algorithm using 5 processors and infinite processors.
 ???

 Little challenge: Determine, according to Amdahl's law, how

many processors are needed to achieve maximum theoretical
speedup while sequential portion remains the same?
 The answer may be surprising?
 That’s why we say actual achievable speedup is always less-than
or equal to theoretical speedups.
Karp-Flatt Metric
Karp-Flatt Metric

 The metric is used to calculate serial fraction for a given

parallel configuration.
 i.e., if a parallel program is exhibiting a speedup S while using P
processing units then experimentally determined serial fraction e is
given by :-

Example task: Suppose in a parallel program, for 5

processors, you gained a speedup of 1.25x, determine sequential
fraction of your program.
CS3006 - Fall 2021
Types of Parallelism
Types of Parallelism

1. Data-parallelism
 When there are independent tasks applying the same operation
to different elements of a data set
 Example code
for i= to 99 do
a[ i ] = b[ i ] + c [ i ]
Endfor

 Here same operation addition is being performed on first 100

of ‘b’ and ‘c’
 All 100 iterations of the loop could be executed
simultaneously.
Types of Parallelism

2. Functional-parallelism
 When there are independent tasks applying different operations
to different data elements
 Example code
1) a=2
2) b=3
3) m= (a+b)/2
4) s= ( + )/2
5) v= s -
 Here third and fourth statements could be performed
concurrently.
Types of Parallelism

3. Pipelining
 Usually used for the problems where single instance of the
problem can not be parallelized
 The output of one stage is input of the other stage
 Dividing whole computation of each instance into multiple
stages provided that there are multiple instances of the problem
 An effective method of attaining parallelism on the
uniprocessor architectures
 Depends on pipelining abilities of the processor
Types of
Parallelism
3. Pipelining
 Example:
Assembly line
analogy

Sequential Execution
Types of
Parallelism
3. Pipelining
 Example:
Assembly line
analogy

Pipelining
Types of
Parallelism
3. Pipelining
 Example:
Overlap
instructions in a
single instruction
cycle to achieve
parallelism

4-stage Pipelining
Multi-processor
vs
Multi-Computer
Multi-Processor

 Multiple-CPUs with a shared memory

 The same address on two different CPUs refers to the same

memory location.

 Generally two categories:-

1. Centralized Multi-processors
2. Distributed Multi-processor
Multi-Processor

i. Centralized Multi-processor
 Additional CPUs are attached to the
system bus, and all the processors
share the same primary memory
 All the memory is at one place and
has the same access time from every
processor
 Also known to as UMA(Uniform
Memory Access) multi-processor or
SMP (symmetrical Multi-processor )
Multi-Processor

ii. Distributed Multi-processor

 Distributed collection of memories
forms one logical address space
 Again, the same address on different
processors refers to the same memory
location.
 Also known as non-uniform memory
access (NUMA) architecture
 Because, memory access time varies
significantly, depending on the
physical location of the referenced
address
Multi-Computer

 Distributed-memory, multi-CPU computer

 Unlike NUMA architecture, a multicomputer has disjoint
local address spaces
 Each processor has direct access to their local memory only.
 The same address on different processors refers to two
different physical memory locations.
 Processors interact with each other through passing
messages
Multi-Computer
Asymmetric Multi-Computers

 A front-end computer that interacts

with users and I/O devices

 The back-end processors are

dedicatedly used for “number
crunching”

 Front-end computer executes a

full, multiprogrammed OS and
provides all functions needed for
program development

 The backends are reserved for

executing parallel programs
Multi-Computer
Symmetric Multi-
Computers
 Every computer executes same OS
 Users may log into any of the
computers
 This enables multiple users to
concurrently login, edit and
compile their programs.
 All the nodes can participate in
execution of a parallel program
Network of Workstations
vs
Cluster
Cluster Network of workstations

Usually a co-located collection of low-cost A dispersed collection of computers.

computers and switches, dedicated to Individual workstations may have different
running parallel jobs. Operating systems and executable
All computer run the same version of programs
operating system.
Some of the computers may not have User have the power to login and power
interfaces for the users to login off their workstations
Commodity cluster uses high speed Ethernet speed for this network is usually
networks for communication such as fast slower. Typical in range of 10 Mbps
Ethernet@100Mbps, gigabit
Ethernet@1000 Mbps and Myrinet@1920
Mbps.
Reading Assignment

 Cache Coherence and Snooping

 Branch prediction and issues while
pipelining the problem
Assigned reading pointers:

 Cache Coherence:
 When we are in a distributed environment, each CPU’s cache needs to
be consistent (continuously needs to be updated for current values),
which is known as cache coherence.
 Snooping:
 Snoopy protocols achieve data consistency between the cache memory
and the shared memory through a bus-based memory system. Write-
invalidate and write-update policies are used for maintaining cache
consistency.
 Branch Prediction:
 Branch prediction is a technique used in CPU design that attempts to
guess the outcome of a conditional operation and prepare for the most
likely result.
Questions
CS3006 - Fall 2021

Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Lecture 3 Amdahl's Law and Karp Flatt Metric
No ratings yet
Lecture 3 Amdahl's Law and Karp Flatt Metric
42 pages
2-Amdahls Law
No ratings yet
2-Amdahls Law
32 pages
Week 01 Lec 2 - 05!03!2024 (Types of Parallelism)
No ratings yet
Week 01 Lec 2 - 05!03!2024 (Types of Parallelism)
17 pages
Week - 01 - Lec - 2 - 05-03-2021 (Types of Parallelism)
No ratings yet
Week - 01 - Lec - 2 - 05-03-2021 (Types of Parallelism)
17 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
36 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
Lect 02
No ratings yet
Lect 02
51 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
PDC Notes by Zatch-1
No ratings yet
PDC Notes by Zatch-1
42 pages
Parallel and Distributed Computing Lecture 02
No ratings yet
Parallel and Distributed Computing Lecture 02
17 pages
BDS-Session-2
No ratings yet
BDS-Session-2
58 pages
Principles of Scalable Performance
No ratings yet
Principles of Scalable Performance
61 pages
Lec7 PDF
No ratings yet
Lec7 PDF
16 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
hpc_parallel
No ratings yet
hpc_parallel
122 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
Week_7 (1)
No ratings yet
Week_7 (1)
27 pages
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
100% (1)
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
38 pages
RS_PDS-OE 3010
No ratings yet
RS_PDS-OE 3010
8 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
Pepper Presentation
No ratings yet
Pepper Presentation
38 pages
PDC Last Min Notes For MCQS - Theory
No ratings yet
PDC Last Min Notes For MCQS - Theory
39 pages
Week 7
No ratings yet
Week 7
27 pages
Arch13 Multiprocessors Afterlecture
No ratings yet
Arch13 Multiprocessors Afterlecture
70 pages
Introduction To Parallel Processing: Shantanu Dutt University of Illinois at Chicago
No ratings yet
Introduction To Parallel Processing: Shantanu Dutt University of Illinois at Chicago
51 pages
2nd
No ratings yet
2nd
19 pages
unit1 2 and 3
No ratings yet
unit1 2 and 3
76 pages
Lecture 4 Analytical Modeling of Parallel Programs
No ratings yet
Lecture 4 Analytical Modeling of Parallel Programs
11 pages
Introduction to Paralel Procesing
No ratings yet
Introduction to Paralel Procesing
40 pages
2 New Module 2 Performance Analysis of Multiprocessor Architectures Students Version
No ratings yet
2 New Module 2 Performance Analysis of Multiprocessor Architectures Students Version
13 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
7 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Lecture 2
No ratings yet
Lecture 2
32 pages
Cao- Unit 4 - Notes_final
No ratings yet
Cao- Unit 4 - Notes_final
30 pages
Course Outcome 1:: 15Cs4180 - Parallel Computing
No ratings yet
Course Outcome 1:: 15Cs4180 - Parallel Computing
23 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Computer Hardware Engineering: IS1200, Spring 2015
No ratings yet
Computer Hardware Engineering: IS1200, Spring 2015
17 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Lecture-2-06.01.2025
No ratings yet
Lecture-2-06.01.2025
21 pages
Parallel and Distributed Computing Complete Notes
No ratings yet
Parallel and Distributed Computing Complete Notes
41 pages
Zindagi Zama Da
No ratings yet
Zindagi Zama Da
21 pages
Introduction To Parallel Computing-Dr Nousheen
No ratings yet
Introduction To Parallel Computing-Dr Nousheen
43 pages
001__DDS-IIIT-Jan-10th
No ratings yet
001__DDS-IIIT-Jan-10th
34 pages
2022 Mid 1
No ratings yet
2022 Mid 1
4 pages
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
No ratings yet
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
40 pages
Parallel Processor Computing Unit 1
No ratings yet
Parallel Processor Computing Unit 1
10 pages
Multiprocessor Concepts
No ratings yet
Multiprocessor Concepts
40 pages
Lecture Week - 3 Amdahl Law 1
No ratings yet
Lecture Week - 3 Amdahl Law 1
19 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Lecture 2 Amdahl's Law and Karp-Flatt Metric
0% (1)
Lecture 2 Amdahl's Law and Karp-Flatt Metric
14 pages
2.ParallelArchExec
No ratings yet
2.ParallelArchExec
46 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
Amdahl's Law
No ratings yet
Amdahl's Law
25 pages
CS439 CC 2 Parallel Distributed Systems[1]
No ratings yet
CS439 CC 2 Parallel Distributed Systems[1]
37 pages
The Complete Future Trait Guide
From Everand
The Complete Future Trait Guide
Hamze Ghalebi
No ratings yet
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Zeeshan
No ratings yet
Zeeshan
4 pages
PDC - Lecture - No. 3
No ratings yet
PDC - Lecture - No. 3
34 pages
Database Project Hospital Management System
No ratings yet
Database Project Hospital Management System
91 pages
Assignment-Cn Name Zeeshan Ali Roll No 19011519-163-A Submited To
No ratings yet
Assignment-Cn Name Zeeshan Ali Roll No 19011519-163-A Submited To
4 pages
Assignment-Cn: Syeda Pakiza Imran
No ratings yet
Assignment-Cn: Syeda Pakiza Imran
5 pages
Topic: Lab Task 11
No ratings yet
Topic: Lab Task 11
2 pages
Multivariable Calculus Multivariable Calculus
No ratings yet
Multivariable Calculus Multivariable Calculus
1 page
Lab Task No. 5: 19011519-088 Nauman: Bs-Cs-A
No ratings yet
Lab Task No. 5: 19011519-088 Nauman: Bs-Cs-A
6 pages
Essay Writting
No ratings yet
Essay Writting
12 pages
Submitted BY: Ali Raza
No ratings yet
Submitted BY: Ali Raza
3 pages
Theories of Punishment in West
No ratings yet
Theories of Punishment in West
2 pages
# Algorithm and Examples: Assignment #1
No ratings yet
# Algorithm and Examples: Assignment #1
5 pages
Assignment No:1 Name: Waleed Tariq Submitted To: Sir Hamza Subject: TAX Assignment Topic: Audit Class: LLM Semester 2 Campus: University of Lahore
No ratings yet
Assignment No:1 Name: Waleed Tariq Submitted To: Sir Hamza Subject: TAX Assignment Topic: Audit Class: LLM Semester 2 Campus: University of Lahore
6 pages
Articulators: DR - Muhammad Aamir Rafiq Assistant Prof
No ratings yet
Articulators: DR - Muhammad Aamir Rafiq Assistant Prof
52 pages
Kashmir Issue
No ratings yet
Kashmir Issue
20 pages
Assignment: Application of Calculus and Analytical Geometry
50% (2)
Assignment: Application of Calculus and Analytical Geometry
7 pages
Introduction To Mobile Application Development: Presented by Chandan Mourya
No ratings yet
Introduction To Mobile Application Development: Presented by Chandan Mourya
26 pages
ZM Ve200se Eng
No ratings yet
ZM Ve200se Eng
6 pages
11 Managing Sub Programs
No ratings yet
11 Managing Sub Programs
5 pages
Lecture 17 File Processing in Java by Rab Nawaz Jadoon
No ratings yet
Lecture 17 File Processing in Java by Rab Nawaz Jadoon
19 pages
Vivek Chopra
No ratings yet
Vivek Chopra
2 pages
Cef Log
No ratings yet
Cef Log
7 pages
Oddcast Tech Note No. 5 Guidelines For Using The Vhost Text To Speech Api
No ratings yet
Oddcast Tech Note No. 5 Guidelines For Using The Vhost Text To Speech Api
4 pages
Communication Protocol of Contec Pulse Oximeter V7.0
No ratings yet
Communication Protocol of Contec Pulse Oximeter V7.0
8 pages
05 MongoDB Aggregation Pipeline With Examples
No ratings yet
05 MongoDB Aggregation Pipeline With Examples
41 pages
lp1 - S012022020014 - PRACTICAL LAB 1
No ratings yet
lp1 - S012022020014 - PRACTICAL LAB 1
2 pages
VM Infrastructure
No ratings yet
VM Infrastructure
1 page
ACER Veriton X
No ratings yet
ACER Veriton X
3 pages
User Guide Embedded Control System of The Igus Robolink DCi Robot Arm (Version 201910 V01.3-EN)
No ratings yet
User Guide Embedded Control System of The Igus Robolink DCi Robot Arm (Version 201910 V01.3-EN)
32 pages
MaximumPC October 2011
100% (1)
MaximumPC October 2011
72 pages
HPE G2 Enterprise Series Racks Datasheet
No ratings yet
HPE G2 Enterprise Series Racks Datasheet
4 pages
[Lec4.0]-Input output[Part 1, Mano-Ch11]
No ratings yet
[Lec4.0]-Input output[Part 1, Mano-Ch11]
22 pages
HW1SolSp25
No ratings yet
HW1SolSp25
11 pages
Manual 26015 (Revision A, 2/2014) : Woodward Control Assistant Software
No ratings yet
Manual 26015 (Revision A, 2/2014) : Woodward Control Assistant Software
2 pages
Files2Sql - Manual (PDF Library)
No ratings yet
Files2Sql - Manual (PDF Library)
32 pages
Watcher Autocheck
No ratings yet
Watcher Autocheck
2 pages
TS3310CLI Instructions
No ratings yet
TS3310CLI Instructions
6 pages
Selenium Configuration
No ratings yet
Selenium Configuration
14 pages
Manual ELT Tester
No ratings yet
Manual ELT Tester
18 pages
PHP Tutorial - Learn PHP
No ratings yet
PHP Tutorial - Learn PHP
67 pages
MT6070iH 8070ih MT607i Installation 101028
No ratings yet
MT6070iH 8070ih MT607i Installation 101028
8 pages
Fundamentals of Information Technology
No ratings yet
Fundamentals of Information Technology
424 pages
Software Update For BD30: Step 1: Creating Separate Firmware Update Device
No ratings yet
Software Update For BD30: Step 1: Creating Separate Firmware Update Device
2 pages
Download Full Visual Studio Code Distilled: Evolved Code Editing for Windows, macOS, and Linux 2nd ed. Alessandro Del Sole PDF All Chapters
100% (3)
Download Full Visual Studio Code Distilled: Evolved Code Editing for Windows, macOS, and Linux 2nd ed. Alessandro Del Sole PDF All Chapters
21 pages
Database Access Db2
No ratings yet
Database Access Db2
16 pages
NX300 NX300 NX300 NX300 Firmware Upgrade Guide Firmware Upgrade Guide II HH ( (Ii - Launcher) Launcher)
No ratings yet
NX300 NX300 NX300 NX300 Firmware Upgrade Guide Firmware Upgrade Guide II HH ( (Ii - Launcher) Launcher)
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

PDC - Lecture - No. 2

Uploaded by

PDC - Lecture - No. 2

Uploaded by

Parallel and Distributed

 Amdahl’s Law of Parallel Speedup

 Cluster vs Network of workstations

 Amdahl’s was formulized in 1967

 Here ‘p’ is total number of available processing nodes.

 T(p) can be calculated as:

 What if you have infinite number of processors?

 Example 1: Suppose 70% of a sequential algorithm is

 F and 1-F use Amdahl’s law to calculate theoretical speedups.

 Example 2: Suppose 25% of a sequential algorithm is

 Little challenge: Determine, according to Amdahl's law, how

 The metric is used to calculate serial fraction for a given

Example task: Suppose in a parallel program, for 5

 Here same operation addition is being performed on first 100

 Multiple-CPUs with a shared memory

 The same address on two different CPUs refers to the same

 Generally two categories:-

ii. Distributed Multi-processor

 Distributed-memory, multi-CPU computer

 A front-end computer that interacts

 The back-end processors are

 Front-end computer executes a

 The backends are reserved for

Usually a co-located collection of low-cost A dispersed collection of computers.

 Cache Coherence and Snooping

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.