0% found this document useful (0 votes)
56 views

5 - Grid Computing

Grid computing enables the sharing of geographically distributed heterogeneous computing resources like processors, data, and applications. It provides on-demand access to these resources in a consistent and inexpensive manner for solving large-scale computationally and data intensive problems. Grid computing aims to aggregate otherwise underutilized resources to address problems too large for any individual computer. Key benefits include exploiting idle computing power, providing massive parallel processing, enabling virtual organizations for collaboration, and improving reliability through load balancing across resources.

Uploaded by

umair
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

5 - Grid Computing

Grid computing enables the sharing of geographically distributed heterogeneous computing resources like processors, data, and applications. It provides on-demand access to these resources in a consistent and inexpensive manner for solving large-scale computationally and data intensive problems. Grid computing aims to aggregate otherwise underutilized resources to address problems too large for any individual computer. Key benefits include exploiting idle computing power, providing massive parallel processing, enabling virtual organizations for collaboration, and improving reliability through load balancing across resources.

Uploaded by

umair
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

GRID COMPUTING

Dr. Syed Nasir Mehmood Shah


nasirsyed.utp@gmail.com
Overview
• Introduction

• What is Grid Computing?

• Evolution

• What Grid Computing can do?

• Concepts and Components


The Problem

• Consider this scenario,

– When the Large Hadron Collider in CERN would be


completed

– Speculated to provide scientists with a large amount of


data on the behavior of particles

– It poses a technological challenge because it would


produce over 10 Petabytes of data a year

– Almost equivalent to a stack of CDs 20 Km high!


What we have

• In a review, IBM has noted that

– over a 24-hour period a UNIX server is actually serving


less than 10 percent of its capacity

– For Mainframes which ironically are targeted towards


specific problems, this figure is about just 40 percent
usage.

– For Desktops and Laptops this figure is a meager 5


percent.
Problem Domain

• Ever increasing demand for huge source of computing energy


which can be used for power hungry scientific and business
processes.

• To utilize the already under-utilized resources which are


worth millions of monetary value. This computing power has
to be put to use, without sacrificing the availability, flexibility,
accessibility and many other factors
Introduction to Grid Computing
A ‘Grid’ is an infrastructure for resource sharing.
 Most Grids use the idle time of thousands or millions of computers
throughout the world.
Introduction to Grid Computing

 Grid Computing
 Grid computing enables the sharing, selection, and aggregation
of geographically distributed heterogeneous resources for solving
large-scale problems in science, engineering, and commerce
Power Grid Analogy
• Electrical/Power Grid
– No worry about where the electricity comes from – simply plug your toaster to get
the electrical power

– Infrastructure that makes this possible is called "the power grid“ e.g. Transmission
lines, power stations, etc

– Pervasive: electricity is available essentially everywhere

– Utility: you ask for electricity and pay for it

• Grid
– No worry about where the computer power you are using comes from - simply plug
your computer in to the Internet to get the computer power

– Infrastructure that makes this possible is called "the Grid“ i.e. it links PCs, servers,
network elements
– Grid is to be pervasive – accessible through the web services/portal
– Utility: you ask for computer power or storage capacity you get it and pay for
What is Grid Computing?

• Carl Kesselman and Ian Foster wrote this definition in their


book "The Grid: Blueprint for a New Computing
Infrastructure":

"A computational grid is a hardware and software


infrastructure that provides dependable, consistent,
pervasive, and inexpensive access to high-end computational
capabilities."
What is grid Computing?

• IBM's Grid Computing has put forward the following


statement:

"Grid is the ability, using a set of open standards and


protocols, to gain access to applications and data,
processing power, storage capacity and other computing
resources over the Internet.
Evolution Global
• 1996-1999 – Experimentation and core grid Grids

protocols – Globus Toolkit 1.0


Partner
Grids
• 1999 – Data grid and Globus Toolkit 2.0
Enterprise
• Medium-scale data management and analysis cluster/
grid

• 2001 – OGSA with Globus Toolkit 3.0 and Local


Data
Grids
integration with Web services and resource Local
cluster
virtualization. Plus number of higher level computing

Super computers
services Personal devices

• Problems – lack of common vocabulary, TIME

EARLY SECOND THIRD


common infrastructure formulation, common STAGE STAGE STAGE

intercommunication protocols and common


interfaces or APIs.

• New services include common vocabulary and


systemization

• 2003 onwards - more extensive standardization


and computing
Types of Grid

• Computational grid
– Setting aside resources specifically for computing power
– Most of the machines are high-performance servers

• Scavenging grid
– Used with large numbers of desktop machines
– Machines are scavenged for available CPU cycles and other
resources
– Owners of the desktop machines are usually given control
over when their resources are available to participate in the
grid
Types of Grid

• Data grid
– Housing and providing access to data across multiple
organizations
– Users are not concerned with where this data is located as
long as they have access to the data
What Grid computing can do?

• Every computer can access the resources of every

other computer belonging to the network

• A scientist studying proteins, logs into a computer

and uses an entire network of computers to analyze data

• A businessman accesses his company's network through a PDA in order to forecast

the future of a particular stock

• An Army official accesses and coordinates computer resources on three different

military networks to formulate a battle strategy

• All of these scenarios have one thing in common: They rely on a concept called grid

computing
What Grid computing can do?

Exploiting underutilized resources

LAN

LAN
What Grid computing can do?

Parallel CPU capacity

• Potential for massive parallel CPU capacity - In addition to pure


scientific needs - industries such as the bio-medical field, financial
modeling, oil exploration, motion picture animation, and many others

• Applications have been written to use algorithms that can be


partitioned into independently running parts

• A CPU intensive grid application can be thought of as many smaller


“subjobs,” each executing on a different machine. Subjobs do not need
to communicate with each other - “scalable”

• Barriers often exist to perfect scalability


What Grid computing can do?
Virtual resources and virtual organizations for collaboration Simple view of
heterogeneous, dispersed
resources due to
virtualization

• Enable and simplify collaboration among a wider audience and allow


heterogeneous systems to work together to form the image of a large virtual
computing system offering a variety of virtual resources

• Sharing starts in the form of files or data

• “Data grid”
– Large capacities than single system
– Such spanning can improve data transfer rates through the use of striping
tech
– Duplicate data to serve as a backup
– Sharing is not limited to files - resources, such as equipment, software,
services, licenses
What Grid computing can do?

Access to additional resources

• In addition to CPU and storage resources, a grid can provide access to


increased quantities of other resources and to special equipment

• Expensive licensed software

• Special devices like remote printers

• Special equipment

• Remote medical diagnostic

• Robotic surgery tools with two-way interaction from a distance


What Grid computing can do?
Resource balancing

• Can offer a resource balancing effect by


scheduling grid jobs on machines with low
utilization

• Larger peak loads are handled in two


ways:
– An unexpected peak can be routed to
relatively idle machines in the grid.
– If the grid is already fully utilized, the
lowest priority work being performed
on the grid can be temporarily
suspended or even cancelled Jobs are migrated to less busy parts of the grid to
balance loads
What Grid computing can do?

Reliability

• High-end conventional computing systems use expensive hardware to increase


reliability - greater cost, due to the duplication of high-reliability components

• Alternate approach - relies more on software technology than expensive hardware

• Grid management software can automatically resubmit jobs to other machines on


the grid when a failure is detected

• Real-time situations, multiple copies of the important jobs can be run on different
machines throughout the grid

• Autonomic computing – software that automatically heals problems in the grid


even before an operator or manager is aware of them
What Grid computing can do?
Management

• Offers management of priorities among


different projects

• Administrators can change any number of


policies that affect how the different
organizations might share or compete for
resources
Administrators can adjust policies to better
allocate resources
• When maintenance is required, grid work can
be rerouted to other machines without
crippling the projects involved

Autonomic computing - able to identify important trends throughout the grid,


informing management of those that require attention
Types of Resources
Computation
• computing cycles provided by the processors of the machines on the
grid

• three primary ways to exploit the computation resources


– simplest is to use it to run an existing application on an available
machine on the grid rather than locally
– use an application designed to split its work in such a way that the
separate parts can execute in parallel on different processors
– run an application, that needs to be executed many times, on many
different machines in the grid
• Scalability is a measure of how efficiently the multiple processors on a
grid are used
Types of Resources
Storage

• “data grid” - provides some quantity of storage for grid use

• Internal memory - temporary

• Secondary storage increases capacity, performance, sharing,


and reliability of data

• Any individual file or data base can span several storage devices
and machine
Types of Resources

Communications

• Includes communications within the grid and external to the


grid

• Bandwidth available for large data communications can often


be a critical resource

• External communication access to the Internet

• Redundant communication paths are sometimes needed to


better handle potential network failures and excessive data
traffic
Types of Resources

Software and licenses

• expensive software installation – not possible on every grid


machine

• licensing fees are significant - expenses for an organization


Types of Resources

Jobs and applications

• Jobs are programs which may compute


something, execute one or more system
commands, move or collect data, or
An application is one or more jobs that are
operate machinery scheduled to run on grid

May run parallel on different machines in the grid


Application is one or more jobs that are scheduled to run on machines in the
grid - results are collected and assembled to produce the answer
Grid software components
Management components
• Component - keeps track of the resources available to the
grid users/members
• Measurement components that determine both the
capacities of the nodes on the grid and their current
utilization rate at any given time – used for scheduling and
to determine the health of the grid -alerting
– Any fault
– Congestion
– Over-commitment
• Advanced grid management software can automatically
manage many aspects of the grid - known as autonomic
computing, or recovery oriented computing
Grid software components
Management components
• automatically recover from various kinds of grid failures and
outages - finding alternative ways to get the workload
processed
Grid software components
Distributed grid management

• Larger grids may have a hierarchical or other type of organizational topology

• The work involved in managing the grid is distributed to increase the scalability

• Job by central job scheduler submitted to a lower level scheduler that handles the
assignment to the specific machine

Submission software

• Any member machine of a grid can be used to submit jobs and queries to the grid

• This function may be implemented as a separate component installed on submission


nodes or submission clients in some grids

• When a grid is built using dedicated resources - separate submission software is usually
installed on the user’s PC or workstation
Grid software components
Donor software

• Some sort of identification and authentication procedure must be


performed before a machine can join the grid

• Certificate Authorities be used to establish and ensure the identity of


the donor machine as well as the users and the grid itself

• Possible to join the grid without any special authentication or possible


for any user to submit jobs to the grid - serious security problems

• The donor machine will usually have some sort of monitor

• Both the grid management and donor software must communicate


with each other to send the job or receive the result
Grid software components
Schedulers
• Job scheduling software locates a machine to run a grid job
submitted by a user
– Round-robin fashion
– More advanced scheduler - implement a job priority in
the queue

Communications
• A grid system may include software to help jobs
communicate with each other
• An application may split itself into a large number of sub jobs
• Sub jobs need to be able to locate other specific sub jobs,
establish a communication and send the appropriate data
Intergrid and Intragrid
Intragrid and intergrid
• A simple grid consists of just a few
machines - homogeneous systems in
one department
• Intra grid - heterogeneous machines
configuration - more types of
resources are available in multiple
departments but within the same
organization
• Inter grid - grid may grow to cross
organization boundaries and may be
used to collaborate on projects of
common interest
• Highest levels of security are usually
required in this configuration
Comparison with P2P Applications

– Grid – aggregating distributed high end machines like clusters. P2P


concentrates on low end systems such as PCs

– Grid targeted towards Scientific and Business domain applications

– Sharing of not only files but also IT resources in Grid

– In P2P the guarantee of availability and system connectivity is not


applicable

– P2P is an approach towards decentralization without any particular


efforts towards virtualization of single system image

– Grid is about bringing all the resources together necessarily with equal
commitment to sharing, and presenting a virtual single system image
Comparison with Clusters

• Clusters: collection of computing nodes providing processor (primarily) or


data sharing

• Clusters: connected using a high speed local interconnection network in a


local data centre

• Grid is large scale phenomenon, which is stretching over huge geographical


region

• Clusters are targeted more towards some specified number of users, and pre-
defined set of application, or might be in most generic form simply load
sharing systems

• Grid, on the other hand, has been designed for dynamic number of users and
applications
Comparison with Clusters
• Clusters surely presents a single system image, wherein the user can visualize the
complete collections of nodes as one Single Computer,

• Whereas Grid is about presenting a virtual image of single large computing device

• In case of clusters, the guarantee of service is much more pronounced as compared to


Grids.

• Nodes are expected to give there full resources and are fully devoted to the complete
system

• Whereas in Grids, computing nodes which are connected at required to give some or
more (as much available) of its computing power

• Cluster are (generally) homogenous collection of machines tightly coupled over a small
region

• Grids are complete opposite of this, and are heterogeneous collection of nodes, wherein
the heterogeneity is shadowed by middle layer applications
Comparison with Client/Server models
• Client/server models: distributed form of computing nodes stretched
across large geographical region and providing end-user service
(Web Services)

• These models are end-to-end oriented, session based and coupled


as group(s) of computers working in cycle and no where for the
same purpose.

• Grids are designed with a view towards a computing machinery


which presents a unified interface to the user

• For Client/Server the interface changes for each change of service


or server
Comparison with Distributed Computing

• Grid computing is a paradigm of distributed computing

• In Grid System: the user is not required to know anything about the
underlying topology or any individual nodes in particular.

• Distributed computing is about firing request on specific node(s)

• In Grid interaction is with the system as a whole and not with any
node(s) in particular.

• Resemblance is limited to the distributed nature.


Comparison with Web Services
• Protocols of www are open gen purpose to support distributed
resources but not coordinated

• Web is mainly focused on communication but grid computing


enables resource sharing and collaborative interplay towards the
common goal

• Web provides basic infrastructure for data exchange between two


different distributed applications whereas grid – aggregation of
high end resources for solving large scale problems

• Both hide the complexities of the system

• Web services are used to support the grid computing since web
has emerged as the standards – based approach for accessing
network applications
Conclusion
Grid computing appears to be a promising trend for three
reasons:
• Make more cost-effective use of a given amount of
computer resources
• Provides the way to solve problems that can't be
approached without an enormous amount of computing
power
• Suggests that the resources of many computers can be co-
operative to each other so that they are able to solve a
common problem

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy