Computer Organization and Architecture
Computer Organization and Architecture
It refers to a process by which the current bus master accesses and leaves the control of the bus and
passes it to another bus requesting processor unit. The bus arbiter decides who will become the current
bus master. There are two approaches to bus arbitration:
1) Centralized:
It is a single bus arbiter which performs the required arbitration.
2) Distributed:
All devices participate in the selection of the next bus master.
Parts of a BUS:
1) Address
2) Data
3) Control
Techniques of Arbitrating:
1) Daisy Chaining
It is a centralized bus arbitration technique. During any bus cycle the bus master may be any
device, DMA controller or processor connected to the bus. All devices are assigned static priority
according to their locations along a bus grant control line. The device closest to the bus arbiter
has got the highest priority. Requests for bus access are made on a common request line (BRQ).
Similarly line SACK is used to indicate the use of bus. When no devices are using the bus, the
SACK is inactive. The central bus arbiter propagates a bus grant signal BGT if the BRQ line is high
and SACK signal indicates that the bus is idle. The first device which has issued a bus request,
receives the BGT signal and stops its propagation. On completion it resets the bus-busy flag in
the arbiter and a new BGT signal is generated if other requests are outstanding. The first device
simply passes the BGT signal to the next devices.
Advantages:
a) Simplicity
b) Scalability: User can add or remove the devices according to requirement.
2) Polling or rotating priority
In this method the devices are assigned unique priorities and compete to access the bus, but the
priorities are dynamically changed to give every device an opportunity to access the bus. In this
scheme no central bus arbiter exist and the bus grant line is connected from the last device to
the first device in a closed loop.
Advantage:
This method does not favor any particular device or processor.
3) Fixed priority:
In this method the bus control passes from one device to another only through the centralized
bus arbiter. Here each device has a dedicated BRQ output line and BGT input line. If there is m
devices the bus arbiter has m BRQ input and m BGT output. The arbiter follows a priority order
with different priority level to each device. At a given time, the arbiter issues BGT to the highest
priority device among the devices who have issued bus request. This scheme needs more
hardware but generates faster response.
Input output driver is a software module that issues different commands to input output
controller. For executing various input output operations. Following are the various operations
performed by input output driver
1) Reading a file from disk
2) Printing some lines by the printer
3) Displaying some message on the monitor
4) Storing some data on disk
The input output driver program for a given peripheral device is developed only after knowing
the architecture of the input output controller device. So the input output driver program and
input output controller device together achieve the input output operation done on behalf of
corresponding peripheral device. After completing the input output operation the input output
driver returns control to the called program and passes return signal about the completion of
the program. The input output driver for basic peripheral devices supported by general PC are
part of BIOS which is physically stored in ROM part of main memory. The I/O drivers for other
peripherals are provided on the floppy diskette or CD. This program is stored in hard disk and
brought into the RAM by bootstrap system program during booting.
CPU Organization
1) The different components in CPU:
a. ALU
b. CU
c. Register
One Machine Cycle:
d. Fetch Instruction
e. Decode Instruction
f. Execute Instruction
Register:
o User accessible register(System Programmer)
General Purpose Register: it stores the operands.
Address register: it stores the addressing modes
Data Register: it stores data
Condition code register
o Control and status register(Used by CU of CPU):
Program counter
Instruction register
Memory address register
Memory Buffer register
Stack
Computer Instruction Set:
o Instruction format:
o
o
Types of Operations:
Arithmetic Operation(+,-,*,/)
Eg.
ADD R1,R2
ADD X
Logical Operation: It is based on Boolean operation
Eg.
ADD X
OR X
NOT X
XOR X
CLR
Shift Operation: It is used with a logical shift. Here the bits of a word is shifted
left or right.
Eg.
SHR
SHL
ROR //Rotate Right
ROL //Rotate Left
Data transfer Operation: It moves from one location to another location in
computer with no change in the data content.
Eg.
Load LD
Store ST
Move MOV
Push data PUSH
Pop data POP
Branch Instruction: It is sometimes called jump instruction which has one of its
operand that indicates the address of the next instruction to be executed.
Eg.
BRZ Q //Branch to location Q if result is zero
BRP Q //Branch to location Q if result is positive
o Operand Types:
Numbers
Characters
Logical Data
Address
CPU Organization:
o Single Accumulator based CPU organization // Eg. ADD X
Where X is the address of the operand. The add instruction perform the
operation:
AC <- AC + M[X]
Here AC is the accumulator register and M[X] denotes the memory
word that is the operand located at x.
MULT Y
AC <- AC*M[Y]
It was used in PDP-8 processor and is used for process control and lab
application.
In early days of computer, the computers had only accumulator based CPUs. It is
a simple CPU in which the accumulator register is only used for processing the
instructions of a program and intermediate results are stored into this register.
The instruction format used only one address field that is why CPU was known
as one address machine.
Advantage:
It has short instruction i.e., takes less memory space
Instruction Cycle takes less time
General Register Based CPU Organization // Eg. ADD R1, R2
Instead of single accumulator multiple general purpose register are used in this
type of CPU organization. It uses 2 or 3 address fields in its instruction:
Eg. MULT R1,R2,R3
//R1 <- R2*R3
MULT R1,R2
//R1 <- R1*R2
It was used in PDP11, IBM 360
Advantages:
Efficiency increases
Less memory space is used to store the program since instruction are
used in more compact ways
Disadvantages:
The compiler needs to be more intelligent
Increases cost
Stack Register Based CPU Organization
The CPU uses stack or LIFO mechanism. The register that holds the top most
operand in stack is called stack pointer or SP.
Two operations are performed Push and Pop
PUSH<memory address>
SP <- SP+1;
SP <- <memory address>
Addressing Scheme:
o Zero Addressing: A computer having instruction with no address field in the instruction
format is called zero address computer.
A stack organized computer does not use an address field for the instructions ADD or
MUL. The PUSH and POP instruction however need an address field to specify the
destination or source of operand.
Eg. ADD
Which specifies the operation
ADD TOS <- (A+B)
X=(A+B) * (C+D)
//Using Zero Addressing
Push A
TOS<- A
Push B
TOS <-B
ADD TOS <- (A+B)
Push C
TOS <- C
Push D
TOS <-D
ADD TOS <- (C+D)
MUL TOS <- [TOS <-(A+B) * TOS <-(C+D)]
Pop X
M[X] <- TOS
o One Addressing: One address instruction uses an implied accumulator register (AC) for
data manipulation. All operations are basically done between the accumulator register
and a memory operand .
LOAD A
AC <- M[A]
ADD B
AC <- AC+M[B]
STORE T
M[T] <-AC
LOAD C
AC <- M[C]
ADD D
AC <-AC+M[D]
MUL T
AC <-AC+M[D]
STORE X
M[X]
o Two Addressing: It is most common in computers. Here each address field can specify
either a processor register or number word
MOV R1,A R1 <-M[A]
ADD R1,B R1<-R1+M[B]
MOV R2,C R2 <- M[C]
ADD R2,D R2 <-R2+M[D]
MUL R1,R2 R1<-R1*R2
MOV X,R1 M[X] <- R1
More instructions transfers the operands to and from the memory and processor
register.
o Three Addressing: Computer with 3address instruction format can use each address
field to specify either a processor register or a memory operand
ADD R1,A,B
R1<-M[A]+M[B]
ADD R2,C,D
R2<-M[C]+M[D]
MUL X,R1,R2
X<-R1*R2
Addressing Modes: The ALU of the CPU executes the instructions as dictated by the opcode field
of an instruction. The instructions are executed on some data stored in memory or register. The
different ways in which the location of an operand is specified in an instruction are referred to
as addressing modes.
o Implied or inherited mode: Here the operands are indicated implicitly by the instruction.
Here the accumulator register is used to hold the operand and after the instruction
execution the result is stored in the same register.
Eg. RAL;
//Rotate the content of the accumulator left through carry
CMA;
//Compliment the content of the accumulator
o Immediate Mode: Here the operand is mentioned explicitly in the instruction. It
contains an operand value rather than an address of it in the address field.
Eg. MVI A, 06; //loads the equivalent binary value of 06 to the accumulator
ADD 06;
o
o
Stack Addressing Mode: Stack organized computer use stack address instruction, here
all the operands for an operation are taken from the top of the stack.
The instruction does not have any operand field.
Eg. SUB //when the SUB instruction is executed, two operands are popped out
automatically one by one. After subtraction, the result will be pushed onto the stack.
Register(Direct): Here the processor register hold the operands i.e., the address field is
now the register field. Which contains the operand required for the instruction
Eg. ADD R1,R2; //adds the content of register R1 & R2 and stores the result in R1
Register Indirect: In this mode the instruction specifies an address of CPU register that
holds the address of the operand in memory i.e., the address field is a register which
contains the memory contents of operand.
NOTE:
o Effective Address: The effective address is defined to be the memory address obtained
from computation dictated by the given addressing mode.
Classification of parallel computer architecture:
o Flynns classification (1966)
o Fengs classification (1972)
o Handlers classification (1977)
Flynns classification of parallel computers:
o SISD(Single Instruction string and single data string)
This kind of architecture is not practically used it is a concept. There n processor elements each
receiving distinct instruction to execute on the same data stream and its derivatives. The result
or output of one processor element becomes the input of next other processor element in a
series. This architecture is also known as systolic arrays. There is no practical machine of this
class exists.
SIMD(Single instruction string multiple data string)
Array processor fall into this category. Here multiple processor units are supervised by a
common control unit, all processor elements receive the same instruction stream which is
broadcast from the CU but operate on different data stream. This machine is generally used by
vector type data
Eg. Illiac-IV
MIMD(Multiple Instruction String and multiple data string):
This category covers multiprocessor systems and multiple computer systems. Is called tightly
coupled if the degree of interaction among the processor is high. Otherwise it is called loosely
coupled computers. Most commercial MIMD computer are loosely coupled computer. Eg. IBM
370,UNIVAC
Pipeline Hazards: Pipelines Hazards are situations that prevents the next instruction in the
instruction stream from executing during its designated clock cycle. The instruction is said to be
stalled. When an instruction is stalled, all instructions later in the pipeline than the stalled
instruction are also stalled. Instructions earlier than the stalled one can continue. No new
instructions are fetched during the stall.
Types:
o Control Hazard
A typical computer program consist of the following instructions:
Arithmetic(60 %)
Store(15%)
Branch(5%)
Conditional Branch(20%)
An arithmetic and store instructions do not alter the sequential execution order
of the program. This implies that pipeline flow is linear type. However the
branch type instruction may alter the program counters content in order to
jump to a location other than the next instruction. This means that the branch
type instruction may cause adverse effects on the pipeline performance. To
solve the control hazards prefetching technique is used to solve the arithmetic
instruction.
Solution of Control Hazards: In order to cope with the adverse effects of branch
instructions, an important technique called prefetching is used. Prefetching
techniques starts that: Instruction words ahead of one currently being decoded
in the instruction decoding stage are fetched from the memory system before
the ID stage requests them.
Data Hazard
Inter-instruction dependencies may arise to prevent the sequential (in-order)
data flow in the pipeline, when successive instructions overlap their fetch,
decode and execution through a pipeline processor. This situation due to interinstruction dependencies is called data hazard.
We have two instructions, I1 and I2. In a pipeline the execution of I2 can start
before I1 has terminated. If in a certain stage of the pipeline, I2 needs the result
produced by I1, but this result has not yet been generated, we have a data
hazard.
According to various data update patterns in instruction pipeline, there are
three classes of data hazards exist:
Write After Read(WAR) hazards
Read After Write (RAW) hazards
Write After Write (WAW) hazards
- Note that Read-after-read is not a hazard, because nothing is changes
on a read operation.
Structural Hazard
It occurs when certain resource (memory, functional unit) is requested by more
than one instruction at the same time.
Eg. Instruction ADD R4, X fetches operand X from memory in the OF stage of 3rd
clock period. The memory does not accept another access during that period.
For this (i+2)th instruction cannot be initiated at 3rd clock period to fetch the
instruction from memory. Thus one clock cycle is stalled in the pipeline for all
subsequent instructions. This is shown in fig.
Clock 1
Cycles
ADD
IF
R4,X
ID
OF
EX
WB
10
11
12
13
Instr.
i+1
Instr.
i+2
Instr.
i+3
IF
ID
OF
EX
WB
stall
IF
ID
OF
EX
WB
IF
ID
OF
EX
WB
CISC Characteristics
The essential goal of a CISC architecture is to attempt to provide a single machine instruction for
each statement that is written in a high-level language. Examples of CISC architectures are the
Digital Equipment Corporation VAX computer and the IBM 370 computer.
The Major Characteristics of CISC architecture are:
o A large number of instructions are used typically from 100 to 250 instructions.
o Some instructions that perform specialized tasks and are used infrequently
o A large variety of addressing modes typically from 5 to 20 different modes
o It has small number of general purpose registers
o Clock per instruction lies between 2 to 15
o Mostly micro programmed computer use this architecture
RISC Characteristics
The concept of RISC architecture involves an attempt to reduce execution time by simplifying
the instruction set of the computer. The major characteristics of a RISC processor are:
o Relatively few instructions
o Relatively few addressing modes
o Large number of general purpose register used
o Clock per instruction lies between 1 and 2
o Memory access limited to load and store instructions
o All operations done within the registers of the CPU
o Single cycle instruction execution
o Hardwired rather than micro programmed control
The control unit is a component of a computer's central processing unit (CPU) that
directs operation of the processor. It tells the computer's memory, arithmetic/logic
unit and input and output devices how to respond to a program's instructions.
needed]