Processor Performance Enhancement
Processor Performance Enhancement
Clock Speed
The clock speed of the processor is the number of FDE Cycles it can perform per second, for
example 3.4 GHz is 3,400,000,000 cycles per second. The higher the clock speed, the greater
the number of FDE Cycles that can occur each second, hence higher clock speeds improve
performance.
Cache Memory
Cache Memory is memory which is built directly into the CPU. It is far faster to access than
RAM, as it is much closer to the registers that require the data, and it also operates at a similar
speed to the CPU, so has very little lag comparative to retrieving data from RAM. It contains
multiple levels of cache which differ in size and distance from the CPU, Level 1, 2 and 3 cache.
Level 1 is closest to the CPU and is the smallest, whereas Level 3 is furthest from the CPU and
is the biggest. The reason all the cache isn't simply made Level 1 is cost- faster memory is more
expensive.
Frequently used data such as parts of the OS are stored in the cache, allowing the processor to
run faster as it doesn't have to rely on slower fetches from the RAM as often.
Number of Cores
A multi-core processor is one which contains multiple distinct processing units within a single
CPU. Each core can operate independently of the others, and will each have their own low-
level cache, as well as sharing a high-level cache. Different cores can run different applications
at the same time during multitasking, allowing more actions to occur each second.
However, multiple cores do not always improve processor performance; if a core is processing
an instruction which is dependent on the outcome of another instruction, being processed by
another core, it will have to wait until that core returns a result before the instruction can finish
processing. This can cause hanging, where a core is wasted while it waits for the result of
another instructions execution.
In other words, pipelining technique executes one instruction, while the next instruction is
being decoded, and the instruction after that is being fetched. This allows high efficiency of
the components/processes in a CPU.
Pipelining does have its downsides for instance, it is not always possible to predict the
instruction that needs to be fetched and decoded next. For instance, a program that has IF
statements will require ‘instruction jumps’. As a result, the pipeline must be ‘flushed’, and then
obtain the correct instruction. The more often the pipeline is flushed, the less of a benefit the
pipeline becomes.
Step 1 Instruction 1
Contemporary Architecture
This is a modified form of Harvard Architecture. It relaxes the strict separation of data and
instructions, but still lets the CPU access more than 2 memory buses. Contemporary processors
use a mixture of both von Neumann and Harvard architecture, which differs from a pure von
Neumann architecture by using von Neumann architecture for the main memory to the CPU
and Harvard for the control unit and caches.
Following are the details of Von Neumann and Harvard architecture.
It then decodes the instruction and finally issues more control signals to the hardware to
actually execute it.
Harvard Architecture
This architecture is almost identical to Von Neumann; however, it stores data and instructions
in separate memory units. The CPU is also capable of reading an instruction and performing
memory access at the same time, even without a cache. This is because there are separate buses
for the data and instructions. It is possible to have two separate memory systems for a Harvard
architecture and to use pipelining.