Microprocessor of TU IT Adhikrit
Microprocessor
A Microprocessor is a programmable device that takes in input, performs some arithmetic and logical operations over it and produces the desired output.
In simple words, a Microprocessor is a digital device on a chip that can fetch instructions from memory, decode and execute them, and give results. It is an important part of a computer architecture without which you will not be able to perform anything on your computer.
How does a Microprocessor Work?
The working of a microprocessor can be understood by breaking it down into the following four key steps −
Fetch − It is the very first function that a microprocessor performs. In this step, the microprocessor accesses data and instructions from the memory unit or an input device.
Decode − After receiving data and instructions, the microprocessor decodes them and interprets for the computing process.
Execute − In this step, the microprocessor performs the requested operations on the data.
Store − Finally, the results produced by the operations are stored in the memory unit.
Hence, a typical microprocessor completes its working in four steps, where each step represents a specific task or function.
Block Diagram of a Microprocessor
A Microprocessor takes a bunch of instructions in machine language and executes them, telling the processor what it has to do. The microprocessor performs three basic things while executing the instruction:
It performs some basic operations like addition, subtraction, multiplication, division, and some logical operations using its Arithmetic and Logical Unit (ALU). New Microprocessors also perform operations on floating-point numbers.
Data in microprocessors can move from one location to another.
It has a Program Counter (PC) register that stores the address of the next instruction based on the value of the PC. The Microprocessor jumps from one location to another and makes decisions.
features of a microprocessor
The features of a microprocessor include −
Processing Power − Microprocessors are capable of performing millions or even billions of calculations per second, enabling them to execute complex tasks quickly.
Versatility − Microprocessors can execute a wide range of instructions and tasks, making them suitable for various applications, from simple household appliances to advanced computers.
Integration − Microprocessors integrate multiple components, such as arithmetic logic units (ALUs), control units, and memory, onto a single chip, reducing size and complexity while increasing efficiency.
Scalability − Microprocessors come in different configurations and speeds, allowing them to be modified to the specific needs of different devices and applications.
Power Efficiency − Modern microprocessors are designed to operate efficiently while consuming minimal power, making them suitable for battery-powered devices like smartphones and laptops.
Interconnectivity − Microprocessors can communicate with other components and devices through input/output ports, enabling them to interact with external sensors, displays, and storage devices.
Applications of Microprocessors
Today, microprocessors are being used in almost all electronic devices and systems used in households and industries. Some common applications of microprocessors depending on their nature are listed below −
Microprocessors are used in a wide range of common electronic and computing devices like laptops, desktops, smart watches, smart TVs, etc.
Microprocessors are also used in microcontrollers to perform data processing and control operations.
Microprocessors specially designed for digital signal processing are used in applications like telecommunication, audio processing, image processing, etc.
Microprocessors are also used in robotic or autonomous devices like surveillance drones, autonomous aircrafts, etc.
The specialized microprocessors named Application Specific Integrated Circuits (ASICs) microprocessors are designed for specific tasks and customization depending on the application requirements.
GPU (Graphical Processing Unit) microprocessors are designed and used for performing high performance graphics functions.
Bus System of Microprocessor Based System /Bus Organization
A bus organization is a group of conducting wires which carries information, all the peripherals are connected to microprocessor through the bus. A system bus is nothing just a group of wires to carry bits.
The MPU (Micro Processing Unit) performs primarily four operations:
Memory Read: Read data (or instructions) from memory.
Memory Write: Write data (or instructions) into memory.
I/O Read: Accepts data from I/P devices.
I/O Write: Sends data to O/P devices.
The diagram to represent the bus organization of 8085 microprocessor is given below:-
Types of Bus in the microprocessor are:-
Address Bus
Data Bus
Control Bus
i. Address Bus:-
The address bus carries information about the location of data in the memory.
The address bus is unidirectional because of data flow in one direction, from the microprocessor to memory or from the microprocessor to input/out devices.
Length of Address bus of 8085 microprocessor is 16 bit (That is, four hexadecimal digits), ranging from 0000H to FFFF H. The microprocessor 8085 can transfer maximum 16-bit address which means it can address 65,536 different memory location i.e 64KB memory.
Address Bus is used to perform the first function, identifying a peripheral or a memory location.
ii. Data Bus:-
The data bus allows data to travel between the microprocessor (CPU) and memory (RAM).
The data bus is bidirectional because of data flow in both directions, from the microprocessor to memory or input/output devices and from memory or input/output devices to microprocessors. Length of Databus of 8085 microprocessor is 8 bit (that is, two hexadecimal Digits0, ranging from 00H to FF H.
The data bus is used to perform the second function, transferring binary information.
iii. Control Bus:-
The control bus carries the control signals to control all the associated peripherals, the microprocessor uses control bus to process data, that is what to do with selected memory location signals are:-
a. memory card
b.memory write
c. input/output,write.
Intel 8085 microprocessor architecture
A microprocessor is fabricated on a single integrated circuit (IC) or chip that is used as a central processing unit (CPU).The 8085 microprocessor is an 8-bit microprocessor that was developed by Intel in the mid-1970s. It was widely used in the early days of personal computing and was a popular choice for enthusiasts due to its simplicity and ease of use. The architecture of the 8085 microprocessor consists of several key components, including the accumulator, registers, program counter, stack pointer, instruction register, flags register, data bus, address bus, and control bus.
The accumulator is an 8-bit register that is used to store arithmetic and logical results. It is the most commonly used register in the 8085 microprocessor and is used to perform arithmetic and logical operations such as addition, subtraction, and bitwise operations.
Features of 8085 Microprocessor
The 8085 microprocessor has six general-purpose registers (B, C, D, E, H, and L), which can be paired as BC, DE, and HL for 16-bit operations.
The program counter (PC) is a 16-bit register that holds the address of the next instruction and increments after each instruction execution.
The stack pointer (SP) is a 16-bit register that keeps track of the top of the stack, used for storing return addresses and temporary data.
The instruction register is an 8-bit register that holds the instruction currently being executed.
The flags register is an 8-bit register that stores status flags, including Carry (set when an arithmetic operation generates a carry), Zero (set when the result is zero), Sign (set when the result is negative), and Parity (set when the result has an even number of 1s).
The data bus is an 8-bit bidirectional bus that transfers data between the microprocessor and memory or other devices.
The address bus is a 16-bit unidirectional bus that specifies memory locations and devices the microprocessor accesses.
The control bus consists of signals that manage microprocessor operations, including Read (to fetch data), Write (to store data), Interrupt (to signal an external event), and Reset (to restart the microprocessor).
Arithmetic and Logic Unit (ALU)
It is used to perform mathematical operations like addition, multiplication, subtraction, division, decrement, increment, etc. Different operations are carried out in ALU: Logical operations, Bit-Shifting Operations, and Arithmetic Operations.
Flag Register
It is an 8-bit register that stores either 0 or 1 depending upon which value is stored in the accumulator. Flag Register contains 8-bit out of which 5-bits are important and the rest of 3-bits are “don’t Care conditions”. The flag register is a dynamic register because after each operation to check whether the result is zero, positive or negative, whether there is any overflow occurred or not, or for comparison of two 8-bit numbers carry flag is checked. So for numerous operations to check the contents of the accumulator and from that contents if we want to check the behavior of given result then we can use Flag register to verify and check. So we can say that the flag register is a status register and it is used to check the status of the current operation which is being carried out by ALU.
Different Fields of Flag Register
Carry Flag
Parity Flag
Auxiliary Carry Flag
Zero Flag
Sign Flag
Accumulator
Accumulator is used to perform I/O, arithmetic, and logical operations. It is connected to ALU and the internal data bus. The accumulator is the heart of the microprocessor because for all arithmetic operations Accumulators’ 8-bit pin will always there connected with ALU and in most-off times all the operations carried by different instructions will be stored in the accumulator after operation performance.
General Purpose Registers
There are six general-purpose registers. These registers can hold 8-bit values. These 8-bit registers are B,C,D,E,H,L. These registers work as 16-bit registers when they work in pairs like B-C, D-E, and H-L. Here registers W and Z are reserved registers. We can’t use these registers in arithmetic operations. It is reserved for microprocessors for internal operations like swapping two 16-bit numbers. We know that to swap two numbers we need a third variable hence here W-Z register pair works as temporary registers and we can swap two 16-bit numbers using this pair.
Program Counter
Program Counter holds the address value of the memory to the next instruction that is to be executed. It is a 16-bit register.
For Example: Suppose current value of Program Counter : [PC] = 4000H
(It means that the next executing instruction is at location 4000H.After fetching,program Counter(PC) always increments by +1 for fetching of the next instruction.)
Stack Pointer
It works like a stack. In stack, the content of the register is stored that is later used in the program. It is a 16-bit special register. The stack pointer is part of memory but it is part of Stack operations, unlike random memory access. Stack pointer works in a continuous and contiguous part of the memory. whereas Program Counter(PC) works in random memory locations. This pointer is very useful in stack-related operations like PUSH, POP, and nested CALL requests initiated by Microprocessor. It reserves the address of the most recent stack entry.
Temporary Register
It is an 8-bit register that holds data values during arithmetic and logical operations.
Instruction register and decoder
It is an 8-bit register that holds the instruction code that is being decoded. The instruction is fetched from the memory.
Timing and Control Unit
The timing and control unit comes under the CPU section, and it controls the flow of data from the CPU to other devices. It is also used to control the operations performed by the microprocessor and the devices connected to it. There are certain timing and control signals like Control signals, DMA Signals, RESET signals and Status signals.
Interrupt Control
Whenever a microprocessor is executing the main program and if suddenly an interrupt occurs, the microprocessor shifts the control from the main program to process the incoming request. After the request is completed, the control goes back to the main program. There are 5 interrupt signals in 8085 microprocessors: INTR, TRAP, RST 7.5, RST 6.5, and RST 5.5.
Priorities of Interrupts: TRAP > RST 7.5 > RST 6.5 > RST 5.5 > INTR
Address Bus and Data Bus
The data bus is bidirectional and carries the data which is to be stored. The address bus is unidirectional and carries the location where data is to be stored.
In the 8085 microprocessor, the address bus and data bus are two separate buses that are used for communication between the microprocessor and external devices.
The Address bus is used to transfer the memory address of the data that needs to be read or written. The address bus is a 16-bit bus, allowing the 8085 to access up to 65,536 memory locations.
The Data bus is used to transfer data between the microprocessor and external devices such as memory and I/O devices. The data bus is an 8-bit bus, allowing the 8085 to transfer 8-bit data at a time. The data bus can also be used for instruction fetch operations, where the microprocessor fetches the instruction code from memory and decodes it.
The combination of the address bus and data bus allows the 8085 to communicate with and control external devices, allowing it to execute its program and perform various operations.
Serial Input/Output Control
It controls the serial data communication by using Serial input data and Serial output data.
Serial Input/Output control in the 8085 microprocessor refers to the communication of data between the microprocessor and external devices in a serial manner, i.e., one bit at a time. The 8085 has a serial I/O port (SID/SOD) for serial communication. The SID pin is used for serial input and the SOD pin is used for serial output. The timing and control of serial communication is managed by the 8085’s internal circuitry. The 8085 also has two special purpose registers, the Serial Control Register (SC) and the Serial Shift Register (SS), which are used to control and monitor the serial communication.
Uses of 8085 Microprocessor
The 8085 microprocessor is a versatile 8-bit microprocessor that has been used in a wide variety of applications, including:
Embedded Systems: The 8085 microprocessor is commonly used in embedded systems, such as industrial control systems, automotive electronics, and medical equipment.
Computer Peripherals: The 8085 microprocessor has been used in a variety of computer peripherals, such as printers, scanners, and disk drives.
Communication Systems: The 8085 microprocessor has been used in communication systems, such as modems and network interface cards.
Instrumentation and Control Systems: The 8085 microprocessor is commonly used in instrumentation and control systems, such as temperature and pressure controllers.
Home Appliances: The 8085 microprocessor is used in various home appliances, such as washing machines, refrigerators, and microwave ovens.
Educational Purposes: The 8085 microprocessor is also used for educational purposes, as it is an inexpensive and easily accessible microprocessor that is widely used in universities and technical schools.
Programming and Interfacing
Programming: This involves writing code in assembly language or a high-level language like C to give instructions to the microprocessor. Assembly language is often used for microprocessor programming because it provides direct control over the hardware and allows for efficient use of system resources.
• Assembly Language Programming: In assembly language, you write instructions using mnemonics that correspond to the machine instructions executed by the microprocessor. These instructions can manipulate data, perform arithmetic and logic operations, control program flow, and interact with peripherals.
• High-Level Language Programming: Some microprocessors also support programming in high-level languages like C. In this approach, you write code using C syntax, which is then compiled into assembly language or machine code that the microprocessor can execute.
Interfacing in Microprocessor
Interfacing: Interfacing involves connecting the microprocessor to other hardware components such as memory, input/output devices, sensors, actuators, etc. This often requires understanding the electrical characteristics of the microprocessor's signals and the interface protocols used by the peripherals.
• I/O Interfacing: Input/output (I/O) interfacing involves connecting the microprocessor to devices like keyboards, displays, sensors, and actuators.
This may require using specialized interface circuits, such as serial or parallel communication interfaces, analog-to-digital converters (ADCs), digital-to analog converters (DACs),
MemoryInterfacing:
• When we are executing any instruction, we need the microprocessor to access the memory for reading instruction codes and the data stored in the memory. For this, both the memory and the microprocessor requires some signals to read from and write to registers. The interfacing process includes some key factors to match with the memory requirements and microprocessor signals. The interfacing circuit therefore should be designed in such a way that it matches the memory signal requirements with the signals of the microprocessor.Memory interfacing involves connecting the microprocessor to memory devices like RAM (Random Access Memory) and ROM(Read-Only Memory). This includes addressing schemes, data bus width considerations, timing requirements, and interfacing protocols.
8086 Microprocessor
The 8086 Microprocessor is an enhanced version of the 8085Microprocessor that was designed by Intel in 1976. It is a 16-bit Microprocessor having 20 address lines and16 data lines that provides up to 1MB storage. It consists of powerful instruction set, which provides operations like multiplication and division easily.
It supports two modes of operation, i.e. Maximum mode and Minimum mode. Maximum mode is suitable for system having multiple processors and Minimum mode is suitable for system having a single processor.
Features of 8086
The most prominent features of a 8086 microprocessor are as follows −
It has an instruction queue, which is capable of storing six instruction bytes from the memory resulting in faster processing.
It was the first 16-bit processor having 16-bit ALU, 16-bit registers, internal data bus, and 16-bit external data bus resulting in faster processing.
It is available in 3 versions based on the frequency of operation −
8086 → 5MHz
8086-2 → 8MHz
(c)8086-1 → 10 MHz
It uses two stages of pipelining, i.e. Fetch Stage and Execute Stage, which improves performance.
Fetch stage can prefetch up to 6 bytes of instructions and stores them in the queue.
Execute stage executes these instructions.
It has 256 vectored interrupts.
It consists of 29,000 transistors.
Comparison between 8085 & 8086 Microprocessor
Size − 8085 is 8-bit microprocessor, whereas 8086 is 16-bit microprocessor.
Address Bus − 8085 has 16-bit address bus while 8086 has 20-bit address bus.
Memory − 8085 can access up to 64Kb, whereas 8086 can access up to 1 Mb of memory.
Instruction − 8085 doesnt have an instruction queue, whereas 8086 has an instruction queue.
Pipelining − 8085 doesnt support a pipelined architecture while 8086 supports a pipelined architecture.
I/O − 8085 can address 2^8 = 256 I/O's, whereas 8086 can access 2^16 = 65,536 I/O's.
Cost − The cost of 8085 is low whereas that of 8086 is high.
Architecture of 8086
8086 Microprocessor is divided into two functional units, i.e., EU (Execution Unit) and BIU (Bus Interface Unit).
EU (Execution Unit)
Execution unit gives instructions to BIU stating from where to fetch the data and then decode and execute those instructions. Its function is to control operations on data using the instruction decoder & ALU. EU has no direct connection with system buses as shown in the above figure, it performs operations over data through BIU.
Let us now discuss the functional parts of 8086 microprocessors.
ALU
It handles all arithmetic and logical operations, like +, −, ×, /, OR, AND, NOT operations.
Flag Register
It is a 16-bit register that behaves like a flip-flop, i.e. it changes its status according to the result stored in the accumulator. It has 9 flags and they are divided into 2 groups − Conditional Flags and Control Flags.
Conditional Flags
It represents the result of the last arithmetic or logical instruction executed. Following is the list of conditional flags −
Carry flag − This flag indicates an overflow condition for arithmetic operations.
Auxiliary flag − When an operation is performed at ALU, it results in a carry/barrow from lower nibble (i.e. D0 D3) to upper nibble (i.e. D4 D7), then this flag is set, i.e. carry given by D3 bit to D4 is AF flag. The processor uses this flag to perform binary to BCD conversion.
Parity flag − This flag is used to indicate the parity of the result, i.e. when the lower order 8-bits of the result contains even number of 1s, then the Parity Flag is set. For odd number of 1s, the Parity Flag is reset.
Zero flag − This flag is set to 1 when the result of arithmetic or logical operation is zero else it is set to 0.
Sign flag − This flag holds the sign of the result, i.e. when the result of the operation is negative, then the sign flag is set to 1 else set to 0.
Overflow flag − This flag represents the result when the system capacity is exceeded.
Control Flags
Control flags controls the operations of the execution unit. Following is the list of control flags −
Trap flag − It is used for single step control and allows the user to execute one instruction at a time for debugging. If it is set, then the program can be run in a single step mode.
Interrupt flag − It is an interrupt enable/disable flag, i.e. used to allow/prohibit the interruption of a program. It is set to 1 for interrupt enabled condition and set to 0 for interrupt disabled condition.
Direction flag − It is used in string operation. As the name suggests when it is set then string bytes are accessed from the higher memory address to the lower memory address and vice-a-versa.
General purpose register
There are 8 general purpose registers, i.e., AH, AL, BH, BL, CH, CL, DH, and DL. These registers can be used individually to store 8-bit data and can be used in pairs to store 16bit data. The valid register pairs are AH and AL, BH and BL, CH and CL, and DH and DL. It is referred to the AX, BX, CX, and DX respectively.
AX register − It is also known as accumulator register. It is used to store operands for arithmetic operations.
BX register − It is used as a base register. It is used to store the starting base address of the memory area within the data segment.
CX register − It is referred to as counter. It is used in loop instruction to store the loop counter.
DX register − This register is used to hold I/O port address for I/O instruction.
Stack pointer register
It is a 16-bit register, which holds the address from the start of the segment to the memory location, where a word was most recently stored on the stack.
BIU (Bus Interface Unit)
BIU takes care of all data and addresses transfers on the buses for the EU like sending addresses, fetching instructions from the memory, reading data from the ports and the memory as well as writing data to the ports and the memory. EU has no direction connection with System Buses so this is possible with the BIU. EU and BIU are connected with the Internal Bus.
It has the following functional parts −
Instruction queue − BIU contains the instruction queue. BIU gets upto 6 bytes of next instructions and stores them in the instruction queue. When EU executes instructions and is ready for its next instruction, then it simply reads the instruction from this instruction queue resulting in increased execution speed.
Fetching the next instruction while the current instruction executes is called pipelining.
Segment register − BIU has 4 segment buses, i.e. CS, DS, SS& ES. It holds the addresses of instructions and data in memory, which are used by the processor to access memory locations. It also contains 1 pointer register IP, which holds the address of the next instruction to executed by the EU.
CS − It stands for Code Segment. It is used for addressing a memory location in the code segment of the memory, where the executable program is stored.
DS − It stands for Data Segment. It consists of data used by the program andis accessed in the data segment by an offset address or the content of other register that holds the offset address.
SS − It stands for Stack Segment. It handles memory to store data and addresses during execution.
ES − It stands for Extra Segment. ES is additional data segment, which is used by the string to hold the extra destination data.
Instruction pointer − It is a 16-bit register used to hold the address of the next instruction to be executed.
Instruction Set
An instruction set, also known as an instruction set architecture (ISA), is a set of commands that a microprocessor can understand and execute. These instructions tell the processor what operations to perform, such as arithmetic, data manipulation, and input/output operations.
Types of Instruction Set
Generally, there are two types of instruction set used in computers.
RISC Architecture
RISC stands for Reduced Instruction Set Computer Processor, a microprocessor architecture with a simple collection and highly customized set of instructions.
It is built to minimize the instruction execution time by optimizing and limiting the number of instructions. It means each instruction cycle requires only one clock cycle, and each cycle contains three parameters: fetch, decode and execute.
The RISC processor is also used to perform various complex instructions by combining them into simpler ones. RISC chips require several transistors, making it cheaper to design and reduce the execution time for instruction.
Examples of RISC processors are SUN's SPARC, PowerPC, Microchip PIC processors, RISC-V
Characteristics of RISC:
1. It has simpler instructions and thus simple instruction decoding.
2. More general-purpose registers.
3. The instruction takes one clock cycle in order to get executed.
4. The instruction comes under the size of a single word.
5. Pipeline can be easily achieved.
6. Few data types.
7. Simpler addressing modes.
CISC
The CISC architecture comprises a complex instruction set. A CISC processor has a variable-length instruction format. In this processor architecture, the instructions that require register operands can take only two bytes.
In a CISC processor architecture, the instructions which require two memory addresses can take five bytes to comprise the complete instruction code. Therefore, in a CISC processor, the execution of instructions may take a varying number of clock cycles. The CISC processor also provides direct manipulation of operands that are stored in the memory.
The primary objective of the CISC processor architecture is to support a single machine instruction for each statement that is written in a high-level programming language.
Characteristics of CISC:
• The length of the code is short, so it requires very little RAM.
• CISC or complex instructions may take longer than a single clock cycle to execute the code.
• Less instruction is needed to write an application.
• It provides easier programming in assembly language.
• Support for complex data structure and easy compilation of high-level languages.
• It is composed of fewer registers and more addressing nodes, typically 5 to 20.
• Instructions can be larger than a single word.
• It emphasizes the building of instruction on hardware because it is faster to create than the software.
Difference Between RISC and CISC
RISC
1. It stands for Reduced Instruction Set Computer.
2.It is a microprocessor architecture that uses a small instruction set of uniform length.
3.These simple instructions are executed in one clock cycle.
4. These chips are relatively simple to design.
5. They are inexpensive.
6. Examples of RISC chips include SPARC, POWER PC.
7. It has a smaller number of instructions.
8. It has fixed-length encodings for instructions.
9. Simple addressing formats are supported.
10. It doesn't support arrays. It has a large number of instructions.
11. It doesn't use condition codes.
12. Registers are used for procedure arguments and return addresses.
CISC
It stands for Complex Instruction Set Computer.
This offers hundreds of instructions of different sizes to the users.
This architecture has a set of special purpose circuits which help execute the instructions at a high speed.
These chips are complex to design.
Examples of CISC include Intel architecture, AMD.
They are relatively expensive.
It has more instructions.
It has variable-length encodings of instructions.
The instructions interact with memory using complex addressing modes.
It supports arrays.
Condition codes are used.
The stack is used for procedure arguments and return addresses.
Which of these architectures is power efficient?
a. IANA
b. ISA
c. CISC
d. RISC
Ans:-d. RISC
Both the RISC and CISC architectures have primarily been developed to reduce _____________.
a. Time delay
b. Semantic gap
c. Cost
d. All of the above
Ans:-b. Semantic gap
Pipe-lining is the special feature of ____________.
a. IANA
b. ISA
c. CISC
d. RISC
Ans:-d. RISC
Instruction format:
It is based on how ALU operates instructions
Instruction includes a set of operation codes and operands that manage with the operation codes. Instruction format supports the design of bits in an instruction. It contains fields including opcode, operands, and addressingmode.
The instruction length is generally preserved in multiples of the character length, which is 8 bits. When the instruction length is permanent, several bits are assigned to opcode, operands, and addressing modes.
Addressing Mode
• The data is represented in the instruction format with the help of addressing mode
• The addressing mode is the first part of the instruction format
• The data can either be stored in the memory of a computer or it can be located in the register of the CPU
Operation Code( OPCODE)
• The operation code gives instructions to the processor to perform the specific Operation
• The operation code is the second part of the instruction format
OPERAND
• It is the part of the instruction format that specifies the data or the address of the data
Types of instruction format
Depending upon the processor of the computer the instruction format contains zero to three operands
Zero Address Instructions
• These instructions do not specify any operands or addresses. Instead, they operate on data stored in registers or memory locations implicitly defined by the instruction. For example, a zero address instruction might simply add the contents of two registers together without specifying the register names.
• There is no address field.
• Stack is used.
Advantages
• They are simple and can be executed quickly since they do not require any operand fetching or addressing. They also take up less memory space.
Examples
• Expression: X = (A+B)*(C+D)
• Postfixed : X = AB+CD+*
• TOP means top of stack
• M[X] is any memory location
One Address Instructions
• The instruction format in which the instruction uses only one address field is called the one address instruction format
• In this type of instruction format, one operand is in the accumulator and the other is in the memory location
• It has only one operand
• It has two special instructions LOAD and STORE
Expression: X = (A+B)*(C+D)
ex:- AC is accumulator
M[] is any memory location
M[T] is temporary location
Advantages: They allow for a wide range of addressing modes, making them more flexible than zero-address instructions.
They also require less memory space than two or three-address instructions.
Two Address Instruction Format
The instruction format in which the instruction uses only two address fields is called the two address instruction format
This type of instruction format is the most commonly used instruction format
As in one address instruction format, the result is stored in the accumulator only, but in two addresses instruction format the result can be stored in different locations
This type of instruction format has two operands
It requires shorter assembly language instructions
The number of instruction will increases and speed of execution is slow
Expression: X = (A+B)*(C+D)
Three Address Instructions format
These instructions specify three operands or addresses, which may be memory locations or registers.
The instruction operates on the contents of all three operands, and the result may be stored in the same or a different location. For example, a three-address instruction might multiply the contents of two registers together and add the contents of a third register, storing the result in a fourth register.
Expression: X = (A+B)*(C+D)
R1, R2 are registers
M[] is any memory location
Advantages:
They allow for even more complex operations and can be more efficient than two-address instructions since they allow for three operands to be processed in a single instruction. They also allow for a wide range of addressing modes.
Disadvantages: length of instruction with increases which increases the size of instruction register(IR), cost increases
Addressing mode
It explains how operands are given in instructions.
Implied mode/Implicit::
In implied addressing the operand is specified in the instruction itself.
It is used for zero and one address instructions.
Eg. STC set carry, Add
Inc A accumulator=accumulator+1
Cl A complement accumulator
Size of instruction is very small
Immediate Mode
In this mode, the operand is given in the instruction itself as constant. An immediate mode instruction has an operand/data field rather than the address field. We use constant value and no computation is required to calculate effective address.
Data value should be less than address value
For example: ADD 7, which says Add 7 to contents of accumulator. 7 is the operand here.
Eg. Mov r1,#25
Add R1,#20
Register Mode
In this mode the operand is stored in the register and this register is
present in CPU. The instruction has the address of the Register where
the operand is stored.
Eg. Add r1,r2
Mov r1,r2
Advantages
• Shorter instructions and faster instruction fetch.
• Faster memory access to the operand(s)
Disadvantages
• Very limited address space
• Using multiple registers helps performance but it complicates the instructions.
Register Indirect Mode
In this mode, the instruction specifies the register whose contents give us the address of operand which is in memory. Thus, the register contains the address of the operand rather than the operand itself.
Eg. Add r1,[r2]
R1=r1+m[r2]
LOAD (R1)
ACC=M[R1]
Direct Addressing Mode(Absolute addressing mode)
In this mode, the actual memory address of the operand is present in the instruction itself.
• Single memory reference to access data.
• No additional calculations to find the effective address of the operand.
For Example: ADD R1, [4000] - In this the 4000 is memory address of operand.
NOTE: Effective Address is the location where operand is present.
No computation is required to calculate EA.
Uses to access variables
Drawbacks: If instruction size is fixed it may cause problem.
Indirect Addressing Mode
In this, the address field of instruction gives the address where the effective address is stored in memory. This slows down the execution, as this includes multiple memory lookups to find the operand.
It is used to implement the pointer and passing parameters
2 memory access is required
For Example: ADD R1, @2000 - In this the 2000 is effective address of operand.
NOTE: Effective Address is the location where operand is present.
Relative Addressing Mode
• Relative addressing is the technique of addressing instructions and data areas by designating their location in relation to the location counter or to some symbolic location. This is used for program control instruction statement.
• the effective address could be calculated here if we add the displacement (the immediate value that is given in the instruction) along with the register value. Here, the given instruction’s address part is a signed number (usually), either negative or positive. The effective address calculated in this case is relative to the next instruction’s address.
• EA = Displacement + pc
• NOTE D MUST BE ONE LESS THAN JUMP
Base Index addressing mode
• In this addressing mode, operand is given by base plus index register.
• Eg. Add R1, [R2+R3] where r2 is base register and r3 is index register
• Add R1, [100+200]MEMORY ADDRESS OF R2 AND R3
Interrupts in 8085
In the 8085 microprocessor, interrupt is a process in which control of the program transfers from the main program to the starting location defined by interrupt. It is a process by which some external device or peripheral informs microprocessor to become ready for data communication by accepting the made request. Hence, it is a signal that temporarily suspends the normal execution of a program and redirects the control to a specific interrupt service routine (ISR). Interrupts allow the microprocessor to respond to external events, such as user input, system events, or hardware signals without the need for constant polling.
Interrupts are the signals generated by the external devices to request the microprocessor to perform a task. There are 5 interrupt signals, i.e. TRAP, RST 7.5, RST 6.5, RST 5.5, and INTR.
Types of Interrupt Signals in the 8085 Microprocessor
There are five interrupt signals in the 8085 microprocessor:
TRAP: The TRAP interrupt is a non-maskable interrupt that is generated by an external device, such as a power failure or a hardware malfunction. The TRAP interrupt has the highest priority and cannot be disabled.
RST 7.5: The RST 7.5 interrupt is a maskable interrupt that is generated by a software instruction. It has the second highest priority.
RST 6.5: The RST 6.5 interrupt is a maskable interrupt that is generated by a software instruction. It has the third highest priority.
RST 5.5: The RST 5.5 interrupt is a maskable interrupt that is generated by a software instruction. It has the fourth highest priority.
INTR: The INTR interrupt is a maskable interrupt that is generated by an external device, such as a keyboard or a mouse. It has the lowest priority and can be disabled.
OR,
Interrupt
• An Interrupt is the signals generated by the external devices that halt the normal flow of the program.
• Interrupts are generally the requests from the external device to the microprocessor for performing some actions.
• In case when an interrupt occurs, the microprocessor shifts and starts working on a temporarywork on a different task, and then later return to its previous task. Interrupts can be internal orexternal. When the interrupt occurs, the program stops executing and the microcontroller begins to execute the interrupt service routine (ISR).
• Interrupts are essential for efficient computing because they allow the processor to handle multiple tasks simultaneously. For example, while the processor is running a word processing program, it can also be responding to keystrokes, checking for incoming network data, and playing music in the background. Without interrupts, the processor would have to wait for each task to finish before starting the next one, which would be much slower
Twotypes of Interrupt:
Hardware and Software
Hardware interrupt
• If the signal for the processor is from the external device or hardware is called a hardware interrupt.
• Example: if we will press a key from the keyboard to do some action. This will generate a signal that is given to the processor to take action, we call such interrupts as hardware interrupts.
There are two types of hardware interrupt.
Maskable Interrupt
• The hardware interrupts can be delayed when a much higher priority interrupt has occurred to the processor. Eg, RST7.5, RST6.5 and RST5.5
Non-Maskable Interrupt
• The hardware interrupt cannot be delayed and should be processed by the processor immediately. Eg trap
Software interrupt
Software interrupts are the interrupts that can be inserted into a desired location in the program. Software interrupts can also be divided into two types. They are
Normal Interrupts
• the interrupts that are caused because of the software instructions arecalled software instructions.
Exception
• The unplanned interrupts that occur while the execution of a program iscalled Exception.
Computer Architecture
Computer Architecture is a blueprint for design and implementation of a computer system. It refers to the overall design of a computer system, including the hardware and software components that make up the system and how they interact with each other.
Computer architecture provides the functional details and behaviour of a computer system. It involves the design of the instruction set, the microarchitecture, and the memory hierarchy, as well as the design of the hardware and software components that make up the system.
Computer Architecture mainly deals with the functional behaviour of a computer system and covers the "What to do?" part.
It gives the functional description of requirements, design, and implementation of the different parts of a computer system.
Computer Organization refers to the way in which the hardware components of a computer system are arranged and interconnected. It implements the provided computer architecture and covers the "How to do?" part.
Computer Organization is to be defined after the decision of the computer architecture. It just provides information that how operational attributes of a computer system are linked together and help in realizing the architectural specification of the computer. It involves the design of the interconnections between the various hardware components, as well as the design of the memory and I/O systems.
Difference between Computer Architecture and Computer Organization.
Computer Architecture
They explain what a computer does.
They majorly focus on the functional behaviour of computer systems.
Computer architectures deal with high level design matters.
It comes before computer organisation.
It covers logical functions, such as registers, data types, instruction sets, and addressing modes.
They coordinate between the hardware and software of the system.
Computer Organisation
They explain how a computer actually does it.
They majorly focus on the structural relationship and deep knowledge of the internal working of a system.
They deal with low-level design matters.
It comes after the architecture part.
It covers physical units like peripherals, circuit designs, and adders.
They manage the portion of the network in a system
Instruction Cycle
A program consisting of the memory unit of the computer includes a series of instructions.
The program is implemented on the computer by going through a cycle for each instruction.
• The instruction cycle (also known as the Fetch–Decode–Execute cycle or the fetch-execute cycle) is the basic operational process of a computer system.
• The time taken for the execution of an instruction is known as the Instruction Cycle.
• It is the process by which a computer retrieves a program instruction from its memory, determines what actions the instruction describes, and then carries out those actions.
• This cycle is repeated continuously by a computer's central processing unit (CPU), from boot-up until the computer has shut down.
In the basic computer, each instruction cycle includes the following procedures −
➢It can fetch instruction from memory.
➢It is used to decode the instruction.
➢It can read the effective address from memory if the instruction has an indirect address.
➢It can execute the instruction.
Instruction Cycle
➢ Fetch: The CPU fetches, or retrieves, the next instruction from memory.The program counter (PC) holds the address of the current instruction being executed. The CPU reads the instruction located at this address from the memory into its instruction register (IR).
➢ Decode: The fetched instruction is decoded to determine what operation needs to be performed and what data is involved. The CPU interprets the instruction's opcode (operation code) and identifies the specific operation or instruction to be executed.
➢ Execute: After decoding the instruction, the CPU carries out the actual operation specified by the instruction. This might involve performing arithmetic or logical operations, transferring data between registers or memory, or controlling other hardware components. The result of the operation may be stored in registers or memory as needed.
➢ Write Back (optional): In some cases, the CPU needs to write the results of the operation back to memory or registers. For instance, after performing a calculation, the CPU may store the result in a register or update the value in memory.
MACHINE CYCLE
The machine cycle is the most basic operation that a computer performs, and in order to complete
menial tasks such as showing a single character on the screen, the CPU has to perform multiple
cycles. The computer does this from the moment it boots up until it shuts down.It is also known as the instruction cycle.
The steps of a machine cycle are:
Fetch – The control unit requests instructions from the main memory that is stored at amemory’s location as indicated by the program counter (also known as the instructioncounter).
Decode – Received instructions are decoded in the instruction register. This involvesbreaking the operand field into its components based on the instruction’s operation code(opcode).
Execute – This involves the instruction’s opcode as it specifies the CPU operationrequired. The program counter indicates the instruction sequence for computer. Theseinstructions are arranged into the instructions register and as each are executed, itincrements the program counter so that the next instruction is stored in memory.Appropriate circuitry is then activated to perform the requested task. As soon asinstructions have been executed, it restarts the machine cycle that begins the fetch step.
Examples of machine cycle:
Simple processor. Processor used in a calculator might fetch an instruction to add two numbers, decode the instruction to determine the numbers to add, and then execute the addition and display the result.
Basic processor. Processor used in a simple computer might fetch an instruction, decode it to determine the memory address to load from, execute the load, and store the value in a register.
Complex processor. A processor used in a modern computer might fetch an instruction, decode it, execute it, and then perform additional steps such as fetching data from cache or memory, performing pipelining, or handling interrupts.
Control unit
The control unit, a vital part of the computer's CPU (central processing unit), orchestrates the processor's operation. The concept of a control unit was first introduced by John von Neumann in his Von Neumann Architecture . The control unit's responsibility is to guide the computer's arithmetic/logic unit, memory, and input and output devices on how to react to the commands given to the processor.
The control unit retrieves internal program commands from the main memory, moves them to the processor instruction register, and produces a control signal based on the register's contents to manage the execution of these commands.
the Working of a CPU Control Unit
A control unit gets data from the user, converts it into control signals, and then passes these signals to the central processor. The computer's processor then guides the connected hardware on the tasks to perform. Since CPU architecture varies from one manufacturer to another, the functions carried out by a control unit in a computer depend on the type of CPU. Examples of devices that require a control unit include:
CPUs or Central Processing Units
GPUs or Graphics Processing Units
The Role of the Control Unit
It manages the flow of data into, out of, and between the different subunits of a processor.
It can interpret commands and instructions.
It regulates the flow of data within the processor.
It can accept external commands or instructions, which it transforms into a series of control signals.
It oversees the multiple execution units of a CPU (such as ALUs, data buffers, and registers).
It also performs various operations, including fetching, decoding, managing execution, and storing results.
Types of Control Unit
Hardwired Control Unit
Micro Programmable Control Unit
Hardwired Control Unit
The hardwired control unit is a unique form of control signal generation that utilizes Finite State Machines (FSM). This control unit is constructed as a sequential logic circuit, made by physically interconnecting components like flip-flops, gates, and drums to produce the final circuit. Due to its physical construction, it is commonly referred to as a hardwired controller.
The hardwired control unit is a technique used to generate control signals using Finite State Machines (FSM). The control signals necessary for executing instructions in the Hardwired Control Unit are produced by special hardware logic circuits. It is important to note that altering the mechanism of signal production isn't possible without physically modifying the circuit's structure.
Key Features of the Hardwired Control Unit
A Hardwired Control unit is composed of two decoders, a sequence counter, and logic gates.
The instruction register (IR) stores instructions fetched from the memory unit.
The instruction register comprises the operation code, the I bit, and bits 0 through 11.
A 3 x 8 decoder is used to encode the operation code in bits 12 through 14.
The outputs of the decoder are represented by the letters D0 through D7.
The operation code of bit 15 is transferred to a flip-flop denoted by the symbol I.
The control logic gates are programmed with operation codes from bits 0 to 11.
The sequence counter (or SC) has the ability to count from 0 to 15 in binary.
How Does a Hardwired Control Unit Work?
The operation code of an instruction holds the basic data for generating control signals. This operation code is decoded in the instruction decoder, which is a set of decoders that decode various fields of the instruction opcode.
Consequently, only a few of the instruction decoder’s output lines have active signal values. These output lines are connected to the inputs of the matrix, which provides control signals for the computer's executive units. This matrix combines the decoded signals from the instruction opcode with the outputs from the matrix that generates signals indicating consecutive control unit states, along with signals from the external environment, such as interrupt signals.
Benefits of Using a Hardwired Control Unit
The Hardwired Control Unit is fast because it uses combinational circuits to generate signals.
The delay that can occur in the creation of control signals is dependent on the number of gates.
It can be optimized to achieve the fastest mode of operation.
It is faster than a micro-programmed control unit.
Limitations of a Hardwired Control Unit
The design becomes more complex as more control signals need to be generated (requiring more encoders or decoders).
Changes to control signals are challenging as they require reconfiguring the wires in the hardware circuit.
Adding a new feature can be difficult and time-consuming.
Evaluating and fixing flaws in the initial design can be challenging.
It can be somewhat expensive.
Microprogram control unit
A Microprogrammed Control Unit is a unique type of control unit that stores binary control values as words within its memory. It operates by generating specific signal collections at every system clock beat, which in turn, direct the instructions to be executed. Every output signal triggers a micro-operation, like register transfer, which results in specific micro-operations that can be stored within memory due to the sets of control signals.
A programming approach is utilized in the implementation of a microprogrammed control unit. It involves the usage of a program, made up of microinstructions, to execute a series of micro-operations. The control memory of the control unit stores a microprogram, which is composed of these microinstructions. The generation of a set of control signals is contingent on the execution of a microinstruction.
Key Characteristics of a Microprogrammed Control Unit
The control memory address register specifies the address of the microinstruction.
The control memory holds all the control information and is typically considered a ROM.
The control register stores the microinstruction fetched from memory.
A control word in the microinstruction specifies one or several micro-operations for a data processor.
While the micro-operations are being executed, the next address is calculated in the circuit of the next address generator and then transferred to the control address register to read the next microinstruction.
The next address generator, also known as a microprogram sequencer, determines the sequence of addresses retrieved from control memory.
Understanding Instruction Words
In micro-programmed control units, the following instruction words are typically fetched into the instruction register. However, the operation code of each instruction is not directly decoded to enable instant control signal generation; instead, it contains the initial address of a microprogram in the control store.
Utilizing a Single-level Control Store
The instruction opcode from the instruction register is received by the control store address register. This address reads the first microinstruction of a microprogram that interprets the execution of such an instruction to the microinstruction register. The operation element of this microinstruction contains encoded control signals, usually in the form of a few bit fields. These fields are decoded using a set of microinstruction field decoders. Also included in the microinstruction is the address of the next microinstruction in the provided instruction microprogram, along with a control field for controlling the actions of the microinstruction address generator.
The last-mentioned field determines the addressing mode or addressing operation to be applied to the address encoded in the continuing microinstruction. This address is refined in microinstructions and conditional addressing mode by employing the processor condition flags, which describe the status of calculations in the current program. The last microinstruction in the instruction of the provided microprogram is the microinstruction that fetches the very next instruction to the instruction register from the main memory.
Making use of a Two-level Control Store
A control unit with a two-level control store also contains nano-instruction memory, in addition to the control memory for the microinstructions. In such a control unit, the microinstructions do not include encoded control signals. The operation component of microinstructions contains the address of a word in the nano-instruction memory that carries encoded control signals. The nano-instruction memory stores all combinations of control signals that exist in microprograms that interpret a computer’s entire instruction set, which is written in nano-instructions only once.
This eliminates the need to store the same operation sections of microinstructions multiple times. In this situation, micro-instruction words can be significantly shorter than in the single level control store, resulting in a significantly smaller microinstruction memory in terms of bits and, consequently, a smaller overall control memory. The control for selecting consecutive microinstructions is stored in the microinstruction memory, whereas those control signals are generated based on nano-instructions. Control signals in nano-instructions are usually encoded using the 1 bit/1 signal technique, which eliminates the need for decoding.
Advantages of a Microprogrammed Control Unit
It enables a systematic design of the control unit.
It’s easier to troubleshoot and modify.
It can maintain the basic structure of the control function.
It simplifies the design of the control unit, making it less expensive and less prone to errors or glitches.
It allows for a methodical and orderly design.
It is used to control software-based functions rather than hardware-based functions.
It’s more flexible.
It can execute complex functions with ease.
Disadvantages of a Microprogrammed Control Unit
Flexibility comes at a higher cost.
It is slower compared to a hardwired control unit.
Types of Micro-programmed Control Unit –
Based on the type of Control Word stored in the Control Memory (CM), it is classified into two types :
1. Horizontal Micro-programmed Control Unit :
The control signals are represented in the decoded binary format that is 1 bit/CS. Example: If 53 Control signals are present in the processor then 53 bits are required. More than 1 control signal can be enabled at a time.
It supports longer control words.
It is used in parallel processing applications.
It allows a higher degree of parallelism. If degree is n, n CS is enabled at a time.
It requires no additional hardware(decoders). It means it is faster than Vertical Microprogrammed.
It is more flexible than vertical microprogrammed
2. Vertical Micro-programmed Control Unit :
The control signals are represented in the encoded binary format. For N control signals- Log2(N) bits are required.
It supports shorter control words.
It supports easy implementation of new control signals therefore it is more flexible.
It allows a low degree of parallelism i.e., the degree of parallelism is either 0 or 1.
Requires additional hardware (decoders) to generate control signals, it implies it is slower than horizontal microprogrammed.
It is less flexible than horizontal but more flexible than that of a hardwired control unit.
Computer Memory
Computer memory is just like the human brain. It is used to store data/information and instructions. It is a data storage unit or a data storage device where data is to be processed and instructions required for processing are stored. It can store both the input and output can be stored here.
It’s faster than secondary memory (e.g., hard drives).
It is usually volatile, meaning it loses data when power is turned off.
A computer needs to run; a computer can’t operate without primary memory.
Types of Computer Memory
In general, computer memory is divided into three types:
Primary memory
Secondary memory
Cache memory
1. Primary Memory
It is also known as the main memory of the computer system. It is used to store data and programs or instructions during computer operations. It uses semiconductor technology and hence is commonly called semiconductor memory. Primary memory is of two types:
RAM (Random Access Memory): It is a volatile memory. Volatile memory stores information based on the power supply. If the power supply fails/ interrupted/stopped, all the data and information on this memory will be lost. RAM is used for booting up or starting the computer. It temporarily stores programs/data which has to be executed by the processor. RAM is of two types:
S RAM (Static RAM): S RAM uses transistors and the circuits of this memory are capable of retaining their state as long as the power is applied. This memory consists of the number of flip flops with each flip flop storing 1 bit. It has less access time and hence, it is faster.
D RAM (Dynamic RAM): D RAM uses capacitors and transistors and stores the data as a charge on the capacitors. They contain thousands of memory cells. It needs refreshing of charge on capacitor after a few milliseconds. This memory is slower than S RAM.
ROM (Read Only Memory): It is a non-volatile memory. Non-volatile memory stores information even when there is a power supply failed/ interrupted/stopped. ROM is used to store information that is used to operate the system. As its name refers to read-only memory, we can only read the programs and data that are stored on it. It contains some electronic fuses that can be programmed for a piece of specific information. The information is stored in the ROM in binary format. It is also known as permanent memory. ROM is of four types:
MROM(Masked ROM): Hard-wired devices with a pre-programmed collection of data or instructions were the first ROMs. Masked ROMs are a type of low-cost ROM that works in this way.
PROM (Programmable Read Only Memory): This read-only memory is modifiable once by the user. The user purchases a blank PROM and uses a PROM program to put the required contents into the PROM. Its content can’t be erased once written.
EPROM (Erasable Programmable Read Only Memory): EPROM is an extension to PROM where you can erase the content of ROM by exposing it to Ultraviolet rays for nearly 40 minutes.
EEPROM (Electrically Erasable Programmable Read Only Memory): Here the written contents can be erased electrically. You can delete and reprogram EEPROM up to 10,000 times. Erasing and programming take very little time, i.e., nearly 4 -10 ms(milliseconds). Any area in an EEPROM can be wiped and programmed selectively.
B. Secondary Memory
It is also known as auxiliary memory and backup memory. It is a non-volatile memory and used to store a large amount of data or information. The data or information stored in secondary memory is permanent, and it is slower than primary memory. A CPU cannot access secondary memory directly. The data/information from the auxiliary memory is first transferred to the main memory, and then the CPU can access it.
Characteristics of Secondary Memory
It is a slow memory but reusable.
It is a reliable and non-volatile memory.
It is cheaper than primary memory.
The storage capacity of secondary memory is large.
A computer system can run without secondary memory.
In secondary memory, data is stored permanently even when the power is off.
Types of Secondary Memory
1. Magnetic Tapes: Magnetic tape is a long, narrow strip of plastic film with a thin, magnetic coating on it that is used for magnetic recording. Bits are recorded on tape as magnetic patches called RECORDS that run along many tracks. Typically, 7 or 9 bits are recorded concurrently. Each track has one read/write head, which allows data to be recorded and read as a sequence of characters. It can be stopped, started moving forward or backwards or rewound.
2. Magnetic Disks: A magnetic disk is a circular metal or a plastic plate and these plates are coated with magnetic material. The disc is used on both sides. Bits are stored in magnetized surfaces in locations called tracks that run in concentric rings. Sectors are typically used to break tracks into pieces.
Hard discs are discs that are permanently attached and cannot be removed by a single user.
3. Optical Disks: It’s a laser-based storage medium that can be written to and read. It is reasonably priced and has a long lifespan. The optical disc can be taken out of the computer by occasional users.
C. Cache Memory
It is a type of high-speed semiconductor memory that can help the CPU run faster. Between the CPU and the main memory, it serves as a buffer. It is used to store the data and programs that the CPU uses the most frequently.
Advantages of Cache Memory
It is faster than the main memory.
When compared to the main memory, it takes less time to access it.
It keeps the programs that can be run in a short amount of time.
It stores data for temporary use.
Disadvantages of Cache Memory
Because of the semiconductors used, it is very expensive.
The size of the cache (amount of data it can store) is usually small.
Memory Hierarchy
Memory Hierarchy is an enhancement to organize the memory such that it can minimize the access time. The Memory Hierarchy was developed based on a program behavior known as locality of references (same data or nearby data is likely to be accessed again and again). The figure below clearly demonstrates the different levels of the memory hierarchy.
Types of Memory Hierarchy
This Memory Hierarchy Design is divided into 2 main types:
External Memory or Secondary Memory: Comprising of Magnetic Disk, Optical Disk, and Magnetic Tape i.e. peripheral storage devices which are accessible by the processor via an I/O Module.
Internal Memory or Primary Memory: Comprising of Main Memory, Cache Memory & CPU registers. This is directly accessible by the processor.
Memory Hierarchy Design
1. Registers
Registers are small, high-speed memory units located in the CPU. They are used to store the most frequently used data and instructions. Registers have the fastest access time and the smallest storage capacity, typically ranging from 16 to 64 bits.
2. Cache Memory
Cache memory is a small, fast memory unit located close to the CPU. It stores frequently used data and instructions that have been recently accessed from the main memory. Cache memory is designed to minimize the time it takes to access data by providing the CPU with quick access to frequently used data.
3. Main Memory
Main memory, also known as RAM (Random Access Memory), is the primary memory of a computer system. It has a larger storage capacity than cache memory, but it is slower. Main memory is used to store data and instructions that are currently in use by the CPU.
Types of Main Memory
Static RAM: Static RAM stores the binary information in flip flops and information remains valid until power is supplied. Static RAM has a faster access time and is used in implementing cache memory.
Dynamic RAM: It stores the binary information as a charge on the capacitor. It requires refreshing circuitry to maintain the charge on the capacitors after a few milliseconds. It contains more memory cells per unit area as compared to SRAM.
read more about – Different Types of RAM (Random Access Memory)
4. Secondary Storage
Secondary storage, such as hard disk drives (HDD) and solid-state drives (SSD) , is a non-volatile memory unit that has a larger storage capacity than main memory. It is used to store data and instructions that are not currently in use by the CPU. Secondary storage has the slowest access time and is typically the least expensive type of memory in the memory hierarchy.
5. Magnetic Disk
Magnetic Disks are simply circular plates that are fabricated with either a metal or a plastic or a magnetized material. The Magnetic disks work at a high speed inside the computer and these are frequently used.
6. Magnetic Tape
Magnetic Tape is simply a magnetic recording device that is covered with a plastic film. Magnetic Tape is generally used for the backup of data. In the case of a magnetic tape, the access time for a computer is a little slower and therefore, it requires some amount of time for accessing the strip.
Characteristics of Memory Hierarchy
Capacity: It is the global volume of information the memory can store. As we move from top to bottom in the Hierarchy, the capacity increases.
Access Time: It is the time interval between the read/write request and the availability of the data. As we move from top to bottom in the Hierarchy, the access time increases.
Performance: The Memory Hierarch design ensures that frequently accessed data is stored in faster memory to improve system performance.
Cost Per Bit: As we move from bottom to top in the Hierarchy, the cost per bit increases i.e. Internal Memory is costlier than External Memory.
Advantages of Memory Hierarchy
Performance: Frequently used data is stored in faster memory (like cache), reducing access time and improving overall system performance.
Cost Efficiency: By combining small, fast memory (like registers and cache) with larger, slower memory (like RAM and HDD), the system achieves a balance between cost and performance. It saves the consumer’s price and time.
Optimized Resource Utilization: Combines the benefits of small, fast memory and large, cost-effective storage to maximize system performance.
Efficient Data Management: Frequently accessed data is kept closer to the CPU, while less frequently used data is stored in larger, slower memory, ensuring efficient data handling.
Disadvantages of Memory Hierarchy
Complex Design: Managing and coordinating data across different levels of the hierarchy adds complexity to the system’s design and operation.
Cost: Faster memory components like registers and cache are expensive, limiting their size and increasing the overall cost of the system.
Latency: Accessing data stored in slower memory (like secondary or tertiary storage) increases the latency and reduces system performance.
Maintenance Overhead: Managing and maintaining different types of memory adds overhead in terms of hardware and software.
Cache memory
Cache memory is a small, high-speed storage area in a computer. The cache is a smaller and faster memory that stores copies of the data from frequently used main memory locations. There are various independent caches in a CPU, which store instructions and data.
The most important use of cache memory is that it is used to reduce the average time to access data from the main memory.
The concept of cache works because there exists locality of reference (the same items or nearby items are more likely to be accessed next) in processes.
By storing this information closer to the CPU, cache memory helps speed up the overall processing time. Cache memory is much faster than the main memory (RAM). When the CPU needs data, it first checks the cache. If the data is there, the CPU can access it quickly. If not, it must fetch the data from the slower main memory.
or,
Cache memory is a special type of high-speed memory located close to the CPU in a computer. It stores frequently used data and instructions, So that the CPU can access them quickly, improving the overall speed and efficiency of the computer.
It is a faster and smaller segment of memory whose access time is as close as registers. In a hierarchy of memory, cache memory has access time lesser than primary memory. Generally, cache memory is used as a buffer.
The main purpose of cache memory is to improve the overall performance of a computer system by reducing the time taken to access frequently used data. By storing copies of frequently accessed data closer to the CPU, cache memory minimizes the latency associated with fetching data from slower main memory.
Characteristics of Cache Memory
Extremely fast memory type that acts as a buffer between RAM and the CPU.
Holds frequently requested data and instructions, ensuring that they are immediately available to the CPU when needed.
Costlier than main memory or disk memory but more economical than CPU registers.
Used to speed up processing and synchronize with the high-speed CPU.
Key Features of Cache Memory
Speed: Faster than the main memory (RAM), which helps the CPU retrieve data more quickly.
Proximity: Located very close to the CPU, often on the CPU chip itself, reducing data access time.
Function: Temporarily holds data and instructions that the CPU is likely to use again soon, minimizing the need to access the slower main memory.
Role of Cache Memory
The role of cache memory is explained below,
Cache memory plays a crucial role in computer systems.
It provide faster access.
It acts buffer between CPU and main memory(RAM).
Primary role of it is to reduce average time taken to access data, thereby improving overall system performance.
Benefits of Cache Memory
Various benefits of the cache memory are,
Faster access: Faster than main memory. It resides closer to CPU , typically on same chip or in close proximity. Cache stores subset of data and instruction.
Reducing memory latency: Memory access latency refers to time taken for processes to retrieve data from memory. Caches are designed to exploit principle of locality.
Lowering bus traffic: Accessing data from main memory involves transferring it over system bus. Bus is shared resource and excessive traffic can lead to congestion and slower data transfers. By utilizing cache memory , processor can reduce frequency of accessing main memory resulting in less bus traffic and improves system efficiency.
Increasing effective CPU utilization: Cache memory allows CPU to operate at a higher effective speed. CPU can spend more time executing instruction rather than waiting for memory access. This leads to better utilization of CPU’s processing capabilities and higher overall system performance.
Enhancing system scalability: Cache memory helps improve system scalability by reducing impact of memory latency on overall system performance.
Working of Cache Memory
In order to understand the working of cache we must understand few points:
Cache memory is faster, they can be accessed very fast
Cache memory is smaller, a large amount of data cannot be stored
Whenever CPU needs any data it searches for corresponding data in the cache (fast process) if data is found, it processes the data according to instructions, however, if data is not found in the cache CPU search for that data in primary memory(slower process) and loads it into the cache. This ensures frequently accessed data are always found in the cache and hence minimizes the time required to access the data.
How does Cache Memory Improve CPU Performance?
Cache memory improves CPU performance by reducing the time it takes for the CPU to access data. By storing frequently accessed data closer to the CPU, it minimizes the need for the CPU to fetch data from the slower main memory.
What is a Cache Hit and a Cache Miss?
Cache Hit: When the CPU finds the required data in the cache memory, allowing for quick access.On searching in the cache if data is found, a cache hit has occurred.
Cache Miss: When the required data is not found in the cache, forcing the CPU to retrieve it from the slower main memory.On searching in the cache if data is not found, a cache miss has occurred
Performance of cache is measured by the number of cache hits to the number of searches. This parameter of measuring performance is known as the Hit Ratio.
Hit ratio=(Number of cache hits)/(Number of searches)
Types of Cache Memory
L1 or Level 1 Cache: It is the first level of cache memory that is present inside the processor. It is present in a small amount inside every core of the processor separately. The size of this memory ranges from 2KB to 64 KB.
L2 or Level 2 Cache: It is the second level of cache memory that may present inside or outside the CPU. If not present inside the core, It can be shared between two cores depending upon the architecture and is connected to a processor with the high-speed bus. The size of memory ranges from 256 KB to 512 KB.
L3 or Level 3 Cache: It is the third level of cache memory that is present outside the CPU and is shared by all the cores of the CPU. Some high processors may have this cache. This cache is used to increase the performance of the L2 and L1 cache. The size of this memory ranges from 1 MB to 8MB.
Application of Cache Memory
CPU Caches: Cache memory is commonly used as CPU caches, such as L1, L2, and L3 caches, to store frequently accessed instructions and data.
Web Caching: In web servers, cache memory is utilized to store frequently requested web pages and resources, reducing server load and improving response times.
Database Caching: Database systems often employ cache memory to store frequently accessed data and query results, enhancing query performance and reducing database server load.
Disk Caching: Operating systems use cache memory to temporarily store data read from and written to disk drives, reducing disk access times and improving overall system performance.
Advantages of Cache Memory
Faster Access: Cache memory provides quicker access to frequently used data compared to main memory, improving overall system performance.
Reduced Latency: By storing frequently accessed data closer to the processor, cache memory reduces memory access latency, enhancing system responsiveness.
Improved Efficiency: Cache memory reduces the need for frequent access to slower main memory, optimizing system resources and reducing memory bottlenecks.
Enhanced Throughput: Cache memory increases system throughput by minimizing the time spent waiting for data to be fetched from slower main memory.
Disadvantages of Cache Memory
Limited Capacity: Cache memory has limited capacity compared to main memory, resulting in the potential for cache thrashing and reduced effectiveness for large datasets.
High Cost per Byte: Cache memory is more expensive per unit of storage compared to main memory, making it economically impractical to have large cache sizes.
Complex Management: Cache memory requires complex management algorithms to ensure that the most relevant data is stored in the cache, which can introduce overhead and complexity.
Coherency Issues: In multiprocessor systems, maintaining cache coherency between multiple caches can be challenging and may lead to synchronization overhead and performance degradation.
Example of Cache memory
An example of cache memory is the CPU cache, which stores frequently used instructions and data to reduce the latency of memory access. Another example is web browser cache, which stores recently accessed web pages and resources for faster loading upon subsequent visits.
Cache Mapping
Cache mapping refers to the method used to store data from main memory into the cache. It determines how data from memory is mapped to specific locations in the cache. or,
Cache mapping is a technique that is used to bring the main memory content to the cache or to identify the cache block in which the required content is present
Cache mapping is the procedure to decide in which cache line the main memory block will be mapped. In other words, the pattern used to copy the required main memory content to the specific location of cache memory is called cache mapping.
The process of extracting the cache memory location and other related information in which the required content is present from the main memory address is called as cache mapping. The cache mapping is done on the collection of bytes called blocks. In the mapping, the block of main memory is moved to the line of the cache memory.
Need for Cache Mapping
Cache mapping is needed to identify where the cache memory is present in cache memory. Mapping provides the cache line number where the content is present in the case of cache hit or where to bring the content from the main memory in the case of cache miss.
Types of Cache mapping
There are three different types of mapping used for the purpose of cache memory which is as follows:
Direct Mapping
Fully Associative Mapping
Set-Associative Mapping
1. Direct Mapping
Direct mapping is a simple and commonly used cache mapping technique where each block of main memory is mapped to exactly one location in the cache called cache line. If two memory blocks map to the same cache line, one will overwrite the other, leading to potential cache misses. Direct mapping’s performance is directly proportional to the Hit ratio.
In direct mapping physical address is divided into three parts i.e., Tag bits, Cache Line Number and Byte offset. The bits in the cache line number represents the cache line in which the content is present whereas the bits in tag are the identification bits that represents which block of main memory is present in cache. The bits in the byte offset decides in which byte of the identified block the required content is present.
“In direct mapping,
A particular block of main memory can map only to a particular line of the cache.
The line number of cache to which a particular block can map is given by-
Cache line number= ( Main Memory Block Address ) Modulo (Number of lines in Cache)”
EX:-
2. Fully Associative Mapping
Fully associative mapping is a type of cache mapping where any block of main memory can be stored in any cache line. Unlike direct-mapped cache, where each memory block is restricted to a specific cache line based on its index, fully associative mapping gives the cache the flexibility to place a memory block in any available cache line. This improves the hit ratio but requires a more complex system for searching and managing cache lines.
In fully associative mapping address is divided into two parts i.e., Tag bits and Byte offset. The tag bits identify that which memory block is present and bits in the byte offset field decides in which byte of the block the required content is present.
In short”In fully associative mapping,
A block of main memory can map to any line of the cache that is freely available at that moment.
This makes fully associative mapping more flexible than direct mapping.”
Ex:-
Here,
All the lines of cache are freely available.
Thus, any block of main memory can map to any line of the cache.
Had all the cache lines been occupied, then one of the existing blocks will have to be replaced.
3. Set-Associative Mapping
Set-associative mapping is a compromise between direct-mapped and fully-associative mapping in cache systems. It combines the flexibility of fully associative mapping with the efficiency of direct mapping. In this scheme, multiple cache lines (typically 2, 4, or more) are grouped into sets.This reduces the conflict misses that occur in direct mapping while still limiting the search space compared to fully-associative mapping.
In set associative mapping the cache blocks are divided in sets. It divides address into three parts i.e., Tag bits, set number and byte offset. The bits in set number decides that in which set of the cache the required block is present and tag bits identify which block of the main memory is present. The bits in the byte offset field gives us the byte of the block in which the content is present.
In short”In k-way set associative mapping,
Cache lines are grouped into sets where each set contains k number of lines.
A particular block of main memory can map to only one particular set of the cache.
However, within that set, the memory block can map any cache line that is freely available.
The set of the cache to which a particular block of the main memory can map is given by-
Cache set number = ( Main Memory Block Address ) Modulo (Number of sets in Cache)
Algorithm
m = v * k
i= j mod v
where
i=cache set number
m=number of lines in the cache number of sets
j=main memory block number
v=number of sets
k=number of lines in each set
Example-Consider the following example of 2-way set associative mapping-
Here,
k = 2 suggests that each set contains two cache lines.
Since cache contains 6 lines, so number of sets in the cache = 6 / 2 = 3 sets.
Block ‘j’ of main memory can map to set number (j mod 3) only of the cache.
Within that set, block ‘j’ can map to any cache line that is freely available at that moment.
If all the cache lines are occupied, then one of the existing blocks will have to be replaced.
OR, in short for understanding point of view
Cache Mapping Techniques
Cache mapping denotes the approach that helps in transiting the content present in your RAM to the cache memory in computer systems. There are three different types of memory mapping in today’s time:
Direct Mapping
Full Associative Mapping
Set Associative Mapping (N-way)
1. Direct Mapping
The simplest memory mapping approach is direct mapping which is used to copy the block of main memory to the available cache line.
In this methodology, the algorithm assigns each memory block to a particular cache line directly without any middleware process. If the cache line is already occupied by another memory block, the previous block is emptied to load the new block.
The memory address space is divided into two segments namely the index field and the tag field with the main memory holding the index field address and the cache storing the later tag field as a reference.
2. Full Associative Mapping
In Full-associative mapping, associative memory is included as a middleware to store the content and addresses of the main memory. In this approach, any memory block can be aligned with any cache line freely to allow the placement of any word at any location of cache memory. This approach is highly optimized, flexible, and faster compared to direct mapping.
3. Set-Associative Mapping (N-way)
In this cache mapping, the experts tried to resolve the limitations and downsides of direct mapping by making some adjustments to the direct mapping algorithm. Set associative cache mapping fundamental eliminates the need to thrash or erase the occupied block as prevalent in the direct mapping approach.
The algorithm works by mapping the multiple lines together instead of mapping each line one by one. Then the entire memory block can be mapped to any of the cache lines. It’s basically a hybrid of the above two mapping techniques and serves as the best of two mapping worlds.
Multi-level Cache Memory
https://www.geeksforgeeks.org/multilevel-cache-organisation/
Cache is a random access memory used by the CPU to reduce the average time taken to access memory.
Multilevel Caches is one of the techniques to improve Cache Performance by reducing the “MISS PENALTY”. Miss Penalty refers to the extra time required to bring the data into cache from the Main memory whenever there is a “miss” in the cache.
For clear understanding let us consider an example where the CPU requires 10 Memory References for accessing the desired information and consider this scenario in the following 3 cases of System design
Case 1 : System Design without Cache Memory
Here the CPU directly communicates with the main memory and no caches are involved.
In this case, the CPU needs to access the main memory 10 times to access the desired information.
Case 2 : System Design with Cache Memory
Here the CPU at first checks whether the desired data is present in the Cache Memory or not i.e. whether there is a “hit” in cache or “miss” in the cache. Suppose there is 3 miss in Cache Memory then the Main Memory will be accessed only 3 times. We can see that here the miss penalty is reduced because the Main Memory is accessed a lesser number of times than that in the previous case.
Case 3 : System Design with Multilevel Cache Memory
Here the Cache performance is optimized further by introducing multilevel Caches. As shown in the above figure, we are considering 2 level Cache Design. Suppose there is 3 miss in the L1 Cache Memory and out of these 3 misses there is 2 miss in the L2 Cache Memory then the Main Memory will be accessed only 2 times. It is clear that here the Miss Penalty is reduced considerably than that in the previous case thereby improving the Performance of Cache Memory.
NOTE :
We can observe from the above 3 cases that we are trying to decrease the number of Main Memory References and thus decreasing the Miss Penalty in order to improve the overall System Performance. Also, it is important to note that in the Multilevel Cache Design, L1 Cache is attached to the CPU and it is small in size but fast. Although, L2 Cache is attached to the Primary Cache i.e. L1 Cache and it is larger in size and slower but still faster than the Main Memory.
Effective Access Time = Hit rate * Cache access time + Miss rate * Lower level access time
Average access Time For Multilevel Cache:(Tavg)
Tavg = H1 * C1 + (1 – H1) * (H2 * C2 +(1 – H2) *M )
where
H1 is the Hit rate in the L1 caches.
H2 is the Hit rate in the L2 cache.
C1 is the Time to access information in the L1 caches.
C2 is the Miss penalty to transfer information from the L2 cache to an L1 cache.
M is the Miss penalty to transfer information from the main memory to the L2 cache.
Von Neumann Architecture
Historically, there have been two types of computers: those that have a very defined function and cannot be programmed, such as calculators, and those that can be programmed (these can be configured to perform a variety of activities, and they store applications).
The contemporary computer is built on John von Neumann’s concept of stored program.
Programs and data are kept in a distinct storage unit called memories in this stored-program approach, and they handle the same.
A computer developed with this design would be considerably easier to reprogram, thanks to this unique notion.
Von-Neumann computer architecture design was proposed in 1945 It’s made up of three fundamental components:
Memory Unit
A memory unit is a collection of storage cells together with associated circuits needed to transfer information in and out of the storage. The memory stores binary information in groups of bits called words. The internal structure of a memory unit is specified by the number of words it contains and the number of bits in each word.
CPU (Central Processing Unit)
The control unit, main memory, and arithmetic-logic unit make up the central processing unit (CPU), which is the most important portion of any digital computer system. The CPU is the computer’s brain, including all of the circuitry required to process input, store data, and generate output. The CPU is always following computer program instructions that instruct it on which information to process as well as how to process it. We couldn’t run applications on a computer without a CPU.
CU (Control Unit)
It is responsible for all processor control signals and timing signals. It governs how data moves throughout the system, directs all input and output flow, and gets code for instructions.
ALU (Arithmetic and Logic Unit)
The arithmetic logic unit (ALU) is the portion of the CPU that handles all of the CPU’s computations, such as addition, subtraction, and comparisons. Also, Logical operations, arithmetic operations, and bit shifting operations are all performed by it.
Registers – Registers refer to high-speed storage areas in the CPU. The data processed by the CPU are fetched from the registers. There are different types of registers used in architecture :-
➢ Accumulator: Stores the results of calculations made by ALU. It holds the intermediate of arithmetic and logical operations. It act as a temporary storage location or device.
➢ Program Counter (PC): Keeps track of the memory location of the next instructions to be dealt with. The PC then passes this next address to the Memory Address Register (MAR).
➢ Memory Address Register (MAR): It stores the memory locations of instructions that need to be fetched from memory or stored in memory.
➢ Memory Data Register (MDR): It stores instructions fetched from memory or any data that is to be transferred to, and stored in, memory.
➢ Current Instruction Register (CIR): It stores the most recently fetched instructions while it is waiting to be coded and executed.
➢ Instruction Buffer Register (IBR): The instruction that is not to be executed immediately is placed in the instruction buffer register IBR
Disadvantages Of Von Neumann Architecture
One of the main limitations is that the shared bus can become a bottleneck if too many devices are connected to it. This can lead to slow performance and reduced scalability.
Additionally, the CPU can only execute one instruction at a time, which can limit the overall speed of the system.
Harvard architecture
Harvard architecture is a type of computer architecture that has separate memory spaces for instructions and data. It was developed at Harvard University in the 1930s, and it is named after this institution.
In a Harvard architecture system, the CPU accesses instruction and data memory spaces separately, which can lead to improved performance.
Overall, Von Neumann architecture is more flexible and easier to program, whereas Harvard architecture is more efficient and better suited for embedded systems that require high performance and reliability.
Components:
✓CPU: The central processing unit performs all the calculations and operations required to execute instructions.
✓Instruction memory: This memory holds instructions that the CPU needs to execute. It is typically implemented as read-only memory (ROM) or flash memory.
✓Data memory: This memory holds data that the CPU needs to perform computations. It is typically implemented as random access memory (RAM).
✓Input/output (I/O) devices: These devices are used to communicate with the outside world. Examples include keyboards, displays, and printers.
✓System bus: The system bus is a collection of wires that connect the CPU, instruction memory, data memory, and
I/O devices. It is used to transmit data, instructions, and control signals between these components.
Advantages Of Harvard Architecture
✓The CPU can access both instruction and data memory simultaneously.
✓This can lead to improved performance because the CPU does not have to switch between memory spaces as often as in a Von Neumann architecture.
✓Additionally, because the instruction memory is typically implemented as ROM or flash memory,it is non-volatile, meaning that it does not lose its contents when power is turned off.
✓This makes it well-suited for embedded systems that need to operate without a constant power source.
Disadvantages Of Harvard Architecture
✓As the CPU accesses instruction and data memory separately, it can be more difficult to write programs that require the CPU to modify its own code.
✓Additionally, because the instruction and data memories are separate, it can be more difficult to share data between different parts of a program.
Difference between Von Neumann Architecture & Harvard Architecture
Memory:- Single memory for instructions and data Separate memory for instructions and data
Access:- CPU accesses instructions and data through a shared bus CPU accesses instruction and data memory spaces separately
Performance:- Can become a bottleneck if too many devices are connected to the shared bus
Improved performance:- because the CPU can access both instruction and data memory simultaneously
Modification:- Easier to modify programs as instructions and data are stored in the same memory
More difficult to modify programs that require the CPU to modify its own code
Data sharing:- Easy to share data between different parts of a program
More difficult to share data between different parts of a program
Applications:- Suitable for general-purpose computing where flexibility is required
Suitable for embedded systems where performance is critical, and code is not frequently modified
RISC Architecture
RISC stands for Reduced Instruction Set Computer Processor, a microprocessor architecture with a simple collection and highly customized set of instructions.
It is built to minimize the instruction execution time by optimizing and limiting the number of instructions. It means each instruction cycle requires only one clock cycle, and each cycle contains three parameters: fetch, decode and execute.
The RISC processor is also used to perform various complex instructions by combining them into simpler ones. RISC chips require several transistors, making it cheaper to design and reduce the execution time for instruction.
Examples of RISC processors are SUN's SPARC, PowerPC, Microchip PIC processors, RISC-V
Characteristics of RISC:
1. It has simpler instructions and thus simple instruction decoding.
2. More general-purpose registers.
3. The instruction takes one clock cycle in order to get executed.
4. The instruction comes under the size of a single word.
5. Pipeline can be easily achieved.
6. Few data types.
7. Simpler addressing modes.
CISC
The CISC architecture comprises a complex instruction set. A CISC processor has a variable-length instruction format. In this processor architecture, the instructions that require register operands can take only two bytes.
In a CISC processor architecture, the instructions which require two memory addresses can take five bytes to comprise the complete instruction code. Therefore, in a CISC processor, the execution of instructions may take a varying number of clock cycles. The CISC processor also provides direct manipulation of operands that are stored in the memory.
The primary objective of the CISC processor architecture is to support a single machine instruction for each statement that is written in a high-level programming language.
Characteristics of CISC:
• The length of the code is short, so it requires very little RAM.
• CISC or complex instructions may take longer than a single clock cycle to execute the code.
• Less instruction is needed to write an application.
• It provides easier programming in assembly language.
• Support for complex data structure and easy compilation of high-level languages.
• It is composed of fewer registers and more addressing nodes, typically 5 to 20.
• Instructions can be larger than a single word.
• It emphasizes the building of instruction on hardware because it is faster to create than the software.
Difference Between RISC and CISC
RISC
1. It stands for Reduced Instruction Set Computer.
2.It is a microprocessor architecture that uses a small instruction set of uniform length.
3.These simple instructions are executed in one clock cycle.
4. These chips are relatively simple to design.
5. They are inexpensive.
6. Examples of RISC chips include SPARC, POWER PC.
7. It has a smaller number of instructions.
8. It has fixed-length encodings for instructions.
9. Simple addressing formats are supported.
10. It doesn't support arrays. It has a large number of instructions.
11. It doesn't use condition codes.
12. Registers are used for procedure arguments and return addresses.
CISC
It stands for Complex Instruction Set Computer.
This offers hundreds of instructions of different sizes to the users.
This architecture has a set of special purpose circuits which help execute the instructions at a high speed.
These chips are complex to design.
Examples of CISC include Intel architecture, AMD.
They are relatively expensive.
It has more instructions.
It has variable-length encodings of instructions.
The instructions interact with memory using complex addressing modes.
It supports arrays.
Condition codes are used.
The stack is used for procedure arguments and return addresses.
I/O Organization
Input/Output (I/O) programming in computer architecture involves managing the communication and data transfer between a computer's central processing unit (CPU) and external devices. This process is crucial for interacting with peripherals such as keyboards, mice, storage devices, and network interfaces.
I/O programming can be accomplished through various techniques, including programmed I/O, interrupt-driven I/O, and Direct Memory Access (DMA).
This I/O technique is the simplest to exchange data between external devices and processors. In this technique, the processor or Central Processing Unit (CPU) runs or executes a program giving direct control of I/O operations.
Processor issues a command to the I/O module and waits for the operation to complete. Also, the processor keeps checking the I/O module status until it finds the completion of the operation.
The processor's time is wasted, in case the processor is faster than the I/O module. Its module is considered to be a slow module.
Its application is in certain low-end microcomputers. It has a single output and single input instruction.
Each one of the instructions selects only one I/O device by number and transfers only a single character by byte.
Programmed I/O
• Programmed I/O instructions are the result of I/O instructions written in a computer program. Each data item transfer is initiated by the instruction in the program.
• Usually the program controls data transfer to and from CPU and peripheral. Transferring data under programmed I/O requires constant monitoring of the peripherals by the CPU.
• It is the responsibility of the processor to control the transfer from I/O to main memory as input and from main memory to I/O as output.
•To overcome the difference in the speed of the processor and I/O device a mechanism must be implemented to synchronize the transfer of data between the processor and I/O modules. This is we require programmed I/O
• When the processor is executing a program and encounters an instruction relating to input/output, it executes that instruction by issuing a command to the appropriate input/output module.
• With the programmed input/output, the input/output module will perform the required action and then set the appropriate bits in the input/output status register.
• The input/output module takes no further action to alert the processor. In particular it doesn't interrupt the processor. Thus, it is the responsibility of the BUS Device CPU Memory I/O DMA processor to check the status of the input/output module periodically, until it finds that the operation is complete.
Disadvantages
• In the programmed I/O method the CPU stays in the program loop until the I/O unit indicates that it is ready for data transfer. This is time consuming process because it keeps the processor busy needlessly.
Interrupt-driven I/O
It is such type of I/O programming in which the processor does not wait until the I/O operation is completed. The processor performs other tasks while the I/O operation is being performed.
When the I/O operation is completed, the I/O module interrupts the processor letting the processor know the operation is completed. Its module is faster than the programmed I/O module.
The processor actually starts the I/O device and instructs it to generate and send an interrupt signal when the operation is finished. This is achieved by setting an interrupt enabled bit in the status register.
This technique requires an interrupt for each character that is written or read. It is an expensive business to interrupt a running process as it requires saving context.
It requires additional hardware such as a PIC controller chip. It is fast and efficient.
The system's performance is improved and enhanced.
Direct memory access (DMA)
DMA (Direct memory access) is the special feature within the computer system that transfers the data between memory and peripheral devices(like hard drives) without the intervention of the CPU.
The process is managed by a chip known as a DMA controller
Working of DMA Controller
• DMA controller must share the bus with the processor to make the data transfer.
• The device that holds the bus at a given time is called bus master.
When an I/O device wants to initiate the transfer then it sends a DMA request signal to the DMA controller, for which the controller acknowledges if it is free If the DMA controller is free, it requests the control of bus from the processor by raising the bus request signal.
• Processor grants the bus to the controller by raising the bus grant signal, now DMA controller is the bus master.
• The processor initiates the DMA controller by sending the memory addresses, number of blocks of data to be transferred and direction of data transfer.
• After assigning the data transfer task to the DMA controller, instead of waiting ideally till completion of data transfer, the processor resumes the execution of the program after retrieving instructions from the stack.
• DMA controller now has the full control of buses and can interact directly with memory and I/O devices independent of CPU.
• It makes the data transfer according to the control instructions received by the processor.
• After completion of data transfer, it disables the bus request signal and CPU disables the bus grant signal thereby moving control of buses to the CPU.
Memory-mapped I/O
In this technique all I/O related references are done within the memory
Memory-mapped I/O (Input/Output) is a technique used in computer architecture where the same address space is used to address both memory and I/O devices.
In a memory-mapped I/O architecture, specific addresses are assigned to both RAM (random access memory) and various peripheral devices.
This allows the CPU to communicate with these devices using load and store instructions, treating them as if they were ordinary memory locations.
Memory mapped I/O is an interfacing technique in which memory related instructions are used for data transfer and the device is identified by a 16-bit address. In this type, the I/O devices are treated as memory locations. The control signals used are MEMR and MEMW.The interfacing between I/O and microprocessor will be same as single memory location.For data transfer between I/O device and microprocessor, microprocessor will send address, generate control signals MEMR and MEMW.
Some features:
Load and Store Instructions:
The CPU communicates with I/O devices using load (read) and store (write) instructions, just like it would with regular memory locations.
For example, reading from or writing to a specific memory address may correspond to reading from or writing to a register on an I/O device.
Simplified Programming:
Memory-mapped I/O simplifies programming by providing a consistent interface for accessing both memory and I/O devices. Programmers can use the same instructions and data transfer mechanisms for both types of operations.
Advantages of Memory-Mapped I/O:
➢ Faster I/O Operations: Memory-mapped I/O allows the CPU to access I/O devices at the same speed as it accesses memory. This means that I/O operations can be performed much faster compared to isolated I/O.
➢ Simplified Programming: Memory-mapped I/O simplifies programming as the same instructions can be used to access memory and I/O devices. This means that software developers do not have to use specialized I/O instructions, which can reduce programming complexity.
➢ Efficient Use of Memory Space: Memory-mapped I/O is more memory-efficient as I/O devices share the same address space as the memory. This means that the same memory address space can be used to access both memory and I/O devices.
Disadvantages of Memory-Mapped I/O:
• Limited I/O Address Space: Memory-mapped I/O limits the I/O address space as I/O devices share the same address space as the memory. This means that there may not be enough address space available to address all I/O devices.
• Slower Response Time: If an I/O device is slow to respond, it can delay the CPU’s access to memory. This can lead to slower overall system performance.
Interrupt
• An Interrupt is the signals generated by the external devices that halt the normal flow of the program.
• Interrupts are generally the requests from the external device to the microprocessor for performing some actions.
• In case when an interrupt occurs, the microprocessor shifts and starts working on a temporary work on a different task, and then later return to its previous task. Interrupts can be internal or external. When the interrupt occurs, the program stops executing and the microcontroller begins to execute the interrupt service routine (ISR).
• Interrupts are essential for efficient computing because they allow the processor to handle multiple tasks simultaneously. For example, while the processor is running a word processing program, it can also be responding to keystrokes, checking for incoming network data, and playing music in the background. Without interrupts, the processor would have to wait for each task to finish before starting the next one, which would be much slower.
• Two types of Interrupt:
Hardware and Software
Hardware interrupt
• If the signal for the processor is from the external device or hardware is called a hardware interrupt.
• Example: if we will press a key from the keyboard to do some action. This will generate a signal that is given to the processor to take action, we call such interrupts as hardware interrupts.
There are two types of hardware interrupt.
Maskable Interrupt
• The hardware interrupts can be delayed when a much higher priority interrupt has occurred to the processor. Eg, RST7.5, RST6.5 and RST5.5
Non-Maskable Interrupt
• The hardware interrupt cannot be delayed and should be processed by the processor immediately. Eg trap
Software interrupt
Software interrupts are the interrupts that can be inserted into a desired location in the program. Software interrupts can also be divided into two types. They are
Normal Interrupts
• the interrupts that are caused because of the software instructions are called software instructions.
Exception
• The unplanned interrupts that occur while the execution of a program is called Exception.
Pipelining
Pipelining is a technique used in modern processors to improve performance by executing multiple instructions simultaneously. It breaks down the execution of instructions into several stages, where each stage completes a part of the instruction. These stages can overlap, allowing the processor to work on different instructions at various stages of completion, similar to an assembly line in manufacturing.
Or, Pipelining in computer architecture is the processes of arranging the hardware so that simultaneous execution of multiple instructions takes place, thus, improving the overall performance. Pipelining is a fundamental concept in computer architecture that improves a processor's efficiency. It works by breaking down a complex instruction into smaller, more manageable steps. These steps are then executed in an assembly line fashion, like a factory pipeline.
Example of Pipelining in computer architecture
Let us consider a real-life example of taking food from a counter:
The entire process of taking food from the counter can be divided into various steps - Picking utensils, taking salad, taking food, taking vegetables, etc. Now consider the following two ways of executing this:
One person enters and takes utensils, salad, food, vegetables, and leaves. Then another person enters and repeats the process.
People stand in a queue such that when one person is taking vegetables, some other person will be taking food, someone will be taking salad and utensils.
You can see that the first process will have much lower efficiency than the second. While one person is taking food, the utensils, salad, and vegetable stalls are unused. On the other hand, people are simultaneously using the counter in the second process. Thus we have improved the efficiency of the process just by simultaneously executing multiple processes. Note that we have not used any extra resources.
The above example is similar to what we do in pipelining.
Types of Pipeline in Computer Architecture
The pipeline is divided into 2 categories:
Arithmetic Pipeline
Instruction Pipeline
1. Arithmetic Pipeline
An arithmetic pipeline focuses on dividing a single arithmetic operation (like addition, multiplication, etc.) into smaller stages. These stages could involve fetching operands from registers, performing the actual arithmetic calculation, and storing the result back in a register.
By pipelining arithmetic operations, the processor can potentially begin processing the next instruction while the current instruction is still completing in later stages. This improves the efficiency of the processor by keeping the arithmetic logic unit (ALU) constantly working on calculations.
2. Instruction Pipeline
An instruction pipeline breaks down the entire instruction fetch-decode-execute cycle into distinct stages. This might involve fetching the instruction from memory, decoding it to understand its operation, fetching operands, performing the operation, and storing the result.
With instruction pipelining, multiple instructions can be at different stages of execution concurrently, improving overall processor performance. This is because the processor is not stuck waiting for one instruction to complete all stages before it can begin processing the next one.
1. How many cycles must each stage in the pipelining process be completed within?
a. 4
b. 3
c. 2
d. 1
Answer – (d) 1
2. What can be used in pipelining to improve memory access speed?
a. Buffers
b. Cache
c. Special purpose registers
d. Special memory locations
Answer – (b) Cache
3. When the data for operands is unavailable, it is referred to as _________.
a. Structural hazard
b. Deadlock
c. Stock
d. Data hazard
Answer – (d) Data hazard
Pipelining Hazards and Remedies
The speed of the Central Processing Unit (CPU) is inherently limited by the memory. But there's another factor that comes into play in a pipelined design - the interdependence of instructions at various stages of execution. This interdependence can slow down the pipeline and these dependencies are known as hazards as they pose a risk to the execution process.
These terms, dependencies and hazards, are often used interchangeably in computer architecture. A hazard essentially prevents an instruction in the pipeline from being executed in the designated clock cycle. We use the term clock cycle because each instruction can be in a different machine cycle.
Pipelining boosts processor performance by overlapping instruction execution. But it's not all smooth sailing. Pipeline hazards can throw a wrench in the works, causing slowdowns and wasted cycles. These hiccups come in three flavors: structural, data, and control hazards.
Luckily, clever engineers have cooked up ways to tackle these issues. From forwarding data to predicting branches, these techniques help keep the pipeline flowing smoothly. Understanding hazards and their solutions is key to grasping how modern processors squeeze out every bit of performance.
Types of Pipeline Hazards in Computer Architecture
There are primarily three types of hazards in computer architecture:
1. Structural Hazards
2. Data Hazards
3. Control Hazards
Structural Hazard:
Structural hazards are caused by conflicts over hardware resources among the instructions in the pipeline. These resources could be memory, a General Purpose Register (GPR), or an Arithmetic Logic Unit (ALU). A resource conflict arises when multiple instructions in the pipeline require access to the same resource in the same clock cycle. This is a situation where the hardware is unable to handle all possible combinations in an overlapping pipelined execution.
Data Hazards:
Data hazards occur in pipelining when the execution of one instruction depends on the results of another instruction that is still being processed in the pipeline. Data hazards are categorized into three types based on the order of READ or WRITE operations on the registe
Control Hazards:
Control hazards, also known as branch hazards, are caused by branch instructions in computer architecture. These instructions control the flow of program or instruction execution. In higher-level languages, conditional statements are used for iterative loops and condition testing (think of while, for, and if case statements). These are converted into one of the variations of BRANCH instruction. Hence, a conditional hazard arises when the execution of one instruction depends on the result of another instruction, such as a conditional branch, which checks the consequent value of the condition.
Inshort
Structural hazards occur when hardware resources required by the pipeline stages cannot be supplied simultaneously due to resource conflicts
Example: Two instructions requiring access to the same memory unit at the same time
Data hazards arise when instructions have data dependencies between them that prevent parallel execution
Read after write (RAW) dependencies occur when an instruction reads a source before a previous instruction writes to it
Write after read (WAR) dependencies occur when an instruction writes to a destination before a previous instruction reads from it
Write after write (WAW) dependencies occur when two instructions write to the same destination in a different order than intended
Control hazards, also known as branch hazards, occur when the flow of instruction execution is altered by branch or jump instructions
Causes subsequent instructions that have been fetched or decoded to be discarded
Example: A branch instruction causing the pipeline to fetch instructions from a different memory address
Pipeline Hazard Mitigation Techniques
Resolving Structural Hazards
Provide more hardware resources to reduce conflicting resource requirements between pipeline stages
Example: Adding additional memory ports or ALUs to allow simultaneous access
Optimize instruction scheduling to minimize resource conflicts
Rearrange instructions to avoid multiple instructions requiring the same resource in the same cycle
Use out-of-order execution to allow instructions to execute in a different order than fetched, reducing resource conflicts
Resolving Data Hazards
Forwarding (bypassing) forwards the required data from a later pipeline stage back to an earlier stage when available
Avoids waiting for the data to pass through pipeline registers
Implemented using multiplexers that select between register file values and forwarded results based on the hazard type
Stalling the pipeline by inserting bubbles (empty cycles) can resolve data hazards when forwarding is not possible
Gives instructions enough time to complete and write their results back to the register file
Compiler optimizations can arrange code to minimize data hazards and stalling
Out-of-order execution allows instructions to execute in a different order than fetched, reducing data dependencies
Requires complex hardware to track dependencies and reorder instructions
Resolving Control Hazards
Branch prediction techniques attempt to predict the outcome of a branch before it is known
Allows the pipeline to speculatively fetch and execute instructions from the predicted path
Static branch prediction uses fixed rules based on branch instruction type or direction
Dynamic branch prediction uses runtime information and adaptive predictors to improve accuracy
Delayed branching reduces control hazard penalties by rearranging instructions to fill delay slots after a branch
Allows useful work to be done while the branch is resolved
Places the burden on the compiler to correctly fill delay slots
Branch target buffers store the target addresses of previously executed branches to reduce the cycles needed to calculate the target
Multiprocessors and Multicore architecture
A multiprocessing operating system is defined as a type of operating system that makes use of more than one CPU to improve performance. Multiple processors work parallely in multi-processing operating systems to perform the given task. All the available processors are connected to peripheral devices, computer buses, physical memory, and clocks. The main aim of the multi-processing operating system is to increase the speed of execution of the system. The use of a multiprocessing operating system improves the overall performance of the system. For example, UNIX, LINUX, and Solaris are the most widely used multi-processing operating system.
Or,
A system with a multiprocessor has several CPUs or processors. These systems execute multiple instructions concurrently. Throughput improves as a result. The remaining CPUs will keep operating normally even if one CPU fails. Multiprocessors are therefore more dependable.
Multiprocessor systems can take advantage of distributed or shared memory. To execute instructions concurrently, each processor in a shared memory multiprocessor shares the main memory and peripherals. In these systems, the main memory is accessed by all CPUs via a single bus. As bus traffic increases, the majority of CPUs will be inactive. The symmetric multiprocessor is another name for this kind of multiprocessor. It gives each CPU access to a single memory region.
A distributed memory multiprocessor contains private memory for each CPU. To do the computational duties, all processors can use local data. If remote data is required, the processor may access the main memory or interact with other processors over the bus.
Working of Multi-Processing Operating System
Multi-processing operating system consists of multiple CPUs. Each CPU is connected to the main memory.
The task to be performed is divided among all the processors.
For faster execution and improved performance, each processor is assigned a specific task.
Once all the tasks of each processor are completed they are compiled together in order to produce a single output.
The allocation of resources for each processor is handled by the operating system. This process results in better utilization of the available resources and improved performance.
The below diagram describes the working of multi-processing operating systems.
Types of Multiprocessing Operating Systems
Multi-processing operating systems are classified into two types. They are:
1. Symmetrical Multiprocessing Operating System
In a Symmetrical multiprocessing operating system, each processor executes the same copy of operating system every time. Each process makes its own decisions and works according to all other process to make sure that system works efficiently. With the help of CPU scheduling algorithms, the task is assigned to the CPU that has least burden. Symmetrical multiprocessing operating system is also known as "Shared Everything System" because all the processors share memory and input-output bus. Below image describes about symmetric multiprocessing operating system.
Advantages
Failure of one processor does not affect the functioning of other processors.
It divides all the workload equally to the available processors.
Make use of available resources efficiently.
Disadvantages
Symmetric multiprocessing OS are more complex.
They are more costly.
Synchronization between multiple processors is difficult.
2. Asymmetrical Multiprocessing Operating System
In an Asymmetrical multiprocessing operating system one processor acts as a master whereas the remaining all processors act as slaves. Slave processors are assigned with ready to execute processes by the master processor. A ready queue is being maintained by the master processor to provide processes for slaves. In a multiprocessing operating system a scheduler is created by a master process that assigns processes to be executed to slave processors. Below diagram describes the asymmetrical multiprocessing operating system.
Advantages
Asymmetrical multiprocessing operating systems are cost-effective.
They are easy to design and manage.
They are more scalable.
Disadvantages
There can be uneven distribution of workload among the processors.
The processors do not share the same memory.
Entire system goes down if one process fails.
Use Cases of Multiprocessor Organization
Let us now discuss some of the use cases of Multiprocessor Organization.
High-performance computing clusters :- Multiprocessor systems are used in clusters to distribute computational tasks among multiple processors, enabling high-performance computing for scientific simulations, weather forecasting, and financial modeling.
Database management systems :- Multiprocessor systems are utilized in database servers to handle concurrent user requests and provide efficient data processing and retrieval.
Web servers:- Multiprocessor systems are employed in web servers to handle a large number of simultaneous client connections and deliver fast response times.
Virtualization and cloud computing :- Multiprocessor systems are used in virtualized environments and cloud computing platforms to provide scalable and efficient computing resources for multiple virtual machines or containers.
Real-time systems :- Multiprocessor systems are used in real-time applications, such as flight control systems and process control systems, to ensure a timely and predictable response to critical events.
Conclusion
Multiprocessing operating systems are designed in such a way that multiple processors can work simultaneously. They provide advantages such as better performance, efficient utilization of resources and high availability.
Multicore Processor
In other words, a multicore processor comprises multiple processing units, or "Cores," each of which has the potential to do distinct tasks. For example, if you are doing many tasks at a time such as watching a movie and using other applications. one core processor will handle activities like watching a movie while the multicore processor handles other responsibilities at the same time.
Or,
A multi-core processor is a single computing device with many cores (separate processing units). It indicates that the system has a single CPU with several cores. These cores can each read and execute instructions from a computer. Despite being cores rather than processors, they function in a way that gives the impression that the computer system has multiple processors. These cores can carry out regular processor operations like a branch, move data, and add data.
A multicore system allows a single processor to execute many instructions at once, boosting the system's total program execution performance. It increases the speed at which instructions are carried out while reducing the amount of heat produced by the CPU. Applications for multicore processors include general-purpose, embedded, network, and graphics processing (GPU).
Uses of Multicore Processor
Multicore processors are used in many devices like desktop, laptop, smartphone, and gaming systems. Some applications which use multicore processor are as below.
Multicore Processor is used in high graphics games like Overwatch and Star Wars Battlefront, and other 3D games.
The multicore processor is more appropriate used in video editing software like Adobe Photoshop, and iMovie.
Multicore Processor is used solidworks with computer-aided design (CAD).
Database servers also handled by multicore CPU.
Multicore CPU is used in high network traffic.
Embedded systems can handle by multicore processor.
Architecture of Multicore Processor
A multi-core processor's design enables communication between all existing cores, and they divide and assign all processing duties appropriately. each core's processed data is through back to the Motherboard by a single common gateway once all of the operations have been finished. In terms of total performance this technique beats a single core CPU.
Advantages of Multicore Processor
Performance: A multi-core CPU can perform more work as compared to a one-core processor. so multi core processor performance is better.
Reliability: The software is always assigned to different cores in multi core processor. If one piece of software fails others remain unaffected.
Software Interactions: If a software is running on many cores. it will communicate with each other.
Multitasking: Multi core CPU can perform multiple tasks at a same time even if many applications may be run at the same time.
Power Consumption: A multi-core processor consumes less power. Only a part of the CPU that produces heat will be used. Due to low battery utilization the power consumption is automatically reduced.
Disadvantages of Multicore Processors
Application Speed: A multi-core processor is designed for multitasking, its performance is not enough. when an software is processing It jump from one core to the next. so the result is the cache fills up, increasing its speed.
Jitter: In a multi-core CPU, the cores increase more interference happens due to this result in excessive jitters. operating system's program performance may suffer, and failures frequency may increase .
Analysis: When you are doing multiple task at once, you will need to add memory. In a multi-core CPU, this is tough to analysis it.
Resource Sharing: A multi-core CPU shares multiple resources, both internal and external. Like networks, system buses, and main memory are some resources. Any software running on the same core again and again it will interrupted.
Difference between MultiCore and MultiProcessor System
Multicore System
A processor that has more than one core is called a Multicore Processor while one with a single core is called Unicore Processor or Uniprocessor. Nowadays, most systems have four cores (Quad-core) or eight cores (Octa-core). These cores can individually read and execute program instructions, giving feel like a computer system has several processors but in reality, they are cores and not processors. Instructions can be calculation, data transferring instruction, branch instruction, etc.
Processors can run instructions on separate cores at the same time. This increases the overall speed of program execution in the system. Thus heat generated by the processor gets reduced and increases the overall speed of execution.
Multicore systems support MultiThreading and Parallel Computing. Multicore processors are widely used across many application domains, including general-purpose, embedded, network, digital signal processing (DSP), and graphics (GPU). Efficient software algorithms should be used for the implementation of cores to achieve higher performance. Software that can run parallel is preferred because we want to achieve parallel execution with the help of multiple cores.
Multi processor System
Two or more processors or CPUs present in same computer, sharing system bus, memory and I/O is called MultiProcessing System. It allows parallel execution of different processors. These systems are reliable since failure of any single processor does not affect other processors. A quad-processor system can execute four processes at a time while an octa-processor can execute eight processes at a time. The memory and other resources may be shared or distributed among processes.
Difference Between MultiCore and MultiProcessor System
MultiCore
A single CPU or processor with two or more independent processing units called cores that are capable of reading and executing program instructions.
It executes single program faster.
Not as reliable as multiprocessor.
It has less traffic.
It does not need to be configured.
It’s very cheaper (single CPU that does not require multiple CPU support system).
MultiProcessor
A system with two or more CPU’s that allows simultaneous processing of programs.
It executes multiple programs Faster.
More reliable since failure in one CPU will not affect other.
It has more traffic.
It needs little complex configuration.
It is Expensive (Multiple separate CPU’s that require system that supports multiple processors) as compared to MultiCore.
Comments
Post a Comment