The Processor: Datapath and Control (Multi-cycle implementation)

Slides:



Advertisements
Similar presentations
1 I/O 设备访问方式和类型. 2 Overview n The two main jobs of a computer: l I/O (Input/Output) l processing n The control of devices connneted to the computer is.
Advertisements

程序的执行 程序执行和指令执行概述 数据通路基本结构和工作原理 流水线方式下指令的执行
資料庫設計 Database Design.
操作系统结构.
CHAP 2 Computer-System Structures 计算机系统结构
Chapter 2: Computer-System Structures计算机系统结构
最新計算機概論 第3章 計算機組織.
第4章 VHDL设计初步.
Combinational Logic 組合邏輯
数字系统设计 Digital System Design
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
CH1 Number Systems and Conversion
Hardware Chen Ching-Jung
CH.2 Introduction to Microprocessor-Based Control
Chapter 5 電腦元件 目標---- 研讀完本章後,你應該可以: 閱讀有關電腦的廣告以及了解它的專業用語(行話)。
第 2 章 中央處理單元.
Lotus Domino R7 Designer
微处理器设计1 刘鹏 College of ISEE Zhejiang University
数字系统设计 I Digital System Design I
臺北市立大學 資訊科學系(含碩士班) 賴阿福 CS TEAM
第六章 应用程序结构.
CPU資料處理 醫務管理暨醫療資訊學系 陳以德 副教授: 濟世CS 轉
PIC16F1827介紹 以微控器為基礎之電路設計實務-微處理器實驗室.
Lecture on High Performance Processor Architecture (CS05162)
第4章 处理器(CPU) 4.1 引言 4.2 逻辑设计的一般方法 4.3 建立数据通路 4.4 一个简单的实现机制 4.5 多周期实现机制.
指令集架構 計算機也跟人類一樣,需要提供一套完整的語言讓人們跟它充分溝通,以完成正確的計算工作。
1-1 微電腦系統單元 1-2 微電腦系統架構 1-3 微控制器(單晶片微電腦) 1-4 類比與數位訊號介面
1-1 微電腦系統單元 1-2 微電腦系統架構 1-3 微控制器(單晶片微電腦) 1-4 類比與數位訊號介面
memory array (2n words by m bits)
5 Computer Organization (計算機組織).
Operating System Concepts 作業系統原理 CHAPTER 2 系統結構 (System Structures)
The Processor: Datapath and Control
Operating System Internals and Design principles
第4章(2) 空间数据库 —关系数据库 北京建筑工程学院 王文宇.
memory array (2n words by m bits)
HLA - Time Management 陳昱豪.
Computer Organization
EVS-05-27e Action items7 China will provide language for low battery energy warning by next EVS IG meeting.
微程序控制器 刘鹏 Dept. ISEE Zhejiang University
创建型设计模式.
ICT RTOS Research Group 胡伟平,王剑
計算機結構 – 概論 陳鍾誠 於金門大學.
Ch 9: Input/Output System 输入/输出系统
微处理器设计2 刘鹏 College of ISEE Zhejiang University
组合逻辑3 Combinational Logic
預官考試輔導 計算機概論提要 91年12月4日.
第14章 其它DSP设计库 14.1 总线控制库 14.2 复数信号库 14.3 Gates库 14.4 状态机函数库
重點 資料結構之選定會影響演算法 選擇對的資料結構讓您上天堂 程式.
JTAG INTERFACE SRAM TESTER WITH C-LCM
邏輯設計 Logic Design 顧叔財, Room 9703, (037)381864,
陳慶瀚 機器智慧與自動化技術(MIAT)實驗室 國立中央大學資工系 2013年5月28日
單元11: 事件結構 主題: a. 事件結構概述 b. 如何使用事件結構 c. 使用事件結構須注意的事項.
数字系统设计 Digital System Design
句子成分的省略(1).
Instructions: Language of the Machine
IBM SWG Overall Introduction
Speaker: Liu Yu-Jiun Date: 2009/4/29
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
虚 拟 仪 器 virtual instrument
中国科学技术大学计算机系 陈香兰 Fall 2013 第三讲 线程 中国科学技术大学计算机系 陈香兰 Fall 2013.
True friendship is like sound health;
第10章 存储器接口 罗文坚 中国科大 计算机学院
计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.
Chapter 10 Mobile IP TCP/IP Protocol Suite
CHAPTER 6 Concurrency:deadlock And Starvation
 隐式欧拉法 /* implicit Euler method */
2 Number Systems, Operations, and Codes
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
第一章 有關電腦.
Presentation transcript:

The Processor: Datapath and Control (Multi-cycle implementation) Chapter5-2 The Processor: Datapath and Control (Multi-cycle implementation) 臺大電機系 吳安宇教授 V1. 11/17/2004 臺大電機吳安宇教授-計算機結構

Review of Single-cycle Implementation 臺大電機吳安宇教授-計算機結構

Single-cycle implementation Why a single-cycle implementation isn’t used today? Long cycle time for each instruction (load takes longest time) All instructions take as much time as the slowest one 臺大電機吳安宇教授-計算機結構

Outline 5.1 Introduction 5.2 Logic Design Conventions 5.3 Building a Datapath 5.4 A Simple Implementation Scheme 5.5 A multi-cycle Implementation 臺大電機吳安宇教授-計算機結構

A multi-cycle Implementation Each step in the execution will take one clock cycle. Allow a function unit (e.q. ALU) to be used more than once per instruction, as long as it is used on different clock cycles. Advantage: Allow instructions to take different numbers of clock cycles. Share function units within the execution of a single instruction. The difference between single-cycle & multi-cycle implementation: A single memory unit is used for both instructions and data. A register is used to save the instruction after it is read from memory. – It is called “Instruction Register (IR)”. A single ALU is used, rather than an ALU + two adders. 臺大電機吳安宇教授-計算機結構

Added Temporary Registers The Instruction Register (IR) and the Memory Data Register (MDR) are added to save the output of memory for an instruction read and a data read, respectively. Two separate registers are used, since both values are needed during the same clock cycle (the IR needs to hold the instruction until the end of execution of that instruction, and thus will require a write control signal) The A and B registers are used to hold the register operand values read from the register file. The ALUOut register holds the output of the ALU. 臺大電機吳安宇教授-計算機結構

A multi-cycle Implementation Two sources for a memory address: a MUX to select The PC (for instruction access) ALUOut (for data access, lw, sw) A single ALU must accommodate all the inputs Two required changes to the datapath: An additional multiplexor: choose between the A register and the PC. A four-way multiplexor: the B register the constant 4 the sign-extended field the sign-extended and shifted offset field (2bits) 臺大電機吳安宇教授-計算機結構

Multi-cycle Datapath 臺大電機吳安宇教授-計算機結構

Adding Control Signals to Datapath 臺大電機吳安宇教授-計算機結構

Program Counter Control With the jump and branch instruction, there are 3 possible value to be written into the PC: Normal: The output of the ALU: PC+4, which should be stored directly into the PC The register ALUOut: the address of the branch target address When the instruction is a jump: The lower 26 bits of the IR shifted left by 2 and concatenated with the upper 4 bits of the incremented PC. PCWrite: causes an unconditional write of the PC PCWriteCond: causes a write of the PC if the branch condition is also true 臺大電機吳安宇教授-計算機結構

Complete Datapath 臺大電機吳安宇教授-計算機結構

Actions of the control signals Priority: PCWrite > PCWriteCond 臺大電機吳安宇教授-計算機結構

Actions of the 2-bit control signals 臺大電機吳安宇教授-計算機結構

Breaking the Instruction Execution into Clock Cycles The limitation of one ALU operation, one memory access, and one register file access determines what can fit in one step Breaking the Instruction Execution into Clock Cycles Instruction fetch step Instruction decode and register fetch step Execute, memory address computation, or branch completion Memory access or R-type instruction completion step Memory read completion step Each MIPS instruction needs 3 ~ 5 of these steps. 臺大電機吳安宇教授-計算機結構

Complete Datapath 臺大電機吳安宇教授-計算機結構

Instruction Fetch and Decode Instruction fetch step IR <= Memory [PC]; PC <= PC + 4; Instruction decode and register fetch step A <= Reg [IR [25:21]]; # get rs B <= Reg [IR [20:16]]; # get rt ALUOut <= PC + (sign-extend (IR[15:0]) << 2) # precompute branch target address 臺大電機吳安宇教授-計算機結構

Execution cycle Execute, memory address computation, or branch completion Memory reference: ALUOut <= A + sign-extend (IR[15:0]); R-type: ALUOut <= A op B; Branch: if (A==B) PC <= ALUOut; Jump: PC <= {PC[31:28], (IR[25:0], 2’b00)}; #{x, y} is the Verilog notation for concatenation of bit fields x and y 臺大電機吳安宇教授-計算機結構

Instruction Completion Steps Memory access or R-type instruction completion step Memory reference: MDR <= Memory [ALUOut]; # for lw or Memory [ALUOut] <= B; # for sw R-type: Reg [ IR [ 15:11 ] ] <= ALUOut; # completion of R-type Memory read completion step Load: Reg[IR[20:16]] <= MDR; 臺大電機吳安宇教授-計算機結構

A multi-cycle Implementation 臺大電機吳安宇教授-計算機結構

CPI in a Multi-cycle CPU: CPI in the multi-cycle CPU: 25% loads (1% load byte + 24% load word) 10% stores (1% store byte + 9% store word) 11% branches (6% beq, 5% bne) 2% jumps (1% jal + 1% jr) 52% ALU (all the rest) CPI = 0.25*5 + 0.10*4 + 0.52*4 + 0.11*3 + 0.02*3 = 4.12 This CPI is better than the worst-case CPI of 5.0 when all the instructions take the same number of clock cycles. Loads: 5 (clock cycles) Stores: 4 ALU instructions: 4 Branches: 3 Jumps: 3 臺大電機吳安宇教授-計算機結構

Techniques to Specify the Control Two different techniques to specify the control: Finite state machine (state diagram) Microprogramming (see Appendix) Microprogram: A symbolic representation of control in the form of instructions, called microinstructions, that are executed on a simple micromachine. 臺大電機吳安宇教授-計算機結構

Finite-state Machine Control The high-level view of the finite state machine control 臺大電機吳安宇教授-計算機結構

Instruction Fetch and Decode 臺大電機吳安宇教授-計算機結構

Memory-reference instructions 臺大電機吳安宇教授-計算機結構

R-type 臺大電機吳安宇教授-計算機結構

Branch 臺大電機吳安宇教授-計算機結構

Jump 臺大電機吳安宇教授-計算機結構

Complete State diagram 臺大電機吳安宇教授-計算機結構

Implementation of State Diagram Conventional way to implement the Control Unit B. Use Verilog/VHDL to implement the State Diagram 臺大電機吳安宇教授-計算機結構

Interrupt and Exception Interrupts were initially created to handle unexpected events like arithmetic overflow and to signal requests for service from I/O devices. Some events generated internally or externally: Type of event From where? MIPS terminology I/O device request External Interrupt Invoke the operating system from user program Internal Exception Arithmetic overflow Using an undefined instruction Hardware malfunctions Either Exception or interrupt 臺大電機吳安宇教授-計算機結構

Interrupt and Exception Exception: any unexpected change in control flow without distinguishing whether the cause is internal or external Interrupt: only when the event is externally caused We will only discuss how to handle an undefined instruction or an arithmetic overflow in this chapter. How exceptions are handled: Save the address of the offending instruction in the Exception Program Counter (EPC) and transfer control to the operating system at some specified address. Take some predefined action in response to an overflow, or stop the execution of the program and report an error (Execute Interrupt Service Routine) Terminate the program or may continue its execution, using the EPC to determine where to restart the execution of the program. 臺大電機吳安宇教授-計算機結構

Interrupt registers Two main methods used to communicate the reason for an exception: Cause register: A status register which holds a field that indicates the reason for the exception (used in MIPS architecture) Vectored interrupt: An interrupt for which the address to which control is transferred is determined by the cause of the exception. The operating system knows the reason for the exception by the address at which it is initiated. The address are separated by 32 bytes or 8 instructions, and the operating system must record the reason for the exception and may perform some limited processing in this sequence. Exception type Exception vector address (in hex) Undefined instruction C000 0000hex Arithmetic overflow C000 0020hex 臺大電機吳安宇教授-計算機結構

Handle Interrupt in MIPS For MIPS exception system Two additional registers to the datapath: EPC (exception program counter): A 32-bit register used to hold the address of the affected instruction. Cause: A register used to record the cause of the exception. In the MIPS architecture, this register is 32 bits. 3 Additional control signals: EPCWrite CauseWrite IntCause Change the 3-way mutiplexor (controlled by PCSouse) to a 4-way multiplexor, with additional input wired to the constant value 8000 0180hex. 臺大電機吳安宇教授-計算機結構

Handle Interrupt in MIPS Two new states (10 and 11) are shown in Fig 5.40 Undefined instruction (10): This is detected when no next state is defined from state 1 for the op value. Arithmetic overflow (11): The Overflow signal is used in the modified finite state machine to specify an additional possible next state(11) for state 7. 臺大電機吳安宇教授-計算機結構

Complete Datapath of MIPS CPU 臺大電機吳安宇教授-計算機結構

Complete State Diagram of Controller 臺大電機吳安宇教授-計算機結構

HW#6 HW#6: Chapter 5 exercise: 5.4, 5.5, 5.6, 5.8, 5.11, 5.13 Due date: 11/26 (Friday by 2pm) to TA Thinking Shen (in front of E2-232 box). No late submissions 臺大電機吳安宇教授-計算機結構