The Processor: Datapath and Control

Slides:



Advertisements
Similar presentations
嵌入式系统与单片机 北京科技大学电子信息系.
Advertisements

程序的执行 程序执行和指令执行概述 数据通路基本结构和工作原理 流水线方式下指令的执行
2014 年上学期 湖南长郡卫星远程学校 制作 13 Getting news from the Internet.
建構 Beta電腦 – Fall /29/0.
資料庫設計 Database Design.
最新計算機概論 第3章 計算機組織.
Combinational Logic 組合邏輯
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
Operators and Expressions
Homework 4 an innovative design process model TEAM 7
CH1 Number Systems and Conversion
one Counting units 2 ones 3 ones.
Arithmetic for Computers
Population proportion and sample proportion
Hardware Chen Ching-Jung
主題五 CPU Learning Lab.
CH.2 Introduction to Microprocessor-Based Control
正反器 Flip-Flop 閂鎖器 +邊緣觸發之控制信號 ∥ 正反器
Chapter 5 電腦元件 目標---- 研讀完本章後,你應該可以: 閱讀有關電腦的廣告以及了解它的專業用語(行話)。
第 2 章 中央處理單元.
微处理器设计1 刘鹏 College of ISEE Zhejiang University
臺北市立大學 資訊科學系(含碩士班) 賴阿福 CS TEAM
现场总线Fieldbus.
Quiz 3 假设各种分支占所有指令数的百分比如下表所示:
触发器和时序电路分析 刘鹏 浙江大学信息与电子工程学院 March 30, 2017 ZDMC.
PIC16F1827介紹 以微控器為基礎之電路設計實務-微處理器實驗室.
Lecture on High Performance Processor Architecture (CS05162)
樹狀結構 陳怡芬 2018/11/16 北一女中資訊專題研究.
第4章 处理器(CPU) 4.1 引言 4.2 逻辑设计的一般方法 4.3 建立数据通路 4.4 一个简单的实现机制 4.5 多周期实现机制.
Chapter 5 Verilog硬體描述語言
指令集架構 計算機也跟人類一樣,需要提供一套完整的語言讓人們跟它充分溝通,以完成正確的計算工作。
邏輯設計.
2-3 基本數位邏輯處理※.
1-1 微電腦系統單元 1-2 微電腦系統架構 1-3 微控制器(單晶片微電腦) 1-4 類比與數位訊號介面
1-1 微電腦系統單元 1-2 微電腦系統架構 1-3 微控制器(單晶片微電腦) 1-4 類比與數位訊號介面
memory array (2n words by m bits)
5 Computer Organization (計算機組織).
The Processor: Datapath and Control
Operating System Internals and Design principles
memory array (2n words by m bits)
微程序控制器 刘鹏 Dept. ISEE Zhejiang University
数字系统设计复习 Digital System Design Summary
計算機結構 – 概論 陳鍾誠 於金門大學.
C 語言簡介 - 2.
預官考試輔導 計算機概論提要 91年12月4日.
第14章 其它DSP设计库 14.1 总线控制库 14.2 复数信号库 14.3 Gates库 14.4 状态机函数库
Chapter 5 – Sequential Circuits
重點 資料結構之選定會影響演算法 選擇對的資料結構讓您上天堂 程式.
JTAG INTERFACE SRAM TESTER WITH C-LCM
陳慶瀚 機器智慧與自動化技術(MIAT)實驗室 國立中央大學資工系 2013年5月28日
数字系统设计 Digital System Design
触发器和时序电路分析 刘鹏 浙江大学信息与电子工程学院 March 29, 2016 ZDMC.
Instructions: Language of the Machine
BORROWING SUBTRACTION WITHIN 20
The Processor: Datapath and Control (Multi-cycle implementation)
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
第10章 存储器接口 罗文坚 中国科大 计算机学院
第六章 記憶體.
计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.
Chapter 10 Mobile IP TCP/IP Protocol Suite
Create and Use the Authorization Objects in ABAP
5. Combinational Logic Analysis
2 Number Systems, Operations, and Codes
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
Introduction to Computer Security and Cryptography
ABAP Basic Concept (2) 運算子 控制式與迴圈 Subroutines Event Block
ABAP Basic Concept (2) 運算子 控制式與迴圈 Subroutines Event Block
陳情表之外     with 三仁 三樂 歐陽宜璋製於 /10/23.
Section 1 Basic concepts of web page
Chapter 8 – Memory Basics
Presentation transcript:

The Processor: Datapath and Control Computer Organization & Design 5th. Chapter 4 The Processor: Datapath and Control 處理器:資料路徑與控制 ROBERT CHEN

Outlines Introduction Logic Design Conventions Building a Datapath Computer Organization & Design 5th. Outlines Introduction Logic Design Conventions Building a Datapath A Simple Implementation Scheme A Multicycle Implementation Exception

Introduction 計算機的效能受到下面三個因素影響: Computer Organization & Design 5th. Introduction 計算機的效能受到下面三個因素影響: 指令的數目(instruction count) 每個指令的時脈週期數目 (CPI) 整數指令, 算數邏輯指令, 記憶體相關指令及分支 時脈週期的長短(clock cycle time) 編譯器(compiler)和指令集架構(ISA)決定了一個程式所需的指令數目的多寡。 時脈週期的長度和每個指令的時脈週期數目(CPI)卻是由處理器本身的製作方式來決定。 在本章中,我們分別對於兩種不同的MIPS指令製作方式,建構出其資料路徑和控制單元。 單一時脈製作方法 多重時脈製作方法

Introduction 製作MIPS時,其功能單元包含兩個不同的邏輯元件: 能運算資料的元件 含狀態的元件 例:ALU Computer Organization & Design 5th. Introduction 製作MIPS時,其功能單元包含兩個不同的邏輯元件: 能運算資料的元件 例:ALU 組合式(元件的輸出值僅取決於現有的輸入值) 含狀態的元件 例:記憶體和暫存器檔案 循序式(輸出值決定在輸入值及其內部的狀態) 循序邏輯

Introduction 執行指令的階段 圖4.1以高階的概觀圖來說明MIPS的製作方式 指令擷取(Instruction Fetch) Computer Organization & Design 5th. Introduction 執行指令的階段 指令擷取(Instruction Fetch) 解碼 (Decode) 運算元擷取 (Operand Fetch) 執行(Execute) 寫回(Write back) 圖4.1以高階的概觀圖來說明MIPS的製作方式

Introduction We're ready to look at an implementation of the MIPS Computer Organization & Design 5th. Introduction We're ready to look at an implementation of the MIPS Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow instructions: beq, j

Introduction State Elements Unclocked vs. Clocked Computer Organization & Design 5th. Introduction State Elements Unclocked vs. Clocked Clocks used in synchronous logic when should an element that contains state be updated? cycle time rising edge falling edge

Introduction An unclocked state element Latches and Flip-flops Computer Organization & Design 5th. Introduction An unclocked state element The set-reset latch output depends on present inputs and also on past inputs Latches and Flip-flops Latches and flip-flops are the simplest memory elements. Output is equal to the stored value inside the element (don't need to ask for permission to look at the value) Change of state (value) is based on the clock Latches: whenever the inputs change, and the clock is asserted Flip-flop: state changes only on a clock edge (edge-triggered methodology) A clocking methodology defines when signals can be read and written Wouldn't want to read a signal at the same time it was being written

Introduction D-latch Two inputs: Two outputs: Computer Organization & Design 5th. Introduction D-latch Two inputs: the data value to be stored (D) the clock signal (C) indicating when to read & store D Two outputs: the value of the internal state (Q) and it's complement When the latch is open (C asserted), the value of Q changes as D changes transparent latch.

Introduction D flip-flop(D型正反器) Flip-flops are not transparent Computer Organization & Design 5th. Introduction D flip-flop(D型正反器) Flip-flops are not transparent Output changes only on the clock edge The first latch, called the master, is open and follows the input D when C is asserted. When the clock input falls, the first latch is closed, but the 2nd latch, called the slave, is open and gets its input from the output of the master latch. Q _ D l a t c h C

Introduction Set-up time and Hold time D C Computer Organization & Design 5th. Introduction Set-up time and Hold time Set-up time: the minimum time that the input must remain valid before the clock edge Hold time: the minimum time that the input must be valid after the clock edge (usually very small) D C Set-up time Hold time

Introduction An edge triggered methodology(邊緣觸發) Computer Organization & Design 5th. Introduction An edge triggered methodology(邊緣觸發) Decide signals when to be read, when to be written Typical execution: read contents of some state elements, send values through some combinational logic write results to one or more state elements C l o c k y e S t a m n 1 b i g 2

Introduction Register File(暫存器檔案) Computer Organization & Design 5th. Introduction Register File(暫存器檔案) A register file consists of a set of registers that can be read and written by supplying a register number to be accessed. Built using D flip-flops and decoders (specify register number) Read part (left) : supply a register number as input, and the output is the information stored in that register. A register file with 2 read ports and 1 write ports. (right) M u x R e g i s t r 1 n  a d 2 m b R e a d r g i s t n u m b 1 2 f l W

Introduction Register File Computer Organization & Design 5th. Introduction Register File Write part: need 3 inputs: a register number, the data to write, and a clock that controls the writing into the register. Note: we still use the real clock to determine when to write n - t o 1 d e c r R g i s - C D u m b W a

Introduction Simple Implementation Basic components: Computer Organization & Design 5th. Introduction Simple Implementation Basic components: two state elements instruction memory (指令記憶體)and program counter (PC) are needed to store and access instructions. An adder is needed to compute the next instruction address. Since the instruction memory is read-only(唯讀), we can treat it as combinational logic. P C I n s t r u c i o m e y a d . b g A S

Introduction Fetching instruction and incrementing PC (擷取指令並遞增PC) Computer Organization & Design 5th. Introduction Fetching instruction and incrementing PC (擷取指令並遞增PC) A portion of the datapath used for fetching instructions and incrementing Program Counter PC送出位址讀取指令之後, 立刻PC+4,指到下一個指令 P C I n s t r u c i o m e y R a d 4 A

Introduction R-Format ALU operations Computer Organization & Design 5th. Introduction R-Format ALU operations R-format instruction has 3 register operands, 2 read and 1 write Rg. add $t0, $t1, $t2 Register numbers are 5 bits to indicate 32 registers, data bus are 32 bits and ALU control has 4 bits A L U c o n t r l R e g W i s a d 1 2 u D m b . Z 5 4

Introduction Datapath for R-type Instruction Eg. add $t0, $t1, $t2 I n Computer Organization & Design 5th. Introduction Datapath for R-type Instruction Eg. add $t0, $t1, $t2 I n s t r u c i o R e g W a d 1 2 A L U l Z p 4

Introduction Load and Store Instructions Computer Organization & Design 5th. Introduction Load and Store Instructions Load and store instructions compute a memory address by adding the base register, to a 16-bit signed offset field contained in the instruction “Sign extension unit” extends the 16-bit data to 32-bit data by replicating the high-order sign bit to the extra higher 16-bit data Eg. lw $t0, 40($t1) sw $t0, 32($t1) 1 6 3 2 S i g n e x t d b . - s o u M m R a W r D y A

Introduction Datapath for load and store instructions 資料路徑的載入和儲存動作 Computer Organization & Design 5th. Introduction Datapath for load and store instructions 資料路徑的載入和儲存動作 暫存器的存取發生在記憶體位址計算之後。 對記憶體的讀取。 如果是載入指令,會有一個寫入動作到暫存器檔案中。 lw $t0, 40($t1) sw $t0, 32($t1) t1 I n s t r u c i o 1 6 3 2 R e g W a d D m y S x A L U l Z t0 40

Introduction J-type Instruction Branch datapath Computer Organization & Design 5th. Introduction J-type Instruction Branch datapath Needs to compute the branch target address (計算分支目標位址) PC+4 is the address of the next instruction Offset field is left-shifted two bits to make a word offset. (PC0-27  Offset 25-0 +00 ) Needs to compare register contents(比較暫存器內容) 1 6 3 2 S i g n e x t d Z r o A L U u m h f l T b a c B P C + 4 s p I R W beq $t1, $t2, offset

Computer Organization & Design 5th. Introduction 聖戰士組合 利用多工器(MUX)或資料選擇器(data selector)將R形態指令和記憶體指令的資料路徑組合起來, 而不用重複增加相同的功能單元 4

Computer Organization & Design 5th. Introduction 聖戰士組合 加入指令擷取部份的資料路徑

Introduction 聖戰士組合 加入分支部份的資料路徑 跳躍指令目標位址=指令之偏移量+跳躍指令之位址 Computer Organization & Design 5th. Introduction 聖戰士組合 加入分支部份的資料路徑 跳躍指令目標位址=指令之偏移量+跳躍指令之位址

Introduction 大功告成? 最難的是Control Unit 之設計 Computer Organization & Design 5th. Introduction 大功告成? 最難的是Control Unit 之設計

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme 這個簡易的製作方式包含 載入字組 (lw) 及儲存字組 (sw) 相等分支 (beq) ALU 指令: add, sub, and , or, 及 set on less than 根據不同的指令形態,ALU需要可以做下列運算 加法 計算 lw 及 sw 的記憶體位址 減法 為了相等分支 AND, OR, subtraction, add, 或 slt 為了 R-形態指令需要 (由6位元的功能欄決定) ALU 控制輸入 0000 : AND 0001 : OR 0010 : 加法 0110 : 減法 0111 : 小於時設定 set on less than 1100 :NOR (for other MIPS instructions) ALU a b Zero Result Overflow CarryOut ALU-operation 4

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme Purpose Selecting the operations to perform (ALU, read/write, etc.) Controlling the flow of data (multiplexor inputs) How you get these control signals: Information comes from the 32 bits of the instruction Example: add $8, $17, $18 Instruction Format: ALU's operation based on instruction type and function code 000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct

What Control Signals Do We Need? Computer Organization & Design 5th. What Control Signals Do We Need?

Design Method for Control Computer Organization & Design 5th. Design Method for Control Multi-level control (decoding) Instruction opcode: main control unit (first level) ALU control Sub-control for arithmetic MUX control Which source registers and destination registers ALU input source Input source of destination register Input source of PC Result for first level Seven 1-bit control lines 2-bit ALUOP control signals The above control signals can be set based solely on the opcode field of the instruction Exception: PCSrc (depends on the beq result)

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme ALU控制位元的控制是由 ALUOp 控制位元所決定 ALUOp是來用決定不同的指令型態 指令運算碼 ALUOp 指令的運算 功能欄位 需要的ALU運算 ALU的控制輸入 LW 00 載入字組 XXXXXX 加法 0010 SW 儲存字組 Branch equal 01 相等分支 減法 0110 R-type 10 100000 100010 AND 100100 and 0000 OR 100101 or 0001 小於時設定 101010 小於時設定slt 0111

ALU Control ALU Control Instructions using ALU Branch eq R-type Computer Organization & Design 5th. ALU Control ALU Control Instructions using ALU Load/store address calculation – add lw $t1, offset(t2) Branch eq Subtract for comparison ‘taken’ or ‘not taken’ add/subtract for address calculation beq $t1, $t2, offset R-type and/or set-on-less-than ALU control 4 2 6 function field ALUOp operation

ALU Control Multi-level control (decoding) Computer Organization & Design 5th. ALU Control Multi-level control (decoding) Instruction opcode: main control unit – first level 00 = lw, sw 01 = beq, 10 = arithmetic 2nd level: function code for arithmetic : sub control Main CU generates the ALUOP bits as inputs of the ALU control unit Reduce the size of main control but may increase the delay

ALU Control Truth table X : don’t care term Computer Organization & Design 5th. ALU Control Truth table X : don’t care term All zeros or don’t care terms are eliminated Input Output 注意事項: 1.ALUOP 目前無 ’11’項 所以原來的’10’改成’1X’ 2.Funct field中F5F4皆為 ’10’故改成’XX’

設計主要的控制單元 指令的格式 Op 欄位:Op[5 : 0] Computer Organization & Design 5th. 設計主要的控制單元 指令的格式 Op 欄位:Op[5 : 0] R 型指令、相等則分支(beq)指令及儲存指令中, 暫存器:指令的25 : 21 位元及20 : 16 位元的rs 欄位及 rt 欄位 載入及儲存指令中的基底暫存器:指令的25 : 21 位元(rs) 相等則分支(beq)指令﹑載入指令及儲存指令的16 位元偏移量(offset): 指令的15 : 0 位元

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme Seven single-bit control lines, one 2-bit ALUOp control signal Except for PCSrc, the control signal can be set solely based on the opcode field of the instruction. To generate PCSrc, we need to AND together a signal from the control unit, which we call Branch, with the Zero signal out of the ALU.

The Simple Datapath with the Control Unit Computer Organization & Design 5th. The Simple Datapath with the Control Unit P C I n s t r u c i o m e y R a d [ 3 1 ] 2 6 5 A M g L U O p W B h D S 4 x l f Z

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme 為什麼單一時脈週期的製作方式不被採用? 每個指令的時脈週期都必須有相同長度(因此,CPI = 1) 計算機的運算處理指令中最長的路徑將決定時脈週期的長度 整體效能似乎不是很好 範例:單一時脈計算機的效能,假設功能單元的運算時間如下: 記憶體單元: 2 ns ALU 及加法器: 2 ns 暫存器檔案 (讀取或寫入): 1 ns 下列的製作方式那一種會比較快? 每個指令在一個固定長度的時脈週期內運作完成 每個指令在一個時脈週期內運作完成,但時脈週期長度是可變動

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme 範例 (續) 為了計算效能,假設我們使用下列指令的混合比例: 24% 載入, 12% 儲存, 44% R形態指令, 18% 分支及 2%跳躍指令 解答 1. CPU 時脈週期為 8 ns. 2. CPU 時脈週期 = 8*24% + 7*12% + 6*44% + 5*18% + 2*2% = 6.3 ns 效能改進的比例為 8/6.3 = 1.27. 指令種類所用到的功能單元 R格式 指令擷取 暫存器存取 ALU   載入字組 記憶體存取 儲存字組 分支 跳躍 指令 種類 指令記憶體 暫存器讀取 ALU運算 資料記憶體 暫存器寫入 總和 R格式 2 1 6ns 載入字組 8ns 儲存字組   7ns 分支 5ns 跳躍 2ns

A Simple Implementation Scheme Computer Organization & Design 5th. A Simple Implementation Scheme 範例 假設我們有浮點指令單元: 執行浮點加法需要8ns 執行浮點乘法需要16ns 所有功能單元所需的時間如同上例。下列的製作方式何會比較快? 1.每個指令在一個固定長度的時脈週期內運作完成 2.每個指令在一個時脈週期內運作完成,但時脈週期長度是可變動 為了計算效能,假設我們使用下列指令的混合比例: 31%載入, 21%儲存, 27% R形態指令, 5%分支,2% 跳躍指令, 7%浮點加法及7% FP浮點乘法 解答 1. 最長的指令為浮點乘法,其時脈週期為 2 + 1 + 16 + 1 = 20 ns 2. 浮點指令的加法須時 2 + 1 + 8 + 1 = 12 ns. CPU 時脈週期 = 8*31% + 7*21% + 6*27% + 5*5% + 2*2% +20*7% + 12*7%= 7.0 ns 效能改進的比例為20/7 = 2.9.

Design Main Control Unit Computer Organization & Design 5th. Design Main Control Unit

Computer Organization & Design 5th.

Computer Organization & Design 5th.

Computer Organization & Design 5th.

Computer Organization & Design 5th.