第六章 Internet交换体系结构 Internet技术胡越明.

第六章 Internet交换体系结构 Internet技术胡越明

Agenda 6.1 交换路由器系统结构 6.2 IXA和IXP网络处理器简介 6.3 网络处理器应用系统的构成

6.1交换路由器系统结构路由器的构成嵌入式CPU和网络接口等系统硬件嵌入式操作系统以及各种协议软件网络管理系统软件路由器的用户界面

路由器的性能指标总体数据传输速率Aggregate data rate 总体分组转发速率Aggregate packet rate
Total rate at which data can arrive or leave a network system Sum of data rate on all interfaces Bit per second (bps) 总体分组转发速率Aggregate packet rate Packet per second (pps) Packet size from 64 octets to 1518 octets Many processing operations require a fixed amount of time on a packet

路由器的分组处理功能基本的处理深度处理路由表查找分组的分类检错和纠错分组缓存管理和队列管理分片和重组发送队列调度和管制
流量检测、整形组播处理隧道处理安全性处理

早期的路由器结构

早期的路由器结构线卡问题解决方法缺点网络物理链路的连接点主要完成网络层和数据链路层的功能 CPU必须处理每个数据包
每个数据包必须通过总线两次总线与CPU都是瓶颈解决方法在每块线卡中增加处理器用于路由查找处理、转发大部分IP数据包数据包至多通过共享总线一次缺点路由查找受到CPU速度的限制共享总线影响了吞吐量

交换路由器 ——实现报文从输入端口到输出端口的无阻塞传输
switch fabric

交换路由器线卡交换结构分组处理向交换结构发送和接收分组 Banyan结构 Crossbar 并行访问共享内存
The physical connection within a switch between the input and output ports Banyan结构 Crossbar 并行访问共享内存

并行访问共享内存的路由器结构分离数据通路与控制信息通路

端口冲突的现象有多个输入端的分组需要转发到同一个输出端建立队列使分组排队等待

排队方式输入队列(IQ) 输出队列(OQ) 虚拟输出队列(VOQ) 输入与开关队列的组合(CICQ) 输入与输出队列的组合(CIOQ)

输入排队方式每个输入端口包含一个分组缓存输入队列的行前阻塞（HOL阻塞）
IQ switch has only 58.6% throughput due to head-of-line blocking. M. Karol, M. Hluchyj, and S. Morgan, "Input versus Output Queueing on a Space Division Packet Switch," IEEE Transactions on Communications, Vol. 35, No. 12, pp , December 1987.

输出队列方式每个输出端口用一个分组缓存输入端同时到达的数据包可同时送到相应的输出队列 The buffer memory speed must operate at N times link speed, difficult for Gigabit networks 虚拟输出队列（VOQ） Organize the input buffer in each input port into N parallel VOQs

VOQ 一种分类的输入队列结构可以达到输出队列方式的效果需要有一个选择算法从这些输入队列中选择一个输出输入接口处共有M*N个分组队列
解决发往同一输出接口的分组冲突问题对来自不同输入接口送往同一输出接口的分组队列的调度

性能比较

输入与开关队列的组合(CICQ) CICQ Combined input and crossbar queue
Input buffer and Crossbar switch with output buffer 便于输出调度

输入与开关队列的组合(CICQ) 需要先进的集成电路技术交叉开关的每个输入端有多个队列输出接口可实现调度算法开关电路+缓存电路
每个输入端队列的分组对应不同的输出端输出接口可实现调度算法严格优先级加权轮回（WRR） DRR和层次化调度

输入与输出队列的组合（ CIOQ ）适合于没有CICQ的场合每个输入端和输出端都可以有多个队列位于线卡中

CIOQ的路由器 Switch Fabric 分类标记 ingress 查表调度调度 egress

CIOQ与QoS 输入队列的功能输出队列的功能分类队列管理调度整形根据转发表将分组放入不同输出端的队列
对队列长度进行计量对超过业务流规范的分组进行管制或标记调度输出队列的功能实现分组队列不同优先级整形对输出的业务流进行整形

路由器分组处理功能的划分输出处理输入处理分组头修改检错和纠错分类和解复用加入检错信息流量计量和管制地址查找分片安全认证
Decrement TTL 加入检错信息分片流量整形输出安全处理输入处理检错和纠错分类和解复用流量计量和管制地址查找安全认证分组头修改队列管理输入端的分组修改是加上给交换结构的信息，使分组送到相应的输出线卡。

虚拟输出队列的调度 ——DiffServ下的调度方式之一
每个输入端口有16个虚拟输出队列区分16个QoS类输入端口间采用WRR调度策略输入端口内采用DRR调度策略

虚拟输出队列的调度 ——DiffServ下的调度方式之二
增加第三级调度采用严格优先级(SP)策略 16个端口间采用WRR 端口的两个队列组之间采用SP 队列组中的8个队列之间采用DRR

线卡上的处理器功能实现方式处理大部分分组只有控制包和异常包需转发至主CPU 转发引擎通用嵌入式CPU ASIC 网络处理器
微引擎/通道处理器/毫微处理器实现方式通用嵌入式CPU ASIC 网络处理器

线卡上的处理器通用CPU ASIC 网络处理器(NP) 处理和转发IP数据包的CPU 性能较低高速、固定功能开发和制造周期较长
灵活性有限难以实现复杂的功能如NAT 网络处理器(NP) 一种专用于网络交换设备的处理器类型较好的性能和灵活性并行处理结构陡峭的学习曲线

Why NP? Network data rates are increasing
Less packet interval time Protocols are becoming more dynamic and sophisticated L4 to L7 switch Protocols are being introduced more rapidly Multicast RTP for VoIP IPTV

Data rate example Technology 10Base-T 100Base-T 1000Base-T Data rate
10Mbps 100Mbps 1000Mbps Packet rate for small packets 19.5Kpps 195.3Kpps 1953Kpps Packet rate for large packets 0.8Kpps 8.2Kpps 82.3Kpps Time per small packets 51.2ms 5.12ms 0.51ms Time per large packets 1214.4ms 121.44ms 12.14ms

网络处理器面向网络应用的处理器单片的多处理器系统标准功能用硬件实现标准化专用于进行分组的处理以线速转发分组可编程
若干个处理分组的高速智能处理器标准功能用硬件实现如加密/解密和散列标准化网络处理器论坛（NPF）

Advance Quality of Service
Product Life Cycles Sophisticated Algorithms Mean Longer (More Costly) Development & Less Payback L2 Switch Revenue Opportunity 802.1p & Q Revenue Opportunity IP Forwarding Revenue Opportunity Advance Quality of Service Rev Opp Firewall Design Time Selling Time

网络处理器的功能定位应用层传输层网络层链路层物理层功能不断增加，计算量大，是通用处理器的特长
功能相对稳定，由通用处理器或网络处理器内核完成网络层功能部分固定，由微引擎和内核共同完成链路层功能固定，处理简单，由微引擎或外围芯片完成物理层功能固定，缺乏通用性，主要由外围芯片完成

网络处理器的功能定位控制平面数据平面管理平面路由协议分组头处理设备配置分组的接收、发送异常处理分组的分类、排队数据封装
线卡管理接口管理流量管理数据平面分组头处理分组的接收、发送分组的分类、排队数据封装发送调度

网络处理器的主要功能分组头处理分组的接收、发送分组的分类、排队发送调度 CPU Embedded Proc.
I /O Processors Lower Levels Of Processor Hierarchy Lower levels need the most increase

网络处理器的功能模式匹配检索计算数据处理队列管理控制处理对分组中的字段进行匹配发现分组中的特征（满足的表达式）
根据关键字段查找表格中的数据项计算加密、解密、认证、散列、CRC校验数据处理分组的分片 TTL递减、打标签队列管理分组的缓存与QoS相关的流量整形和流量工程策略控制处理异常分组的处理、表格更新、数据统计

网络处理器中采用的技术 Multiple processing engines per chip Multithreading
high-speed interconnections Special purpose or general purpose Multithreading Hardware thread scheduling Hardware signal, mutex, synchronization Processing engine pipelining Or thread pipelining Content-addressable memory Special acceleration hardware For encryption, hash, etc. Hierarchical memory structure Shared internal RAM Large register set

Why not use a general purpose processor?
I/O speed Less I/O capacity Computing speed Less parallelism General purpose processors are not as fast as network processors at data plane network processing Memory access speed Received data are rarely spatially or temporally associated with each other General purpose processor achieve their performance by using on-chip cache to hide memory latencies

6.2 IXA和IXP网络处理器简介 6.2.1 IXA简介 6.2.2 IXP2400网络处理器 6.2.3 IXP2800网络处理器
6.2.4 链路层器件

IXA简介 Internet eXchange Architecture Intel公司提出的网络系统体系结构编程性强支持NPF标准
可构成各种网络设备支持NPF标准 CSIX（common switch interface consortium）各种软件模块的实现规范

Original IXA

Intel网络处理器的特点一个通用的嵌入式核心处理器多个微引擎专用加速硬件具有片内存储层次结构可扩展处理较复杂的任务
包含控制平面和转发平面确定路由、线路间负载平衡多个微引擎优化于分组处理的指令系统线速处理较简单的任务分组接收、分类、路由查找、队列管理、发送专用加速硬件支持堆栈操作、散列、加密/解密计算具有片内存储层次结构寄存器、微引擎本地存储器、便笺存储器、片内SRAM 可扩展支持多个网络处理器的连接支持微引擎数量的扩展

Internal Architecture
Hash Unit IX Bus Interface Scratch Pad SRAM Controller Microengine 0 Microengine 5 Microengine 1 Microengine 2 Microengine 3 Microengine 4 StrongARM Core (166 MHz) 16K B Instruction Cache 8 K B Data Cache 512 B Mini-Data Cache PCI SDRAM 64 Bit 32 Bit Internal Architecture

StrongARM PCI Unit SDRAM Unit SRAM Unit 32-bit Micro- engine 1 Micro-
Intel StrongARM SA-1 Core 16KB Icache 8KB Dcache UART GPIO PCI Unit 4 timers RTC 512B mini Dcache Write buffer SDRAM Unit Read buffer JTAG SRAM Unit 32-bit Micro- engine 1 Micro- engine 2 Micro- engine 3 Scratchpad Memory (4kbytes) 64-bit IX Bus interface Micro- engine 4 Micro- engine 5 Micro- engine 6 Hash Unit

Intel网络处理器产品系列高档中档低档面向核心路由器 IXP2800 (IXP2850) 面向边缘路由器 IXP2400
面向联网设备 IXP400 (IXP42x, IXP46x)

6.2.2 IXP2400网络处理器 XScale内核 8个微引擎 Scratchpad memory 共享的Hash单元
专用RISC处理器 8 threads per micro-engine 4KW control store, 640W local memory and more registers CRC, CAM Scratchpad memory 片内共享存储器共享的Hash单元 2 QDR SRAM channels for up to 20 Gbps; Support for external classification engines Non overlapped address space UP to 1 Gbyte DDR SDRAM 64/66 PCI host CPU interface MSF interface supporting Utopia 1/2/3, SPI-3 (POS) and CSIX interfaces OC-48 data rates Configurable RBUF and TBUF size (64, 128, 256B)

IXP2400 CAP: CSR访问代理

StrongARM Characteristics
Reduced Instruction Set Computer (RISC) 32 bit arithmetic Vector floating point provided via a coprocessor Byte addressable memory Virtual memory support Built-in serial port Facilities for a kernelized operating system

StrongARM Characteristics
5 stage pipeline single cycle instruction execution 16KB 32way I-cache 16KB 32way write-back D-cache co-processor support JTAG support

StrongARM core pipeline organization

Summary of ARM architectures
Core Architecture ARM v1 ARM v2 ARM2as, ARM v2a ARM6, ARM600, ARM v3 ARM7, ARM700, ARM v3 ARM7TDMI, ARM710T, ARM720T, ARM740T v4T StrongARM, ARM8, ARM v4 ARM9TDMI, ARM920T, ARM940T v4T ARM9ES v5TE ARM10TDMI, ARM1020E v5TE

XScale core 采用超流水技术的RISC结构的32位微处理器采用ARM V5的定点指令系统
7~8级指令流水线采用ARM V5的定点指令系统 ARM V5在V4的基础上增加了浮点指令在用户模式的应用程序中与StrongARM兼容支持ARM的Thumb指令系统 ARM V5T 支持ARM 的DSP扩展 ARM V5TE 32 KB指令cache和32 KB数据cache

Role Of Microengines Ingress Egress
Packet receive from physical layer hardware Checksum verification Header processing and classification Packet buffering in memory Table lookup and forwarding Header modification Egress Checksum computation Queue management Transmit schedule Packet transmit to physical layer hardware

Microengine Execution Pipeline
The Microengines have a five stage execution pipeline P0 = Fetch instruction P1 = Decode instruction P2 = Read operands P3 = Perform ALU/shift operation P4 = Write results Developers Workbench Cursors show what is happening in each pipeline stage Colors of the arrows indicate: Instruction executing Microengine idle Microengine stalled Instruction aborted Stage 4 Stage 0

Microengine Enhancements
4/8 threads per microengine Multiplier unit Next-neighbor registers 640x32 local memory Pseudo-random number generator CRC calculator Four 32 bit timers and timer signaling 16 entry CAM Time-stamping unit

Microengine Enhancements (continued)
Support for generalized thread signaling Queue manipulation mechanism that eliminates the need for mutual exclusion ATM segmentation and reassembly hardware Byte alignment facilities Two ME clusters with independent buses 4K word instruction store 256 GPRs and 512 transfer Regs 32-bit multiplication unit

SRAM Unit Features Read/ Write Bit test/set/clear 8 entry ReadLock CAM
Long Word Block of Long Words Bit test/set/clear 8 entry ReadLock CAM 8 entry Push/Pop queue

SRAM Unit Architecture
(1/2 Core clock) Flash ROM 512K(nom) to 8MB (max) Address Queue Service Arbiter Amba Translation Unit & Data FIFO SA Core SRAM Memory References Amba R/W Addr Queue SRAM Up to 8MB 16 Entry Read Queue Address Microengines Command Reference FIFOs 16 Entry Order/Write Queue 8 Entry Priority Queue 24 Entry Read Lock Fail Queue Microengine queues Slow Port For Peripherals Address 8 Entry CAM SRAM Unit Internal structure SRAM XFER Registers 32-bit data

SDRAM Unit Features Read/ Write Read-Modify-Write Chained Reference
Quad Word Block of Quad Words Read-Modify-Write Use indirect_ref optional token Can modify individual bytes of the Quadword Chained Reference Use chained_ref optional token SDRAM unit will service same thread till the chain is completed Used for in order to access non-contiguous blocks of memory

SDRAM Unit SDRAM Unit Internal structure SDRAM Up to Address 256 MB
Queue Service Arbiter SA Core Request Logic SA Core SDRAM Memory References SDRAM Up to 256 MB PCI Unit Request Logic PCI Memory References Address 16 Entry ODD Queue Microengines Command Reference FIFOs 16 Entry EVEN Queue 83 MHz (1/2 Core clock) 16 Entry ORDER Queue 16 Entry PRIORITY Queue Microengine queues 64-bit Data SDRAM Unit Internal structure Byte Aligner

Media or Switch Fabric (MSF) Interfaces
MSF configurable to – Utopia 1, 2, or 3 interface – CSIX-L1 fabric interface – System Packet Interface Level 3 or 4 (SPI-3 or SPI-4) SerDes Framer Interface (SFI) Note: The Optical Internetworking Forum (OIF) controls the SPI and SFI standards.

SPI System Packet Interface SPI-3 SPI-4 SPI-5
OC-48 system interface for physical and link layer devices 32位的接口，4Gbps 支持133MHz时钟频率可以分割成4个8位的通道 SPI-4 OC-192 system interface for physical and link layer devices 支持400MHz时钟频率, 10Gbps SPI-5 OC-768 system interface for physical and link layer devices

CSIX Common Switch Interface Consortium 由厂商会员组成制订网络处理器规范
CSIX L1 fabric interface Look-aside interface Stream interface

CSIX L1 fabric interface
C frame帧格式 Header optional extension header optional payload optional padding vertical parity trailer 32/64/96/128位并行连接线路支持最大速率为32Gbps 支持板级连接 C frame的帧单播帧组播帧广播帧流控帧

C frame Header Extension header Payload Padding bits
2 bytes in length Payload length(8), frame type(4), ready bits for link level flow control Extension header Type-specific, determined by the frame type, 0-4 bytes e.g. destination fabric port for unicast frames Payload Maximum allowable length is 256 bytes Padding bits Ensure that the CFrame has an appropriate length Vertical parity field 16 bit field, use of the field is optional

IXP2800 X-scale core 16 version 2 micro-engines
700 Mhz 16 version 2 micro-engines 1.4 GHz uE Operation 20+ GOPs Media / Switch Fabric Interface Configured as CSIX-L2 or SPI-4 10Gbs Full Duplex Media Interface 50Gbs Packet Memory Bandwidth 30Million Packets Per Second L4 forwarding 60Million Enqueue/Dequeue Operations/Sec ~14W, 1357 BGA

IXP2800

IXP2800 Features con’t PCI Interface QDR Interface (w/Parity)
64 bit / 66 MHz Interface for Control QDR Interface (w/Parity) (4) 36 bit SRAM Channels (QDR or Co-Processor) Network Processor Forum Proposed Co-processor Standard Interface RDR Interface (w/ECC) (3) Independent Direct Rambus DRAM Interfaces Supports 4i Banks or 16 interleaved Banks Supports 16/32 Byte bursts Tuned for PC800 or PC1066 RDR

IXP2850 Version of 2800 with onboard encryption processor
symmetric-key ciphers Advanced Encryption Standard (AES) triple Data Encryption Standard (3DES) one-way hash function Secure Hash Algorithm (SHA-1) keyed message digest Hashed Message Authentication Code (HMAC) HMAC concatenates some private data into the message data before computing one-way hash A checksum accumulator

IXP23xx Intel’s first 90nm Network Processor
Microengines at 300, 600, or 900 MHz Intel XScale® core at 600, 900, or 1200 MHz Two DDR DRAM controllers QDR SRAM controller 128kB Internal SRAM 512kB Layer 2 Push Cache Integrated I/O from T1/E1 through Gigabit Ethernet Integrated Encryption Engines

Intel® IXP2350 GMII, TBI GMII, TBI (4) H/MVIP MII MII MEv2 MEv2 1 Rbuf
16 16 72 QDR SRAM Controller DDR DRAM Controller Intel® IXP2350 Microengines MEv2 MEv2 1 Rbuf SPI3 or Utopia 32 Tbuf 32 MEv2 3 MEv2 2 64 PCI v2.2 Bridge GMII, TBI Hash 64/48/128 Gigabit Ethernet 0 Message SRAM 128KB Scratch 16KB Gigabit Ethernet 1 GMII, TBI DDR DRAM Controller 40 (4) H/MVIP 16 T1/E1/J1 256 Ch HDLC Control Plane Processing 10/100 Ethernet 0 MII Expansion Bus Controller 512 KB L2 Cache Intel XScale® Core Crypto 16 10/100 Ethernet 1 MII Network Processing Engines 0 & 1

IXP400网络处理器系列 XScale内核网络处理器引擎（NPE） PCI接口 MII/RMMI接口（802.3） UTOPIA-2接口
主频为266MHz、400MHz、533MHz、667MHz等网络处理器引擎（NPE）用于减轻典型L2网络功能负担如以太网过滤、ATM SAR、HDLC PCI接口 32位的PCI v2.2 MII/RMMI接口（802.3）集成在NP片内的以太网接口 UTOPIA-2接口 8位，33MHz主频支持单个或多个物理接口配置 USB接口包括USB2.0宿主控制器和USB v1.1设备控制器

IXP400网络处理器系列高性能串行（HSS）接口 SDRAM接口扩展总线接口加密/认证模块 DSP支持
用于连接T1/E1或者SLIC/CODEC SLIC（Analog Subscriber Line Interfaces）是传统模拟电话线的接口标准 Codec如模数/数模转换、调制/解调、压缩/解压缩 6线，支持8.192MHz，8个HDLC通道 SDRAM接口支持32MB到1GB存储器，可支持ECC 扩展总线接口最多25位地址，可连接各种其他设备可用于连接Flash存储卡或其他Boot ROM存储器加密/认证模块 DES、3DES、AES128位和256位、SHA等 DSP支持支持TI的DSP

IXP400网络处理器系列器件 UTOPIA HSS MII 0 MII 1 AES/DES HDLC SHA-1/MD-5 IXP425
8 IXP422 IXP421 IXP420 IXC1100

IXP425网络处理器系列

IXP465网络处理器系列

Tolapai Single Die integrates IA CPU @ 600, 1066 and 1200MHz
DDR2 memory controller (MCH) PCI Express* Standard IA PC peripherals (ICH) 3x Gigabit Ethernet MACs 3x TDM high-speed serial interfaces for 12 T1/E1 or Slic/Codec connections Intel® QuickAssist Integrated Accelerator For high-performance security and IP telephony applications

Tolapai

Tolapai 148 Million transistors 1,088-ball FCBGA w/1.092 mm pitch
37.5 mm x 37.5 mm package Intel's first integrated IA processor, chipset and memory controller since 1994's 80386EX.

6.2.4 链路层器件 10端口千兆以太网MAC器件IXF1010 SPI-3成帧器件IXF6048
链路层器件 10端口千兆以太网MAC器件IXF1010 SPI-3成帧器件IXF6048 四端口千兆以太网MAC器件IXF1104

4端口千兆以太网MAC器件IXF1104

Intel网络处理器的特点有一个主核通用嵌入式处理器处理器异常分组和网络协议多个多线程的微引擎高度可编程便于实现新型分组处理功能

6.3 网络处理器应用系统的构成硬件构成软件构成应用系统构成实例

IXP2400 Full-Duplex OC-48 System Implementation
R A M Host CPU (IOP or iA) DDR SDRAM Packet Memory QDR SRAM Queues & Tables Q D R Q D R T C A M Classification Accelerator IXP2400 Ingress Processor IXF6048 Framer OC-48 OC48 Switch Fabric Gasket OC48 OC48 IXP2400 Egress Processor 1x OC-48 or 4x OC-12 T C A M Classification Accelerator QDR SRAM Queues & Tables Q D R Q D R S D R A M DDR SDRAM Packet Memory

Gigabit Ethernet Backplane
Typical network edge architectures Ports DSLAM Line Card 12-port ADSL PHY Dual 10/100 Ethernet PHY Control Plane Processor 10/100 Console Dual 10/100 MAC Intel® IXP2350 Network Processor Network Processor Integrated Gigabit Ethernet MACs Intel® XScale Core 128 KB Integrated SRAM Integrated 10/100 MACs 128 Port Utopia L2 Interface Dual Gigabit MAC Dual Gigabit PHY Gigabit Ethernet Backplane FPGA 12-port ADSL PHY 12-port ADSL PHY Boot Flash DDR DRAM QDR SRAM

Gigabit Ethernet Backplane
Typical network edge architectures Node B Transport Card Encryption CoProcessor Control Plane Processor Dual 10/100 Ethernet PHY 10/100 Console Dual 10/100 MAC Octal T1/E1/J1 Framer/LIU 16 T1/E1/J1 HDLC Controller IMA Network Processor Intel® IXP2350 Network Processor Integraed Encryption Engine Integrated Gigabit Ethernet MACs Intel® XScale Core 128 KB On-Chip SRAM 256 Channel HDLC Controller Integrated 10/100 MACs Dual Gigabit MAC Dual Gigabit PHY Gigabit Ethernet Backplane Octal T1/E1/J1 Framer/LIU Boot Flash DDR DRAM QDR SRAM

Fabric Interface Chip (FIC) PPP/ ATM/ OTN / SONET/ SDH
10Gbps SONET Line Card SAR’ing Classification Metering Policing Initial Congestion Management Ingress Processor D R A M D R A M D R A M RDR Packet Memory QDR SRAM Queues & Tables Control Plane Processor Q D R Q D R Q D R Q D R PCI 64/66 IXP2800 Ingress Processor Fabric Interface Chip (FIC) Calypso CDR, DEMUX 10GbE OC-192c SPI I/F 10Gbs 15Gbs CSIX I/F Fabric Flow Ctl CDR, DEMUX 10Gbs 15Gbs IXP2800 Egress Processor Traffic Shaping Flexible Choices diff serve TM 4.1 … Egress Processor 10 GbE WAN / PPP/ ATM/ OTN / SONET/ SDH QDR SRAM Queues & Tables Q D R Q D R Q D R Q D R D R A M D R A M D R A M RDR Packet Memory

软件构成 NPF定义的网络处理器软件架构

NPF定义的网络处理器软件架构转发平面控制平面负责以线速处理网络的流量根据网络业务流作出分组处理的决定控制和配置转发平面
转发、分类、过滤等控制平面控制和配置转发平面执行各种信令和路由协议为转发平面提供路由信息、转发信息、端口配置信息和QoS配置信息分为应用层、服务层和功能层

控制平面软件应用层最高层次的抽象路由协议、边界网关协议、路由信息的管理服务层实现与系统相关的抽象功能层实现与硬件单元相关的抽象

控制平面软件两套API 控制平面平台开发包（CP_PDK） NPF应用API NPF管理API 为内核的控制平面软件提供标准化的接口
针对特定的协议类型在操作系统之上运行 IPv4单播转发API、MPLS API、区分服务API 每个API分别由一个构件实现 NPF管理API 系统的配置和管理数据转发平面的插件管理名字空间的管理控制平面平台开发包（CP_PDK）为内核的控制平面软件提供标准化的接口建立在内核构件基础上

应用系统构成实例 Internet交换路由器见教材 2. 边缘汇聚路由器 3. 多业务服务平台

附录：其他网络处理器 2.5Gbps 10Gbps IBM公司的PowerNP 4GS3 Vitesse公司的IQ2000、IQ2200
Motorola公司的C-5 DCP Cisco Toaster 2 10Gbps Bay Microsystems公司的BrecismsP5000 Xstream Logic公司(后改名Clearwater Networks ) 动态多线程(DMS)处理器核智能包管理单元(PMU) 采用类似MIPS的结构 Ezchip公司的NP-1 Lexra公司的NetVortex

PowerNP high-level architecture
UnderstandingNPs

PowerNP Embedded processor complex (EPC) Ingress EDS Egress EDS
计算资源 Ingress EDS 网络接口的分组接收、发送、调度 Egress EDS 交换结构接口的分组接收、发送、调度 Ingress SWI 实现内部分组回路 Egress SWI Ingress PPM 连接物理层设备 Egress PPM

PowerNP components Embedded processor complex (EPC) Data flow (DF)
包含1个嵌入式PowerPC 8 protocol processor units 每个协议处理器包含2个CLP 共16个picoprocessor 32个分组处理线程采用加速硬件实现帧转发、过滤、CRC计算 Accepts data for processing from both ingress and egress DFs 4KB shared memory pool (1KB per thread) Data flow (DF) Data path for receiving and transmitting packets Coprocessor Provide hardware-assist function Table search, packet alteration, classification, pattern search

CLP Core language processor A 32 bit picoprocessor 1 cycle ALU ops
scaled-down RISC processor 1 cycle ALU ops 16 32bit or bit GPRs Supports 2 threads(32 threads in all) Run at 133MHz

PowerNP functional block diagram

Vitesse公司的IQ2200 4 200MHz scalar RISC processor cores
With co-processors for lookup, classification, packet order management, multicast support Optimized instruction for network operations QoS Engine for packet priorities and transfer Vsc2202-pb-r10-vppd-00306

Motorola C-port C-5 16 channel processors (CP) 5 co-processors
RISC core with 2 Serial Data Processors (SDP) RISC core Classification, traffic scheduling SDP talks to other CP, field parsing, CRC validation/calculation, framing header validation, extraction, insertion, deletion, 5 co-processors Executive processor: coordination with external host processors Fabric processor: for using multiple C-5’s in a fabric Table lookup unit: table lookup and update Queue management: manage packet queues Buffer management: memory management 1 general purpose processors UnderstandingNPs

Cisco Toaster 2 Consists of 2 PXF chips
Parallel express forwarding (PXF) architecture Contains 16 Express microcontroller (XMC) Single thread execution model Used in Cisco Edge Service Router (ESR)

Cisco Toaster 2 Express microcontroller (XMC)
Based on a vanilla 2-way issue RISC VLIW Arranged in 4 pipelines Results in a 4x8 systolic array

Lextra NetVortex Use 16 MIPS R3000 32-bit RISC core(LX8000)
Support for single cycle context switch among 8 contexts Add special instructions to speed up packet processing 1’s complement add, insert and extract bit fields LX480 used for control plan processor

Clearwater CNP810 SMT core 10 functional units (FU)
Use simultaneous multi-threading Peak throughput of 225Gbps Support SPI-3, SPI-4 150nm process 300MHz 12W UnderstandingNPs

EZ-chip Uses specialized processors for different tasks
Task optimized processors (TOPs) TOPparse TOPsearch TOPresolve TOPmodify Manufactured by IBM UnderstandingNPs

Questions Can we make network processors – Faster? – Easier to use?
– More powerful? – More general? – Cheaper? – All of the above?

第六章 Internet交换体系结构 Internet技术胡越明.

Similar presentations

Presentation on theme: "第六章 Internet交换体系结构 Internet技术胡越明."— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

第六章 Internet交换体系结构 Internet技术 胡越明.

Similar presentations

Presentation on theme: "第六章 Internet交换体系结构 Internet技术 胡越明."— Presentation transcript:

Similar presentations

About project

反馈

第六章 Internet交换体系结构 Internet技术胡越明.

Presentation on theme: "第六章 Internet交换体系结构 Internet技术胡越明."— Presentation transcript: