XOS-Principles of Operation --Crossbeam的管理模式
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
Overview 1.The X-Series is powered by the Crossbeam X-Series Operating System (XOS™) software 2.核心是对X-Series Security Platform的管理,主要管理其各个部件(CPM、NPM、APM)和它们之间的交互、以及出现故障如何切换等等
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
Basic Configuration Privilege levels .Can assign commands and users to any level from 0 to 15 .User at level X can run commands at level X and below
Basic Configuration Create VAP Group
Basic Configuration Flow rules Circuit (External circuit, Internal circuit) .External circuit-Linked to a NPM physical port using a logical interface . Internal circuit- Used for communication between VAPs and VAP-groups (serialization) .Physical & logical interface .Circuits & virtual network devices .Circuits & IP Addressing .Logical interface & circuits VLAN, DNS, Gateway, static routers
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
High Availability-Single Box Hardware Element Redundancy
High Availability-Single Box a. Module Link State Status APM and NPM keep track of connectivity from remote modules to themselves, while, CPM maintains the state of all connections in the system. Based on heartbeat messages Heartbeat是通过冗余通信通道和消息重传机制来保证通信的可靠性。它们之间相互发送报文来告诉对方自己当前的状态,如果在指定的时间内未受到对方发送的报文,那么就认为对方失效,这时需启动资源接管模块来接管运行在对方主机上的资源或者服务。
High Availability- Single Box b. Interface Redundancy (NPM) A failed interface is remapped onto its backup interface, transparent to the application. Active/Active or Active/Standby Preemption On or Preemption Off (default) master or active (MAC-Usage Options)
High Availability- Single Box c. Service Availability (APM) Flows are load-balanced among all VAPs in a VAP group. Active flow state: needed to remember the VAP assignment, created on the first packet of the flow, synchronized between NPs. Backup Mode\Standby VAP
Backup Mode Backup Mode = Group / Pair On failure of a VAP, each flow is redirected to a pre-determined alternative VAP. Established flow states are maintained. Backup Mode = None On failure of a VAP, flows are reclassified by the NPM and load balanced to the remaining VAPs in VAP Group. Established flow states on failed AP are deleted. Slower convergence than group but perfect load balancing after failure. Default setting is none
Standby VAP A Standby will be dynamically allocated to an active VAP Group in case of an APM failure. If a VAP fails, a standby VAP is preempted and rebooted to run the VAP image of the failed VAP. Preemption Mode: a preemption-priority is used to determine if APMs can be taken from another group.
High Availability- Single Box d. CP Redundancy (CPM) CPM provide software configuration to VAPs . Continuous operation of the box in the presence of CP failure.
High Availability-Dual Box (1)在AS模式下,主机和备机之间可以同步规则、对象、路由和Session等信息。当主机出现问题后,各种网络服务都可以平滑的切换到备机上,保证用户不断网。 (2)在AA模式下,两台设备可以同时工作,信息也是即使同步的,包括Session。同时采用一个优化算法,将流量平均的放在两个设备上进行处理。当一个设备出现问题后,另一个设备就可以自动切换,提供网络服务。而当问题设备被修复后,工作的设备也会将Session等信息同步回问题设备。同步完成后,双机就可以正常工作。
High Availability-Dual Box (active/passive) Dual –Box High Availability a. Multi Box Redundancy VRRP is used to allow two or more chassis to share IP address. VRRP Mater is elected based on Priority and own the shared IP address, and the 2nd box becomes VRRP Backup and doesn’t forward traffic. If interface on the master chassis fails, the 2nd box’s interface takes over. VRRP技术,即虚拟路由冗余协议 (Virtual Router Redundancy Protocol) 在VRRP协议中,有两组重要的概念:VRRP路由器和虚拟路由器,主控路由器和备份路由器。VRRP路由器是指运行VRRP的路由器,是物理实体,虚拟路由器是指VRRP协议创建的,是逻辑概念。一组VRRP路由器协同工作,共同构成一台虚拟路由器。该虚拟路由器对外表现为一个具有唯一固定IP地址和MAC地址的逻辑路由器。处于同一个VRRP组中的路由器具有两种互斥的角色:主控路由器和备份路由器,一个VRRP组中有且只有一台处于主控角色的路由器,可以有一个或者多个处于备份角色的路由器。VRRP协议使用选择策略从路由器组中选出一台作为主控,负责ARP相应和转发IP数据包,组中的其它路由器作为备份的角色处于待命状态。当由于某种原因主控路由器发生故障时,备份路由器能在几秒钟的时延后升级为主路由器。由于此切换非常迅速而且不用改变IP地址和MAC地址,故对终端使用者系统是透明的。
High Availability-Dual Box (active/passive) Priority deltas-Support multiple failures across the VRRP participating chassis Priority deltas assigned to various system components may affect the failover group priority. The system with the highest priority value will become master for the failover group.
High Availability-Dual Box (active/passive) Next Hop Health Checking (NHHC) If the probes fail, rout cost is increased so that an alternative rout with a lower cost may be chosen.
High Availability-Dual Box (active/passive) b. Redundant HA Links The HA Links must be in isolated VLAN.
High Availability-Dual Box (active/active) An active/active configuration enables both controller nodes to process I/Os, and provide a standby capability for the other.
High Availability-Dual Box (active/active) Configuration synchronization occurs when one or both units in a failover pair boot. The configurations are synchronized as shown: When a unit boots while the peer unit is active (with both failover groups active on it), the booting unit contacts the active unit to obtain the running configuration regardless of the primary or secondary designation of the booting unit. When both units boot simultaneously, the secondary unit obtains the running configuration from the primary unit.
High Availability-Dual Box 在系统出现故障发生失败切换时,切换的时间是由下面二个因素来决定的: 从系统出错到触发失败切换所用的时间 应用程序的恢复时间 服务器之间的网络监控是每2秒进行一次,如果监控程序没有收到另一台服务器的回应,它会再尝试6次,在触发抢失败切换之前整个过程要花费14秒。监控的时间间隔和重试次数都可以在群集管理器中根据我们自己的费要进行实际的调整。 应用程序的恢复时间有很大的差别,这个过程同时也包括诸如文件系统检查和数据库日志恢复之类的动作。群集管理器支持Linux的日志型文件系统,如EXT3文件系统,它能够大大地减少文件系统检查的时间。 在应用程序被系统管理员使用群集管理器跨服务器进行重新部署的时候,服务会被停止/启动脚本快速的停止和重新启动。这样消除了所有监控程序的延时和应用程序的恢复过程,因此这个过程是非常快速的。
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
Flow Processing a. Flow Scheduling/Load Balancing To make the best load-balancing decision, NPM need feedback from the VAP group members: (1)Each VAP is running a load balancing agent, which communicates with a load calculator on the CPM; (2)CPM sends back a scheduling vector to NPM every second. All data sent via the Control Plane
Flow Processing Flow rules A flow rule is associated with a specific VAP Group. Multiple flow rules may be created for each VAP Group. Flow rules priority promiscuous mode (IDS)
Flow Processing Flow Processing with NAT Thanks to the normalization process the return flow will hit the same VAP as the inbound flow. However, if a VAP modifies the IP header info (i.e. NAT), the return traffic received by the NPM has no AFT entry matching IP addresses. NAT英文全称是“Network Address Translation”,中文意思是“网络地址转换”,
Flow Processing Recalssify-nat The reclassify-nat command that is applied on the circuit where the nated flow is generated forces the NPM to treat all flows as if they are originating from the VAP. 1.Including those modified by the application (i.e. NAT) 2.Nated outbound flows are classified on egress 3.An AFT entry is then created forcing the return traffic to match the flow. AFT-Active Flow Table
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
Applications AS the Sub-system Disk is on the CPM, the CPM is responsible for installing applications onto the VAP. Each APM in a X series chassis can be equipped with a local disk, where temporary can be stored. For some applications (Anti-Virus, Proxy, etc) you might need to have a SWAP space enabled, to increase the Application performances. Swap files are stored in the local disk. A swap file (or swap space or, in Windows NT, a pagefile) is a space on a hard disk used as the virtual memory extension of a computer's real memory (RAM). Having a swap file allows your computer's operating system to pretend that you have more RAM than you actually do.
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
System Monitoring & Troubleshooting Swatch A Linux tool to monitor XOS system state and network I/O
System Monitoring & Troubleshooting Troubleshooting using TCPdump TCPdump is a packet analyzer, similar to snoop on a Sun Solaris box. Running TCPdump without any options will dump all packets on all interfaces. This is usually not a good idea especially on a production system.
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
Multiple Applications Serialization: Traffic flows through one application to another. (proxy applications) Parallelization: Traffic flows to two or more applications simultaneously. (IDS applications) Traffic Splitting: Traffic stream is split between multiple application based on flow rule specifications. (Transparent web proxy)
Outline Overview Basic Configuration High Availability Flow Processing Applications System Monitoring & Troubleshooting Multiple Applications System Maintenance
System Maintenance Recovery XOS Upgrade Rollback Firmware Upgrade
Comments 1.What else managements should we need? 2.What else situations should we pay attention to?