Leftmost Longest Regular Expression Matching in Reconfigurable Logic

Slides:



Advertisements
Similar presentations
Exercise 1 EECS, Peking University Exercise in Query Processing.
Advertisements

國立成功大學工程科學系 Department of Engineering Science -National Cheng Kung University 控制與訊號處理實驗室 Control & Signal Processing Lab MATLAB/Simulink 教學.
MMN Lab 未來教室與雲端化學習 Yueh-Min Huang Department of Engineering Science, National Cheng Kung University, Tainan, Taiwan
新东方多媒体数据库与考研英语
曲延棣 副教授兼系主任 義守大學 醫務管理學系
第4章 VHDL设计初步.
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
实践 课题 周围环境对当代大学生成长的影响 指导老师:王永章 小组成员:陈荣、刘若楠、张红艳、吕雪丹、樊金芳、李惠芬、黄婧
An Ultra-Wearable, Wireless, Low Power ECG Monitoring System
XI. Hilbert Huang Transform (HHT)
A TIME-FREQUENCY ADAPTIVE SIGNAL MODEL-BASED APPROACH FOR PARAMETRIC ECG COMPRESSION 14th European Signal Processing Conference (EUSIPCO 2006), Florence,
Rate and Distortion Optimization for Reversible Data Hiding Using Multiple Histogram Shifting Source: IEEE Transactions On Cybernetics, Vol. 47, No. 2,February.
YARN & MapReduce 2.0 Boyu Diao
! 温故知新 上下文无关文法 最左推导 最右推导 自上而下 自下而上 句柄 归约 移进-归约冲突 移进-归约分析 递归下降预测分析
親愛的老師您好 感謝您選用本書作為授課教材,博碩文化準備本書精選簡報檔,特別摘錄重點提供給您授課專用。 說明: 博碩文化:
Department of Computer Science & Information Engineering
異質計算教學課程內容 「異質計算」種子教師研習營 洪士灝 國立台灣大學資訊工程學系
微積分網路教學課程 應用統計學系 周 章.
Source: IEEE Access, vol. 5, pp , October 2017
GPU分散式演算法設計與單機系統模擬(第二季)
5 Computer Organization (計算機組織).
§5.6 Hole-Burning and The Lamb Dip in Doppler- Broadened Gas Laser
Logistics 物流 昭安國際物流園區 總經理 曾玉勤.
Flash数据管理 Zhou da
微程序控制器 刘鹏 Dept. ISEE Zhejiang University
計算機結構 – 概論 陳鍾誠 於金門大學.
文字探勘與知識工程 Text Mining & Knowledge Engineering
An Introduction to Computer Science (計算機概論)
Introduction to Multimedia Coding
Dynamic Traffic Diversion in SDN: Testbed vs Mininet
2019/1/2 Experimental Analysis on Performance Anomaly for Download Data Transfer at IEEE n Wireless LAN 在IEEE n無線LAN上下載數據傳輸的性能異常的實驗分析 Author:
Reinventing Your Business Model Christensen, C. M. et al
研究經驗與趨勢分享 黃悅民 Department of Engineering Science,
A high payload data hiding scheme based on modified AMBTC technique
Chapter 5 Recursion.
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
2019/4/8 A Load Balancing Mechanism for multiple SDN Controllers based on Load Informing Strategy Miultiple controller 的 load balancing 機制,使用一個叫 Load informing.
Guide to a successful PowerPoint design – simple is best
每周三交作业,作业成绩占总成绩的15%; 平时不定期的进行小测验,占总成绩的 15%;
3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
虚 拟 仪 器 virtual instrument
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.
從 ER 到 Logical Schema ──兼談Schema Integration
通信工程专业英语 Lesson 13 Phase-Locked Loops 第13课 锁相环
Google Local Search API Research and Implementation
A Data Mining Algorithm for Generalized Web Prefetching
Course 10 削減與搜尋 Prune and Search
Or 蚂蚁的故事 一个寓言… 或者 May be not.... 不是个寓言而是个真的故事.....
The Ant A Fable... Or 蚂蚁的故事 May be not.... 一个寓言… 或者 不是个寓言而是个真的故事....
Deep Learning with Limited Numerical Precision
An Efficient MSB Prediction-based Method for High-capacity Reversible Data Hiding in Encrypted Images 基于有效MSB预测的加密图像大容量可逆数据隐藏方法。 本文目的: 做到既有较高的藏量(1bpp),
BiCuts: A fast packet classification algorithm using bit-level cutting
Introduction to Probability Theory ‧1‧
Nucleon EM form factors in a quark-gluon core model
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
Introduction to Service Science 课程概述
Mobile IPv4.
5. Combinational Logic Analysis
Arguments to the main Function and Final Project
Introduction to Computer Security and Cryptography
MATLAB 結構化財務程式之撰寫 MATLAB財務程式實作應用研習 主題五 資管所 陳竑廷
Advanced Competitive Programming
Gaussian Process Ruohua Shi Meeting
Advanced Competitive Programming
2019/12/1 An Improved CPK Identity Authentication Scheme Based on Cloud Environment Author: Yanyan Song, Jun Qin Publisher: 2017 Asia-Pacific Engineering.
Hybrid fractal zerotree wavelet image coding
Section 1 Basic concepts of web page
POWER-EFFICIENT RANGE-MATCH-BASED PACKET CLASSIFICATION ON FPGA
Presentation transcript:

Leftmost Longest Regular Expression Matching in Reconfigurable Logic 2018/5/8 Leftmost Longest Regular Expression Matching in Reconfigurable Logic Author: Kubilay Atasu (IBM Research – Zurich) Presenter: Kuan-Chieh Feng Date: 2016/10/11 Department of Computer Science and Information Engineering National Cheng Kung University CSIE CIAL Lab 1

Outline Introduction Related Work Baseline Architecture Optimized Architecture Experiment National Cheng Kung University CSIE Computer & Internet Architecture Lab

2018/5/8 Introduction The so-called big data has become a new natural resource, and discovering insights in big data will be the key capability of future computing platforms. The process of extracting information from large-scale unstructured text is called text analytics. Text analytics functions rely heavily on regexs and dictionaries for locating named entities. 按能源消耗的比例增加來增加processor的core數量相當不合成本。 所以要同時改善效能與能源消耗,就要仰賴其他的計算資源,像是GPU或是FPGA。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2018/5/8 Introduction 圖1顯示使用FPGA-based在商業分析平台上從不同的data resource持續收集新的entries 並且使用local news search engine把他們index。 當大量的使用者做query,第二階段就會變成bottleneck。 要減少bottleneck就需要增加processor的core數,且會導致使用更多的空間以及能量消耗 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Related Work When several regex matches that end at the same offset position exist, typically only the span with the smallest start-offset value will be reported. This technique is called leftmost regex matching heuristic. National Cheng Kung University CSIE Computer & Internet Architecture Lab

2018/5/8 Related Work Definition 1: Each regex match is associated with a span (s, e), where s is the start-offset position and e the end-offset position of the regex match. Definition 2: Span (s0, e0) contains span (s1, e1) if (s1 > s0) and (e1 < e0).  1  每個Regex match對應到一組span 2 每一組span包含s與e代表位置 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2018/5/8 Related Work Definition 3: The leftmost regex match at offset position i is the regex match with the smallest start-offset position value that ends at offset position i.  Definition 4: A leftmost longest regex match is a regex match that is not contained in any other regex match.  3 在offset位置i的leftmost regex match則是結束在位置i的regex match中其start-offset值最小者。 4  leftmost longest regex match代表部會包含在所有其他的regex match National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2018/5/8 Related Work 假設regex是 a|aa|aaaa能夠match input stream aaaa 圖顯示可以match有八種情況。每個情況根據其start offset與end offset有不同的span(如圖) 其中只有四個是屬於leftmost matches,包括(0,0) (0,1) (1,2) (0,3) 當leftmost match在offset位置為1時,則是span (0,1) offset 2 : span (1,2) offset 3 : span (0,3) 其中start offset最小的為span(0,3)則(0,3)為leftmost longest match National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Baseline Architecture 2018/5/8 Baseline Architecture Leftmost longest regex matches combines : Regex matcher that supports start-offset reporting Sorting unit (increasing order of the start offsets, where the spans having the same start offset are sorted in decreasing order of their end offsets.) Containment filter sort過的span會被餵到containment filter(過濾所以包含在其他的span 且會產生指出leftmost longest regex match的過濾的span stream。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Baseline Architecture 2018/5/8 Baseline Architecture 經過start offset遞增排序,且end-offset遞減排序後所得到的第一個span, 在抵達containment filter後會被儲存在local register且還沒產生其他output。 接著後面進來的span,其end offset必定小或等於原來儲存在register中的value。 總而言之,sorting的過程會保證後面進來的span必定會contain在原來的span當中。 之後span會儲存在local memory然後output,則後面的span在儲存在local register中。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Baseline Architecture 2018/5/8 Baseline Architecture In general, the proposed architecture can produce multiple and possibly overlapping spans as output. Example : Assume span (0, 1), (2, 3), (2, 4), and (1, 5) . After sorting (0, 1), (1, 5), (2, 4), and (2, 3). The containment filter stores the first span (0, 1) in its registers. When the second span (1, 5) arrives, (0, 1) is written to output and (1, 5) is written to the registers. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Baseline Architecture 2018/5/8 Baseline Architecture 當整個網路的資料都儲存在controller,很容易就會洩漏 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture 2018/5/8 Optimized Architecture We exploit two key properties to eliminate the redundancies that exist in the baseline architecture. Observe that to locate the leftmost longest matches, it is not necessary to generate all possible regex matches. The existing architectures that support leftmost regex matching [13], [14] produce partially sorted results. They report only the regex match with the smallest start offset. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture 2018/5/8 Optimized Architecture 首先,在regex match我們只找出start-offset最小的span 接著,在inversion unit我們透過原來的順序來反轉,此時end-offset最大且start-offset最小的span會移到最前面 最後containment filer就可以直接取出LLM。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture 2018/5/8 Optimized Architecture Latency-Hiding Inversion Unit The inversion unit can be implemented using simple LIFO buffers. It will not produce any output until the leftmost regex matcher produces an eos (end-of-stream) signal. The latency of this can significantly reduce the throughput rate of the optimized architecture. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture 2018/5/8 Optimized Architecture 透過兩種模式,我們可以同時針對兩條stream做讀寫的動作 start pointer與end pointer開始是相同的。而end 位置從LIFO buffer的end pointer讀取,而read pointer會在new span傳送到containment filter時遞減(遞增)。 當eos signal抵達時,inversion unit會執行: 1. 複製write pointer到read pointer , start pointer到end pointer 2.從forward切換到backward mode (或是相反) 3.初始化start pointer及write pointer LIFO buffer是否可能沒有空間? 發生的情況很少,因為read buffer會很快速的flush 而write buffer反而沒那麼頻繁的寫入buffer National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture Computing non-overlapping matches Definition 5: Two spans (s0, e0) and (s1, e1) overlap if (s1 > s0) and (e1 > e0) and (s1 <= e0) or if (s0 > s1) and (e0 > e1) and (s0 <= e1). Definition 6: The non-overlapping leftmost regex match at offset position i is the regex match with the smallest start offset that ends at position i and that does not overlap with any leftmost regex match that ends at a position smaller than i. National Cheng Kung University CSIE Computer & Internet Architecture Lab

Optimized Architecture 2018/5/8 Optimized Architecture Definition 7: A non-overlapping leftmost longest regex match is a non-overlapping leftmost regex match that is not contained in any other non-overlapping leftmost regex match. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Optimized Architecture 2018/5/8 Optimized Architecture 在位置0時,只有(0,0) 所以它是該位置的non-overlapping match 位置1時,有(0,1)與(1,1)兩者,(0,1)並沒有與(0,0) overlap,且較(1,1)符合 位置2時有(1,2)與(2,2),但是(1,2)與(0,1)是overlap因此選擇(2,2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Experiment National Cheng Kung University CSIE Computer & Internet Architecture Lab