異質計算教學課程內容 「異質計算」種子教師研習營 洪士灝 國立台灣大學資訊工程學系 2016-01-29 異質計算教學課程內容 「異質計算」種子教師研習營 洪士灝 國立台灣大學資訊工程學系 2016-01-29
現有的教學資源 HSA Tutorial Day 異質系統架構與應用 2015 異質計算系統國際研習課程
HSA Tutorial Day 全天七個導覽教學課程 簡報投影片、影片可下載 http://teaualune.github.io/hsa-tutorial-day/
異質系統架構與應用 Labs and final projects 台大資工系18周課程 課程內容 HSA Tutorial Day整天活動 https://ceiba.ntu.edu.tw/1032CSIE5314_HSAA 課程內容 HSA Tutorial Day整天活動 Introduction to Parallel Architectures Introduction to Parallel Programming GPU and GPGPU Programming OpenCL Programming by Example Virtual Platforms and Application Development Emulating HSA with HSAemu Reconfigurable Computing and Applications Optimizing Parallel Applications Invited speakers from industry Labs and final projects
2015 異質計算系統國際研習課程 邀請UT Texus的Vijay Janapa Redd和AMD的Paul Blinzer來給六場演講 http://www.cs.ccu.edu.tw/~cinfon/WorkshopSoCDesign/index.html
A New Book about HSA Edited by Wen-Mei Hwu (Dec. 2015)
該不該教HSA? 回頭看,去年教這門課是有點早,今年來教的話,因為資料比較齊全,而且有本可以做為教科書的課本,比較合適。 如果今年要教HSA的話,我的建議是,參考GPUOpen。
教學方法 需要看教學的目的決定教學方法: 初學者: 進階研究者: 學習者的程度 想解決的問題 簡單的 data parallel問題 基本的 GPU架構 較高階的語言(CUDA, AMP++) 進階研究者: 較複雜的irregular問題 進階的 GPU架構(HSA) 較低階的語言(OpenCL, HSA, Verilog)
期末專題和研究題目 異質計算是進階的課程, 進化極快 著重在於 實務上的重點: 分析應用的特性 思考各種異質架構的特性與優勢 上網蒐集最新資料和自學 實務上的重點: 效能分析工具的使用 應用程式的追蹤與效能分析 優化技術的使用
Example Works Done by Master Students Accelerating OpenCL-based Monte Carlo Medical Applications with GPU and FPGA (Bo-Yi Huang) Accelerating Accelerating Data Deduplication with Heterogeneous System Architecture (Yen-Po Wang) Virtual Hadoop: MapReduce over Docker Containers with an Auto-Scaling Mechanism for Heterogeneous Environments (Yi-Wei Chen) Accelerating SQL Database Applications with Heterogeneous System Architecture (Kuan-Ju Lin) Android Maleware Detection with Deep Learning (Wen-Ting Yeh)
Case Study - MCML MCML (Monte Carlo modeling of light transport in multi-layered tissues) Provided by University of Texas M.D. Anderson Cancer Center Based on visible light Multiple runs and layers A run includes different layers Each layer/tissue could set up the configuration To accelerate this time-consuming application, MCML was converted to OpenCL to run on GPU/FPGA Example : Five-layer skin model using an infinitely narrow beam at 633nm
GPU Acceleration for MCML Monte Carlo modeling of light transport in multi-layered tissues 2.67GHz Intel i7 4-core nVidia GTS 450 (192 cores) 5.4x 52x
FPGA Acceleration for MCML Altera Stratix V 21.3x
FPGA vs GPU for MCML FPGA offers better power-performance for the OpenCL code in this case study without coding in Verilog It is interesting to see how GPU/FPGA competes or compliment each other in the future To appear in FPGA 2016 Conference A Platform-Oblivious Approach for Heterogeneous Computing: a Case Study with Monte Carlo-based Radiation Simulation
Ongoing Works Performance methodologies and tools Accelerating System Characterization Optimization Process Design space exploration Open source system software Accelerating Big data appliances Machine learning algorithms Medial applications November 12, 2018November 12, 2018
Extra Thoughts For regular applications, use of GPU will be easier and more common Application development with open, easy-to-use API Use of optimized libraries/engines For irregular applications, entry barrier is still high Find ways to convert irregular applications to regular New and specialized system architecture Integrated CPU/GPU with shared memory November 12, 2018November 12, 2018
謝謝聆聽 Q&A