Multi-threaded Algorithm 3 Michael Tsai 2014/1/2
Multithreaded matrix multiplication P-SQUARE-MATRIX-MULTIPLY(A,B) n=A.rows let C be a new n x n matrix parallel for i=1 to n parallel for j=1 to n 𝑐 𝑖𝑗 =0 for k=1 to n 𝑐 𝑖𝑗 = 𝑐 𝑖𝑗 + 𝑎 𝑖𝑘 ∙ 𝑏 𝑘𝑗 return C 𝑇 1 (𝑛)=Θ 𝑛 3 𝑇 ∞ 𝑛 =Θ log n +Θ log 𝑛 +Θ 𝑛 =Θ(𝑛) 𝑇 1 𝑛 𝑇 ∞ 𝑛 = Θ 𝑛 3 Θ 𝑛 =Θ 𝑛 2
Divide-and-conquer Multithreaded Algorithm for Matrix Multiplication (勒勒長) 𝐴= 𝐴 11 𝐴 12 𝐴 21 𝐴 22 𝐵= 𝐵 11 𝐵 12 𝐵 21 𝐵 22 𝐶= 𝐶 11 𝐶 12 𝐶 21 𝐶 22 = 𝐴 11 𝐴 12 𝐴 21 𝐴 22 𝐵 11 𝐵 12 𝐵 21 𝐵 22 = 𝐴 11 𝐵 11 𝐴 12 𝐵 12 𝐴 21 𝐵 11 𝐴 22 𝐵 12 + 𝐴 12 𝐵 21 𝐴 12 𝐵 22 𝐴 22 𝐵 21 𝐴 22 𝐵 22 C T
𝑀 1 𝑛 𝑀 ∞ 𝑛 = Θ 𝑛 3 Θ log 2 𝑛 =Θ 𝑛 3 log 2 𝑛 P-MATRIX-MULTIPLY-RECURSIVE(C,A,B) n=A.rows if n==1 𝑐 11 = 𝑎 11 𝑏 11 else let T be a new n x n matrix partition A,B,C, and T into n/2 x n/2 submatrices ( 𝐴 𝑖𝑗 , 𝐵 𝑖𝑗 , 𝐶 𝑖𝑗 , 𝑇 𝑖𝑗 , 𝑖,𝑗={1,2}) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝐶 11 , 𝐴 11 , 𝐵 11 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝐶 12 , 𝐴 11 , 𝐵 12 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝐶 21 , 𝐴 21 , 𝐵 11 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝐶 22 , 𝐴 21 , 𝐵 12 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝑇 11 , 𝐴 12 , 𝐵 21 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝑇 12 , 𝐴 12 , 𝐵 22 ) spawn P-MATRIX-MULTIPLY-RECURSIVE( 𝑇 21 , 𝐴 22 , 𝐵 21 ) P-MATRIX-MULTIPLY-RECURSIVE( 𝑇 22 , 𝐴 22 , 𝐵 22 ) sync parallel for i=1 to n parallel for j=1 to n 𝑐 𝑖𝑗 = 𝑐 𝑖𝑗 + 𝑡 𝑖𝑗 𝑀 1 𝑛 =8 𝑀 1 𝑛 2 +Θ 𝑛 2 =Θ 𝑛 3 𝑀 1 𝑛 𝑀 ∞ 𝑛 = Θ 𝑛 3 Θ log 2 𝑛 =Θ 𝑛 3 log 2 𝑛 𝑀 ∞ 𝑛 = 𝑀 ∞ 𝑛 2 +Θ log 𝑛 +Θ log 𝑛 =Θ log 2 𝑛
How about Strassen’s method? Reading assignment: p.795-796. Parallelism: Θ( 𝑛 log 7 log 2 𝑛 ), slightly less than the original recursive version!
Multithreaded Merge Sort A: Array to be sorted p and r: Start and end index of the range to be sorted Merge-Sort’(A,p,r) if p<r q= (𝑝+𝑟)/2 spawn MERGE-SORT’(A,p,q) MERGE-SORT’(A,q+1,r) sync MERGE(A,p,q,r) Merge() is the bottleneck! Θ(𝑛) 𝑀 𝑆 1 ′ 𝑛 =2𝑀 𝑆 1 ′ 𝑛 2 +Θ 𝑛 =Θ 𝑛 log 𝑛 Parallelism= Θ( log 𝑛 ) 𝑀 𝑆 ∞ ′ 𝑛 =𝑀 𝑆 ∞ ′ 𝑛 2 +Θ 𝑛 =Θ 𝑛
把Merge也用Divide & Conquer來解! (方便交給不同的thread去做) 2. 找出 𝑝 2 ~ 𝑟 2 中 𝑞 2 這個位置使得 𝑝 2 ~ 𝑞 2 −1都< x, 𝑞 2 ~ 𝑟 2 都≥ x 1. 挑出 𝑝 1 ~ 𝑟 1 的中位數x 3. Copy x 到新地方(根據x和 𝑞 2 的位置) Base case: 𝑝 1 ~ 𝑟 1 和 𝑝 2 ~ 𝑟 2 都是空的 4. 開分身去merge 𝑝 1 ~ 𝑞 1 −1和 𝑝 2 ~ 𝑞 2 −1, 及 𝑞 1 +1~ 𝑟 1 和 𝑞 2 ~ 𝑟 2 兩個區段
Multithreaded Merge P-MERGE(T, 𝑝 1 , 𝑟 1 , 𝑝 2 , 𝑟 2 ,A, 𝑝 3 ) 𝑛 1 = 𝑟 1 − 𝑝 1 +1 𝑛 2 = 𝑟 2 − 𝑝 2 +1 if 𝑛 1 < 𝑛 2 exchange 𝑝 1 𝑤𝑖𝑡ℎ 𝑝 2 exchange 𝑟 1 𝑤𝑖𝑡ℎ 𝑟 2 exchange 𝑛 1 𝑤𝑖𝑡ℎ 𝑛 2 if 𝑛 1 ==0 return else 𝑞 1 = ( 𝑝 1 + 𝑟 1 )/2 𝑞 2 =BINARY-SEARCH(T[ 𝑞 1 ],T, 𝑝 2 , 𝑟 2 ) 𝑞 3 = 𝑝 3 + 𝑞 1 − 𝑝 1 +( 𝑞 2 − 𝑝 2 ) A[ 𝑞 3 ]=T[ 𝑞 1 ] spawn P-MERGE(T, 𝑝 1 , 𝑞 1 −1, 𝑝 2 , 𝑞 2 −1,A, 𝑝 3 ) P-MERGE(T, 𝑞 1 +1, 𝑟 1 , 𝑞 2 , 𝑟 2 ,A, 𝑞 3 +1) sync T: Array to be merged A: Array to save the merged result 𝑝 1 , 𝑟 1 , 𝑝 2 , 𝑟 2 : Start and end index of the range to be merged 𝑝 3 : The index of the median in A
𝑛 1 ≥ 𝑛 2 , 𝑛= 𝑛 1 + 𝑛 2 𝑛 2 = 2 𝑛 2 2 ≤ 𝑛 1 + 𝑛 2 2 = 𝑛 2 最糟的狀況下, x比所有 𝑝 2 ~ 𝑟 2 的都大 要merge 𝑛 1 2 + 𝑛 2 個元素 𝑛 1 2 + 𝑛 2 ≤ 𝑛 1 2 + 𝑛 2 2 + 𝑛 2 2 = 𝑛 1 + 𝑛 2 2 + 𝑛 2 2 ≤ 𝑛 2 + 𝑛 4 = 3𝑛 4
Multithreaded Merge P-MERGE(T, 𝑝 1 , 𝑟 1 , 𝑝 2 , 𝑟 2 ,A, 𝑝 3 ) 𝑛 1 = 𝑟 1 − 𝑝 1 +1 𝑛 2 = 𝑟 2 − 𝑝 2 +1 if 𝑛 1 < 𝑛 2 exchange 𝑝 1 𝑤𝑖𝑡ℎ 𝑝 2 exchange 𝑟 1 𝑤𝑖𝑡ℎ 𝑟 2 exchange 𝑛 1 𝑤𝑖𝑡ℎ 𝑛 2 if 𝑛 1 ==0 return else 𝑞 1 = ( 𝑝 1 + 𝑟 1 )/2 𝑞 2 =BINARY-SEARCH(T[ 𝑞 1 ],T, 𝑝 2 , 𝑟 2 ) 𝑞 3 = 𝑝 3 + 𝑞 1 − 𝑝 1 +( 𝑞 2 − 𝑝 2 ) A[ 𝑞 3 ]=T[ 𝑞 1 ] spawn P-MERGE(T, 𝑝 1 , 𝑞 1 −1, 𝑝 2 , 𝑞 2 −1,A, 𝑝 3 ) P-MERGE(T, 𝑞 1 +1, 𝑟 1 , 𝑞 2 , 𝑟 2 ,A, 𝑞 3 +1) sync 𝑃 𝑀 ∞ 𝑛 =𝑃 𝑀 ∞ 3𝑛 4 +Θ log 𝑛 =Θ log 2 𝑛 𝑃 𝑀 1 𝑛 =Ω 𝑛 , 因為至少要copy n elements 𝑃 𝑀 1 𝑛 =O 𝑛 , 見課本p.802 ⇒𝑃 𝑀 1 𝑛 =Θ 𝑛 Θ log 𝑛
P-MERGE-SORT(A,p,r,B,s) n=r-p+1 if n==1 B[s]=A[p] else let T[1 P-MERGE-SORT(A,p,r,B,s) n=r-p+1 if n==1 B[s]=A[p] else let T[1..n] be a new array 𝑞= 𝑝+𝑟 2 𝑞 ′ =𝑞−𝑝+1 spawn P-MERGE-SORT(A,p,q,T,1) P-MERGE-SORT(A,q+1,r,T,q’+1) sync P-MERGE(T,1,q’,q’+1,n,B,s) A: Array to be merged p,r: Index of the range to be sorted B: Array to save the result 𝑃𝑀 𝑆 1 𝑛 =2𝑃𝑀 𝑆 1 𝑛 2 +𝑃 𝑀 1 𝑛 =2𝑃𝑀 𝑆 1 𝑛 2 +Θ 𝑛 =Θ 𝑛 log 𝑛 𝑃𝑀 𝑆 ∞ 𝑛 =𝑃𝑀 𝑆 ∞ 𝑛 2 +𝑃 𝑀 ∞ 𝑛 =𝑃𝑀 𝑆 ∞ 𝑛 2 +Θ log 2 𝑛 =Θ log 3 𝑛 Parallelism = Θ 𝑛 log 𝑛 log 3 𝑛 =Θ 𝑛 log 2 𝑛 Much better now!
怎麼自己學演算法 非常重要 這堂課沒辦法把所有”重要”的演算法都教完 但是希望過程中你已經建立了自己學習演算法的能力 準備研究所考試的時候可能會需要
我的經驗 我也跟著大家一起學習 資料結構 與 演算法 一些好方法(自以為)跟大家分享: 畫圖, 用簡單的例子圖解步驟 先了解概念(大方向), 不急著看複雜的數學 (我常說: 如果什麼都沒弄懂, 先弄懂這個) 一次了解一小部分就好 (模組化學習), 不急著弄懂所有的部分 相信自己一定可以弄懂 (非常重要) 不知道自己到底懂了沒? 用一些古怪的例子來試試 (boundary case) 練習自己看課本 (非常重要)
期末考內容&型式 Closed book, 2 x A4 cheat sheets (double-sided) 佔學期成績26% 包含整學期上課內容, 但以期中考過後的內容為主 180 minutes 題型與期中考類似 (是非+選擇+解釋) Extra office hour ??