Presentation is loading. Please wait.

Presentation is loading. Please wait.

IX. Basic Implementation Techniques and Fast Algorithm

Similar presentations


Presentation on theme: "IX. Basic Implementation Techniques and Fast Algorithm"— Presentation transcript:

1 IX. Basic Implementation Techniques and Fast Algorithm
289 IX. Basic Implementation Techniques and Fast Algorithm  9-A 快速演算法設計的原則 Fast Algorithm Design Goals: Saving Computational Time Number of Additions Number of Multiplications Number of Time Cycles Saving the Hardware Cost for Implementation

2 290  9-B 對於簡單矩陣快速演算法的設計 如何簡化下面四個運算 y[1] = ax[1] + 2ax[2]

3 291 (4) (i) z(1) = a[x(1) + x(2)], z(2) = z(1) (ii) z(3) = (b−a)x(2), z(4) = (c−a)x(1) (iii) y(1) = z(1) + z(3), y(2) = z(2) + z(4)

4 問題思考:如何對 complex number multiplication 來做 implementation?

5 293  9-C 一般化的簡化理論: 假設一個 MN sub-rectangular matrix S 可分解為 column vector 及 row vector 相乘 若 [a1, a2, …., aM]T 有 M0 個相異的 non-trivial values (am  2k, am  2kah where m  h) [b1, b2, …., bN] 有 N0 個相異的 non-trivial values 則 S 共需要 M0 + N0 個乘法

6 294 Step za = b1x[1] + b2x[2] + …. + bNx[N] Step z[1] = a1 za , z[2] = a2 za , ………, z[N] = aM za

7 295 簡化理論的變型 S1 也是一個 MN matrix 若 S1 有 P1 個值不等於 0, 則 S 的乘法量上限為 M0 + N0 + P1   以此類推

8 296 思考: 對於如下的情形需要多少乘法

9  9-D Examples 297 DFT: 3  3 DFT 可以用特殊方法簡化
Without any simplification, the DFT needs 4N2 real multiplications (x[n] may be complex) 3  3 DFT 可以用特殊方法簡化

10 298  5  5 DFT 的例子

11 299 8-point DCT 觀察對稱性質之後,令

12 300 Part 1:

13 301 Part 2: [Ref] B. G. Lee, “A new algorithm for computing the discrete cosine transform,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 32, pp , Dec

14 X. Fast Fourier Transform
302 X. Fast Fourier Transform  C. S. Burrus and T. W. Parks, “DFT / FFT and convolution algorithms”, John Wiley and Sons, New York, 1985.  R. E. Blahut, Fast Algorithm for Digital Signal Processing, Addison Wesley Publishing Company. N-point Fourier Transform: 運算量為 N2 FFT (with the Cooley Tukey algorithm): 運算量為 Nlog2N 要學到的概念:(1)快速演算法不是只有 Cooley Tukey algorithm (2) 不是只有 N = 2k 有時候才有快速演算法

15  10-A Other DFT Implementation Algorithms
(1) Cooley-Tukey algorithm (Butterfly form) (2) Radix-4, 8, 16, …. Algorithms (3) Prime Factor Algorithm (4) Goertzel Algorithm (5) Chirp Z transform (CZT) (6) Winograd algorithm

16 Reference  J. W. Cooley and J. W. Tukey, “An algorithm for the machine computation of complex Fourier series,” Mathematics of Computation, vol. 19, pp , Apr (Cooley-Tukey)  C. S. Burrus, “Index Mappings for multidimensional formulation of the DFT and convolution,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 25, pp , June (Prime factor)  G. Goertzel, “An algorithm for the evaluation of finite trigonometric series,” American Math. Monthly, vol. 65, pp , Jan (Goertzel)  C. R. Hewes, R. W. Broderson, and D. D. Buss, “Applications of CCD and switched capacitor filter technology,” Proc. IEEE, vol. 67, no. 10, pp , Oct (CZT)  S. Winograd, “On computing the discrete Fourier transform,” Mathematics of Computation, vol. 32, no. 141, pp , Jan (Winograd)  R. E. Blahut, Fast Algorithm for Digital Signal Processing, Reading, Mass., Addison-Wesley, 1985.

17  10-B Cooley Tukey Algorithm
305  10-B Cooley Tukey Algorithm When N = 2k twiddle factors x1[n] = x[2n], x2[n] = x[2n+1] Therefore, one N-point DFT = two (N/2)-point DFT + twiddle factors

18 306 8-point DFT X[0] X[1] X[2] X[3] X[4] X[5] X[6] X[7] x[0] x[2] x[4] x[6] 4-point DFT 1 w w2 w3 x[1] x[3] x[5] x[7] -1 4-point DFT -1 -1 -1

19 307 8-point DFT x[0] x[4] x[2] x[6] x[1] x[5] x[3] x[7] X[0] X[1] X[2] X[3] X[4] X[5] X[6] X[7] -1 1 w2 -1 -1 -1 1 w w2 w3 -1 -1 -1 1 w2 -1 -1 -1 -1 -1

20 308  Number of real multiplications 的估算 2k-point DFT 一共有 k 個 stages 每個 stage 和下一個 stage 之間有 2k-1 個 twiddle factors 所以,一共有 2k-1(k-1) 個 twiddle factors 一般而言,每個 twiddle factor 需要 3 個 real multiplications  2k-point DFT 需要 個 real multiplications Complexity of the N-point DFT: O(Nlog2N)

21 309  8-point DFT 只需要 4 個 real multiplications (Why?)  更精確的分析,使用 Cooley-Tukey algorithm 時,N-point DFT 需要 個 real multiplications (Why?)

22  10-C Radix-4 Algorithm 310 限制: N = 4k
or N = 24k (此時 Cooley-Tukey algorithm 和 radix-4 algorithm 並用) twiddle factors One N-point DFT = four (N/4)-point DFTs + twiddle factors

23 311 Note: (1) radix-4 algorithm 最後可將 N = 4k-point DFT 拆解成 4-point DFTs 的組合 4-point DFTs 不需要任何的乘法 (2) 使用 radix-4 algorithm 時,N-point DFT 需要 個 real multiplications

24  Number of real multiplications for the N-point DFT
乘法數 加法數 1 12 8 26 104 44 160 2 4 13 52 27 114 45 170 3 14 32 28 64 48 92 16 15 40 30 80 208 5 10 34 20 72 54 228 6 36 18 33 56 156 7 35 150 60 21 62 63 256 9 22 39 182 204 88 24 100 66 284 11 46 25 148 42 124 70 300

25 N 乘法數 72 164 128 560 256 1308 1024 7436 80 260 144 436 288 1160 1152 7088 81 480 160 680 312 1324 1260 7640 84 248 168 580 336 1412 2016 12728 88 412 180 360 1540 2048 16836 90 340 192 752 420 2080 2304 15868 96 280 204 976 504 2300 2520 16540 104 468 216 1020 512 3180 4032 29488 108 456 224 1016 3100 4096 37516 112 396 240 940 672 3496 4368 35828 120 380 252 1008 5356 5040 36860

26 附錄十: Complexities 一覽表 314  N-point DFT:  N-point DCT, DST, DHT:
 Two-dimensional (2-D) Nx  Ny-point DFT:  Convolution of an M-point sequence and an N-point sequence: when M/N and N/M are not large, when N >> M and M is a fixed constant. when M >> N and N is a fixed constant.

27 315  2-D Convolution of an (Mx  My )-point matrix and an (Nx  Ny )-point matrix: when MxMy/NxNy and NxNy/MxMy are not large, when MxMy >> NxNy or NxNy >> MxMy when MxMy >> NxNy or NxNy >> MxMy , and Mx, My are fixed constants.

28 316 Four important concepts that should be learned from fast algorithm design: (1) (2) (3) (4)


Download ppt "IX. Basic Implementation Techniques and Fast Algorithm"

Similar presentations


Ads by Google