Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/8937
標題: 應用於多重輸入多重輸出正交分頻多工通訊系統之高速硬體模組設計
High speed hardware accelerator designs for MIMO OFDM communication systems
作者: 徐偉傑
Hsu, Wei-Jie
關鍵字: OFDM;正交分頻多工;MIMO;FFT;QRD;SVD;GMD;Pre-coding;多重輸入多重輸出;快速傅立葉轉換;QR分解;奇異值分解;幾何平均值分解;預編碼
出版社: 電機工程學系所
引用: [1] T. Koonen, “Fiber to the Home/Fiber to the Premises: What, Where, and When?,” Proc. IEEE, vol. 94, no. 5, pp. 911-934, May 2006. [2] J. Armstrong, “OFDM for Optical Communications,” J. Lightw. Technol., vol. 27, no. 3, pp. 189-204, Feb. 2009. [3] Jyun-Liang Chen, “Baseband receiver design for orthogonal frequency division multiplexing based passive optical network,” Master's thesis, National Chung Hsing University Department of Electrical Engineering, Jul. 2009. [4] E. Bernard, J.G. Krammer, M. Sauer, and R. Schweizer, “A pipeline architecture for modified higher radix FFT,” Int. Conf. Acoust. Speech Signal Process., Mar. 1992, pp.617-620 [5] L. Jia, Y. Gao, J. Isoaho, and H. Tenhunen, “A new VLSI-oriented FFT algorithm and implementation,” ASIC Conf., Sep. 1998, pp.337-341. [6] G. Jo Byung and H. Sunwoo Myung, “New Continuous-Flow Mixed-Radix (CFMR) FFT Processor Using Novel In-Place Strategy,” IEEE Trans. Circuits Syst. Regul. Pap., vol. 52, no. 5, pp. 911-919, May 2005. [7] Y.J. Moon and Y.I. Kim, “A mixed-radix 4-2 butterfly with simple bit reversing for ordering the output sequences,” Int. Conf. Adv. Commun. Technol., Feb. 2006, pp. 4-7. [8] H. Jiang, H. Luo, J. Tian, and W. Song, “Design of an Efficient FFT Processor for OFDM Systems,” IEEE Trans. Consum. Electron, vol. 51, no. 4, pp. 1099-1103, Nov. 2005. [9] E. Bidet, D. Castelain, C. Joanblanq, and P. Senn, “A fast single-chip implementation of 8192 complex point FFT,” IEEE J. Solid-State Circuits, vol. 30, no. 3, pp. 300-305, Mar. 1995. [10] T. Sansaloni, A. Pe'rez-Pascual, V. Torres, J. Valls, “Efficient pipeline FFT processors for WLAN MIMO-OFDM systems,” Electron. Lett, vol. 41, no. 19, pp. 1043-1044, Sep. 2005. [11] D. Patel, M. Shabany, and P. G. Gulak, “A low-complexity high-speed QR decomposition implementation for MIMO receivers,” IEEE Int. Symp. Circuits Syst., 2009, pp. 33-36. [12] B. Fischer and J. Modersitzki, “Fast inversion of matrices arising in image processing,” Comput. Sci., vol. 22(1), pp. 1-11, 1999. [13] Shewen Sun, Song Wei, and Cong Wang, “DPCC and QR actorization-based color medical image authentication algorithm,” Int. Conf. Image Anal. Signal Process., 2009, pp. 81-84. [14] G. Batchelor, “Introduction to fluid dynamics (2nd edition),” Cambridge University Press, 2001. [15] R. Gomez Ku, A. Castillo Atoche, and D. Torres Roman, “Control generation for QR decomposition based on the polytope model,” Int. Conf. Electr. Eng. Comput. Scie. Autom. Control, 2009, pp. 1-6. [16] I. Ojalvo, “Proper use of lanczos vectors for large eigenvalue problems,” Comput. Struct., vol. 20(1-3), pp. 115-120, 1985. [17] T.F. Coleman and C.F.Van Loan., “Handbook for Matrix Computations,” SIAM, Philadelphia, PA, 1988. [18] A. Maltsev, V. Pestretsov, R. Maslennikov, and A. Khoryaev, “Triangular systolic array with reduced latency for QR-decomposition of complex matrices,” Int. Symp. Circuits Syst., May 2006. [19] S. Y. Kung, “Vlsi Array Processors,” Prentice Hall, 1988. [20] Otto Bretscher, “Linear Algebra with Applications,” Prentice Hall, 2001. [21] Josef Stoer and Roland Bulirsch, “Introduction to Numerical Analysis(3rd ed.),” Springer. [22] C. Bischof, “The WY Representation for Products of Householder Transformations,” SIAM J. Sci. Stat. Comput., vol. 8, pp. 1-12, 1987. [23] C.F.T. Tang, K.J.R. Liu, and S.A. Tretter, “On systolic arrays for recursive complex Householder transformations with applications to array processing,” Int. Conf. Acoust. Speech Signal Process., Apr. 1991, pp. 1033-1036. [24] Kuo-Liang Chung and Wen-Ming Yan, “The complex Householder transform,” IEEE Trans. Signal Process., vol. 45, no. 9, pp. 2374-2376, Sep. 1997 [25] H. C. Andrews and C. L. Patterson, “Singular Value Decompositions and Digital Image Processing,” IEEE Trans. Acoust. Speech Signal Process., vol. 24(1), pp. 26-53, Feb. 1976. [26] Y. S. Shim and Z. H. Cho, “SVD Pseudo Inversion Image Reconstruction,” IEEE Trans. Acoust. Speech Signal Process., pp. 904-909, Aug. 1981. [27] L. H. Sibul, “Application of Singular Value Decomposition to Adaptive Beamforming,” IEEE Int. Conf. Acoust. Speech Signal Process., Mar. 1984, pp. 750-753. [28] G. H. Golub and C. F. Van Loan, “Matrix Computations(2nd),” Johns Hopkins Univ. Press, Baltimore, MD, 1989. [29] G. E. Forsythe and P. Henrici, “The Cyclic Jacobi Method for Computing the Principal Values of a Complex Matrix,” Trans. Am. Math. Soc., vol. 94(1), pp. 1-23, Jan. 1960. [30] E. G. Kogbetliantz, “Solution of Linear Equations by Diagonalization of Coefficients Matrix,” Q. Appl. Math., vol. 14(2), pp. 123-132, 1955. [31] R. P. Brent, F. T. Luk, and C. F. Van Loan, “Computation of the Singular Value Decomposition Using Mesh-Connected Processors,” J. VLSI Comput. Syst., vol. 1(3), pp. 242-270, 1985. [32] R. P. Brent and F. T. Luk, “The Solution of Singular-Value and Symmetric Eigenvalue Problems on Multiprocessor Arrays,” SIAM J. Sci. Stat. Comput., vol. 6, pp. 69-84, Jan. 1985. [33] Carl G. J. Jacobi, “Uber eine neue Auflosungsart der bei der Methode der kleinsten Quadrate vorkommenden linearen Gleichungen,” Department of Computer Science, University of Maryland, Apr. 1992. [34] Carl G. J. Jacobi, “Uber ein leichtes Verfahren, die in der Theorie der Sakularstorungen vorkommenden Gleichungen numerisch aufzulosen,” J. fur die Reine und Angewandte Math., vol. 30, pp. 51-95, 1846. [35] Magnus R. Hestenes, “Inversion of Matrices by Biorthogonalization and Related Results,” J. Soc. Ind. Appl. Math., vol. 6(1), pp. 51-90, Mar. 1958. [36] Hemkumar, N.D. and Cavallaro, J.R., “A systolic VLSI architecture for complex SVD,” IEEE Int. Symp. Circuits Syst., May 1992, pp. 1061-1064. [37] A. J. Van der Veen and E. F. Deprettere, “A Parallel VLSI Direction Finding Algorithm,” Proc. SPIE Adv. Algori. Archit. Signal Process., vol. 975(III), pp. 289-299, Aug. 1988. [38] Y. Jiang, W. Hager, and J. Li, “The geometric mean decomposition,” Linear Algebra Its Applications, vol. 396, pp. 373-384, Feb. 2005. [39] Jian-Lung Tzeng, Chien-Jen Huang, Yu-Han Yuan, and Hsi-Pin Ma, “A high performance low complexity joint transceiver for closed-loop MIMO applications,” Asia-S. Pac. Des. Autom. Conf., 2010, pp. 381-382. [40] G. Lebrun, J. Gao, and M. Faulkner, “MIMO transmission over a time-varying channel using SVD,” IEEE Trans. Wireless Commun., vol. 4, pp. 757-764, 2005. [41] Nguyen Quoc Khuong, N. Van Duc, Nguyen Quoc Trung and Vu Thi Minh Tu, “A Precoding method for closed-loop MIMO-OFDM systems,” Int. Conf. Adv. Technol. Commun., 2008, pp. 431-433. [42] Yi Jiang, Jian Li, and W.W. Hager, “Joint transceiver design for MIMO communications using geometric mean decomposition,” IEEE Trans. Signal Process., vol. 53, pp. 3791-3803, 2005. [43] Kyeong Jin Kim, Man-On Pun, and R.A. Iltis, “QRD-based precoded MIMO-OFDM systems with reduced feedback,” IEEE Trans. Commun., vol. 58, pp. 394-398, 2010. [44] Yi Jiang, Jian Li and W.W. Hager, “MIMO transceiver design using geometric mean decomposition,” IEEE Inf. Theor. Workshop, pp. 193-197, 2004.
摘要: 
由於正交分頻多工技術應用於通訊系統中,將可以有效率的使用頻寬,且提高通訊傳輸的資料量。在本論文中,基於正交分頻多工通訊系統,設計了幾種高速硬體模組,且進一步實現成晶片。首先為應用於“光纖到家”系統中的高速64點快速傅立葉轉換模組設計,而光纖通訊系統所定義的快速傅立葉轉換取樣頻率為2GHz,本論文執行4路的平行化處理設計,使得硬體工作頻率可以在250MHz中運作。本論文第二種硬體模組設計為應用於調頻連續波雷達系統的1024點快速傅立葉轉換模組的設計,此硬體模組設計在前端包含了濾波器的設計,且整體硬體工作頻率可以運作在10MHz。本論文的第三種設計,可應用於多重輸入多重輸出正交分頻多工系統中的訊號偵測模組之多種的矩陣分解設計,包含了複數值的QR分解、奇異值分解與幾何平均值分解。從演算法的分析中,得知本論文提出的有效的運算方法,在運算複雜度上可以顯著的低於傳統的演算法,且本論文所執行的演算法只需運算在實數中,並消除傳統常見的演算法之冗餘的計算。進一步的,本論文提出的幾何平均值分解演算法是不需要執行奇異值分解運算,進而也解決了收斂性的問題。
本論文提出的每一個硬體模組皆有使用FPGA或是ASIC實現,應用於FTTH的高速64點快速傅立葉轉換模組,首先將此模組實現於FPGA上,有效的取樣頻率將可達到2.12GHz。另外進一步的實現於ASIC上,工作頻率將可達到250MHz,且核心電路面積為0.94 × 0.92 mm2 。應用於FMCW雷達系統的快速傅立葉轉換模組將利用晶片實現,晶片面積大小為1.83 × 1.82mm2,工作頻率最高可達至20MHz,且執行一筆1024點的快速傅立葉轉換運算的時間為0.3075ms。根據802.11n的系統規格,發展設計了一個多功能的2 × 2 QR/SVD矩陣分解的晶片,工作頻率可達到120MHz,而晶片面積大小為1.01 × 1mm2。

The OFDM based communication systems are effective in bandwidth utilization and thus facilitate higher date transmission rate. In the thesis, based on OFDM communication systems, several high speed accelerator module designs and their chip implementations are investigated. First, a high speed 64-point FFT module design targeting optical fiber-to-the-home (FTTH) system applications is developed. The specification of FFT sampling frequency is as high as 2GHz. The proposed design features a 4-way parallel processing to lower the working frequency to a feasible number of 250MHz. Second, an 1024-point FFT module applied to the frequency-modulation-continuous-waveform (FMCW) radar system is developed. The design also incorporates a filtering front end module and can operate at 10MHz. Third, various matrix decomposition modules applicable to signal detection of MIMO-OFDM systems are devised. These include complex-valued QR decomposition, singular value decomposition (SVD) and geometric mean decomposition (GMD). Starting from the algorithm analyses, our investigation lead to efficient computing schemes with significant lower computational complexity than traditional methods. All these schemes work on real-valued elements and can eliminate the computing redundancy common in conventional approaches. The proposed GMD scheme is also free of performing SVD in the first place and bears no convergence problem.
Based on the developed algorithms, novel architecture designs are derived. Either ASIC or FPGA implementations is accomplished for each design. The high speed 64-point FFT module for FTTH applications is first implemented in FPGA. The effective sampling rate can reach 2.12GHz. The alternative ASIC implementation result features a 250MHz working frequency on a die with a 0.94 0.92 mm2 sized core. The chip design of the FFT processor for FMCW radar systems, with a die size of 1.83 1.82mm2, functions properly at 20MHz and is capable of performing a 1024-point FFT computation every 0.3075ms. Subject to the 802.11n system specs, a multi-function 2 2 QR/SVD matrix decomposition chip was developed. It features a working frequency at 120MHz and the chip core size is 1.01 1mm2.
URI: http://hdl.handle.net/11455/8937
其他識別: U0005-2408201017264000
Appears in Collections:電機工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.