跳到主要內容

臺灣博碩士論文加值系統

(44.220.181.180) GMT+8:2024/09/18 09:42
Font Size: Enlarge Font   Word-level reduced   Reset  
Back to format1 :::

Browse Content

Author my cdr record
 
twitterline
Author:孫茂仁
Author (Eng.):Mao Jen Sun
Title:座標旋轉演算法之高效能與低功率之二維離散餘弦轉換及反餘弦轉換之架構設計
Title (Eng.):High-Efficiency and Low-Power Architectures for 2-D DCT and IDCT Based on CORDIC Rotation
Advisor:宋志雲
advisor (eng):Tze-Yun Sung
degree:Master
Institution:中華大學
Department:電機工程學系碩士班
Narrow Field:工程學門
Detailed Field:電資工程學類
Types of papers:Academic thesis/ dissertation
Publication Year:2006
Graduated Academic Year:94
language:Chinese
number of pages:77
keyword (chi):離散餘弦轉換
keyword (eng):DCT
Ncl record status:
  • Cited Cited :2
  • HitsHits:338
  • ScoreScore:system iconsystem iconsystem iconsystem iconsystem icon
  • DownloadDownload:0
  • gshot_favorites title msgFav:0
摘要
隨著電腦應用與通訊系統技術發展的快速成長,影像的壓縮技術水準也愈來愈高明。從影像壓縮觀點來看,轉換編碼方式優於傳統的線性預測編碼;以WHT為例,其中最主要的核心矩陣運算僅需要加法與減法運算方式而已,有非常顯著的效果。而在有愈來愈多的多媒體資料壓縮應用的出現與需求的今天,離散餘弦轉換早已是不可或缺的重要元件之一,更是JPEG、MPEG-1,2,4等影像與視訊壓縮標準中重要核心技術。
在許多影像處理系統當中,二維離散餘弦轉換與二維離散反餘弦轉換愈來愈廣泛的被應用。在本論文中,我們提出一個高效率8x8的2-D DCT 與 2-D IDCT 處理器,利用平行且管線化的設計架構,分別利用128個與64個記憶體單元,儲存6個係數的唯讀記憶體單元,來提高效能與節省記憶體空間;而在核心算數單元方面,1-D DCT 與 1-D IDCT處理器完全不使用任何乘法器,而是藉由CORDIC演算法當中的累加運算中的加法器與位移器來取代,本論文提出來的2-D DCT 與 2-D IDCT架構不但有較規則的資料流、較低的複雜度而且有較低的消耗功率來得到較高的效能。
在本論文中所提出的硬體架構皆已使用Verilog硬體描述語言來實現,並且使用Synopsys的Design Compiler合成軟體來進行邏輯合成,最後使用Astro Layout Tools配合台積電的0.18μm1P6M CMOS製程與Artisan的記憶體製程,進行電路的佈局,自動合成出電路晶片;晶片分析我們使用Synopsys的軟體 PrimPower估算其消耗功率。
Abstract
With the rapid growth of modern communication applications and computer technologies, image compression is increasingly in demand. From the compression point of view, transform coding is superior to linear predication coding. Walsh-Hadamard transform is the simplest one, in which the computations involved in the kernel matrix are only additions and subtractions. As cosine transform approximates to the optimal Karhunen-Loeve transform, which is however much more complicated in practice, discrete cosine transform (DCT) has been widely used in the image compression task. Moreover, DCT is adopted by the JPEG standard.
Two-dimensional discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) have been widely used in many image processing systems. In this paper, efficient architecture with parallel and pipelined structures are proposed to implement 8x8 DCT and IDCT processors. In which, dual-bank of SRAM (128 words) and single bank of SRAM (64 words),the coefficient ROM (6 words) is utilized for saving the memory space. The kernel arithmetic unit, i.e. multiplier, which is demanding in the implementation of DCT and IDCT processors, has been replaced by simple adders and shifters based on the double rotation CORDIC algorithm. The proposed architectures for 2-D DCT and IDCT processor not only simplify hardware but also reduce the power consumption with high performances.
The proposed parallel-pipelined architecture for 2-D DCT and IDCT processors have been written in Verilog® and synthesized by TSMC 0.18μm 1P6M CMOS cell libraries. Finally, the layout of the design is generated automatically by the Astro Layout Tools in a 0.18μm 1P6M CMOS technology. The core sizes and power consumptions can be obtained from the reports of Synopsys® design analyzer and PrimPower®, respectively.
目錄
英文摘要.........................Ⅰ
摘要...........................Ⅱ
誌謝...........................Ⅳ
目錄...........................Ⅴ
圖目錄..........................Ⅷ
表目錄..........................Ⅹ
第一章 導論 ......................1
1.1 研究背景 .....................1
1.2 研究動機 .....................2
1.3 研究方法與步驟 ..................3
第二章 離散餘弦轉換演算法 ................3
2.1 離散餘弦轉換演算法簡介 ..............3
2.2 1-D DCT/IDCT演算法 ...............6
2.3 2-D DCT/IDCT演算法 ...............10
2.3.1 2-D DCT演算法 ................ 10
2.3.2 2-D IDCT演算法 ................13
2.4 先前的架構設計 .................15
2.4.1 Direct-method designs ..............15
2.4.2 CORDIC-based designs ..............17
第三章 離散餘弦轉換硬體架構設計 ............23
3.1 設計概念.....................23
3.2 CORDIC 演算法 .................25
3.3 1-D DCT硬體實現架構設計.............28
3.4 1-D IDCT硬體實現架構設計 ............36
3.5 行列分解法的架構設計...............44
3.6 2-D DCT/IDCT 硬體實現架構設計..........46
3.6.1 雙記憶體硬體架構設計 ..............46
3.6.2 單記憶體硬體架構設計 ..............52
3.7 效能比較.....................58
第四章 模擬結果與硬體實現 ...............60
4.1 設計流程.....................60
4.2 1-D DCT/IDCT模擬結果與比較 ...........62
4.3 2-D DCT/IDCT模擬結果與比較 ...........64
4.3.1 雙記憶體硬體設計模擬結果............64
4.3.2 單記憶體硬體設計模擬結果............65
4.4 2-D DCT/IDCT 設計晶片佈局 ...........66
4.4.1 雙記憶體設計晶片佈局 .............67
4.4.2 單記憶體設計晶片佈局..............69
4.5 晶片實現結果與比較................71
第五章 結論......................72
參考文獻.........................74
參考文獻
[1]N. Ahmed, T. Natarajan, and Kao K. R., “Discrete cosine transform,“ IEEE Trans.on Communications,vol.COM-23,pp.90-93, Jan. 1974.
[2]W. H. Chen, C. H. Smith, and S. C. Fralick, “A fast computational algorithm for the discrete cosine transform, “ IEEE Trans. Commun., vol. COM-25, pp. 1004-1009, Sept. 1977.
[3]F. A. McGovern, R. F. Woods, M. Yan, “Novel VLSI implementation of 8x8 point 2D DCT“, Electronics letters 14th, Apr.,1994,vol. 30,no.8
[4]Elliott, D. F., Kao K. R., “Fast Transforms Algorithms, Analysis, Applications,” Chapter 8, Walsh-Hadamard Transform, Prentice-Hall, 1982, pp. 301-303.
[5]Clarke, R. J., “Relation between the Karhenen Loeve and Cosine Transform,” IEEE Proceedings, Part F, Vol. 128, No. 6, Nov. 1981, pp. 359-360.
[6]Narasimha, M. J., Peterson, A. M., “On the Computation of the Discrete Cosine Transform,” IEEE Transactions on Communications, Vol. 26, No. 6, June 1978, pp. 934-936.
[7]R. M.Haralick “A Storage Way to Implement the Discrete Cosine Transform,” IEEE Transactions on Computers, July 1976, pp. 764-765.
[8]W. H. Chen, C. H. Smith, S. C. Fralick, “Fast Computational Algorithm for the Discrete Cosine Transform,” IEEE Transactions on Communications, Vol. 25, No. 9, Sept. 1977, pp.1004-1009.
[9]T. Y. Sung, “VLSI Parallel and Distributed Computation Algorithms for DCT Processors,” Proceedings IEEE International Phoenix Conference on Computer and Communications, Scottsdale, Arizona, USA, 1990, pp.121-125.
[10]T. Y. Sung, “VLSI Parallel and Distributed Processing Algorithms for Multidimensional Discrete Cosine Transforms,” 1990 A Two-Track International Conference on Databases, Parallel Architectures, and their Applications, Miami Beach, Florida, USA, March 1990, pp. 36-39.
[11]T. Y. Sung, “Novel Parallel VLSI Architectures for Discrete Cosine Transforms,” Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Albuquerque, New Mexico, USA, April 1990, pp.998-1001.
[12]Y. P. Lee, T. H. Chen, L. G. Chen, C. W. Ku, “ A Cost-Effective Architecture for 8×8 two-dimensional DCT/IDCT Using Direct Method,” IEEE Transactions on Circuits Systems for Video Technology, Vol. 7, No. 1, June 1997, pp. 459-467.
[13]Y. T. Chang, C. L. Wang, “New Systolic Array Implementation of the 2-D Discrete Cosine Transform and Its Inverse,” IEEE Transactions on Circuits Systems for Video Technology, Vol. 5, No. 1, April 1995, pp. 150-157.
[14]S. F. Hsiao, W. R. Shiue, “A New Hardware-Efficient Algorithm and Architecture for Computation of 2-D DCTs on a Linear Array,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, Nov. 2001, pp.1149-1159.
[15]S. F. Hsiao, J. M. Tseng, “New Matrix Formulation for Two-Dimensional DCT/IDCT Computation and its Distributed-Memory VLSI Implementation,” IEE Proc.-Vis. Image Signal Process, Vol. 149, No. 2, April 2002, pp. 97-107.
[16]V. Srinvasan, K. J. R. Liu, “VLSI Design of High-Speed Time-Recursive 2-D DCT/IDCT Processor for Video Applications,” IEEE Transactions on Circuits Systems for Video Technology, Vol. 6, No. 1, Feb. 1996, pp. 87-96.
[17]T. Kuroda, “A 0.9-V, 150-MHz, 10-mW, 4mm2, 2-D Discrete Cosine Transform Core Processor with Variable Threshold-Voltage(VT) Scheme,” IEEE Journal of Solid-States Circuits, Vol. 31, No. 11, Nov. 1996, pp.1770-1778.
[18]R. Rambaldi, A. Uguzzoni, R. Guerrieri, “A 35 W 1.1 V Gate Array IDCT Processor for Video-Telephony,” Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, 1998, pp.2993-2996.
[19]T. H. Chen, “A Cost-Effective 2-D IDCT Core Processor with Folded Architecture,” IEEE Transactions on Consumer Electronics, Vol. 45, No.2, May 1999, pp.333-339.
[20]T. Y. Sung, Y. H. Sung, “A Novel Implementation of Cost-Effective Parallel-Pipelined 8×8 DCT Processor,” The Fourth IEEE Asia-Pacific Conference on Advanced System Integrated Circuits (AP-ASIC) 2004, Fukuoka, Japan, August 3-5, 2004, pp.200-203.
[21]Y. H. Hu, Z. Wu, “An Efficient CORDIC Array Structure for the Implementation of Discrete Cosine Transform”, IEEE Transactions on Signal Processing, Vol. 43, No. 1, Jan. 1995, pp.331-.336.
[22]H. Jeong, J. Kim, W. K. Cho, “Low-Power Multiplierless DCT Architecture Using Image Data Correlation,” IEEE Transactions on Consumer Electronics, Vol. 50, No. 1, Feb. 2004, pp.262-267.
[23]D. Gong, Y. He, Z. Gao, “New Cost-Effective VLSI Implementation of a 2-D Discrete Cosine Transform and Its Inverse”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 14, No. 4, April 2004, pp. 405-415.
[24]V. Dimitrov, K. Wahid, G. Jullien, “Multiplication-Free 2D DCT Architecture Using Algebraic Integer Encoding”, Electronics Letters, Vol. 40, No. 20, Sept. 2004,pp.
[25]M. Alam, W. Badawy, G. Jullien, “A New Time Distributed DCT Architecture for MPEG-4 Hardware Reference Model”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 5, May 2005, pp.726-730.
[26]J. E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Transactions on Electronic Computers, Vol. EC-8, 1959, pp. 330-334.
[27]J. S. Walther, “A Unified Algorithm for Elementary Functions,” Spring Joint Computer Conference Proceedings, Vol.38, 1971, pp.379-385.
[28]X. Hu, R. G. Harber, S. C. Bass, “Expanding the range of the Convergence of the CORDIC Algorithm”, IEEE Transactions on Computers, Vol. 40, No. 1, 1991, pp.13-21.
[29]T. Y. Sung, Y. H. Sung, “The Quantization Effects of CORDIC Arithmetic for Digital Signal Processing Applications”, The 21st Workshop on Combinatorial Mathematics and Computation Theory, Taiwan, May 21~22, 2004, pp. 16-25.
[30]T. Y. Sung, “A Memory-Efficient and High-Speed Split-Radix FFT/IFFT Processor Based on Pipelined CORDIC Rotations,” to appear in IEE Proceedings – Vision, Image and Signal Processing.
[31]T. Y. Sung, C. S. Chen, M. C. Shih, “The Double Rotation CORDIC Algorithm: New Results for VLSI Implementation of Fast Sine/Cosine Generation,” 2004 International Computer Symposium (ICS-2004), Taipei, Taiwan, Dec. 15-17, 2004, pp.1285-1290.
[32]“TSMC 0.18 CMOS Design Libraries and Technical Data, v.3.2,” Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan, and National Chip Implementation Center (CIC), National Science Council, Hsinchu, Taiwan, R.O.C., 2006.
[33]S. Wang, E. E. Swartzlander Jr., “Merged CORDIC Algorithm”, Proc. Int’l Symp. Circuit and Systems, 1998, pp. 1988-1991.
[34]Y. H. Hu and Z. Wu, “An efficient CORDIC array structure for the implementation of discrete cosine transform,” IEEE Trans. On Signal Processing, col. 43, no. 1, pp.331-336, Jan. 1995.
[35]S.F. Hsiao, Y. H. Hu, Juang, T.-B, C.H. Lee, “Efficient VLSI Implementations of Fast Multiplierless Approximated DCT Using Parameterized Hardware Modules for Silicon Intellectual Property Design” IEEE Trans. On Circuits and Systems for Video Technology, Vol. 52, Aug. 2005, pp. 1568-1579.
[36]Synopsys products, http://www. synopsys.com/products.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
First Page Prev Page Next Page Last Page top
system icon system icon