(3.238.240.197) 您好!臺灣時間:2021/04/13 01:11
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:許瀚仁
研究生(外文):Han-Jen Hsu
論文名稱:雙模式之離散餘弦轉換智產核心與可重組化數位訊號處理器之設計與實現
論文名稱(外文):Design and Implementation of Dual-mode DCT IP Core and Reconfigurable DSP Processor
指導教授:賴永康
指導教授(外文):Yeong-Kang Lai
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:英文
論文頁數:86
中文關鍵詞:離散餘弦轉換轉換積體電路設計可重組化計算
外文關鍵詞:DCTVLSI DesignReconfigurable Computing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:120
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
由於系統單晶片的時代來臨,傳統的特殊應用積體電路已不敷使用,因此如何將各種不同的演算法實現在相同的硬體中,以節省電路的面積及設計的時間。可重組化計算是相當熱門的研究領域之一,經由適當的調整,使硬體不再只是單一功能,能依照不同的應用以改變硬體。
在這篇論文中針對離散餘弦轉換轉換,提出一個二維雙模式之管線式離散餘弦轉換轉換器架構。專門用來處理8 × 8的區塊大小,以行列分離的運算是相當適合實現超大型積體電路,此架構包含兩種離散餘弦轉換轉換器,分別處理奇數及偶數的資料,利用可重組化管線化設計及兩種模式的切換,可以達到低功率及高速的需求,內部的字元長度所達到的精準度可滿足CCITT對於離散餘弦轉換的誤差需求,平行的規則架構也可達到高速運算的處理。
本晶片使用Artisan 0.25μm設計單元庫及TSMC 0.25 1P5M製程。晶片電晶體總數為77822,大小為1.38×1.38 mm2 ,最大操作頻率可達到56MHz。在56MHz的操作頻率下消耗功率為14.17mW;在28MHz的操作頻率下消耗功率為7.89mW。
在可重組化部分,提出一個適用於數位訊號處理的動態混合顆粒可重組化處器,此引擎包含一64個可重組化單元的陣列、控制器、可重組化資料緩衝器及微程式碼 ROM,可實現在視訊處理及訊號處理中常用的演算法。如離散餘弦轉換轉換、濾波器、FIR濾波器移動估計等演算法。在整個可重組化系統中扮演共同處理器的角色,以增加系統的效能。
In this thesis, we propose a cost-effective 2-D Discrete Cosine Transform IP Core with reconfigurable datapath.
The chip can process 8 × 8 block of video sequence. Even-odd decomposition is suitable for VLSI implementation.
The architecture includes of two types of reconfigurable processor to process even and odd data.
We use two mode operations of reconfigurable datapaths to achieve high speed and low power consumption.
The precision of wordlength can meet the requirement of CCITT standard.
A prototype chip is implemented in Artisan 0.25μm cell library and fabricated by TSMC 0.25μm 1P5M technology.
This chip includes a texture transpose memory, two DCT processors, pre-adder and the total transistor count is 77822. The die size is 1.38×1.38 mm2.
The operation speed of post-layout simulation can reach 56 MHz.
Static timing analysis is also used to verify the chip.
The power consumption is 14.17mW@56 MHz and 7.89mW@28 MHz.
Because of the approach of the decade of System-on-Chip (SOC), the traditional ASIC is inefficient to use.
Though how to map different algorithms in the same hardware, to reduce the hardware cost and design time is more important.
Reconfigurable computing is the latest research topic.
In this thesis, we propose a Reconfigurable DSP processor to
implement the algorithms of video processing and digital signal processing.
Such as Discrete Cosine Transform (DCT), motion estimation, FIR filter, and Discrete Fourier Transform (DFT).
The reconfigurable processor plays the role of co-processor in whole system to increase the performance of system.
Contents
1 Introduction 1
1.1 Recongurable Computing Overview . . . . . . . . . . . . . . . . 1
1.2 Classication of Recongurable Architecture . . . . . . . . . . . . 2
1.2.1 Recongurable Logic . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Recongurable Datapath . . . . . . . . . . . . . . . . . . . 3
1.2.3 Recongurable Arithmetic . . . . . . . . . . . . . . . . . . 3
1.2.4 Recongurable Control . . . . . . . . . . . . . . . . . . . . 6
1.3 Characteristic of Recongurable Architecture . . . . . . . . . . . . 7
1.3.1 Granularity . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Depth of programmability . . . . . . . . . . . . . . . . . . 7
1.3.3 Recongurability . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.4 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.5 Computational model . . . . . . . . . . . . . . . . . . . . . 9
1.4 FPGA- based Architecture . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Garp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 DRLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Integrated Architecture . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1 Morphosys . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 REMARC . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.3 PipeRench . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.4 RaPiD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 Architecture Design of Dual-mode DCT IP Core 22
2.1 Discrete Cosine Transform algorithm . . . . . . . . . . . . . . . . 22
2.1.1 direct method . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.2 using other transform . . . . . . . . . . . . . . . . . . . . . 26
2.1.3 indirect method . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Modication of the DCT algorithm . . . . . . . . . . . . . . . . . 28
2.3 VLSI Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Finite Wordlength Simulation . . . . . . . . . . . . . . . . . . . . 36
2.5 Low Power Consideration . . . . . . . . . . . . . . . . . . . . . . 37
3 Chip Implementation 39
3.1 Chip Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Functional Verication . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 RTL verication of the DCT IP Core . . . . . . . . . . . . 41
3.2.2 Gate-level verication of the DCT IP Core . . . . . . . . . 42
3.2.3 Post-layout gate-level simulation of the DCT IP Core . . . 44
3.2.4 Post-layout transistor-level simulation of the DCT IP Core 44
3.3 Static Timing Analysis . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Test consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5 Implementation Result . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6 Hardware Emulation by Aptix MVP . . . . . . . . . . . . . . . . 56
3.7 Chip Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Architecture Design of Recongurable processor for Digital Sig-
nal Processing 62
4.1 Recongurable Computing Engine System . . . . . . . . . . . . . 62
4.2 Recongurable processor for DSP . . . . . . . . . . . . . . . . . . 63
4.2.1 Recongurable Cell . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 Split-ALU organization . . . . . . . . . . . . . . . . . . . . 65
4.2.3 Instructions for ALU/Controller . . . . . . . . . . . . . . . 69
4.2.4 Global Buses . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.5 Interconnection . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3 Mapping algorithms and performance analysis . . . . . . . . . . . 72
4.3.1 Discrete Cosine Transform . . . . . . . . . . . . . . . . . . 72
4.3.2 Discrete Fourier Transform . . . . . . . . . . . . . . . . . . 73
4.3.3 Motion Estimation . . . . . . . . . . . . . . . . . . . . . . 75
4.3.4 FIR lter . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Conclusion 80
Bibliography 81
[1] Hartej Singh et al., MorphoSys: An integrated Recongurable System for
Data-Parallel and Computation-Intensive Applications," IEEE Transactions
of Computers, vol. 49, no. 5, May, 2002.
[2] Guangming Lu et al., The MorphoSys Dynamically Recongurable System-
On-Chip," Proceedings of the First NASA/DoD Workshop on Evolvable
Hardware, pp. 152-160, July, 1999.
[3] Hartej Singh, Recongurable Architectures for Multimedia and Data-
Parallel Application Domains," PhD thesis of California, Irvine, 2000.
[4] Seth Copen Goldsteiny et al., PipeRench: A Coprocessor for Streaming
Multimedia Acceleration," Proc. International Symposium Computer Archi-
tecture, pp. 28-39, May, 1999.
[5] Takashi Miyamori, Kunle Olukotun, A quantitative analysis of recong-
urable coprocessors for multimedia applications ," IEEE Symposium on FP-
GAs for Custom Computing Machines, pp. 2-11, Apr., 1998.
[6] Carl Ebeling, Darren C. Cronquist, Paul Franklin, Jason Secosky, and Stefan
G. Berg, Mapping applications to the RaPiD congurable architecture ," IEEE Symposium on FPGAs for Custom Computing Machines, pp. 16-18 ,
Apr., 1997.
[7] Ethan Mirsky, Andre'' DeHon, MATRIX: A Recongurable Computing Ar-
chitecture with Congurable Instruction Distribution and Deployable Re-
sources," IEEE Symposium on FPGAs for Custom Computing Machines,
pp. 157-166, Apr., 1996.
[8] John R. Hauser, John Wawrzynek, Garp: A MIPS Processor with a Recon-
gurable Co-Processor," IEEE Symposium on Field-Programmable Custom
Computing Machines, April, 1997.
[9] Taro Fujii et al.,A dynamically recongurable logic engine with a multi-
context/multi-mode unied-cell architecture," IEEE Solid-State Circuits
Conference, pp. 15-17, Feb, 1999.
[10] Zhi Alex Ye, Andreas Moshovos, Scott Hauck, and Prithviraj Banerjee,
The Chimaera Recongurable Functional Unit," IEEE Symposium on Field-
Programmable Custom Computing Machines, pp. 87-96, April, 1997.
[11] Yung-Pin Lee, Thou-Ho Chen, Liang-Gee Chen, Mei-Juan Chen, and Chung-
Wei Ku, A cost-eective architecture for 88 two-dimensional DCT/IDCT
using direct method," IEEE Transactions on Circuits and Systems for Video Technology, vol 7,pp. 459-467,June, 1997.
[12] P. Duhamel, C. Guillemot,Polynomial transform computation of 2-D DCT,"
in Proc. ICASSP''90, pp. 1515-1518, April, 1955.
[13] M. Vetterli,Fast 2-D discrete cosine transform," in Proc. ICASSP''85, pp.
1538-1541, Mar, 1985.
[14] Jue-Hsuan Hsiao, Liang-Gee Chen, Tzi-Dar Chiueh, and Chun-Te
Chen,High throughput CORDIC-based systolic array design for the Dis-
crete Cosine Transform," IEEE Transactions on Circuits and Systems for
Video Technology, vol. 5, pp. 218-224, June, 1995.
[15] Avanindra Madisetti, Alan N.Willson et al., A 100 MHz 2-D 8 8
DCT/IDCT Processor for HDTV Applications," IEEE Transactions on Cir-
cuits and Systems for Video Technology, pp. 158-165, April, 1995.
[16] Darren Slawecki, Weiping Li et al., DCT/IDCT processor design for high
data rateimage coding,"IEEE Transactions on Circuits and Systems for
Video Technology, vol. 2, pp. 135-146, June, 1992.
[17] Shin-ichi Uramoto et al., A 100 MHz 2-D discrete cosine transform core
processor," IEEE Journal of Solid-State Circuits, vol. 27, pp. 492-499, Apr.,
1992.
[18] T. Miyazaki et al., DCT/IDCT processor for HDTV developed with DSP
silicon complier," Journal of VLSI Signal Processing, vol. 5, pp. 39-46, June,
1993.
[19] M. Matsui et al., 200 MHz compression macrocells using low-swing dier-
ential logic," in ISSCC Dig. Tech. Papers, pp. 254-255, Feb., 1994.
[20] Jiun-In Guo , A low cost 2-D inverse discrete cosine transform design for im-
age compression," IEEE International Symposium on Circuits and Systems,
vol.4, pp. 658 -661, 2001.
[21] Liang-Gee Chen, Juing-Ying Jiu, Hao-Chieh Chang, Yung-Pin Lee and
Chung-Wei Kuet al.,A low power 2D DCT chip design using direct 2D al-
gorithm, " Design Automation Conference Proceedings of the Asia and South
Pacic , 10-13, pp. 145 -150, Feb, 1998.
[22] T. Kuroda et al.,A 0.9V, 150MHz, 10-mV, 4mm ,2-D Discrete Cosine Trans-
form Core Processor with Variable Threshold-Voltage(VT)Scheme , " IEEE
J. Solid-State Circuit , vol. 31, no. 11, pp. 1770-1779, Nov 1996.
[23] Suhwan Kim, Conrad H. Ziesler and Marios C. Papaefthymiou,A recong-
urable pipelined IDCT for low-energy video processing , " IEEE International
ASIC/SOC Conference, pp. 13-17, Sept, 2000.
[24] Yeong-Kang Lai, Han-Jen Hsu,A cost-eective Discrete Cosine Transform
Processor with recongurable datapath , " IEEE International Symposium
on Circuit and System, May, 2003.
[25] Yeong-Kang Lai, Liang-Gee Chen,A data-interlacing architecture with two-dimensional data-reuse for full-search block-matching algorithm , " IEEE
Transactions on Circuits and Systems for Video Technology, vol. 8, no. 2, April , 1998.
[26] Jun-Fu Shen, Tu-Chih Wang, Liang-Gee Chen, A Novel low-power full-search
Block-matching Motion-Estimation Design of H.263+ , " IEEE Transactions
on Circuits and Systems for Video Technology, vol. 11, July, 2001.
[27] Peter Pirsch,Architectures for digital signal processing, " John Wiley and
Sons, 1998.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔