跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.170) 您好!臺灣時間:2024/12/08 14:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:鄭夢涵
研究生(外文):Meng-Han Cheng
論文名稱:改良型矩陣乘法器之設計
論文名稱(外文):Design of an Improved Matrix Multiplier
指導教授:杜迪榕
指導教授(外文):Dyi-Rong Duh
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:34
中文關鍵詞:矩陣乘法合併運算定點運算縮小部分乘積矩陣
外文關鍵詞:matrix multiplicationmerged arithmeticfixed-point arithmeticpartial product matrix reduction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:438
  • 評分評分:
  • 下載下載:34
  • 收藏至我的研究室書目清單書目收藏:0
矩陣乘法是科學與工程計算中常見的運算之一,許多人皆為了能增進其計算效率而努力。近幾十年以來,為了加速這類需要龐大計算量的運算,平行處理不外乎為最佳的選擇。隨著硬體製造技術的進步,選擇高速的處理器或是採用多個處理器來執行這類型的運算也非常普遍。在此篇論文中,合併運算將被包含在平行架構中進行。合併運算打破個別的乘法器與加法器的界線,而將乘法與加法視為一體同時執行。然而,在做個別乘積項的加法時,並沒有任一個方法總是最好的。因此,我們提出一個包含之前的方法和新的混合方式來尋找最有效率的一種。有鑒於使用者對系統的考量並不唯一,我們的模擬程式將輸出三種量測標準供使用者選擇,分別是時間,成本和時間乘以成本。除此之外,大致的硬體連接方式也被呈現於結果中,協助之後的實作設計。對於追求高效能以及低成本的系統設計中,此論文將能提供莫大的幫助。
Since matrix multiplication is one of the most used operations in science and engineering, a lot of efforts for improving its efficiency have been made greatly. To accelerate such enormous computing, parallel processing architectures are mostly considered by decades. For the advance of manufacturing technology, high clock rate processors or multiple processors are also used to speed up the computation. In this thesis, another approach called merged arithmetic is included into our parallel architecture. It dissolves the boundary between the individual multipliers and adders to perform multiple multiply and addition in parallel. However, none of the methods which were presented previous for reducing partial product matrix is absolutely better than others. This study proposes a combined method to find out the most efficient reduction. Respecting the user’s demand is not the same all the time; our simulation results include three metrics, delay, cost, and delay × cost. Besides, the hardware interconnection for further implementation is also offered. It is very helpful for the design of such systems because a high performance throughput and low cost system are both what we concern.
論文摘要 i
Abstract ii
Contents iii
List of Figures v
List of Tables vi
1 Introduction 1
1.1 Background and Motivation 1
1.2 Related Work 2
1.3 About This Thesis 3
2 Matrix Multiplication 4
2.1 Matrix Multiplication 4
2.2 SIMD Matrix Multiplication 5
3 Former Work 8
3.1 Fast Multiplier 8
3.1.1 Wallace Tree 10
3.1.2 Dadda Tree 11
3.1.3 Fast Adder 13
3.2 2’s-Complement Multiplication 13
3.3 Merged Arithmetic 14
3.4 Fixed-Point Number System 19
4 Our Result 20
4.1 The Estimation Method 20
4.2 Simulation Results of the Previous Appraoches 21
4.3 Two Reduction Methonds on Demand 23
4.4 Our Result 26
4.5 An Improved Matrix Multiplier 28
5 Conclusion and Future Research 30
5.1 Concluding Remarks 30
5.2 Future Research 31
Bibliography 32
Appendix A 35
[1]C. R. Baugh, and B. A. Wooley, "A two’s complement parallel array multiplication algorithm," IEEE Transaction on Computers, vol. C-22, pp. 1045-1047, 1973.
[2]F. Bensaali, A. Amira, and A. Bouridane, "Accelerating matrix product on reconfigurable hardware for image processing applications," IEE Proceedings of Circuits, Devices and Systems, vol. 152, no. 3, pp. 236-246, 2005.
[3]G. Choe and E. E. Swartzlander, Jr., "Merged Arithmetic for computing wavelet transforms", in Proceedings of the 8th Great Lakes Symposium on VLSI, 1998, pp. 196-201.
[4]G. Choe and E. E. Swartzlander, Jr., "Complexity of merged two’s complement multiplier-adders," in Proceedings of the 35th IEEE Midwest Symposium on Circuits and Systems, 1999, vol. 1, pp. 384-387.
[5]L. Dadda, "Some schemes for parallel multipliers," Alta Frequenza, vol. 34, pp. 349-356, 1965.
[6]A. Fayed, W. Elgharbawy and M. Bayoumi, "A Data Merging Technique High-Speed Low-Power Multiply Accumulate Units, " in Procceedings of the International Conference on Acoustics, Speech, and Signal Processing, 2004, pp. V- 145-8.
[7]K. A. Feiste and E. E. Swartzlander, Jr., "High-speed VLSI implementation of FIR lattice filters," in Proceedings of the 29th Asilomar Conference on Signals, Systems and Computers, 1995, pp. 127-131.
[8]K. A. Feiste and E. E. Swartzlander, Jr., "High-speed VLSI implementation of IIR lattice filters," in Proceedings of the 30th Asilomar Conference on Signals, Systems and Computers, 1996, pp. 1057-1062.
[9]K. A. Feiste and E. E. Swartzlander, Jr., "Merged arithmetic revisited," in Proceedings of the IEEE Workshop on Signal Processing Systems, 1997, pp. 212-221.
[10]J. Gu, C.-H. Chang and K.-S. Yeo, "Algorithm and Architecture for a High Density, Low Power Scalar Product Macrocell," IEE Proceedings on Computer Digital Technology, vol. 151, no. 2, pp. 161-172, 2004.
[11]R. S. Grover, W. Shang, and Q. Li, "Bit-level two’s complement matrix multiplication," Integration, the VLSI Journal, vol. 33, no. 1, pp. 3-21, 2002.
[12]K. Hwang and F. A. Briggs, Computer Architecture and Parallel Processing, McGraw-Hill, New York, 1984.
[13]H.-P. Huang and D.-R. Duh, "Fast computation algorithm for robot dynamics and its implementation," in Proceedings of the IEEE International Symposium on Industrial Electronics, 1992, pp. 352-356.
[14]D. L. Jones, Fixed-Point Number Representation, Connexions Web site. http://cnx.org/content/m11930/1.2/, Dec 28, 2004.
[15]J.-W. Jang, S. B. Choi, and V. K. Prasanna, "Energy- and time-efficient matrix multiplication on FPGAs," IEEE Transaction on Very Large Scale Integration Systems, vol. 13, no. 11, pp. 1305-1319, November 2005.
[16]R. Lin, "A reconfigurable low-power high-performance matrix multiplier design," in Proceedings of the IEEE First International Symposium on Quality Electronic Design, 2000, pp. 321-328.
[17]E. L. Leiss, Parallel and Vector Computing, McGraw-Hill, New York, 1995.
[18]B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, Oxford Univ. Press, New York, 2000.
[19]V. Y. Pan, "How can we speed-up matrix multiplication?" SIAM Review, vol. 26, no. 3, pp.393-415, 1984.
[20]R. Scrofano, S. Choi and V. K. Prasanna, "Energy Efficiency of FPGAs and Programmable Processors for Matrix Multiplication", in Proceedings of the IEEE International Conference on Field-Programmable Technology, 2002, pp. 422-425.
[21]E. E. Swartzlander, Jr., "Merged arithmetic," IEEE Transaction on Computers, vol. C-29, no. 10, pp. 946-950, October 1980.
[22]C. S. Wallace, "A suggestion for a fast multiplier", IEEE Transaction on Electronic Computing, vol. EC-13, pp. 14-17, 1964.
[23]Z. Ye and C.-H. Chang, "A hybrid CSA tree for merged arithmetic architecture of FIR filter," in Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, 2003, pp. 449-453.
[24]L. Zhuo and V. K. Prasanna, "High performance linear algebra operations on reconfigurable systems," in Proceedings of the 2005 ACM/IEEE conference on Supercomputing, 2005.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 3. 方穎芝、吳佳貴,「港口國管制初探」,中華民國海運月刊,第165期,第2-7頁,1999年。
2. 3. 方穎芝、吳佳貴,「港口國管制初探」,中華民國海運月刊,第165期,第2-7頁,1999年。
3. 2. 方信雄,「港口國管制之正確認識」,中華海員月刊,第559期,第24-27頁,2003年。
4. 2. 方信雄,「港口國管制之正確認識」,中華海員月刊,第559期,第24-27頁,2003年。
5. 4. 方穎芝、吳佳貴,「港口國管制國際協約之一―巴黎備忘錄」,中華民國海運月刊,第165期,第8-17頁,1999年。
6. 4. 方穎芝、吳佳貴,「港口國管制國際協約之一―巴黎備忘錄」,中華民國海運月刊,第165期,第8-17頁,1999年。
7. 5. 方穎芝、吳佳貴,「港口國管制國際協約之二─東京備忘錄」,中華民國海運月刊,第165期,第18-23頁,1999年。
8. 5. 方穎芝、吳佳貴,「港口國管制國際協約之二─東京備忘錄」,中華民國海運月刊,第165期,第18-23頁,1999年。
9. 7. 田文國,「我國船舶為因應港口國管制檢查之探討」,中華海員月刊第523、525期,第11-19頁,1997年。
10. 7. 田文國,「我國船舶為因應港口國管制檢查之探討」,中華海員月刊第523、525期,第11-19頁,1997年。
11. 13.陳彥宏,「從巴黎諒解備忘錄論區域性港口國管制政策之施行」,技術學刊,第十四卷,第一期,第119-127頁,1999年。
12. 13.陳彥宏,「從巴黎諒解備忘錄論區域性港口國管制政策之施行」,技術學刊,第十四卷,第一期,第119-127頁,1999年。
13. 17.章詩如,「PSC和船舶管理」,中華民國海運月刊,第157期,第2-9頁,1999年。
14. 17.章詩如,「PSC和船舶管理」,中華民國海運月刊,第157期,第2-9頁,1999年。
15. 18.廖宗,「港口國管制系統之應用」,國立高雄海院學報,第十七期,第15-29頁,2002年12月。