(18.205.60.226) 您好!臺灣時間:2019/12/15 11:42
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
本論文永久網址: 
line
研究生:陳峻楓
研究生(外文):Chen, Chun-Feng
論文名稱:針對多埠記憶體演算法:技術與設計之權衡效能分析
論文名稱(外文):Towards Algorithmic Multi-ported Memory: Techniques and Design Trade-offs
指導教授:賴伯承
指導教授(外文):Lai, Bo-Cheng
口試委員:張添烜張錫嘉賴伯承
口試委員(外文):Chang, Tian-SheuanChang, Hsie-ChiaLai, Bo-Cheng
口試日期:2018-03-08
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電子研究所
學門:工程學門
學類:電資工程學類
論文出版年:2018
畢業學年度:106
語文別:中文
論文頁數:65
中文關鍵詞:多埠記憶體演算法靜態隨機存取記憶體效能分析設計和優化
外文關鍵詞:Algorithmic Multi-ported MemorySRAMperformance analysisdesign and optimization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:781
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
摘要 I
Abstract II
誌謝 III
Contents IV
List of Figures VI
List of Tables VIII
Chapter 1. Introduction 1
Chapter 2. Background 5
2.1 Algorithmic Multi-ported Memory 5
2.2 Performance and Cost Evaluation 6
Chapter 3. Non-Table-Based Approaches 8
3.1 Non-Table-Based Replication Multiple Reads (NTRep-Rd) 9
3.2 Non-Table-Based XOR Multiple Reads (NTX-Rd) 9
3.3 Banking Non-Table XOR-Based Multiple Reads (B-NTX-Rd) 10
3.4 Hierarchical Banking Non-Table XOR-Based Multiple Reads (HB-NTX-Rd) 11
3.5 Non-Table-Based XOR Multiple Writes (NTX-Wr) 12
3.6 Hierarchical Banking Non-Table XOR-Based Multiple Writes (HB-NTX-Wr) 13
3.7 Hierarchical Banking Non-Table XOR-Based Multiple Reads and Writes (HB-NTX-RdWr) 16
3.8 Enhancing NTX-Wr Designs with Efficient Multiple Reads Technqiues 18
Chapter 4. Table-Based Approaches 20
4.1 Table-Based Live Value Table (TBLVT) 20
4.2 Table-Based Remap (TBRemap) 21
4.3 Design of the Lookup Table 22
4.4 Enhancing Table-Based Designs with Non-Table-Based Techniques 23
Chapter 5. Performance and Impact of Design Factors 27
5.1 Circuit-Level Scheme v.s. Algorithmic Scheme 27
5.2 Overall Performance and Cost 30
5.2.1 Non-Table-Based vs. Table-Based (1RnW) 30
5.2.2 Non-Table-Based Multiple Read Ports 34
5.2.3 Non-Table-Based Multiple Read and Write Ports 34
5.2.4 Table-Based Multiple Read and Write Ports 35
5.3 Impact of Banking Structure 39
5.4 Scalability with Memory Depths and Number of Ports 43
5.4.1 Non-Table-Based Designs 43
5.4.2 Table-Based Designs 44
5.5 Proper Tradeoff Between Circuit-Level and Algorithmic Memory 45
5.5.1 Choosing Building SRAM Modules: 1R1W vs. 2RW 45
5.5.2 Benefiting from more Complex Building SRAM Modules 47
5.5.3 mRnW Designs with 2R2W or 2R4W Building Modules 48
Chapter 6. Conclusions 53
References 54
Appendix A: Settings of Design Compiler and CACTI 59
Appendix B: Calculation of Storage Cost Ratio 62
Autobiography 65
Advisors (Bo-Cheng Charles Lai) 65
Students (Chun-Feng Chen) 65
[1] Abdel-Hafeez, Saleh M., and Anas S. Matalkah. "CMOS eight-transistor memory cell for low-dynamic-power high-speed embedded SRAM." Journal of Circuits, Systems, and Computers 17.05 (2008): 845-863.
[2] Bhagyalakshmi, I. V., Ravi Teja, and Madhan Mohan. "Design and VLSI Simulation of SRAM Memory Cells for Multi-ported SRAM’s." (2014).
[3] Rivest, Ronald L., and Lance A. Glasser. A Fast-Multiport Memory Based on Single-Port Memory Cells. No. MIT/LCS/TM-455. MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE, 1991.
[4] Park, Seon-yeong, et al. "CFLRU: a replacement algorithm for flash memory." Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems. ACM, 2006.
[5] Synopsys Design Compiler User Guide Version X-2005.09. Available: http://beethoven.ee.ncku.edu.tw/testlab/course/VLSIdesign_course/course_96/Tool/Design_Compiler%20_User_Guide.pdf
[6] Synopsys Design Compiler Optimization Reference Manual Version D-2010.03.
Available: http://cleroux.vvv.enseirb-matmeca.fr/EN219/doc/dcrmo.pdf
[7] LaForest, Charles Eric, and J. Gregory Steffan. "Efficient Multi-ported Memories for FPGAs." Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays (FPGA), pp. 41-50, ACM, 2010.
[8] Charles Eric LaForest, Ming Gang Liu, Emma Rae Rapati, and J. Gregory Steffan. "Multi-ported Memories for FPGAs via XOR," In Proceedings of the 20th annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pp. 209–218, ACM, 2012.
[9] Charles Eric Laforest, Zimo Li, Tristan O'rourke, Ming G. Liu, and J. Gregory Steffan. "Composing Multi-Ported Memories on FPGAs," in Proceedings of the ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol.7, issue 3, article no. 16, 2014.
[10] Lin, Jiun-Liang, and Bo-Cheng Charles Lai. "BRAM Efficient Multi-ported Memory on FPGA." VLSI Design, Automation and Test (VLSI-DAT), 2015 International Symposium on. IEEE, 2015.
[11] Lai, Bo-Cheng Charles, and Jiun-Liang Lin. "Efficient Designs of Multiported Memory on FPGA." IEEE Transactions on Very Large Scale Integration (VLSI) Systems (2016).
[12] Lai, Bo-Cheng Charles, and Kun-Hua Huang. "An Efficient Hierarchical Banking Structure for Algorithmic Multiported Memory on FPGA." IEEE Transactions on Very Large Scale Integration (VLSI) Systems (2017).
[13] S. Iyer and D. Chuang. (Jan. 2012) “Algorithmic Memory Brings an Order of Magnitude Performance Increase to Next Generation SoC Memories “DesignCon, accessed on Jun. 22, 2017. [Online]. Available: http://www.yuba.stanford.edu/sundaes/Papers/DesignCon-AlgMem.pdf
[14] Tse, David N. C., Pramod Viswanath, and Lizhong Zheng. "Diversity-multiplexing tradeoff in multiple-access channels." IEEE Transactions on Information Theory 50.9 (2004): 1859-1874.
[15] Ping, Li, et al. "Interleave division multiple-access." IEEE Transactions on Wireless Communications 5.4 (2006): 938-947.
[16] Suhendra, Vivy, Chandrashekar Raghavan, and Tulika Mitra. "Integrated scratchpad memory optimization and task scheduling for MPSoC architectures." Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems. ACM, 2006.
[17] Iyer, Sundar, and Shang-Tse Chuang. "High speed memory systems and methods for designing hierarchical memory systems." U.S. Patent Application No. 12/806,631.
[18] Wilton, Steven JE, and Norman P. Jouppi. "CACTI: An enhanced cache access and cycle time model." IEEE Journal of Solid-State Circuits 31.5 (1996): 677-688.
[19] Muralimanohar, Naveen, Rajeev Balasubramonian, and Norman P. Jouppi. "CACTI 6.0: A tool to model large caches." HP Laboratories (2009): 22-31.
[20] Muralimanohar, Naveen, Rajeev Balasubramonian, and Norm Jouppi. "Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0." Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2007.
[21] Muralimanohar, Naveen, Rajeev Balasubramonian, and Norman P. Jouppi. "Architecting efficient interconnects for large caches with CACTI 6.0." IEEE micro 28.1 (2008).
[22] Thoziyoor, Shyamkumar, et al. CACTI 5.1. Technical Report HPL-2008-20, HP Labs, 2008.
[23] Synopsys Design Compiler Standard Cell Library, including TSMC, UMC and SMIC.
Available: https://www.synopsys.com/dw/ipdir.php?ds=dwc_standard_cell
[24] TSMC Standard Cell Library (including 45nm, 90nm advanced technology) Description Name.
Available: http://www.europractice-ic.com/libraries_TSMC.php
[25] Bo-Cheng Charles Lai, Jiun-Liang Lin, Kun-Hua Huang, and Kuo-Cheng Lu. "Method for accessing multi-port memory module, method for increasing write ports of memory module and associated memory controller." U.S. Patent Application No. 15/098,330.
[26] Bo-Cheng Charles Lai, Jiun-Liang Lin, and Kuo-Cheng Lu. "Method for accessing multi-port memory module and associated memory controller." U.S. Patent Application No. 15/098,336.
[27] Tseng, Jessica H., and Krste Asanović. "Banked multiported register files for high-frequency superscalar microprocessors." ACM SIGARCH Computer Architecture News. Vol. 31. No. 2. ACM, 2003.
[28] Kim, John. "Low-cost router microarchitecture for on-chip networks." Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 2009.
[29] Gupta, Pankaj, Steven Lin, and Nick McKeown. "Routing lookups in hardware at memory access speeds." INFOCOM'98. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE. Vol. 3. IEEE, 1998.
[30] Hughes, John H. "Routing table lookup implemented using M-trie having nodes duplicated in multiple memory Banks." U.S. Patent No. 6,308,219. 23 Oct. 2001.
[31] McAuley, Anthony J., Paul F. Tsuchiya, and Daniel V. Wilson. "Fast multilevel hierarchical routing table lookup using content addressable memory." U.S. Patent No. 5,386,413. 31 Jan. 1995.
[32] Teitenberg, Tim, and Bikram Singh Bakshi. "Efficient memory management for channel drivers in next generation I/O system." U.S. Patent No. 6,421,769. 16 Jul. 2002.
[33] Treleaven, Philip C., David R. Brownbridge, and Richard P. Hopkins. "Data-driven and demand-driven computer architecture." ACM Computing Surveys (CSUR) 14.1 (1982): 93-143.
[34] Peng, Zebo, and Krzysztof Kuchcinski. "Automated transformation of algorithms into register-transfer level implementations." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 13.2 (1994): 150-166.
[35] Keshav, Srinivasan, and Rosen Sharma. "Issues and trends in router design." IEEE Communications magazine 36.5 (1998): 144-151.
[36] Tullsen, Dean M., et al. "Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor." ACM SIGARCH Computer Architecture News. Vol. 24. No. 2. ACM, 1996.
[37] Xilinx 7 Series FPGAs Configurable Logic Block User Guide. Available: http://www.xilinx.com/support/documentation/user_guides/ug474_7Series_CLB.pdf
[38] Fetzer, E. S., Gibson, M., Klein, A., Calick, N., Zhu, C., Busta, E., & Mohammad, B. (2002). "A fully bypassed six-issue integer datapath and register file on the Itanium-2 microprocessor." IEEE Journal of Solid-State Circuits Conference, vol. 1, Feb. 2002, pp. 420-478.
[39] Bajwa, H., and X. Chen. "Low-Power High-Performance and Dynamically Configured Multi-port Cache Memory Architecture." Electrical Engineering, 2007. ICEE'07. International Conference on. IEEE, April, 2007.
[40] S. Ben-David, A. Borodin, R. Karp, G. Tardos, and A. Wigderson, “On the Power of Randomization in On-line Algorithms”, New York: Springer, 1994.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔