

(2600:1f28:365:80b0:a8de:191f:a29b:1858) 您好!臺灣時間:2025/01/13 05:15
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Chang-Yuan Cheng
論文名稱(外文):Distributed Memory Management Unit Design for Media Stream Processor Architecture
指導教授(外文):Herming Chiueh
外文關鍵詞:MediaStream ProcessorDistributed MemoryMemory Management
  • 被引用被引用:0
  • 點閱點閱:389
  • 評分評分:
  • 下載下載:15
  • 收藏至我的研究室書目清單書目收藏:0
In modern multimedia applications such as image processing, video compression, two-dimension and three-dimension graphics, data copying and data moving are common processes. However, the bandwidth gaps between processors and memory cause the slow down of transition data. In order to bridge the gap, this thesis proposed a distributed memory management unit (DMMU) for modern media processing architectures. The DMMU consists of address translation unit (ATU) and double data rate (DDR) memory controller. The ATU provides a virtual memory mechanism, and been used to save data transition time. The DDR memory controller is used in simply burst read and burst write mode. The result of DMMU implementation shows that proposed ATU architecture provides 2 million times speed-up than conventional ATU when transmitted 16MB data size. However, when the data capacity is less than 16MB, the proportion of the transition time without ATU/ ATU is increased for the data capacity. The proposed design provides a leap up in data transition for modern media processing architecture with a tiny overhead in circuit area and power.
English Abstract......II
List of Tables......V
List of Figures......VI
Chapter1 Introduction......1
1.1 Motivation......1
1.2 Organization......4
Chapter2 DMMU Design Architecture......5
2.1 Data Copying and Data Moving in Address Translation Unit......5
2.2 Distributed Memory Management Unit......7
2.3 Address Translation Unit......9
2.4 DDR Memory Controller......12
2.4.1 Device Operations of DDR SDRAM......12
2.4.2 Mode Register Definition of DDR SDRAM......15
2.4.3 Block Diagram of DDR Memory Controller......16
2.4.4 The Main Controller Module......17
2.4.5 Summary......19
Chapter3 Implementation......20
3.1 Computer-Aided Design Flow......20
3.2 Implementation of DMMU Interface......23
3.3 Circuit Verification......25
3.4 Functional Verification......29
3.5 Performance Evaluation......45
3.5.1 Comparison......45
3.6 Summary......49
Chapter4 Conclusions......51

List of Tables
Table 2.1 The supervisor and users access privileges and correspond to PR bits.....12
Table 3.1 The signals on the DMMU interface.....24
Table 3.2 The results of the ATU synthesis.....26
Table 3.3 The results of the DMMU synthesis.....26
Table 3.4 The results of the P&R in core utilization = 0.7.....27
Table 3.5 The results of the P&R in core utilization = 0.8.....27
Table 3.6 The core area of the P&R results between the different core utilization and clock frequency.....27
Table 3.7 The specification table of the design.....28
Table 3.8 The different parameter values during initial states.....30
Table 3.9 The different parameter values during command states.....30
Table 3.10 The test bench of the ATU.....33
Table 3.11 The clock period of the micro-controller, the DMMU and DDR.....46
Table 3.12 The data transition time of data copying and data moving in the different data capacity without ATU.....46
Table 3.13 Total access time of two ATU modes.....48

List of Figures
Figure 1.1 The performance gap of CPU and memory.....1
Figure 1.2 Bandwidth hierarchy of an imagine stream architecture.....2
Figure 1.3 Time to complete a series of memory references without access scheduling.....3
Figure 1.4 Time to complete a series of memory references with access scheduling.....3
Figure 1.5 Memory access scheduler architecture.....3
Figure 2.1 The page table of traditional address translation.....6
Figure 2.2 The proposed ATU mechanism in the DMMU.....6
Figure 2.3 The proposed DMMU micro-architecture in the streaming memory system of the imagine stream processor.....8
Figure 2.4 The proposed memory system.....8
Figure 2.5 The block diagrams of the DMMU micro-architecture.....9
Figure 2.6 The flows of the address translation.....9
Figure 2.7 The stream register file organization.....10
Figure 2.8 The translation table of the address translation mode.....11
Figure 2.9 The block diagram of DDR SDRAM 512Mb B-die.....13
Figure 2.10 The simplified state diagram of the DDR SDRAM.....15
Figure 2.11 The mode register set of DDR SDRAM.....16
Figure 2.12 The functional block diagram of the DDR memory controller.....17
Figure 2.13 Initial state diagram of DDR memory controller.....18
Figure 2.12 The command state diagram of DDR memory controller.....19
Figure 3.1 The cell-base design flow.....21
Figure 3.2 The physical level design flow.....22
Figure 3.3 The DMMU interface.....23
Figure 3.4 Layout of DMMU.....28
Figure 3.5 The operation environment of DMMU functional verification.....29
Figure 3.6 The initial state machine of DDR memory controller.....31
Figure 3.7 The burst write and burst read mode of DDR memory controller.....32
Figure 3.8 Access data (PR = 2’b00) under supervisor mode in the DMMU.....35
Figure 3.9 Access data (PR = 2’b01) under supervisor mode in the DMMU.....36
Figure 3.10 Access data (PR = 2’b10) under supervisor mode in the DMMU.....37
Figure 3.11 Access data (PR = 2’b11) under supervisor mode in the DMMU.....38
Figure 3.12 Access data (PR = 2’b00) under users mode in the DMMU.....40
Figure 3.13 Access data (PR = 2’b01) under users mode in the DMMU.....41
Figure 3.14 Access data (PR = 2’b10) under users mode in the DMMU.....42
Figure 3.15 Access data (PR = 2’b11) under users mode in the DMMU.....43
Figure 3.16 Data capacity over limitation in the DMMU.....44
Figure 3.17 Test configuration environment.....45
Figure 3.18 The access time versus the data capacity for different ATU.....48
Figure 3.19 The proportion of the access time without ATU/ATU versus the different data capacity.....49
[1] John. L. Hennessy, and David A.Patterson,“ Computer Architecture – A Quantitative Approach,” Morgan Kaufmann, 3rd edition.

[2] Scott Rixner, “ Stream Processor Architecture,” Kluwer Academic Publishers, Boston, MA, 2001.

[3] Scott Rixner, William J. Dally, Ujval J. Kapasi, Brucek Khailany, Abelardo Lopez-Lagunas, Peter Mattson, and John D. Owens “ A Bandwidth-Efficient Architecture for Media Processing,” Proceedings of the 31st Annual International Symposium on Microarchitecture, Nov. 30 - Dec. 2, 1998, Dallas, Texas, pp. 3-13.

[4] J. Draper, J. Chame, M. Hall, C. Steele, T. Barrett, J. LaCoss, J. Granacki, J. Shin, C. Chen, C. W. Kang, I. Kim, and G. Daglikoca “ The Architecture of the DIVA Processing-In-Memory Chip,” In Proceedings of the International Conference on Supercomputing, June, 2002.

[5] Khailany. B., Dally. W.J., Kapasi. U.J., Mattson, P.; Namkoong, J.; Owens, J.D.; Towles, B.; Chang, A.; Rixner, S, “ Imagine: Media Processing with Streams,” Micro, IEEE Volume 21, Issue 2, March-April 2001 Page(s):35 - 46 Digital Object Identifier 10.1109/40.918001

[6] Brucek Khailany, William J. Dally, Scott Rixner, Ujval J. Kapasi, Peter Mattson, Jinyung Namkoong, John D. Owens, and Brian Towles “ Imagine: Signal and Imagine Processing with Streams,” Hotchips 12, August 2000, Stanford, CA.

[7] Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter Mattson, and John D. Owens “ Memory Access Scheduling,” 27th Annual International Symposium on Computer Architecture, Vancouver, Canada, June 2000, pp. 128-138.

[8] DDR SDRAM controller MegaCore Function, http://www.altera.com

[9] Brucek Khailany, “ The VLSI Implementation and Evaluation of Area-and Energy-Efficient Streaming Media Processors,” Ph.D. dissertation, Stanford University, June 2003.
[10] Herming Chiueh, Draper J., Mediratta S., Sondeen J. “ The Address Translation Unit of the Data–Intensive Architecture (DIVA) System,” Solid-State Circuits Conference, 2002, ESSCIRC 2002, Proceedings of the 28th European 24-26 Sept. 2002 Page(s):767 – 770

[11] DDR SDRAM Memory Controller”, http://www.latticesemi.com

[12] M. Hall and C. Steele “ Memory Management in PIM-Based Systems,” In Proceedings of the Workshop on Intelligent Memory Systems, held in conjunction with Architectural Support for Programming Languages and Operating Systems, Boston, MA, Nov. 2000

[13] John. L. Hennessy, and David A.Patterson,“ Computer Organization & Design – The Hardware / Software Interface,” Morgan Kaufmann, 3rd edition.

[14] DDR SDRAM, http://www.tech-faq.com

[15] J. M. Rabaey, A.Chandrakasan, and B.Nikolic,“ Digital Integrated Circuits,” Prentice Hall, 2nd edition.

[16] CUPPU, VINODH, ET AL., “ A Performance Comparison of Contemporary DRAM Architectures,” In Proceedings of the International Symposium on Computer Architecture (May 1999), pp. 222-233.

[17] The device operations and timing block diagram of DDR SDRAM, http://www.samsung.com
第一頁 上一頁 下一頁 最後一頁 top