跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.84) 您好!臺灣時間:2024/12/05 02:17
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳東賢
研究生(外文):Tony Wu
論文名稱:適用於多處理機系統內接網路之新匯流排配置演算法
論文名稱(外文):A New Bus Allocation Algorithm for Interconnection Networks of Multiprocessor Systems
指導教授:王國禎
指導教授(外文):Kuochen Wang
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1999
畢業學年度:87
語文別:英文
論文頁數:42
中文關鍵詞:仲裁者匯流排配置內接網路多處理機系統
外文關鍵詞:arbiterbus allocationinterconnection networkmultiprocessor system
相關次數:
  • 被引用被引用:0
  • 點閱點閱:101
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在共享記憶體多處理機系統中,內接網路往往是系統效能的瓶頸。
記憶體參考通常有兩種區域性:時間區域性和空間區域性。如果處
理機參考一個記憶體模組,它傾向於再次參考相同的記憶體模組。
如果我們不馬上釋放連接處理機和記憶體模組的匯流排,當此處理
機再次參考相同的記憶體模組時,我們可以直接使用這匯流排而不
需重新設定。因此,我們提出一個適用於多處理機系統內接網路的
新匯流排配置演算法。我們用一個基本上屬於多匯流排架構的疊流
式單邊縱橫交換鍵來說明我們的設計方法。我們採用連接表來記錄
每一個處理機、每一個記憶體模組、和每一條匯流排的狀況。當一
個交易被選上以便執行時,如果處理機和記憶體模組沒有被連接,
我們選擇一條匯流排並重新設定疊流式交換鍵以處理這個交易。如
果該處理機和記憶體模組已經連接,我們直接使用相同的匯流排而
不用重新設定。實驗數據顯示本新演算法能夠更有效地使用匯流排
並減少重新設定的次數。使用新匯流排配置演算法的效能比原先高
了1.5到3倍。此外,我們使用Verilog硬體描述語言及Xilinx的
FPGA分別描述及實現了一個2×2疊流式單邊縱橫交換鍵。Verilog
模擬的結果驗證了疊流式設計的功能。我們並使用Aptix MP3A
FPCB(現場可程式化電路板)與一些電子元件分別來模擬疊流式交
換鍵以及處理機與記憶體模組的行為。本研究的貢獻是設計出一個
高產量的內接網路以配合高效能的處理機,從而消除效能瓶頸。

In shared multiprocessor systems, the interconnection network
is usually the bottleneck of system performance. Memory
references usually have two kinds of locality: temporal
locality and spatial locality. If a processor references a
memory module, it tends to reference the same memory module
again. If we do not release a bus that connect the processor
and the memory module immediately, we can use the same bus
directly without reconfiguration when the processor references
the same memory module again. Thus, we propose a new bus
allocation algorithm for interconnection networks of
multiprocessor systems. A pipelined one-sided crossbar switch,
which is essentially a multiple bus, is used to illustrate our
design approach. We use connection tables to record the states
of each processor, each memory module, and each bus. When a
transaction is selected to be issued, if the processor and the
memory module are not connected, we select a bus and
reconfigure the pipelined switch to proceed the transaction.
If the processor and the memory module have already been
connected, we use the same bus directly without reconfiguration.
Experimental results show that the new algorithm use buses more
efficiently and reduce the number of reconfigurations. The
performance (throughput) using the new bus allocation algorithm
is 1.5 to 3 times higher than that using the original algorithm.
In addition, we have described and realized a 2 ×2 pipelined
one-sided crossbar switch using Verilog HDL and Xilinx FPGAs,
respectively. Verilog simulation results have validated the
functionality of the pipelined design. We also use Aptix MP3A
FPCB (Field Programmable Circuit Board) and some electronic
components to emulate the pipelined switch and the behavior
of the processors and the memory modules, respectively.
The contribution of this work is designing a high throughput
interconnection network to match high performance
multiprocessors and to eliminate the performance bottleneck.

Abstract (in Chinese) i
Abstract (in English) iii
Acknowledgements v
Table of Contents vi
List of Figures viii
1 Introduction 1
1.1 Interconnection Networks in Multiprocessor Systems 1
1.2 Locality 2
1.3 FPGA 3
1.4 FPCB 3
2 Existing Approach 5
2.1 Pipelined Protocol 6
2.2 One-Sided Crossbar Switch 7
2.3 Arbiter 9
2.4 Processor Interface 11
2.5 Memory Interface 14
3 Design Approach 17
3.1 Connection Table 17
3.2 New Bus Allocation Algorithm 18
3.3 An Example 19
4 Evaluation and Discussion 26
5 FPGA and FPCB Implementation 33
5.1 Design and Implementation Flow 33
5.2 Verilog Simulation 34
5.3 Implementation 35
6 Conclusions and Future Work 40
6.1 Conclusions 40
6.2 Future Work 41
Bibliography 42

[1] K. Hwang, "Advance Computer Architecture : Parallelism,
Scalability, Programmability," McGraw-Hill, 1993.
[2] L. Hammond, B. A. Nayfeb, and K. Olukotun, "A
Single-Chip Multiprocessor," IEEE Computer, pp. 79-85, Sep. 1997.
[3] F. Pong, M. Browne, A. Nowatzyk, and M. Dubois, "Design
Verification of the S3.mp Cache-Coherent Shared-Memory System,"
IEEE Tran. on Computers, pp. 135-140, Jan. 1998.
[4] J. L. Hennessy and D. A. Patterson, "Computer Architecture:
A Quantitative Approach, Second Ed.," Morgan Kaufmann Publishing
Company, 1996.
[5] W.-J. Hahn, K.-W. Rim, and S.-W. Kim, "SPAX: a New
Parallel Processing System for Commercial Applications," in Proceedings
11th International Parallel Processing Symp., Apr. 1997, pp. 744-749.
[6] S. Brown and J. Rose, "FPGA and CPLD Architectures: a
Tutorial," IEEE Design and Test of Computers, pp. 42-57, June 1996.
[7] V. Betz and J. Rose, "How Much Logic Should Go in an FPGA
logic Block?" IEEE Design and Test Computers, pp. 10-18, June 1998.
[8] T. Lang, M. Veloro, and M. A. Fiol, "Bandwidth of Crossbar
and Multi Bus Connections for Multiprocessors," IEEE Trans. on
Computers, vol. 31, no. 12, pp. 1227-1234, Dec. 1982.
[9] K. Hwang and F. A. Briggs, "Computer Architecture and Parallel
Processing," McGraw-Hill, 1984.
[10] A. Varma, C. J. Ceorgious, and J. Ghosh, "Rearrangeable
Operation of Large Crosspoint Networks," IEEE Trans. on Communications,
vol. 38, no. 9, pp. 1616-1624, Sep. 1990.
[11] C. J. Georgiou, "Fault-Tolerant Crosspoint Switching
Network," in Proceedings of the 14th Int. Fault-Tolerant Computing,
July 1984, pp. 240-245.
[12] A. Varma and S. Chalasani, "Fault-Tolerance Analysis of
One-Sided Crossbar Switch Networks," IEEE Trans. on Computers,
vol. 41, no. 2, pp. 143-158, Feb. 1992.
[13] K. Wang and C. K. Wu, "Design and Simulation of
Fault-Tolerant Crossbar Switches for Multiprocessor Systems,"
IEEE Proceedings - Computers and Digital Techniques, 1997.
[14] K. Wang and A. Y. Liu, "HDL Design and FPGA
Implementation of a Pipelined One-Sided Crossbar Switch for Multiprocessor
Systems," in Proceedings of the 9th VLSI/CAD Symposium,
pp. 419-244.
[15] Intel Corp., "Pentium Pro Family Developer's Manual, Volume 1:
Specifications," 1997.
[16] K. Wang and Y. H. Hsiao, "A High Performance Pipelined
One-sided Crossbar Switch for Multiprocessor Systems," in Proceedings
of the 1998 International
Conference on Chip Technology, Apr, 1998, pp.264-269.
[17] Synopsys Inc., "Synopsys FPGA Express User Guide," 1997.
[18] Xilinx Inc., "Xilinx XACT Step M1 Foundation series User Guide,"
1997.
[19] Xilinx Inc., "XC4000E and XC4000X FPGA Series - Description,"
1999.
[20] Aptix Inc., "Aptix MP3 System Explorer User's Manual," 1997.
[21] H. C. Hsiao and C. T. King, "Performance Evaluation of
Cache Depot on CC-NUMA Multiprocessors," in Proceedings of the
1998 International
Conference on Parallel and Distributed Systems, Dec. 1998, pp. 519-526.
[22] J. Carter, C. C. Kuo, R. Kuramkote, and M. Swanson,
"Design Alternatives for Shared Memory Multiprocessors,"
in Proceedings of the International Conference on High
Performance Computing, Dec. 1998., pp.41-50.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top