跳到主要內容

臺灣博碩士論文加值系統

(34.204.181.91) 您好!臺灣時間:2023/09/28 09:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林文揚
研究生(外文):Lin,Wen-Yang
論文名稱:平行Cholesky分解考慮下稀疏矩陣之重排列
論文名稱(外文):Reorderings of Sparse Matrices for Parallel Cholesky Factorization
指導教授:陳俊良陳俊良引用關係
指導教授(外文):Chen,Chuen-Liang
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1994
畢業學年度:82
語文別:中文
論文頁數:150
中文關鍵詞:平行計算稀疏矩陣重排列矩陣分解
外文關鍵詞:Parallel ComputingSparse MatrixReorderingFactorization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:215
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
稀疏矩陣的重排列對矩陣分解的效能影響至鉅.一個好的重排列演算法不
僅僅是對循序計算機極為重要,同時也會導致較佳的負載分配,進而提高
平行計算機的效能.儘管長久以來,較少的插入元素一直是評斷循序排列
方法的優劣的標準;然而至今仍無一致的標準來評斷平行排列方法.回顧
過去關於平行排列的研究,大都以消去樹的高度作為衡量的標準.換言之
,較矮的樹優於較高的樹.然而隨著平行架構的日趨複雜及各種平行矩陣
分解法的相繼出現,要取得一適合各式情況的衡量標準似乎是極為困難.
任意使用消去樹的高度作為各種情況下的衡量標準是極危險的.由一些特
殊的例子顯示,消去樹的高度不僅無法反應真正的平行計算時間且忽略了
通訊所需的額外負擔.此外,它所獲致的速率增加比也遠低於所期望的值
.在這個論文中我們提出一個重排列衡量標準的組織架構.根據一些平行
架構考慮的因素,我們希望將各種平行稀疏矩陣分解法所需要花的時間降
至最低.因此我們提出一個適合各種衡量標準的重排列方法.其所產生的
重排列能保証平行矩陣分解所需的時間是所有重排列中最短的.從一些稀
疏矩陣的實驗結果進一步証明此重排列方法的可行性.
Reordering of sparse matrices is essential for good performance
on Cholesky factorizations. A good reordering algorithm not
only is crucial on a sequential computer, but also can lead to
a much better load balance on a parallel computer, and thus to
a dramatic increase in performance compared to a "naive"
ordering. Though low fill has been long served as an
appropriate criterion for measuring the quality of sequential
ordering methods, either in space or time, there is not yet an
agreed upon objective function for the parallel ordering
problem. Reviewing the previous works on parallel ordering
problem, most have used elimination tree height as the
criterion for comparing orderings; that is, with shorter trees
assumed to be superior to taller trees. But with more
complicated the parallel architectures have been evolving into
and quite a variety of parallel factorization algorithms have
been developed, it seems hardly to achieve a well agreed upon
criterion suitable for all situations. The discrepancy of
simply using elimination tree height as the criterion for
different situations is comprehensively illustrated with some
contrived examples. As the results shown, the height of
elimination tree not only inept to indicate the actual parallel
computing time, but also neglect the underlying communication
overhead. Additionally, the speedup it can achieve is far less
than the expected. The underlying idea in this dissertation is
that we propose a framework of reordering criteria aiming to
minimize the completion time of various parallel sparse
Cholesky factorizations with respect to some architecture
issues. Under the framework we have presented a general
reordering scheme which yields an equivalent reordering that
always guarantees minimum completion time among all equivalent
reorderings of the filled graph. Empirical results on a
careful selection from the Harwell-Boeing sparse matrices
collection further assent the attraction of this reordering
scheme.
Cover
Contents
Acknowledgments
Abstract
Abstract in Chinese
Chapter 1 Introduction
1.1 Direct Solution of Sparse Linear Systems
1.1.1 Symbolic Factorization
1.1.2 Task Scheduling
1.1.3 Triangular Solution
1.2 Some Graph Theory Notations
1.3 Cholesky Factorization
1.4 Organization of this Disseration
Chapter 2 Overview of Ordering Problem
2.1 Perfect Ordering and Equivalent Reordering
2.2 Ordering for Sequential Factorization
2.3 Ordering for Parallel Factorization
2.4 Previous Related Work
Chapter 3 Criteria for Reorderings
3.1 A Reordering Framework
3.2 Evaluation Functions of Ordering Criteria
3.3 Parallel Completion Time Criterion
3.4 Consideration on Insufficient Bankwidth
3.5 Comparative Analysis of Parallel Sparse Cholesky Factorizations
3.5.1 Theoretical Results
3.5.2 An example
3.6 Deficiencies of Elimination Tree Height Criterion
3.6.1 Contrived Examples
3.6.3 Comparison with Minimum Height Criterion
Chapter 4 A Generic Algorithm for Optimal Reorderings
4.1 General Description
4.2 Minimum Completion Time Property
4.3 Combination of Criteria
4.4 Detailed Discussions
4.4.1 Representation of the Filled Graph
4.4.2 Implementation of the Simplicial Set
4.4.3 Calculation of the Completion Time
4.4.4 Detailed Description and Complexity Analysis
4.4.5 Improvement by Simplicial Clique Concept
Chapter 5 Implementation and Experimental Results
5.1 Test Data Statistics
5.2 Comparison and Analysis
5.2.1 Computation Issue
5.2.2 Communication Issue
5.3 Summary
Chapter 6 Conclusions
6.1 Effects of Task Granularity
6.2 More on the Two-Phase Ordering Approach
Bibliography
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top