跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.90) 您好!臺灣時間:2024/12/05 19:14
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:甘宗左
研究生(外文):Tsung-Tso Kan
論文名稱:稀疏多鋒Cholesky分解之處理器配置與工作排程
論文名稱(外文):The Processor Allocation and Task Scheduling for Sparse Multifrontal Cholesky Factorization
指導教授:陳俊良陳俊良引用關係
指導教授(外文):Chuen-Liang Chen
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:中文
論文頁數:111
中文關鍵詞:稀疏矩陣多鋒方法分散式記憶體系統平行虛擬機器處理器配置工作排程Cholesky分解
外文關鍵詞:sparse matrixmultifrontal methoddistributed memory environmentPVMprocesssor allocationtask schedulingCholesky factorization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:274
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在本論文中,我們討論了稀疏多鋒Cholesky分解在共享式與分散式記憶體環境之處理器配置與工作排程問題。雖然過去也有許多學者做過這方面的研究,可是他們都只對針對處理器配置的考慮,而忽略了工作排程的問題。
我們在對稀疏多鋒Cholesky分解的研究中發現,平行的Cholesky分解運算是與消去樹的結構息息相關的。過去有些研究是針對平衡的消去樹作處理器配置的考慮,但是,在現實的應用中,消去樹絕大部分是不平衡的。另外,過去的研究都強調工作量的平衡,因此都採取一種稱為循環配置(WRAP)的策略,因此造成許多不必要的通訊。同時,我們發現,對於稀疏多鋒Cholesky分解的問題,其平行度通常並不高,因此是否用很多的電腦來解平行度不高的問題是否恰當也是值得考量的。因為有許多學者的研究中發現,在分散式的環境中,用較少的處理器有時可以有與用較多的處理器來求解的平行執行時間一樣,甚至有可能更少。
我們針對稀疏多鋒Cholesky分解在共享式與分散式記憶體環境之處理器配置與工作排程問題提出一個新的方法,稱為足夠小的子樹配置法(small-enough sub-tree allocation method, SE method)。我們的方法,不僅能讓稀疏多鋒Cholesky分解在共享式與分散式記憶體環境執行時,能比過去的的方法能以較少的處理器個數就達到接近最佳的執行時間,而且也避免了許多不必要的通訊傳輸。
我們在以8部PC執行Windows 95與PVM並以10M Ether NET的網路相連結起來的環境上實作了過去與我們的方法,同時以同一個平行稀疏多鋒Cholesky分解程式執行。在我們的實驗中發現,與過去的方法比較,我們的處理器配置與工作排程方法可以讓平行稀疏多鋒Cholesky分解程式執行得最好。
In this dissertation, we discuss the parallel sparse multifrontal Cholesky factorization problem for both shared and distributed memory multiprocessor systems. There have already been several articles discussing the same topic, most of them focusing upon the processor allocation without considering task scheduling.
We discuss not only the processor allocation but also the task scheduling of the parallel sparse multifrontal Cholesky factorization problem. Parallel Cholesky factorization is strongly related with the structure of the elimination tree, which is usually unbalanced. Some existing methods only address balanced elimination trees. It does not enough, so we propose our method suitable for both balanced and unbalanced trees, named the small-enough sub-tree allocation method. Meanwhile, there are many useful properties associated with elimination tree, such as node computation and communication costs. It provides more information for exploiting the parallelism during scheduling. Previous methods only use a few properties, but our method has more through using all the properties.
A PVM enabled distributed memory environment was built and a real parallel sparse multifrontal Cholesky factorization solver was implemented. In such an environment, we compare several existing processor allocation and task scheduling methods with ours. Our experiments conducted shows that our method is the best.
COVER
TABLE OF CONTENTS
LIST OF TABLES
LIST OF FIGURES
ABSTRACT
CHAPTER 1 INTRODUCTION
1.1 LINEAR SYSTEM
1.2 DIRECT SOLUTIONS OF SPARSE LINEAR SYSTEMS
1.2.1 ORDERING FOR SPARSE MATRICES
1.2.2 SYMBOLIC FACTORIZATION
1.2.3 TRIANGULAR SOLUTION
1.3 SOME GRAPHIC THEORY NOTATION
1.4 MULTIFRONTAL APPROACH
1.5 PROCESSOR ALLOCATION AND TASK SCHEDULING
1.6 PVM: PARALLEL VIRTUAL MACHINE
1.7 THE GOAL AND THE OVERVIEW
CHAPTER 2 PREVIOUS METHODS
2.1 COST EVALUATION
2.2 THE SPARSE-WRAP ALLOCATION METHOD (SW)
2.3 THE SUB-TREE TO SUB-CUBE ALLOCATION METHOD (SS)
2.4 THE BIN-PACK ALLOCATION METHOD (BP)
2.5 THE PROPORTIONAL ALLOCATION METHOD (PR)
2.6 DISCUSSION
CHAPTER 3 THE SMALL-ENOUGH SUB-TREE ALLOCATION
3.1 INTRODUCTION
3.2 THE ALLOCATION METHOD
3.2.1 THE SMALL-ENOUGH SUB-TREE ALLOCATION METHOD
3.2.2 DISTRIBUTED MEMORY VERSION OF THE SE METHOD
3.2.3 SHARED MEMORY VERSION OF THE SE METHOD
3.3 A CASE STUDY
3.3.1 DISTRIBUTED MEMORY ENVIRONMENT
3.3.2 SHARED MEMORY ENVIRONMENT
3.4 DISCUSSION
CHAPTER 4 THE EXPERIMENTS
4.1 A SET OF TEST MATRICES
4.2 THE PVM ON WINDOWS 95/98/NT
4.3 EXPERIMENTAL APPROACH
4.4 VALIDITY OF ESTIMATION MODEL
4.5 REMARKS
CHAPTER 5 THE COMPARISON
5.1 COMMUNICATION FREE ENVIRONMENT
5.2 COMPARISON OF THE COMMUNICATION OVERHEAD
5.3 COMPARISON ON DIFFERENT COMMUNICATION ENVIRONMENT
CHAPTER 6 CONCLUSION
6.1 SUMMARY
6.2 FUTURE EFFORTS
REFERENCES
APPENDIX A - CURRENT LINEAR ALGEBRA PACKAGES
APPENDIX B - COMMUNICATION FREE RESULTS
APPENDIX C - DISTRIBUTED MEMORY RESULTS
[1] J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. van der Vorst, Solving linear systems on vector and shared memory computers, (SIAM, Philadelphia ,1991).
[2] Y. Saad, Iterative methods doe sparse linear systems, (PWS Publishing Company, USA, 1996).
[3] P. R. Amestoy, Factorization of large unsymmetric sparse matrices based on a multifrontal approach in a multiprocessor environment, Ph.D. Dissertation, L’Institut National Polytechnique de Toulouse, Toulouse, France, 1990.
[4] I. S. Duff, A. M. Erisman, and J. K. Reid, Direct Methods for Sparse Matrices, (Oxford Science Publications, New York, 1989).
[5] I. S. Duff, Direct Methods, Technical Report TR/PA/98/28, CERFACS, Toulouse, 1998.
[6] M. T. Heath, E. G. Ng, and B. W. Peyton, Parallel algorithms for sparse linear systems, SIAM Review 33 (1991) 420-460.
[7] I. S. Duff, Sparse matrices and their uses (Academic Press, New York and London, 1981).
[8] I. S. Duff, R. G. Grimes, and J. G. Lewis, Users'' guide for the Harwell-Boeing sparse matrix collection (Release I), Technical Report TR/PA/92/86, Computer Science and Systems Division, Harwell Laboratory, Oxon, U.K., 1992.
[9] J. A. George and J. W. H. Liu, Computer solution of large sparse positive definite systems (Prentice-Hall, Englewood Cliffs, NJ, 1981).
[10] J. A. George, M. T. Heath, J. W. H. Liu, and E. G. Ng, Solution of sparse positive definite systems on a shared-memory multiprocessor, International Journal of Parallel Programming 15 (1986) 309-325.
[11] J. W. H. Liu, Modification of the minimum degree algorithm by multiple elimination, ACM Transactions on Mathematical Software 11 (1985) 141-153.
[12] J. A. George and J. W. H. Liu, An automatic nested dissection algorithm for irregular finite element problems, SIAM Journal on Numerical Analysis 15 (1978) 1053-1069.
[13] M. Raghavachari and A. Rogers, A Study of the Effects of Ordering, Partitioning and Factorization Algorithms on Distributed Sparse Cholesky Factorization, Technical Report CS-TR-505-96, Department of Computer Science, Princeton University, Princeton, NJ, 1996.
[14] G. A. Geist and E. G. Ng, Task scheduling for parallel sparse Cholesky factorization, International Journal of Parallel Programming 18 (1989) 291-314.
[15] J. A. George and J. W. H. Liu, An optimal algorithm for symbolic factorization of symmetric matrices, SIAM Journal on Computing 9 (1980) 583-593.
[16] W. Y. Lin and C. L. Chen, Minimum completion time reordering for parallel sparse Cholesky factorization, in: Proceedings of the ISCA 1993 International Conference on Parallel and Distributed Computing Systems, (Louisville, KY, USA, 1993) 339-343.
[17] W. Y. Lin, Reordering of sparse matrices for parallel Cholesky factorization, Ph.D. Dissertation, Department of Computer Science and Information Engineering, National Taiwan University, 1994.
[18] W. Y. Lin and C. L. Chen, Minimum communication cost reordering for parallel sparse Cholesky factorization, Parallel Computing 25 (1999) 943-967.
[19] G. Alaghband and R. Schreiber, Optimal parallel solution of sparse triangular systems, SIAM Journal on Scientific Computing 14 (1993) 446-460.
[20] E. Anderson and Y. Saad, Solving sparse triangular linear systems on parallel computers, International Journal of High Speed Computing 1 (1989) 73-95.
[21] J. A. George, M. T. Heath, J. W. H. Liu, and E. G. Ng, Solution of sparse positive definite systems on a hypercube, Journal of Computational and Applied Mathematics 27 (1989) 129-156.
[22] S. Parter, The use of linear graphs in Gaussian elimination, SIAM Review 3 (1961) 364-369.
[23] D. J. Rose, Triangulated graphs and the elimination process, Journal of Mathematical Analysis and Applications 32 (1970) 597-607.
[24] R. Schreiber, A new implementation of sparse Gaussian elimination, ACM Transactions on Mathematical Software 8 (1982) 256-276.
[25] J. W. H. Liu, The role of elimination trees in sparse factorization, SIAM Journal on Matrix Analysis and Applications 11 (1990) 134-172.
[26] I. S. Duff and J. K. Reid, The multifrontal solution of indefinite sparse symmetric linear equations, ACM Transactions on Mathematical Software 9 (1983) 302-325.
[27] I. S. Duff, Parallel implementation of multifrontal schemes 3 (1986) 193-204.
[28] I. S. Duff, Multiprocessing a sparse matrix code on the Alliant FX/8, Journal of Computational and Applied Mathematics 27 (1989) 229-239.
[29] I. S. Duff and J. K. Reid, The multifrontal solution on unsymmetric set of linear equations, SIAM Journal on Scientific and Statistical Computing 5 (1984) 633-641.
[30] I. S. Duff, N. I. M. Gould, M. Lescrenier, and J. K. Reid, The multifrontal method in a parallel environment, Technical Report CSC211, Computer Science and System Division, AERE Harwell 1987.
[31] I. S. Duff, Parallel algorithms for general sparse systems, in: Computer algorithms for solving linear algebraic equations - the state of the art, (NATO ASI Series, 1989) 277-297.
[32] S. M. Hadfield, On the LU factorization of sequences of identically structured sparse matrices within a distributed memory environment, Ph.D. Dissertation, Department of Computer and Information Sciences, University of Florida, 1994.
[33] J. W. H. Liu, Computational models and task scheduling for parallel sparse Cholesky factorization, Parallel Computing 3 (1986) 327-342.
[34] J. A. George, M. T. Heath, J. W. H. Liu, and E. G. Ng, Sparse Cholesky factorization on a local-memory multiprocessor, SIAM Journal on Scientific and Statistical Computing 9 (1988) 327-340.
[35] J. A. George, J. W. H. Liu and E. G. Ng, Communication results for parallel sparse Cholesky factorization on a hypercube, Parallel Computing 10 (1989) 289-297.
[36] E. Zmijewski and J. Gilbert, A parallel algorithm for sparse symbolic Cholesky factorization on a multiprocessor, Parallel Computing 7 (1988) 199-210.
[37] J. Dongarra and M. Fischer, Another Architecture: PVM on Windows 95/NT, in: Proceedings of the Cluster Computing Conference - CCC ''97 (Emory University Atlanta, Georgia, USA, 1997) 1-8.
[38] J. W. H. Liu, The multifrontal method for sparse matrix solution: theory and practice, SIAM Review, 34 (1992) 82-109.
[39] C. Ashcraft, R. Grimes, J. Lewis, B. Peyton, and H. Simon, Progress in sparse matrix methods for large linear systems on vector supercomputers, International Journal of Supercomputing Applications 1 (1987) 10-19.
[40] R. E. Benner, G. R. Montry and G. G. Weigand, Concurrent multifrontal methods: shared memory, cache, and frontwidth issues, International Journal of Supercomputing Applications 1 (1987) 26-44.
[41] J. K. Reid, TREESOLVE, a Fortran package for soling large sets of linear finite element equations, Technical Report CSS155, Computer Sciences and Systems Division, AERE Harwell, Oxfordshire, UK, 1984.
[42] P. R. Amestoy and R. Tilch, Solvingthe compressible Navier-Stokes equations with finite elements using a multifrontal method, Impact of Computing in Science and Engineering, 1 (1989) 93-107.
[43] A. R. Conn, N. I. M. Gould, M. Lescrenier and P. L. Toint, Performance of a multifrontal scheme for partially separable optimization, Technical Report 88/4, Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Ontario, Canada, 1989.
[44] R. Lucas, Solving planar systems of equations on distributed-memory multiprocessor, Ph.D. Dissertation, Department of Electrical Engineering, Stanford University, Stanford, CA, 1987.
[45] A. Pothen and C. Sun, A mapping algorithm for parallel sparse Cholesky factorization, SIAM Journal on Scientific and Statistical Computing 14 (1993) 1253-1257.
[46] A. Pothen and C. Sun, Distributed multifrontal factorization using clique trees, in: Proceedings of the Fifth Conference on Parallel Processing for Scientific Computing (SIAM, Philadelphia, 1992) 34-40.
[47] K. Hwang, Advanced computer architecture: parallelism, scalability, programmability (McGraw-Hill, Singapore, 1993).
[48] D. A. Carlson, Parallel processing of tree-like computations, in: Proceedings of the 4th International Conference on Distributed Computing Systems, (San Francisco, California, USA, 1984) 192-200.
[49] C. Y. Tang and R. C. T. Lee, Optimal speeding up of parallel algorithms based upon the divide-and-conquer strategy, Information Sciences 32 (1984) 173-186.
[50] Y. C. Chen, Z. C. Yeh and G. H. Chen, Using fewer processors to reduce time complexities of semigroup computations, Information Processing Letters 32 (1989) 89-93.
[51] D. A. Carlson, Performing tree and prefix computations on modified mesh-connected parallel computers, in: Proceeding of International Conference on Parallel Processing, University Park, Pa., USA, (IEEE Computer Society Press, August 1985) 715-718.
[52] R. Miller and Q. F. Stout, Varying diameter and problem size on mesh-connected computers, in: Proceeding of International Conference on Parallel Processing, University Park, Pa., USA, (IEEE Computer Society Press, August 1985) 697-699.
[53] I. S. Duff, R. G. Grimes, and J. G. Lewis, Sparse matrix test problems, ACM Transactions on Mathematical Software 15 (1989) 1-14.
[54] D. Morton, K. Wang, and D. O. Ogbe, Lessons learned in porting Fortran/PVM code to the Cray T3D, IEEE Parallel and Distributed Technology 3 (1995) 4-11.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top