跳到主要內容

臺灣博碩士論文加值系統

(44.200.140.218) 您好!臺灣時間:2024/07/26 00:12
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳哲豪
研究生(外文):Che-Hao Wu
論文名稱:網格化系統開發且應用於高效率建構演化樹機制之研究
論文名稱(外文):An Effective Approach for Constructing the Phylogenetic Tree on a Grid-Based Architecture
指導教授:劉興民
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:英文
中文關鍵詞:生物資訊網格運算演化樹
相關次數:
  • 被引用被引用:0
  • 點閱點閱:234
  • 評分評分:
  • 下載下載:10
  • 收藏至我的研究室書目清單書目收藏:2
網格計算是當今所提出的一個新概念,其概念就是結合網路上分散的電腦資
源、儲存裝置及其他資源,來處理需要巨量運算資源的龐大問題,它的出現對於
一些需高計算量、計算時間長的系統及應用程式帶來很大的好處。過去幾年來,
網格計算逐漸受到學術研究社群的注意,高能物理、生物資訊以及資料視覺化等
領域皆積極探究網格技術的應用。目前生物資訊專家則正發展生物網格技術,以
探究並解決長久以來生物學中所面臨到大量科學運算及資料儲存的挑戰。
在生物資訊領域中,科學家經常利用物種的相關資訊來推斷其演化關係,並
以二元樹的型式來加以表示,我們稱這棵二元樹為演化樹或親緣樹。針對一堆物
種來建造演化樹的問題通常稱為演化問題(或親緣問題),這類問題的困難處在於
可能存在演化樹的數量是非常巨大的,隨著物種的增加,徹底列舉出所有可能演
化關係將是非常困難的一件事。因此發展一套更快速的方法來建造演化樹是有其
必要性的。
在這篇論文中,我們試著研究、探討一套基於quartet 概念且結合網格計算
環境的建構演化樹方法。我們的方法包含了幾個步驟,首先建立一個相似矩陣,
記錄任兩物種間的演化關係,並根據此相似矩陣及computer node(CU)的運算能
力來分配物種以期望能快速找到候選的(candidate) quartet,每個CU 將會進行相
容性(compatible)檢查以獲得相容的quartet,然後使用評估函式(evaluate function)
來決定下一輪廻的種子(quartet seed),並打散其他不是種子的quartet 重新分配給
新的CU。如此,在下一輪廻中每個CU 將擁有種子及其他的quartet 以重新進行
運算。不斷重覆著這些步驟,直到我們找到最後的compatible quartet,並利用其
所得資訊來建構演化樹。
在論文研究的過程中,我們除了設計一套有效率的建構演化樹方法外,也發
展出一個以網格計算為基礎的環境,並將其應用到需要高計算量的演化問題。在
未來,我們相信這個系統也能擴展到其他生物資訊領域的應用上,以期更有效率
地進行資料的蒐尋、整合、分析及儲存

1 Introduction 1
1.1 ResearchMotivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Significance of theWork . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Approach Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work 7
2.1 Methods for Phylogeny Problem . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Distance BasedMethod . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Character BasedMethod . . . . . . . . . . . . . . . . . . . . . 9
2.2 Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Concepts of Grid . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Grid Based Systems . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 BioGrid Related Systems . . . . . . . . . . . . . . . . . . . . . 13
3 Approaches 15
3.1 Grid Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.1 The Globus Project . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.2 The Globus Toolkit . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.3 Resource and DataManagement . . . . . . . . . . . . . . . . . 16
3.1.4 Information Service . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.5 Grid Security Infrastructure . . . . . . . . . . . . . . . . . . . 18
3.2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Architectural Design . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Architectural Components . . . . . . . . . . . . . . . . . . . . 21
3.3 Relevant Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.1 Splits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.2 Quartet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.1 Creation of SimilarityMatrix . . . . . . . . . . . . . . . . . . 25
3.4.2 Decomposition of the Original Problem . . . . . . . . . . . . . 26
3.4.3 Selection of Four Species for Quartet . . . . . . . . . . . . . . 29
3.4.4 Inferring Quartet Topology . . . . . . . . . . . . . . . . . . . . 30
3.4.5 Tree Construction Algorithm . . . . . . . . . . . . . . . . . . 33
3.4.6 Evaluate Function . . . . . . . . . . . . . . . . . . . . . . . . 34
4 Implementation and Experiments 40
4.1 Implementation of Grid System . . . . . . . . . . . . . . . . . . . . . 40
4.1.1 Backend Computing . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.2 Middle Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.3 Client Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Experiments and Performance Analysis . . . . . . . . . . . . . . . . . 46
4.2.1 Operating Environment of the Experiments . . . . . . . . . . 46
4.2.2 Description of Test Data . . . . . . . . . . . . . . . . . . . . . 47
4.2.3 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.4 CU Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.5 Parameter of Evaluate Function . . . . . . . . . . . . . . . . . 50
4.2.6 Quality of the Result of Constructed Tree . . . . . . . . . . . 51
4.2.7 Comparison withML . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.8 Summary of Experiments . . . . . . . . . . . . . . . . . . . . 53
5 Conclusions 67

[1] Richa Agarwala and David Fernandez-Baca. A polynomial-time algorithm for
the perfect phylogeny problem when the number of character states is xed. In
IEEE Symposium on Foundations of Computer Science, pages 140—147, 1993.
[2] Hans Oisson Andrzej Lingas and Anna Ostlin. Ecient merging and construction
of evolutionary trees. In Journal of Algorithms, volume 41, pages 41—51, 2001.
[3] The Asia Pacic BioGRID Initiative. http://www.apbionet.org/apbiogrid/.
[4] H. J. Baudelt and A. Dress. Reconstructing the shape of a tree from observed
dissimilarity data. In Advances in Applied Mathematics, number 7, pages 309—
343, 1986.
[5] Vincent Berry, Tao Jiang, Paul E. Kearny, Ming Li, and ToddWareham. Quartet
cleaning: improved algorithms and simulations. In European Symposium on
Algorithms, pages 313—324, 1999.
[6] BioGRID. http://biogrid.icm.edu.pl/.
[7] The Canadian BioGrid. http://www.cbr.nrc.ca/.
[8] The Biomedical Informatics Research Network. http://www.nbirn.net/.
[9] David Bryant. Building trees, hunting for trees, and comparing trees.
Ph.D.Thesis. University of Canterbury, NZ., 1997.
[10] David Bryant and Mike Steel. Constructing optimal trees from quartets. In
Journal of Algorithms, volume 38, pages 237—259, 2001.
[11] L. Carmel, Y. Koren, and D. Harel. Visualizing and classifying odors using a similarity
matrix. In the 9th International Symposium on Olfaction and Electronic
Nose (ISOEN02), 2002.
[12] Damon Shing Min Liu Chuen Jr Shiu. A generic functional architecture for the
development of compute-intensive environments: a case study in the use of Grid.
In Master thesis, National Chung Cheng University, 2003.
[13] Mary Cryan, Leslie Ann Goldberg, and Paul W. Goldberg. Evolutionary trees
can be learned in polynomial time in the two-state general markov model. In
IEEE Symposium on Foundations of Computer Science, pages 436—445, 1998.
[14] M. Csuros. Fast recovery of evolutionary trees with thousands of nodes. In
RECOMB, 2001.
[15] P. Erdos, M. Steel, L. Sz, and T. Warnow. Local quartet splits of a binary tree
infer all quartet splits via one dyadic inference rule. In Computers and Articial
Intelligence, pages 217—227, 1997.
[16] Peter L. Erdos, Kenneth Rice, Michael A. Steel, Laszlo A. Szekely, and Tandy J.
Warnow. The short quartet method. 1997.
[17] J. Felsenstein. Evolutionary trees from DNA sequences: a maximum likelihood
approach. In J. Molecular Evolution, number 17, pages 368—376, 1981.
[18] J. Felsenstein. Numerical methods for inferring evolutionary trees. In Quarterly
Review of Biology, number 57, pages 379—404, 1982.
[19] Nir Friedman, Matan Ninio, Itsik Peter, and Tal Pupko. A structural EM algorithm
for phylogenetic inference. volume 9, pages 331—353, 2002.
[20] The Globus Project Web Page. http://www.globus.org.
[21] Jens Gramm and Rolf Niedermeier. Minimum quartet inconsistency is xed
parameter tractable. volume 2089, pages 241—271, 2001.
[22] D. Huson. Splitstree: A program for analyzing and visualizing evolutionary data.
In CABIOS, 1997.
[23] Tao Jiang, Paul E. Kearney, and Ming Li. Orchestrating quartets: approximation
and data correction. volume 4, pages 416—425, 1998.
[24] Katherine St. John, Tandy Warnow, Bernard M.E. Moret, and Lisa Vawter.
Performance study of phylogenetic methods: (unweighted) quartet methods and
neighbor-joining. In ACM-SIAM Symposium on Discrete Algorithms (SODA),
2001.
[25] Je A. Jones and Katherine A. Yelick. Parallelizing the phylogeny problem.
1995.
[26] I. Foster K. CZajkowski, S. Fitzgerald and C.kesselman. Grid information services
for distributed resource sharing. In In proceedings of the Tenth IEEE International
Symposium on High-Performance Distributed Computing(HPDC-10),
pages 181—194, 2001.
[27] Kannan and Warnow. A fast algorithm for the computation and enumeration
of perfect phylogenies when the number of character states is xed. In SODA:
ACM-SIAM Symposium on Discrete Algorithms (A Conference on Theoretical
and Experimental Analysis of Discrete Algorithms), 1995.
[28] Paul E. Kearney. The ordinal quartet method. In RECOMB, pages 125—134,
1998.
[29] J. Kim and T. Warnow. Tutorial on phylogenetic tree estimation. 1999.
[30] R.C.T. Lee. Computational biology. Department of Computer Science and Information
Engineering, National Chi-Nan University, 2001.
[31] H. Matsuda. Protein phylogenetic inference using maximum likelihood with a
genetic algorithm. In Pacific Symposium on Biocomputing, pages 512—523, 1996.
[32] Bernard M. E. Moret, David A. Bader, and Tandy Warnow. High performance
algorithm engineering for computational phylogenetics. In Lecture Notes in Computer
Science, volume 2074, pages 1012—1021, 2001.
[33] The North Carolina BioGrid Project. http://www.ncbiogrid.org/.
[34] L. Szekely P. Erdros, M. Steel and T. Warnow. Constructing big trees from short
sequences. In Proceedings of the 24th International Colloquium on Automata,
Languages, and Programming, 1997.
[35] The EUROGRID project. http://www.eurogrid.org/.
[36] N Saitou and M Nei. The neighbor-joining method: a new method for reconstructing
phylogenetic trees. In Quarterly Review of Biology, number 4, pages
406—425, 1987.
[37] L. A. Salter. Algorithms for phylogenetic tree reconstruction. In Proceeding
of the International Conference on Mathematics and Engineering Techniques in
Medicine and Biological Sciences, volume 2, pages 459—465, 2000.
[38] K. Strimmer and A. von Haeseler. Quartet puzzling: a quartet maximumlikelihood
method for reconstructing tree topologies. In Molecular Biology and
Evolution, volume 13(7), pages 964—969, 1996.
[39] Paul kearney Tao Jiang and Ming Li. A polynomial time approximation scheme
for inferring evolutionary trees from quartet topologies and its application. In
SIAM Journal on Computing, volume 30, pages 1942—1961, 2001.
[40] C. Tuey and M. Steel. Links between maximum likelihood and maximum
parsimony under a simple model of site substitution. In Bulletin of Mathematical
Biology, pages 581—607, 1997.
[41] Z. Yang. Evaluation of several methods for estimating phylogenetic trees when
substitution rates dier over nucleotide sites. volume 40, pages 689—697, 1995.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top