跳到主要內容

臺灣博碩士論文加值系統

(44.200.140.218) 您好!臺灣時間:2024/07/26 00:20
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李宛茜
研究生(外文):Wan-Chian Li
論文名稱:混合樹之間的複合距離之研究
論文名稱(外文):A Study on Compound Distance Problems between Two Mixture Trees
指導教授:阮夙姿
指導教授(外文):Justie Su-Tzu Juan
學位類別:碩士
校院名稱:國立暨南國際大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:88
中文關鍵詞:演化樹混合樹生物計算距離親緣關係樹
外文關鍵詞:phylogenetic treeevolutionary treemixture treedistancetree comparison
相關次數:
  • 被引用被引用:0
  • 點閱點閱:233
  • 評分評分:
  • 下載下載:10
  • 收藏至我的研究室書目清單書目收藏:0
演化樹是一種描述物種演化關係的樹狀圖,在圖上我們可以看出物種演化的過程及物種和物種間的祖先是哪一個物種。這樣的樹可以經由物種資訊計算得到,由電腦的高速計算我們可以得到大量計算出來的演化樹,樹狀圖的比較是測量兩樹狀圖的相似程度,這是在統計學及生物資訊上重要的議題。
混合樹是在2006年由Chen和Lindsay提出的重要演化樹。混合樹帶有比傳統的演化樹更多的資訊,包含突變時間、突變位置集合等。於2008年Lin及Juan提出了一個演算法去計算二棵混合樹之距離,但是由於這個演算法所計算的距離只考慮突變時間所造成的距離,因此我們在此提出演算法去考慮突變位置集合所造成的距離。這樣一來我們便可利用我們的方法及Lin及Juan的方法去計算二棵混合樹的突變時間及突變位置集合的完整距離- 複合距離。
在本論文中,我們共定義了三種以突變位置集合所造成的距離,分別稱之為突變距離、簡單突變距離、總合突變距離。首先定義突變距離,並發展一個O(n2MaxHeight(T1, T2))演算法,其中MaxHeight(T1, T2)代表T1與T2兩棵樹中高度較高的那棵樹的高度,但為要整合Lin及Juan所提出的演算法,因此另外用Lin及Juan所提出的演算法概念發展了一個O(n2MaxHeight(T1, T2))的演算法。接著改進這個演算法使其時間複雜度縮減為O(n2)。其次,為了可以更快速的計算出二棵混合樹的突變距離,我們定義另一個測量混合樹之距離方式-簡單突變距離,發展了一個O(nlogn)的演算法,並改進時間複雜度,另外提出一個O(n)的演算法。由於突變距離及簡單突變距離計算方式相似,並且我們認為每條葉節點到根節點之路徑上的每個點在每個位置發生突變的總次數,亦可以用來當作測量距離的資料,因此定義了第三種距離定義-總合突變距離,並發展一個O(n)演算法。
The evolutionary tree is an important topic in bioinformation. In 2006, Chen and Lindsay proposed a new method to build the mixture tree from DNA sequences. Mixture tree is an evolutionary tree, and it has two information. One of the information is time parameter, and the other is the set of mutated sites. In 2008, Lin and Juan proposed an algorithm to compute the distance between two mixture trees. Their algorithm computes the distance with only considering the time parameter between two mixture trees. In this work, we proposes three methods to measure the similarity of two mixture trees with considering the set of mutated sites and develop algorithms to compute the distance between two mixture trees.
In this thesis, we give three new definitions of distance, called mutated distance, simple mutated distance, and total mutated distance between two mixture trees by considering the set of mutated sites. For mutated distance, we give three algorithms to compute the mutated distance between two mixture trees. The time complexity are O(n2MaxHeight(T1, T2)), O(n2MaxHeight(T1, T2)) and O(n2), respectively. Where MaxHeight(T1, T2) is the maximum height of tree T1 and T2. For simple mutated distance, we design two algorithms. The time complexity are O(nlogn) and O(n), respectively. Finally, we think that the total mutated number of each site in every path between root and leaf can be used to compute the distance between two mixture trees, so we define the total mutated distance, and design an algorithm for total mutated distance in O(n).
Acknowledgement I
Abstract in Chinese II
Abstract III
Contents IV
List of Figures VI
List of Tables VII
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Preview This Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Preliminary 7
2.1 Some Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Useful Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Mutated Distance 13
3.1 The Definition of Mutated Distance . . . . . . . . . . . . . . . . . . . . 13
3.2 Basic Algorithm for Mutated Distance . . . . . . . . . . . . . . . . . . 16
3.2.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Modified Algorithm for Mutated Distance . . . . . . . . . . . . . . . . 22
3.3.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Improved Algorithm for Mutated Distance . . . . . . . . . . . . . . . . 29
3.4.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Simple Mutated Distance 36
4.1 The Definition of Simple Mutated Distance . . . . . . . . . . . . . . . 36
4.2 Basic Algorithm of Simple Mutated Distance . . . . . . . . . . . . . . . 38
4.2.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Improved Algorithm of Simple Mutated Distance . . . . . . . . . . . . 41
4.3.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Total Mutated Distance 45
5.1 The Definition and Algorithm for Total Mutated Distance . . . . . . . 45
5.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6 Conclusion 50
References 51
A Appendix 54
[1] M. A. Bender and M. Farach-Colton, “The lca problem revisited,” Latin American Theoretical Informatics, pp. 88–94, 2000.
[2] J. Bluis, D. Shin, and J. Bluis, “Nodal distance algorithm: calculating a phylogenetic tree comparison metric,” Proc. Third IEEE Symposium on Bioinformatics and Bioengineering (D. Shin, ed.), pp. 87–94, 2003.
[3] G. S. Brodal, R. Fagerberg, and C. N. S. Pedersen, “Computing the quartet distance between evolutionary trees in time O(nlogn),” Algorithmica, Vol. 38, No. 2, pp. 377–395, 2003.
[4] S.-C. Chen and B. G. Lindsay, “Building mixture trees from binary sequence data,” Biometrika, Vol. 93, No. 4, pp. 843–860, 2006.
[5] B. Dasgupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, “Proceedings of the dimacs workshop on discrete problems with medical applications, dimacs series in discrete mathematics and theoretical computer science,” American Mathematical Society, Vol. 55, pp. 125–143, 2000.
[6] R. P. Grimaldi, Discrete and Combinatorial Mathematics. Fifth Edition. Addison Wesley, July 2003.
[7] G.Valiente, Algorithms on Trees and Graphs. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2002.
[8] J. L. Kelley, General Topology I: Basic Concepts and Constructions Dimension Theory. Encyclopaedia of Mathematical Sciences, 1990.
[9] R. C. T. Lee, R. C. Chang, S. S. Tseng, and Y. T. Tsai, Introduction to the Design and Analysis of Algorithms. McGraw-Hill Education, 2005.
[10] M. L. Lesperance and J. D. Kalbeisch, “An algorithm for computing the nonparametric mle of a mixing distribution,” Journal of the American Statistical Association, Vol. 87, No. 417, pp. 120–126, Mar. 1992.
[11] C.-H. Lin and J. S.-Z. Juan, “Computing the mixture distance between mixture tree,” Proceedings of the 2008 international conference on bioinformatice computational
biology, Vol. I, pp. 98–103, 2008.
[12] C.-H. Lin, “A study on measuring distance between two mixture trees,” In Partial Fulfillment of the Requirements for the Degree of Master of Science, Department of Computer Science and Information Engineering National Chi Nan University, Puli, Nantou Hsien, Taiwan, Republice of China, Juan 2008.
[13] C. A. Meacham, G. F. Estabrook, and F. R. McMorris, “Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units,” Systematic Zoology,
Vol. 34, No. 2, pp. 193–200, 1985.
[14] S. T. Richard C. Deonier and M. S. Waterman, Computational Genome Analysis. Springer, 2005.
[15] D. F. Robinson and L. R. Foulds, “Comparison of phylogenetic trees,” Mathematical Biosciences, Vol. 53, No. 1–2, pp. 131–147, February 1981.
[16] S. Sahni, E. Horowitz, and S. Anderson-Freed, Fundamentals of data structures in C. Computer Science Press, 1993.
[17] N. Saitou and M. Nei, “The neighbor-joining method: a new method for reconstructing phylogenetic trees,” Mol Biol Evol, Vol. 4, No. 4, pp. 406–425, Jul 1987.
[18] M. A. Steel, “The maximum likelihood point for a phylogenetic tree is not unique,” Systematic Biology, Vol. 43, pp. 560–564, 1994.
[19] M. A. Steel and D. Penny, “Distributions of tree comparison metrics-some new results,” Systematic Biology, Vol. 42, No. 2, pp. 126–141, 1993.
[20] D. Tsur, “Faster algorithms for guided tree edit distance,” Information Processing Letters, Vol. 108, No. 4, pp. 251–254, 31 October 2008.
[21] G. Valiente, “A fast algorithmic technique for comparing large phylogenetic trees,” SPIRE, pp. 370–375, 2005.
[22] M. A. Weiss, Data Structures And Algorithm Analysis In C. Pearson Education Taiwan Ltd., 2005.
[23] D. B. West, Introduction to Graph Theory 2=e. Prentice Hall, 2001.
[24] W. T. Williams and H. T. Clifford, “On the comparison of two classifications of the same set of elements,” Taxon, Vol. 20, No. 4, pp. 519–522, Aug. 1971.
[25] J. Xu, Theory and Application of Graphs. Kluwer Academic, 2003.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top