(3.236.175.108) 您好!臺灣時間:2021/02/28 03:49
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:何文娟
研究生(外文):Wen-chuan Ho
論文名稱:小字符集的限制最長共同子序列之快速演算法
論文名稱(外文):A Fast Algorithm for the Constrained Longest Common Subsequence Problem with Small Alphabet
指導教授:楊昌彪楊昌彪引用關係
指導教授(外文):Chang-Biau Yang
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:89
中文關鍵詞:小字符集動態規劃相似度限制最長共同子序列最長共同子序列
外文關鍵詞:Small alphabetDynamic ProgrammingSimilarCLCSLCS
相關次數:
  • 被引用被引用:0
  • 點閱點閱:79
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:1
  • 收藏至我的研究室書目清單書目收藏:0
給定A與B兩條序列以及一條限制序列C,限制最長共同子序列問題即是找出A與B的最長共同子序列,而且此序列必須包含限制序列C。Chin等學者提出一個動態規劃的演算法來解決限制最長共同子序列問題,但是此方法需要計算完整的三維矩陣。我們觀察CLCS的三維矩陣發現,在小字符數的情況下,大多數的對應點在相鄰兩層的數值都相同,也就是說這些數值相同的點應該可以直接使用不需要重新計算。因此,在本論文中,我們定義需要計算以及不需要計算的點。在大多數的情況之下,我們只需要計算特定邊界點的數值,不需要計算完整的三維矩陣。但是,我們的演算法在最糟的情況之下仍然需要計算完整的三維矩陣,所以時間複雜度與空間複雜度分別需要O(mnr)以及O(mn)。根據Deorowicz and Obstój在2010年論文的實驗結果,在字符數小於等於20的時候,Chin的方法會有不錯的執行效率。而且,根據我們的實驗結果,在字符數小於等於20的時候,我們演算法的執行效率比Chin的方法更好。所以,我們的演算法在小字符數時會比現有的限制最長共同子序列演算法更有效率。
Given three sequences A, B and C with lengths of m, n and r, respectively,
the constrained longest common subsequence (CLCS) problem is to fi nd the LCS of
A and B which contains C as the subsequence of the answer. The dynamic program-
ming (DP) for solving CLCS, proposed by Chin et al., calculates two-dimensional
lattices layer by layer. We find that the values of most corresponding CLCS lattice
cells are identical in two consecutive layers when |∑| is small, where |∑| denotes the
alphabet size. In this thesis, we clarify which lattice cells need to be calculated, and
which lattices need not be calculated, since the calculation is redundant if two cells
are equal in two consecutive layers. Accordingly, our algorithm calculates only some
special boundary cells, instead of the whole 3-dimensional lattice in most cases. Our
algorithm still requires O(mnr) time and O(mn) space to solve the CLCS problem
in the worst case. In 2010, Deorowicz and Obst oj showed that the algorithm of Chin
et al. has good performance when |∑| ≤ 20. As our experimental results show, our
algorithm is faster than Chin''s algorithm when |∑| ≤ 20. So our algorithm is better
than most of the previous CLCS algorithms when |∑| is small.
學位論文審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
THESIS VERIFIVATION FORM . . . . . . . . . . . . . . . . . . . . . . ii
論文公開授權書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
謝辭. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 The Longest Common Subsequence Problem . . . . . . . . . . . . . . 6
2.2 The Edit Distance of Two Sequences . . . . . . . . . . . . . . . . . . 8
2.3 The Constrained Longest Common Subsequence Problem . . . . . . . 10
2.3.1 The Constrained Longest Common Subsequence Algorithm of Tsai . . . . . . . . . . 11
2.3.2 The Constrained Longest Common Subsequence Algorithm of Chin et al. . . . . . 12
2.3.3 The Constrained Longest Common Subsequence Algorithm of Deorowicz and Obst oj . . 16
Chapter 3. Our Algorithm for the Constrained Longest Common Subsequence Problem . . . . . 19
3.1 The Constrained Longest Common Subsequence Algorithm . . . . . . 19
3.2 Our Edit Distance Algorithm for the Constrained LCS Problem . . . 35
Chapter 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . . 36
Chapter 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Appendixes
A. Miscellaneous Experimental Results . . . . . . . . . . . . . . . . . . 52
H.-Y. Ann, C.-B. Yang, C.-T. Tseng, and C.-Y. Hor, "Fast algorithms for computing
the constrained lcs of run-length encoded strings," Theoretical Computer
Science, Vol. 432, pp. 1-9, 2012.

O. Arbell, G. M. Landau, and J. S. Mitchell, "Edit distance of run-length
encoded strings," Information Processing Letters, Vol. 83, pp. 307-314, 2002.

A. N. Arslan and O. Omer E gecio glu, "Algorithms for the constrained longest
common subsequence problems," International Journal of Foundations of Com-
puter Science, Vol. 16, No. 6, pp. 1099-1109, 2005.

D. Becerra, W. Soto, L. Nino, and Y. Pinz on, "An algorithm for constrained
LCS," ACS/IEEE International Conference on Computer Systems and Applica-
tions, AICCSA 2010, Hammamet, Tunisia, May 16-19, 2010, Revised Selected
Papers, pp. 237-246, 2010.

F. Y. L. Chin, A. D. Santis, A. L. Ferrara, N. L. Ho, and S. K. Kim, "A simple
algorithm for the constrained sequence problems," Information Processing
Letters, Vol. 90(4), pp. 175-179, 2004.

S. Deorowicz, "Fast algorithm for constrained longest common subsequence
problem.," Theoretical and Applied Informatics, Vol. 19, No. 2, pp. 91-102,
2007.

S. Deorowicz, "Bit-parallel algorithm for the constrained longest common subsequence
problem," Fundamenta Informaticae, Vol. 99, pp. 409-433, 2010.

S. Deorowicz and A. Danek, "Bit-parallel algorithms for the merged longest
common subsequence problem," International Journal of Foundations of Com-
puter Science, Vol. 24, pp. 1281-1298, 2013.

S. Deorowicz and J. Obst oj, "Constrained longest common subsequence computing
algorithms in practice," Computing and Informatics, Vol. 29, pp. 427-
445, 2010.

D. He and A. N. Arslan, "A space-e cient algorithm for the constrained
pairwise sequence alignment problem," Genome Informatics, Vol. 16, No. 2,
pp. 237-246, 2005.

D. S. Hirschberg, "A linear space algorithm for computing maximal common
subsequences," Communications of the ACM, Vol. 18, No. 6, pp. 341-343, 1975.

K.-S. Huang, C.-B. Yang, K.-T. Tseng, H.-Y. Ann, and Y.-H. Peng, "Effi cient
algorithms for finding interleaving relationship between sequences," Informa-
tion Processing Letters, Vol. 105, pp. 188-193, 2008.

J. W. Hunt and T. G. Szymanski, "A fast algorithm for computing longest
common subsequences," Communications of the ACM, Vol. 20, No. 5, pp. 350-
353, 1977.

C. S. Iliopoulos and M. S. Rahman, "New effi cient algorithms for the LCS and
constrained LCS problems," Information Processing Letters, Vol. 106, No. 1,
pp. 13-18, 2008.

J. B. KRUSKAL, "An overview of sequence comparison: the warps, string edits,
and macromolecules," SIAM Review, Vol. 25, No. 2, pp. 201-237, 1977.

J. Liu, G. Huang, Y.Wang, and R. Lee, "Edit distance for a run-length-encoded
string and an uncompressed string," Information Processing Letters, Vol. 105,
pp. 12-16, 2007.

W. J. Masek, "A faster algorithm computing string edit distance," Journal of
Computer and System Sciences, Vol. 20, pp. 18-31, 1980.

N. Nakatsu, Y. Kambayashi, and S. Yajima, "A longest common subsequence
algorithm suitable for similar text strings," Acta Informatica, Vol. 18, pp. 171-
179, 1982.

D. B. Needleman and C. D. Wunsch, "A general method applicable to the
search for similarities in the amino acid sequence of two proteins," Journal of
Molecular Biology, Vol. 48, pp. 443-453, 1970.

C.-L. Peng, "An approach for solving the constrained longest common subsequence
problem." http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/search,
Master''s Thesis, Department of Computer Science and Engineering, National
Sun Yat-Sen University, Kaohsiung, Taiwan, 2003.

Y.-H. Peng, C.-B. Yang, K.-S. Huang, C.-T. Tseng, and C.-Y. Hor, "E -
fficient sparse dynamic programming for the merged LCS problem with block
constraints," International Journal of Innovative Computing, Information and
Control, Vol. 6, pp. 1935-1947, 2010.

Y.-H. Peng, C.-B. Yang, K.-S. Huang, and K.-T. Tseng, "An algorithm and
applications to sequence alignment with weighted constraints," International
Journal of Foundations of Computer Science, Vol. 21, pp. 51-59, 2010.

Z. S. Peng and H. F. Ting, "Time and space effi cient algorithms for constrained
sequence alignment," Implementation and Application of Automata: 9th Inter-
national Conference, CIAA 2004, Kingston, Canada, July 22-24, 2004, Revised
Selected Papers, pp. 237-246, 2005.

A. M. Rahman and M. S. Rahman, "Effictive sparse dynamic programming
algorithms for merged and block merged LCS problems," Journal of Computers,
Vol. 9, No. 8, pp. 1743-1754, 2014.

Y. T. Tsai, "The constrained longest common subsequence problem," Informa-
tion Processing Letters, Vol. 88, pp. 173-176, 2003.

R. Wagner and M. Fischer, "The string-to-string correction problem," Journal
of the ACM, Vol. 21, No. 1, pp. 168-173, 1974.

W. L. Wang, "Longest common subsequence with constraint," Master''s Thesis,
Department of Computer Science and Information Engineering, National Chi-
Nan University, Nantou, Taiwan, 2006.

S.Wu, U. Manber, and G. Myers, "An O(NP) sequence comparison algorithm,"
Information Processing Letters, Vol. 35, pp. 317-323, 1990.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔