The traditional exact string matching problem is to find all locations of a pattern string with length m in a text with length n. Here we propose a new encoding method to shorten the both lengths of pattern and text by substituting the substring between a special character for its length in O(m+n). Then we use an exact matching algorithm to solve the exact string matching problem on the encoding pattern and text. As can be seen、by using the encoding method、the pattern and text can be shortened about 2/|Σ| times the lengths of the original ones. In practice、it performs better than 2/|Σ|. For instance、for an English sentence pattern whose length is 50 and a text whose length is 200000、in average、the pattern is shortened to 6% of its original length and the text is shortened to 12.4% of its original length. Thus、the exact matching can be done in a much shorter time.
|