|
When we face a huge volume of data in the computer system, we need information retrieval system to retrieve data which we want. Many information retrieval systems are based on inverted index. The main problem of the inverted index is that the size of inverted file is always very large. In this thesis, we proposed a new method called "range index" to store the index list which can reduce the size overhead of inverted index. Our method also allows the users to control the query time and inverted file size tradeoff flexibly. We proposed four algorithms for constructing range index. We will discuss these algorithms and compare the difference about size of produced inverted file and the constructing time for each algorithm. In different conditions, the effect of the range index on the query time is not the same. We also collected *. We can see the relationship between the query time and these different conditions of range index.
|