( 您好!臺灣時間:2021/03/08 00:00
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):I-Hsiang Su Wang
論文名稱(外文):ICQ-Tree: An Inverted Code Quadtree for the Top-K Spatial Keyword Query over the Incremental Database
指導教授(外文):Ye-In Chang
外文關鍵詞:Spatial DatabaseTop-kQuadtreeKeyword SearchInverted Index
  • 被引用被引用:0
  • 點閱點閱:50
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著地理空間資料的快速發展,空間資料庫的應用也愈來愈廣泛。像是我們在Google Map上輸入關鍵字`restaurant'',系統便會顯示位於你附近的多間餐廳。或是在Instagram上傳自動定位的照片,並且可以使用hashtag來為照片做分類。這些地理資訊的應用,為我們生活帶來許多方便及樂趣。由於空間物件及關鍵字物件的數量增加,資料的儲存及搜尋的效率也就顯得更加重要。一個空間關鍵字查詢是由空間資訊和關鍵字資訊組成。空間可能是地圖上的一個點、線或形狀;而關鍵字是指一段描述或使用者的註解。在一個空間查詢中回傳前k個符合關鍵字條件的結果,我們稱之為前k個空間關鍵字搜尋。Zhang學者提出的IL-Quadtree結構,結合inverted index和linear quadtree來有效減少儲存空間。為進一步提高前k個空間關鍵字查詢演算法的效率,他們提出signature的過濾技術。在查詢的過程中,此方法檢查signature來判斷該節點是否為候選節點。如此便可減少走訪節點的次數。然而,在他們的方法中,必須建立大量的quadtree來儲存物件資料。對於n個關鍵字而言,會有n個quadtree被建立。在進行signature檢查時,必須交叉檢查每個IL-Quadtree。這將會花費較多的時間來處理資料。因此,在本論文中,我們提出了一個ICQ-tree的方法。ICQ-tree結合quadtree以及我們從inverted index改進而成的inverted code。而我們的方法貢獻如下:首先我們只建立一個ICQ-tree來儲存資料,而不是對於資料庫中的n個關鍵字建立n個quadtree。再則,我們每個節點記錄不同的位置,且每個物件只記錄一次。最後,我們透過在每一個ICQ-tree節點中儲存inverted code來加強ICQ-tree。當我們發現其中一個關鍵字絕對不會出現在某節點時,我們可以馬上省略搜尋該節點以及所有該節點的子節點。從我們的實驗數據顯示,我們所提出的ICQ-tree方法會比IL-Quadtree方法來的更有效率。
With the rapid development of geo-spatial data, the use of spatial databases has become more and more widespread. For example, if we enter the keyword ‘restaurant’ on Google Map, the system will display a number of restaurants near you, or upload photos with the location information in Instagram and uses hashtags to classify photos. The application of the geographic information has brought a lot of convenience and interests to our life. As the number of spatial objects and keyword objects increases, the data storage and the search efficiency become more important. A spatial keyword query consists of the spatial information and the textual information. Spatial information can be expressed as a point, a line, or the shape on a map; a keyword is the description or the user’s comment. In the space, for querying the top-k objects that match the keyword conditions, it is called the top-k spatial keyword query. Zhang et al. propose a data structure called the IL-Quadtree which combines the inverted index and the linear quadtree for reducing the storage size effectively. In order to further improve the efficiency of the top-k spatial keyword query algorithms, they propose a signature filtering technique. During the query process, it checks the signature to verify whether the node is a candidate node or not. This will reduce the times of traces around nodes. However, in their approach, a large number of quadtrees must be built to store data of objects. For n keywords, it creates n trees. When performing a signature check, each IL-Quadtree must be cross-checked. This will take long time to process the data. Therefore, in this thesis, we propose the ICQ-tree algorithm. Our ICQ-tree combines the quadtree and the inverted code that we have improved from the inverted index. The contribution of our approach is as follows. First, we only construct an ICQ-tree to store data, instead of building n quadtrees for the n keywords in the database. Second, each node of the ICQ-tree records the non-repetitive location, and each object is recorded only once. Finally, we enhance each node in the ICQ-tree with the inverted code. We can prune a node and all of its child nodes immediately, once we know that one of the query keywords is definitely not in the node. From our simulation results, we show that our proposed approach is more efficient than the IL-Quadtree algorithm.
1. Introduction 1
1.1 The Top-k Spatial Keyword Query 1
1.2 Keyword Matching Approaches 3
1.3 Spatial Index Structure 7
1.4 Related Works 9
1.5 Motivation 11
1.6 Organization of the Thesis 12
2. A Survey of Algorithms for Top-k Spatial Keyword Query 13
2.1 Density Based Spatial Keyword Query 13
2.1.1 Density Based Search 14
2.1.2 The IR-Tree 15
2.2 NNS with Keywords Base on SI-Index 17
2.2.1 The IR2-Tree 17
2.2.2 The Spatial Inverted Index (SI-Index) 19
2.3 The Inverted Linear Quadtree 21
3. An ICQ-Tree Based Algorithm 25
3.1 Data Structure 25
3.1.1 The Quadtree 25
3.1.2 The ICQ-Tree Structure 26
3.1.3 The Inverted Code 30
3.2 Top-k Spatial Keyword Query Processing 31
3.3 Incremental Database 35
3.4 Comparison 37
4. Performance 45
4.1 The Performance Model 45
4.2 Experiments Results 47
4.2.1 Synthetic Dataset 48
4.2.2 Real Datasets 53
5. Conclusion 58
5.1 Summary 58
5.2 Future Work 59
[1] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The R*-tree: An
Efficient and Robust Access Method for Points and Rectangles,” Proc. of the
1990 ACM SIGMOD Int. Conf. on Management of Data, Vol. 19, No. 2, pp. 322–
331, June 1990.
[2] X. Cao, G. Cong, C. S. Jensen, and B. C. Ooi, “Collective Spatial Keyword
Querying,” Proc. of the 2011 Int. Conf. on Management of Data, pp. 373–384,
[3] Y. I. Chang, C. H. Liao, and H. L. Chen, “NA-Trees: A Dynamic Index for
Spatial Data,” Journal of Information Science and Engineering, Vol. 19, No. 1,
pp. 103–139, Jan. 2003.
[4] Y.-Y. Chen, T. Suel, and A. Markowetz, “Efficient Query Processing in Geographic
Web Search Engines,” Proc. of the 2006 ACM SIGMOD Int. Conf. on
Management of data, pp. 277–288, 2006.
[5] D. W. Choi and C. W. Chung, “Nearest Neighborhood Search in Spatial
Databases,” Proc. of the 2015 IEEE 31st Int. Conf. on Data Engineering,
pp. 699–710, 2015.
[6] I. De Felipe, V. Hristidis, and N. Rishe, “Keyword Search on Spatial Databases,”
Proc. of the 2008 IEEE 24th Int. Conf. on Data Engineering, pp. 656–665, 2008.
[7] A. Dimitriou, D. Theodoratos, and T. Sellis, “Top-k-Size Keyword Search on
Tree Structured Data,” Information Systems, Vol. 47, No. 1, pp. 178–193, Jan.
[8] C. Faloutsos and S. Christodoulakis, “Signature files: An Access Method for
Documents and Its Analytical Performance Evaluation,” ACM Trans. on Information
Systems, Vol. 2, No. 4, pp. 267–288, Oct. 1984.
[9] H. Fang, , P. Zhao, , V. S. Sheng, , J. Wu, , J. Xu, , A. Liu, , and Z. Cui, “Effective
Spatial Keyword Query Processing on Road Networks,” Proc. of the 26th
Australasian Database Conf. on Databases Theory and Applications, Vol. 9093,
pp. 194–206, 2015.
[10] R. A. Finkel and J. L. Bentley, “Quad Trees A Data Structure for Retrieval on
Composite Keys,” Acta Informatica, Vol. 4, No. 1, pp. 1–9, March 1974.
[11] Y. Gao, J. Zhao, B. Zheng, and G. Chen, “Efficient Collective Spatial Keyword
Query Processing on Road Networks,” IEEE Trans. on Intelligent Transportation
Systems, Vol. 17, No. 2, pp. 469–480, Feb. 2016.
[12] A. P. Gopinath and A. Salim, “An Approach for Faster Processing of Top-k Spatial
Keyword Queries,” Proc. of the 2015 Int. Conf. on Control Communication
and Computing India, pp. 622–627, 2016.
[13] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc.
of the 1984 ACM SIGMOD Int. Conf. on Management of Data, pp. 47–57, 1984.
[14] Z. Li, K. C. K. Lee, B. Zheng, W. C. Lee, D. Lee, and X. Wang, “IR-Tree: An
Efficient Index for Geographic Document Search,” IEEE Trans. on Knowledge
and Data Engineering, Vol. 23, No. 4, pp. 585–599, April 2011.
[15] G. Morton, “A Computer Oriented Geodetic Data Base; and a New Technique
in File Sequencing,” IBM, Ottawa, Canada, 1966.
[16] Y. Qiu, T. Ohmori, T. Shintani, and H. Fujita, “A New Algorithm for m-Closest
Keywords Query over Spatial Web with Grid Partitioning,” Proc. of the 2015
IEEE/ACIS 16th Int. Conf. on SNPD, pp. 1–8, 2015.
[17] T. K. Sellis, N. Roussopoulos, and C. Faloutsos, “The R+-tree: A Dynamic
Index for Multi-dimensional Objects,” Proc. of the 13th VLDB Conf., pp. 507–
518, 1987.
[18] S. B. Shinde and M. K. Chavan, “Fast Nearest Neighbor Search with Keyword
using Compressed Inverted Index,” Proc. of the 2016 Int. Conf. on Advanced
Communication Control and Computing Technologies, pp. 546–550, 2016.
[19] Y. Tao and C. Sheng, “Fast Nearest Neighbor Search with Keywords,” IEEE
Trans. on Knowledge and Data Engineering, Vol. 26, No. 4, pp. 878–888, April
[20] X. Wang, Y. Zhang, W. Zhang, X. Lin, and W. Wang, “AP-Tree: Efficiently
Support Continuous Spatial-Keyword Queries over Stream,” Proc. of the 2015
IEEE 31st Int. Conf. on Data Engineering, pp. 1107–1118, 2015.
[21] D. Wu, G. Cong, and C. S. Jensen, “A Framework for Efficient Spatial Web
Object Retrieval,” The VLDB Journal, Vol. 21, No. 6, pp. 797–822, Dec. 2012.
[22] C. Zhang, Y. Zhang, W. Zhang, and X. Lin, “Inverted Linear Quadtree: Efficient
Top K Spatial Keyword Search,” IEEE Trans. on Knowledge and Data
Engineering, Vol. 28, No. 7, pp. 1706–1721, July 2016.
[23] L. Zhang, X. Sun, and H. Zhuge, “Density-Based Spatial Keyword Querying,”
Future Generation Computer Systems, Vol. 32, No. 1, pp. 211–221, March 2014.
[24] R. Zhong, G. Li, K.-L. Tan, and L. Zhou, “G-Tree: An Efficient Index for KNN
Search on Road Networks,” IEEE Trans. on Knowledge and Data Engineering,
Vol. 27, No. 8, pp. 2175–2189, Aug. 2015.
[25] J. Zobel, A. Moffat, and K. Ramamohanarao, “Inverted Files Versus Signature
Files for Text Indexing,” ACM Trans. on Database Systems, Vol. 23, No. 4,
pp. 453–490, Dec. 1998.
電子全文 電子全文(網際網路公開日期:20230729)
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔