(3.230.154.160) 您好!臺灣時間:2021/05/08 00:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:Muzwandile Z. W. Makhubu
研究生(外文):Muzwandile Z. W. Makhubu
論文名稱:使用R-tree進行度量空間中頂端k物件處裡之探討
論文名稱(外文):A Study on Top-k Dominance in Metric Space using R-trees
指導教授:劉傳銘劉傳銘引用關係
口試委員:劉傳銘, 王正豪, 陳震宇
口試日期:2016-07-15
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:資訊工程系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
中文關鍵詞:distance-basedmetric spaceprogressivequerytop-k dominance
外文關鍵詞:distance-basedmetric spaceprogressivequerytop-k dominance
相關次數:
  • 被引用被引用:0
  • 點閱點閱:56
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
Top-k dominating queries are an important tool for ‘similarity search’ in database and decision support applications. A top-k dominating query returns k data items with the highest dominance in a dataset. It combines the advantages of two powerful preference query techniques – the top-k query and the skyline query. It does so while mitigating their individual disadvantages. A great deal of work has been done on solving the top-k dominating query problem on a multivariate dataset where data items are defined as multidimensional points. Most of the work has handled the case where the dataset is static and to a lesser extent distance-based dynamic data. In this work the top-k dominating query is performed over metric space data which poses different challenges to those encountered in the abovementioned multivariate data. In this scenario we have data objects and their distances to a set of input query objects, and these distances can change dynamically as input query objects are generated. Two algorithms are developed to solve the problem of top-k dominating queries in metric space. Typically metric space index structures such as the M-tree would be used in such a situation. In this work we show how to efficiently use R-trees in a metric instead. Moreover the paper also investigates means to reduce the memory footprint of the processing algorithm. This is an important direction as this makes the top-k dominating query solution applicable to wireless broadcast environments where the processing nodes may have limited resource (e.g. wireless sensor networks). We were able to show that the R-tree can be effectively used in indexing data that would typically be indexed using metric space indexes. Moreover the algorithms described are capable of finding the top-k dominating results without first finding the exact dominance score of an object. The two algorithms proposed are called Direct-Top-k Dominating (D-TKD) Query and Enhanced-Top-k Dominating (E-TKD) Query algorithm. The D-TKD algorithm solves the problem without the use of any sophisticated indexing scheme, while the E-TKD employs the R-tree as an index. We show the performance of these two algorithms for different; dataset size, query set size and size of result. We demonstrate the performance improvement that is a result of using the R-tree index for the E-TKD method.
Top-k dominating queries are an important tool for ‘similarity search’ in database and decision support applications. A top-k dominating query returns k data items with the highest dominance in a dataset. It combines the advantages of two powerful preference query techniques – the top-k query and the skyline query. It does so while mitigating their individual disadvantages. A great deal of work has been done on solving the top-k dominating query problem on a multivariate dataset where data items are defined as multidimensional points. Most of the work has handled the case where the dataset is static and to a lesser extent distance-based dynamic data. In this work the top-k dominating query is performed over metric space data which poses different challenges to those encountered in the abovementioned multivariate data. In this scenario we have data objects and their distances to a set of input query objects, and these distances can change dynamically as input query objects are generated. Two algorithms are developed to solve the problem of top-k dominating queries in metric space. Typically metric space index structures such as the M-tree would be used in such a situation. In this work we show how to efficiently use R-trees in a metric instead. Moreover the paper also investigates means to reduce the memory footprint of the processing algorithm. This is an important direction as this makes the top-k dominating query solution applicable to wireless broadcast environments where the processing nodes may have limited resource (e.g. wireless sensor networks). We were able to show that the R-tree can be effectively used in indexing data that would typically be indexed using metric space indexes. Moreover the algorithms described are capable of finding the top-k dominating results without first finding the exact dominance score of an object. The two algorithms proposed are called Direct-Top-k Dominating (D-TKD) Query and Enhanced-Top-k Dominating (E-TKD) Query algorithm. The D-TKD algorithm solves the problem without the use of any sophisticated indexing scheme, while the E-TKD employs the R-tree as an index. We show the performance of these two algorithms for different; dataset size, query set size and size of result. We demonstrate the performance improvement that is a result of using the R-tree index for the E-TKD method.
Abstract i
Acknowledgement iii
Table of Contents iv
List of Tables vi
List of Figures vii
1. Chapter 1: Introduction 1
1.1. Study Background 1
1.2. Problem Overview 1
1.3. Contribution 9
1.4. Roadmap 11
2. Chapter 2: Background 12
2.1. Terms and Notation 12
2.1.1. Query Related Symbols Terms and Notation 12
2.1.2. R-tree Related Symbols and Terms 13
2.1.3. The Metric Space 13
2.2. Problem Definition 14
2.2.1. The Dataset 15
2.2.2. Tasks Involved 16
2.2.3. Overview of Our Proposed Solution 16
2.3. Motivation 17
3. Chapter 3: Literature Review 21
3.1. Data Indexing 21
3.2. Nearest-Neighbour Processing 23
3.3. Top-k Queries 24
3.4. Skyline Queries 25
3.5. Top-k Dominating Queries 26
4. Chapter 4: Methodology 28
4.1. Methodology 28
4.2. Dominance Calculation 36
4.3. Index Adaptation 37
4.3.1. Optimizing bounding disc size 38
5. Chapter 5: Proposed Algorithms 41
5.1. Algorithm 1: Direct-TKD Algorithm 41
5.2. Algorithm 2: Enhanced-TKD Algorithm 44
6. Chapter 6: Experimental Results 49
6.1. Size of Dataset 50
6.2. Size of Query Set 55
6.3. Size of Result - k 56
6.4. Query Set Distribution 58
7. Chapter 7: Conclusion 60
References 61
[1]W. Dargie and C. Poellabauer, Fundamentals of Wireless Sensor Networks: Theory and Practice. West Sussex: Wiley, 2010.
[2]P. Kumar, L. Reddy, and S. Varma, “Distance Measurement and Error Estimation Scheme for RSSI Based Localization in Wireless Sensor Networks,” in Proceedings of the 5th IEEE Conference on Wireless Communication and Sensor Networks (WCSN), 2009, pp. 1–4.
[3]D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive skyline computation in database systems,” ACM Trans. Database Syst., vol. 30, no. 1, pp. 41–82, 2005.
[4]S. Ranu, “Applications of Top- k Representative Queries,” in Proceedings of Semantic Web Information Management on Semantic Web Information Management, 2014, pp. 5–6.
[5]M. L. Yiu and N. Mamoulis, “Efficient processing of top-k dominating queries on multi-dimensional data,” in Proceedings of the 33rd international conference on Very large data bases (VLDB ’07 ), 2007, pp. 483–494.
[6]I. Ilyas, G. Beskales, and M. Soliman, “A survey of top-k query processing techniques in relational database systems,” ACM Comput. Surv., vol. 40, no. 4, pp. 1–58, 2008.
[7]E. Tiakas, A. N. Papadopoulos, and Y. Manolopoulos, “Skyline queries: An introduction,” in Proceedings of the 6th International Conference on Information, Intelligence, Systems and Applications, 2016.
[8]X. Miao, Y. Gao, G. Chen, H. Li, and B. Zheng, “Top- k Dominating Queries on Incomplete Data,” IEEE Trans. Knowl. Data Eng, vol. 28, no. 1, pp. 252–266, 2016.
[9]Y. J. Kim and J. Patel, “Performance comparison of the R*-tree and the quadtree for kNN and distance join queries,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 7, pp. 1014–1027, 2010.
[10]M. L. Yiu and N. Mamoulis, “Multi-dimensional top-k dominating queries,” VLDB J., vol. 18, no. 3, pp. 695–718, 2009.
[11]E. Tiakas, G. Valkanas, A. N. Papadopoulos, Y. Manolopoulos, and D. Gunopulos, “Processing Top-k Dominating Queries in Metric Spaces,” ACM Trans. Database Syst., vol. 40, no. 4, pp. 23:1–23:38, 2016.
[12]X. Han, J. Li, and H. Gao, “TDEP: efficiently processing top-k dominating query on massive data,” Knowl. Inf. Syst., vol. 43, no. 3, pp. 689–718, 2015.
[13]J. Zhang, N. Mamoulis, D. Papadias, and Y. Tao, “All-nearest-neighbors queries in spatial databases,” in Proceedings of the 16th International Conference on Scientific and Statistical Database Management, 2004., 2004, pp. 297–306.
[14]P. Ciaccia, M. Patella, and P. Zezula, “M-tree : An Efficient Access Method for Similarity Search in Metric Spaces,” in Proceedings of the 23rd International Conference on Very Large Data Bases, 1997, pp. 426–435.
[15]M. L. Yiu and N. Mamoulis, “Efficient processing of top-k dominating queries on multi-dimensional data,” in Proceedings of the 33rd international conference on Very large data bases, 2007, pp. 483–494.
[16]L. Chen and X. Lian, “Efficient processing of metric skyline queries,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 3, pp. 351–365, 2009.
[17]D. Fuhry, R. Jin, and D. Zhang, “Efficient skyline computation in metric space,” in Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology - EDBT ’09, 2009, p. 1042.
[18]J. K. Uhlmann, “Satisfying general proximity / similarity queries with metric trees,” Inf. Process. Lett., vol. 40, no. 4, pp. 175–179, 1991.
[19]G. Wang, X. Zhou, B. Wang, B. Qiao, and D. Han, “A hyperplane based indexing technique for high-dimensional data,” Inf. Sci., vol. 177, no. 11, pp. 2255–2268, 2007.
[20]A. Guttman, “R-Trees: A Dynamic Index Structure For Spatial Searching,” in Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, 1984, pp. 47–57.
[21]N. Roussopoulos, S. Kelley, and F. Vincent, “Nearest neighbor queries,” in Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data - SIGMOD ’95, 1995, pp. 71–79.
[22]K. L. Clarkson, “Nearest neighbor queries in metric spaces,” Discret. Comput. Geom, pp. 609–617, 1997.
[23]C.-M. Liu and S.-Y. Fu, “Effective protocols for kNN search on broadcast multi-dimensional index trees,” Inf. Syst., vol. 33, no. 1, pp. 18–35, 2008.
[24]S. Berchtold, D. A. Keim, H.-P. Kriegel, and T. Seidl, “Indexing the solution space: a new technique for nearest neighbor search in high-dimensional space,” IEEE Trans. Knowl. Data Eng., vol. 12, no. 1, pp. 45–57, 2000.
[25]A. Vlachou, C. Doulkeridis, K. Nørvåg, and M. Vazirgiannis, “On efficient top-k query processing in highly distributed environments,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, 2008, pp. 753–764.
[26]M. Kontaki, A. N. Papadopoulos, and Y. Manolopoulos, “Continuous top-k dominating queries,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 5, pp. 840–853, 2012.
[27]E. Tiakas and G. Valkanas, “Metric-Based Top-k Dominating Queries,” in Proceedings of the 17th Inter- national Conference on Extending Database Technology (EDBT), 2014, pp. 415–426.
[28]R. Fagin, A. Lotem, and M. Naor, “Optimal aggregation algorithms for middleware,” J. Comput. Syst. Sci., vol. 66, no. 4, pp. 614–656, 2003.
[29]S. Borzsony, D. Kossmann, and K. Stocker, “The Skyline operator,” in Proceedings of the 17th International Conference on Data Engineering, 2001, pp. 1–20.
[30]D. Papadias, Y. Tao, K. Mouratidis, and C. K. Hui, “Aggregate Nearest Neighbor Queries in Spatial Databases,” ACM Trans. Database Syst., vol. 30, no. 2, pp. 529–576, 2005.
[31]C. Zhang, X. Zhou, and C. Gao, “On Improving the Precision of Localization with Gross Error Removal,” in Proceedings of the 28th International Conference on Distributed Computing Systems Workshops, 2008, pp. 144–149.
[32]E. Tiakas, “Top- k Dominating Queries : a Survey,” in Proceedings of the 17th International Symposium on Programming and Systems (ISPS), 2013, pp. 1–10.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔