跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.110) 您好!臺灣時間:2025/09/28 16:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳冠綸
研究生(外文):CHEN, KUAN-LUN
論文名稱:無須支配關係檢查之高效率天際線聯結查詢方法
論文名稱(外文):Efficient Processing of Skyline-Join Queries without Dominance Checking
指導教授:林明言
指導教授(外文):LIN,MING-YAN
口試委員:英家慶沈錳坤
口試委員(外文):YING,JIA-CHINGSHEN,MENG-KUN
口試日期:2017-06-28
學位類別:碩士
校院名稱:逢甲大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:85
中文關鍵詞:天際線查詢連結天際線支配檢查關聯式聯結
外文關鍵詞:Skyline QuerySkyline JoinDominance CheckingRelational Join
相關次數:
  • 被引用被引用:0
  • 點閱點閱:274
  • 評分評分:
  • 下載下載:7
  • 收藏至我的研究室書目清單書目收藏:0
天際線查詢是資料庫研究的重要研究議題,它能依據使用者的偏好,執行支配檢查並回傳使用者有興趣的項目集。天際線的複雜性會隨著資料表的維度上升而增加,因此大多數的研究皆集中在提升單一資料表天際線查詢的效能。
在現實生活中,有很多實務應用需要於兩個以上的資料表上執行天際線查詢,稱為聯結天際線查詢。但直接連結兩個資料表的天際線並無法產生最終結果,因此有效的處理天際線連結變得更加重要,因為項目與維度的數量增加會使天際線的尋找更加困難。過去的連結天際線演算法於連結資料表前刪除不必要的項目並於連結資料表上減少支配檢查的數量。在本篇論文中,我們提出了一個新的演算法稱為SWID (Skyline-join without dominance checking)來解決效能上的問題。 SWID演算法首先劃分資料表並排除不可能的項目,在每一個相同連結屬性的群中尋找當地天際線,接著建構項目的識別群,最後透過交集直接產生最終結果來完全避免在連結表格上執行支配檢查,有效的解決了最根本的問題。我們透過模擬資料與真實資料的實驗結果證實,SWID演算法比近年著名的SEPT演算法好,執行效率平均快了超過100倍;與知名的MSC演算法比較,執行效率平均快了64倍。除此之外,實驗結果也證明了SWID演算法具高效能且有良好的擴展性。

Skyline query has been actively studied in database research. The query performs dominance checking among tuples according to user preferences and returns only the interesting ones. The complexity of finding skyline tuples increases as the number of dimensions of the relation increases, so that most of the studies focus on improving the performance of skyline query on a single relation. In practice, many applications require skyline queries on a relation produced by joining two or more relations, called skyline-join queries. Joining the skyline results of the two relations cannot produce the final skyline. Efficient processing of skyline-join queries thus becomes more important as the increased number of tuples and the increased number of dimensions from join will exacerbate the skyline finding. Previous studies used strategies to prune tuples before join and reduce the number of dominance checks after join. In this study, we propose a novel algorithm called SWID (Skyline-join without dominance checking) to solve the problem efficiently. The SWID algorithm partitions relations and prunes impossible tuples first, finds local skylines in each group of tuples of same join attribute, constructs group identifications for tuples that might become part of the final skyline, and generates directly the final result by cross-products between partitions without any dominance checks after join. Our experiments using synthetic datasets and real datasets show that the SWID algorithm is more than 100 times faster than the SEPT algorithm and 64 times faster than the MSC algorithm in average. In addition, the SWID algorithm has excellent linear scalability.
誌謝……………………………………………………………………………………. i
摘要………………………………………………………………………………….... iii
Abstract iv
Table of Contents v
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 4
1.3 Research Objectives 5
1.4 Problem Statements 6
1.5 Organization of the Thesis 8
Chapter 2 Related Work 9
2.1 Skyline Query Algorithms 9
2.1.1 Block Nested Loops (BNL) Algorithm 9
2.1.2 Divide and Conquer (D&C) Algorithm 10
2.1.3 Sort Filter Skyline (SFS) Algorithm 11
2.1.4 Branch and Bound Skyline (BBS) Algorithm 12
2.2 Skyline Join Algorithms 14
2.2.1 Sort First Skyline Join (SFSJ) Algorithm 14
2.2.2 Skyline Sensitive Join (S2J) Algorithm 16
2.2.3 Symmetric Skyline Sensitive Join (S3J) Algorithm 20
2.2.4 Skyline join with Early Pruning and Termination (SEPT) Algorithm 21
2.2.5 Multiple Skyline Computations (MSC) Algorithm 23
Chapter 3 The Proposed Algorithm 27
3.1 Preliminaries 27
3.2 Overview of the SWID Algorithm 31
3.3 Phases of the SWID Algorithm 33
3.3.1 Phase One: Join-Group Sorting 33
3.3.2 Phase Two: Local Skyline Finding 34
3.3.3 Phase Three: Dominated Group Construction 35
3.3.4 Phase Four: Direct Skyline Generation 38
3.4 Discussion 45
Chapter 4 Experimental Results 48
4.1 Setup of the Experiments 48
4.2 Experimental Results on Small-Scale Datasets 50
4.3 Varying Number of Skyline Attributes 53
4.4 Varying Join Selectivity 56
4.5 Varying Data Cardinality 59
4.6 Effect of Data Distribution 62
4.7 Experimental Results on Real Datasets 64
4.8 Experimental Results on Three Relations 64
4.8.1 Varying Number of Skyline Attributes 65
4.8.2 Varying Join Selectivity 66
4.8.3 Varying Data Cardinality 67
4.8.4 Effect of Data Distribution 67
Chapter 5 Conclusions 69
5.1 Contributions 69
5.2 Future Work 69
References 71

[1]M. Bai, J. Xin, G. Wang, R. Zimmermann, and X. Wang, “Skyline-join Query Processing in Distributed Databases,” Frontiers of Computer Science, Volume 10, Issue 2, pp. 330-352, 2016.
[2]I. Bartolini, P. Ciaccia, and M. Patella, “SaLSa: Computing the Skyline without Scanning the Whole Sky,” Proceedings of the 2006 ACM International Conference on Information and Knowledge Management, pp. 405-414, 2006.
[3] I. Bartolini, P. Ciaccia, and M. Patella, “Efficient Sort-Based Skyline Evaluation,” ACM Transactions on Database Systems, Volume 33, Issue 4, pp. 31-49, 2008.
[4]A. Bhattacharya and B. P. Teja, “Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Multiple Relations,” Proceedings of the 16th International Conference on Management of Data, pp. 15-26, 2010.
[5]S. Börzsönyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” Proceedings of the 17th International Conference on Data Engineering, pp. 421-430, 2001.
[6]T. Bouadi, M. Cordier, and R. Quiniou. “Incremental Computation of Skyline Queries with Dynamic Preferences,” International Conference on Database and Expert Systems Applications, pp. 219–233, 2012.
[7]S. Chaudhuri, N. Dalvi, and R. Kaushik, “Robust Cardinality and Cost Estimation for Skyline Operator,” Proceedings of the 22nd International Conference on Data Engineering, pp. 1-10, 2006.
[8]L. Chen and X. Lian. “Dynamic Skyline Queries in Metric Spaces,” Proceedings of the 11th International Conference on Extending Database Technology, pp. 333-343, 2008.
[9]J. Chomicki, P. Ciaccia, and N. Meneghetti, “Skyline Queries, Front and Back,” ACM SIGMOD Record, Volume 42, Issue 3, pp. 6-18, 2013.
[10]J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting,” Proceedings of the 19th International Conference on Data Engineering, pp. 717-719, 2003.
[11]J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting: Theory and Optimizations,” Proceedings of the Intelligent Information Systems Conference, pp. 595-604, 2005.
[12]M. Endres and W. KieBling, “Semi-Skyline Optimization of Constrained Skyline Queries,” Twenty-Second Australasian Database Conference, pp. 7-16, 2011.
[13]M. Han and K. S. Candan, “Skyline-sensitive Joins with LR-pruning,” Proceedings of the 15th International Conference on Extending Database Technology, pp. 252-263, 2012.
[14]X. Han, J. Li, H. Gao, and C. Yang, “SEPT: An Efficient Skyline Join Algorithm on Massive Data,” Knowledge and Information Systems, Volume 43, Issue 2, pp. 355-388, 2015.
[15]W. Jin, M. Ester, Z. Hu, and J. Han, “The Multi-Relational Skyline Operator,” Proceedings of the 23rd International Conference on Data Engineering, pp. 1276-1280, 2007.
[16]M. L. Mortensen, S. Chester, I. Assent, and M. Magnani, “Efficient Caching for Constrained Skyline Queries,” Proceedings of the 18th International Conference on Extending Database Technology, pp. 337-348, 2015.
[17]M. Nagendra and K. S. Candan, “Efficient Processing of Skyline-Join Queries over Multiple Data Sources,” ACM Transactions on Database Systems, Volume 40, Issue 2, pp. 10:1-10:46, 2015.
[18]A. Nagendra, C. Doulkeridis, and N. Polyzotis, “Skyline Query Processing over Joins,” Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 73-84, 2011.
[19]D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 467-478, 2003.
[20]D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive Skyline Computation in Database Systems,” ACM Transactions on Database Systems, Volume 30, Issue 1, pp. 41-82, 2005.
[21]Y. Park, J.-K. Min, and K. Shim, “Parallel Computation of Skyline and Reverse Skyline Queries Using MapReduce,” Proceedings of the VLDB Endowment, Volume 6, Issue 14, pp. 2002–2013, 2013.
[22]Y. Park, J.-K. Min, and K. Shim, “Efficient Processing of Skyline Queries Using MapReduce,” IEEE Transactions on Knowledge and Data Engineering, Volume 29, Issue 5, pp. 1031-1044, 2017.
[23]M. Sharifzadeh and C. Shahabi. “The Spatial Skyline Queries,” Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 751–762, 2006.
[24] D. Sun, S. Wu, J. Li, and A. K Papadias. H. Tung, “Skyline-join in Distributed Databases,” Proceedings of the 24th International Conference on Data Engineering Workshops, pp. 176-181, 2008.
[25]E. Tiakas, G. Valkanas, A. N. Papadopoulos, Y. Manolopoulos, and D. Gunopulos, “Metric-based Top-k Dominating Queries,” 17th International Conference on Extending Database Technology, pp. 415–426, 2014.
[26]K. L. Tan, P. K. Eng, and B. C. Ooi, “Efficient Progressive Skyline Computation,” Proceedings of the 27th International Conference on Very Large Data Bases, pp. 301-310, 2001.
[27]G. Trimponias, I. Bartolini, D. Papadias, and Y. Yang, “Skyline Processing on Distributed Vertical Decompositions,” IEEE Transactions on Knowledge and Data Engineering, Volume 25, Number 4, pp. 850-862, 2013.
[28]B. Zhang, S.Zhou, and J.Guan, “Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments,” Proceedings of the 16th International Conference on Database Systems for Advanced Applications, pp. 403-414, 2011.
[29]M. Zhang and R. Alhajj, “Skyline Queries with Constraints: Integrating Skyline and Traditional Query Operators,” Data and Knowledge Engineering, Volume 69, Issue 1, pp. 153-168, 2010.
[30]J. Zhang, Z. Lin, B. Li, W. Wang, D Meng, “Skyline Join Query Processing over Multiple Relations,” International Conference on Database Systems for Advanced Applications, pp. 353-361, 2016.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top