跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.44) 您好!臺灣時間:2026/01/01 04:06
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:Mbuso Gerald Dlamini
研究生(外文):Mbuso Gerald Dlamini
論文名稱:Privacy Preserving Data Mining and Association Rule Sharing for Business Intelligence
論文名稱(外文):Privacy Preserving Data Mining and Association Rule Sharing for Business Intelligence
指導教授:黃有評 教授Mbuso Gerald Dlamini
口試委員:黃有評 教授傅立成 教授張玉山 教授蘇順豐 教授姚立德 教授
口試日期:2015-06-25
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電資學院外國學生專班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
中文關鍵詞:Privacy preservationPrivacy preservationBusiness intelligenceAssociation rules
外文關鍵詞:Privacy preservationPrivacy preservationBusiness intelligenceAssociation rules
相關次數:
  • 被引用被引用:0
  • 點閱點閱:369
  • 評分評分:
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
The growing interest in business intelligence, cloud computing and technological advancements has necessitated the usage of data mining as a service in order to ensure business success and sustainability. As a result data mining techniques are becoming more useful to discover and understand unknown customer patterns and their behaviors through association rules discovered from the transactional databases. The lack of expertise and computational resources have compelled companies to outsource data mining activities as it is a complex and expensive exercise. However, the need to ensure privacy and confidentiality is inevitable.
There has only been a few research work that has been done on privacy preservation data mining in market basket analysis. Most current researches have focused on developing privacy preserving algorithms which have only been tested on small databases with ten or less transactions. Other researches have focused on complex and resource intensive privacy preserving which are more complex and expensive for data mining service providers and for businesses. All these researches have depended on the database owner to specify the sensitive items in the database. This thesis, therefore, seeks to provide a low cost solution in privacy preservation data mining for service providers.
The study aims to reduce costs and complications by providing a framework that automatically detects sensitive items in the transactional databases which are difficult and time consuming to detect manually by the owner and service provider. This in turn results in the discovery of sensitive association rules which are then hidden using two low cost heuristic algorithms named MDSRRC (Modified Decrease Support of R.H.S item of Rule Cluster) and ADSRRC (Advanced Decrease Support of R.H.S item of Rule Cluster).
The performance of the framework was analyzed using both synthetic and real transactional data. The results show that this solution to privacy preserving data mining in market basket analysis is viable and will empower both the service provider and the customer in terms of business intelligence. Our investigation concludes that privacy-preserving data mining is to some extent possible and the use of heuristic algorithms simplified privacy preservation by focusing on pattern lengths rather than transaction lengths thus eliminating the possibility of distorting more valuable data.
The growing interest in business intelligence, cloud computing and technological advancements has necessitated the usage of data mining as a service in order to ensure business success and sustainability. As a result data mining techniques are becoming more useful to discover and understand unknown customer patterns and their behaviors through association rules discovered from the transactional databases. The lack of expertise and computational resources have compelled companies to outsource data mining activities as it is a complex and expensive exercise. However, the need to ensure privacy and confidentiality is inevitable.
There has only been a few research work that has been done on privacy preservation data mining in market basket analysis. Most current researches have focused on developing privacy preserving algorithms which have only been tested on small databases with ten or less transactions. Other researches have focused on complex and resource intensive privacy preserving which are more complex and expensive for data mining service providers and for businesses. All these researches have depended on the database owner to specify the sensitive items in the database. This thesis, therefore, seeks to provide a low cost solution in privacy preservation data mining for service providers.
The study aims to reduce costs and complications by providing a framework that automatically detects sensitive items in the transactional databases which are difficult and time consuming to detect manually by the owner and service provider. This in turn results in the discovery of sensitive association rules which are then hidden using two low cost heuristic algorithms named MDSRRC (Modified Decrease Support of R.H.S item of Rule Cluster) and ADSRRC (Advanced Decrease Support of R.H.S item of Rule Cluster).
The performance of the framework was analyzed using both synthetic and real transactional data. The results show that this solution to privacy preserving data mining in market basket analysis is viable and will empower both the service provider and the customer in terms of business intelligence. Our investigation concludes that privacy-preserving data mining is to some extent possible and the use of heuristic algorithms simplified privacy preservation by focusing on pattern lengths rather than transaction lengths thus eliminating the possibility of distorting more valuable data.
Contents
ABSTRACT i
ACKNOWLEDGMENTS iii
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
1.1 Background 1
1.2 Privacy Preserving Data Mining and Privacy Concerns 3
1.2.1 Data Miner Concerns 7
1.2.2 Decision Maker Concerns 7
1.2.3 Goal of Association Rule Hiding 8
1.3 Previous Related Works 10
1.4 Problem Statement 13
1.5 Thesis Development 14
Chapter 2 Literature Review 16
2.1 Introduction 16
2.2 Business Intelligence 17
2.2.1 Business Intelligence Definition 17
2.2.2 Business Intelligence Benefits 19
2.3 Data Mining 21
2.3.1 Data Mining Definition 22
2.3.2 Market Basket Analysis 23
2.3.3 Association Rules 25
2.3.4 Apriori Algorithm 28
2.4 Association Rule Variants 29
2.4.1 Redundant Association Rules 29
2.4.2 Association Rule Sensitivities 31
2.4.3 Representative Association Rules 32
2.5 Association Rule Hiding Approaches 33
2.5.1 Heuristic Approaches 34
2.5.2 Border Based Approaches 35
2.5.3 Exact Approaches 35
2.5.4 Reform Approaches 36
Chapter 3 Methods 37
3.1 Introduction 37
3.2 Detection of Sensitive Items 38
3.3 Advanced Decrease Support of R.H.S. item of Rule Clusters (ADSSRC) Algorithm 39
3.4 Modified Decrease Support of R.H.S. item of Rule Clusters (MDSSRC) Algorithm 42
3.5 Proposed Framework 44
3.6 Performance Measures 44
Chapter 4 Data Preparation 47
4.1 Introduction 47
4.2 Real World versus Synthetic Datasets 48
4.3 Synthetic Data 49
4.4 Real Groceries Dataset 52
Chapter 5 Data Mining 54
5.1 Introduction 54
5.2 Synthetic Datasets Mining 54
5.3 Groceries Real Dataset Mining 60
Chapter 6 Experimental Results 64
6.1 Introduction 64
6.2 Synthetic Data Results 64
6.3 Groceries Real Dataset 68
Chapter 7 Conclusions and Future Works 71
7.1 Conclusions 71
7.2 Future Works 72
References 73
About the Author 77
[1]M. Kantarcioglu and W. Jiang, “Incentive compatible privacy-preserving data analysis,” IEEE Trans. on Knowledge and Data Eng., vol. 25, no. 6, pp.1323-1335, June 2013.
[2]S. D. Patel and S. Tiwari, “Privacy preserving data mining,” International Journal of Computer Science and Information Technologies, vol. 4, no. 1, pp.139-141, May 2013.
[3]F. Giannotti, L.V. S. Lakshmanan, A. Monreale, D. Pedreschi and H.W. Wang, “Privacy-preserving mining of association rules from outsourced transaction databases,” IEEE Systems Journal., vol. 7, no. 3, pp.385-395, September 2013.
[4]M.L. Gonzales, K. Bagchi, G. Udo and P. Kirs, “Diffusion of business intelligence and data warehousing: an exploratory investigation of research and practice,” in Proc. 44th Hawaii Int. Conf. on System Sciences (HICSS), Kauai, Hawaii, pp.1-9, January 2011.
[5]T. Sirole and J. Choudhary, “A survey of various methodologies for hiding sensitive association rules,” Int. Journal of Computer Applications, vol. 96, no.18, pp.12-15, June 2014.
[6]S. M. Darwish, M. M. Madbouly and M. A. El-Hakeem, “A database sanitizing algorithm for hiding sensitive multilevel association rule mining,” Int. Journal of Computer and Communication Engineering, vol. 3, no. 4, pp.285-293, July 2014.
[7]L. Qiu, Y. Li, and X. Wu, “Protecting business intelligence and customer privacy while outsourcing data mining tasks,” Knowledge and Inform. Syst., vol. 17, no. 1, pp.99-120, October 2008.
[8]X. Huang, J. K. Liu, S. Tang, Y. Xiang, K. Liang, L. Xu and J. Zhou, “Cost-effective authentic and anonymous data sharing with forward security,” IEEE Transactions on Computers, vol. 64, no. 4, pp.971-983, April 2015.
[9]R. V. Prakash, A. Govardhan and S. S. V. N. Sarma, “Discovering non-redundant association rules using minmax approximation rules,” Indian Journal of Computer Science and Engineering, vol. 3 no.6, pp.796-802, Jan. 2013.
[10]L. Xu, C. Jiang, J. Wang, J. Yuan and Y. Ren, “Information security in big data: Privacy and data mining,” IEEE Access, vol. 2, pp.1149-1176, October 2014.
[11]J. Vaidya, B. Shafiq, W. Fan, D. Mehmood and D. Lorenzi, “A random decision tree framework for privacy-preserving data mining,” IEEE Transactions on Dependable and Secure Computing, vol. 11, no. 5, pp.399-411, October 2014.
[12]L. Xu, C. Jiang, J. Wang, J. Yuan and Y. Ren, “Information security in big data: privacy and data mining,” IEEE Access, vol. 2, pp.1149-1176, October 2014.
[13]S. Goryczka, L. Xiong and B. C. M. Fung, “m-Privacy for collaborative data publishing,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 10, pp.2520-2533, October 2014.
[14]A. Gkoulalas-Divanis and V.S. Verykios, Association Rule Hiding for Data Mining, Springer, New York, USA, 2010.
[15]C.N. Modi, V. Sardar, U.P. Rao and D.R. Patel, “Maintaining privacy and data quality in privacy preserving association rule mining.” in Proc. Int. Conference on Computing Communication and Networking Technologies, Karur, India, pp.1-6, July 2010.
[16]K. Shah, A.Thakkar and A. Ganatra, “Association rule hiding by heuristic approach to reduce side effects and hide multiple right hand side items,” Int. Journal of Computer Applications, vol. 45, no.1, pp.12-15, May 2012.
[17]S. Kasthuri and T. Meyyappan, “Detection of sensitive items in market basket database using association rule mining for privacy preserving,” in Proc. of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering, Salem, USA, pp.200-203, February 2013.
[18]N.H. Domadiya, and U.P. Rao, “Hiding sensitive association rules to maintain privacy and data quality in database,” in Proc. of the IEEE 3rd Conference International Advance Computing, Ghaziabad, India, pp.1306-1310, February 2013.
[19]U. Srinivasan and B. Arunasalam, “Leveraging big data analytics to reduce healthcare costs,” IEEE IT Pro, vol. 15, no. 6, pp.21-28, December 2013.
[20]J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd ed., Waltham, USA, 2011
[21]J. Kaufmann and P. Chamoni, “Structuring collaborative business intelligence: a literature review,” in Proc. of 47th Hawaii Int. Conference on System Science, Waikoloa, Hawaii, pp.3738-3747, January 2014.
[22]M. Miškuf and I. Zolotová, “Application of business intelligence solutions on manufacturing data,” in Proc. of IEEE 13th Int. Symposium on Applied Machine Intelligence and Informatics, Herl’any, Slovakia, pp.193-197, January 2015.
[23]A. Metzger, P. Leitner, D. Ivanovi´c, E. Schmieders, R. Franklin, M. Carro, S. Dustdar and K. Pohl, “Comparing and combining predictive business process monitoring techniques,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 45, no. 2, pp.276-290, February 2015.
[24]J. Neethu and W. Aswathy, “Data mining on social security and social welfare data,” Int. Journal of Scientific &; Engineering Research, vol. 5, issue 2, pp.1612-1614, February 2014.
[25]K. Sathiyapriya and G. S. Sadasivam, “A survey on privacy preserving association rule mining,” Int. Journal on Data Mining Knowledge Management Process, vol. 3, no. 2, pp.119-130, March 2013.
[26]D. Martin, A. Rosete, J. Alcala-Fdez and F. Herrera, “A new multiobjective evolutionary algorithm for mining a reduced set of interesting positive and negative quantitative association rules,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 1, pp.54-69, February 2014.
[27]N. R. Radadiya, N. B. Prajapati and K. H. Shah, “Privacy preserving in association rule mining,” Int. Journal of Agriculture Innovations and Research, vol. 2, issue 4, pp.208-213, October 2013.
[28]A. Sabharwal, J. Suter, K. S. McKelvey, M. K. Schwartz and C. Montgomery, “Large landscape conservation—synthetic and real-world datasets,” in Proc. of the 27th AAAI Conference on Artificial Intelligence, Washington, USA, pp.1369-1372, July 2013.
[29]W.Lu, G. Miklau and V. Gupta, “Generating private synthetic databases for untrusted system evaluation,” in Proc of the IEEE 30th International Conference on Data Engineering, Chicago, USA, pp.652-663, April 2014.
[30]R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in Proc. of the 20th Int. Conference on Very Large Data Bases, Santiago, Chile, pp.487-499, September 1994.
[31]M. Hahsler, K. Hornik and T. Reutterer, “Implications of probabilistic data modeling for mining association rules,” in Proc. of the 29th Annual Conference of The Gesellschaft fur Klassifikation e.V, Magdeburg, Germany, pp.598-605, March 2005.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top