跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.88) 您好!臺灣時間:2024/12/04 14:36
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:Muzi Wandile Hlatshwayo
研究生(外文):Muzi Wandile Hlatshwayo
論文名稱:Privacy Preserving Data Mining and Association Rule Sharing Using The High Lift Algorithm
論文名稱(外文):Privacy Preserving Data Mining and Association Rule Sharing Using The High Lift Algorithm
指導教授:黃有評黃有評引用關係
指導教授(外文):Yo-Ping Huang
口試委員:洪茂盛蘇國和黃正民楊棧雲黃有評
口試日期:2017-07-10
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電機工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:81
中文關鍵詞:Privacy preservationBusiness intelligenceAssociation rules
外文關鍵詞:Privacy preservationBusiness intelligenceAssociation rules
相關次數:
  • 被引用被引用:0
  • 點閱點閱:103
  • 評分評分:
  • 下載下載:8
  • 收藏至我的研究室書目清單書目收藏:0
The growing need in Companies to make it in business and to stay relevant to customer needs and reactions, has resulted in many companies turning to Business Intelligence. This is done through outsourcing, however the idea of maintaining the company’s secrets throughout the process then becomes an issue of utmost importance thus bringing forth the need for Privacy Preserving Data Mining (PPDM). Little research has been done in this field and especially in areas involving the market basket analysis. Current researches have focused on developing privacy preserving algorithms which have only been tested on small databases with ten or less transactions. This thesis seeks to provide an affordable solution in privacy preservation data mining for service providers. This Framework results in the automatic discovery of sensitive association rules which are then hidden using a low cost heuristic algorithm introduced in this study called the High Lift Algorithm (HLA). The performance of the framework was analyzed using real transactional data from the business environment and was compared against other algorithms already in use. The results show that this solution to privacy preserving data mining in market basket analysis is better than other approaches and will add value both the service provider and the customer in terms of business intelligence. Our investigation concludes that privacy-preserving data mining is possible and the use of the HLA simplified privacy preservation by incorporating aspects that other algorithms have been trying to focus on.
The growing need in Companies to make it in business and to stay relevant to customer needs and reactions, has resulted in many companies turning to Business Intelligence. This is done through outsourcing, however the idea of maintaining the company’s secrets throughout the process then becomes an issue of utmost importance thus bringing forth the need for Privacy Preserving Data Mining (PPDM). Little research has been done in this field and especially in areas involving the market basket analysis. Current researches have focused on developing privacy preserving algorithms which have only been tested on small databases with ten or less transactions. This thesis seeks to provide an affordable solution in privacy preservation data mining for service providers. This Framework results in the automatic discovery of sensitive association rules which are then hidden using a low cost heuristic algorithm introduced in this study called the High Lift Algorithm (HLA). The performance of the framework was analyzed using real transactional data from the business environment and was compared against other algorithms already in use. The results show that this solution to privacy preserving data mining in market basket analysis is better than other approaches and will add value both the service provider and the customer in terms of business intelligence. Our investigation concludes that privacy-preserving data mining is possible and the use of the HLA simplified privacy preservation by incorporating aspects that other algorithms have been trying to focus on.
ACKNOWLEDGMENTS ii
ABSTRACT i
CONTENTS i
List of Figure iv
List of Tables vi
Chapter 1 Introduction 7
1.1 Background 7
1.2 Privacy Preservation and Concerns 9
1.2.1 Data Miner Concerns 12
1.2.2 Decision Maker Concerns 13
1.2.3 Association Rule Hiding 13
1.3 Previous Related Works 15
1.4 Problem Statement 18
1.5 Thesis Development 19
Chapter 2 Literature Review 20
2.1 Introduction 20
2.2 Business Intelligence 21
2.2.1 Business Intelligence Definition 21
2.2.2 Advantages of Business Intelligence 23
2.3 Data Mining 24
2.3.1 Data Mining Definition 24
2.4 Association Rules 26
2.4.1 Association Rules Mining 27
2.4.2 Apriori Algorithm 29
2.4.3 Redundant Association Rules 31
2.4.4 Association Rule Sensitivities 32
2.4.5 Association Rules Hiding Algorithms 33
2.4.6 Representative Association Rules 34
2.5 Classes of Association Rule Algorithms 34
2.5.1 Heuristic Approaches 34
2.5.2 Border Based Approaches 35
2.5.3 Exact Approaches 36
2.5.4 Reform Approaches 36
2.5.5 Other Approaches 36
Chapter 3 Methods 38
3.1 Introduction 38
3.2 Detection of Sensitive Items 39
3.3 Advanced Decrease Support of R.H.S. Item of Rule Clusters (ADSRRC) Algorithm 41
3.4 Modified Decrease Support of R.H.S. item of Rule Clusters (ADSSRC) Algorithm 43
3.5 Proposed Framework 44
3.5.1 Difference in Algorithm Approaches 46
3.6 Performance Measures 47
Chapter 4 Data Preparation 50
4.1 Introduction 50
4.2 Synthetic Datasets 51
4.3 Real Groceries Dataset 52
Chapter 5 Data Mining 56
5.1 Introduction 56
5.2 Pre-processing Process 56
5.3 Data Mining Process and results 58
5.4 Sanitized Database 60
Chapter 6 Experimental Results 63
6.1 Introduction 63
6.2 Results on HLA compared to other algorithms 63
6.3 Real Dataset Results 67
6.4 Individual Itemsets Results 71
Chapter 7 Conclusions and Future Works 74
7.1 Conclusions 74
7.2 Future Works 75
References 76
ACKNOWLEDGMENTSii
ABSTRACTi
CONTENTSi
List of Figureiv
List of Tablesvi
Chapter 1 Introduction7
1.1 Background7
1.2 Privacy Preservation and Concerns9
1.2.1 Data Miner Concerns12
1.2.2 Decision Maker Concerns13
1.2.3 Association Rule Hiding13
1.3 Previous Related Works15
1.4 Problem Statement18
1.5 Thesis Development19
Chapter 2 Literature Review20
2.1 Introduction20
2.2 Business Intelligence21
2.2.1 Business Intelligence Definition21
2.2.2 Advantages of Business Intelligence23
2.3 Data Mining24
2.3.1 Data Mining Definition24
2.4 Association Rules26
2.4.1 Association Rules Mining27
2.4.2 Apriori Algorithm29
2.4.3 Redundant Association Rules31
2.4.4 Association Rule Sensitivities32
2.4.5 Association Rules Hiding Algorithms33
2.4.6 Representative Association Rules34
2.5 Classes of Association Rule Algorithms34
2.5.1 Heuristic Approaches34
2.5.2 Border Based Approaches35
2.5.3 Exact Approaches36
2.5.4 Reform Approaches36
2.5.5 Other Approaches36
Chapter 3 Methods38
3.1 Introduction38
3.2 Detection of Sensitive Items39
3.3 Advanced Decrease Support of R.H.S. Item of Rule Clusters (ADSRRC) Algorithm41
3.4 Modified Decrease Support of R.H.S. item of Rule Clusters (ADSSRC) Algorithm43
3.5 Proposed Framework44
3.5.1 Difference in Algorithm Approaches46
3.6 Performance Measures47
Chapter 4 Data Preparation50
4.1 Introduction50
4.2 Synthetic Datasets51
4.3 Real Groceries Dataset52
Chapter 5 Data Mining56
5.1 Introduction56
5.2 Pre-processing Process56
5.3 Data Mining Process and results58
5.4 Sanitized Database60
Chapter 6 Experimental Results63
6.1 Introduction63
6.2 Results on HLA compared to other algorithms63
6.3 Real Dataset Results67
6.4 Individual Itemsets Results71
Chapter 7 Conclusions and Future Works74
7.1 Conclusions74
7.2 Future Works75
References76
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top