跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.109) 您好!臺灣時間:2026/06/06 08:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:朱恩榮
研究生(外文):Khunanon chunlakan
論文名稱:K-匿名模型於確保隱私資料探勘之研究
論文名稱(外文):A Study of K-anonymity Model for Privacy Preserving Data Mining
指導教授:江季翰江季翰引用關係
指導教授(外文):Ji-Han Jiang
學位類別:碩士
校院名稱:國立虎尾科技大學
系所名稱:資訊工程系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2018
畢業學年度:106
語文別:英文
論文頁數:74
中文關鍵詞:k-匿名性能確保隱私算法
外文關鍵詞:K-anonymityAlgorithmPrivacy-preservationPerformance
相關次數:
  • 被引用被引用:0
  • 點閱點閱:275
  • 評分評分:
  • 下載下載:10
  • 收藏至我的研究室書目清單書目收藏:0
在信息技術時代, 有很多行業從客戶那裡收集數據. 之後這些數據被發布用於研究和分析目的. 數據披露已被廣泛使用. K-匿名是使用最廣泛的之一隱私保護方法. 在這篇論文, 我們將比較眾所周知的三種k-匿名方法. 通過使用從網上資源公開發布的數據集, 也使用公開實施的k-匿名算法. 我們通過使用選定的相同度量和相同的數據集來演示實驗. 然後,我們將討論每個算法的性能哪一個是所選算法中最好的.
In the information technologies era, there are have a lot of industries that collected the data from the customer. And later those data are released for research or analysis purposes. The public data contain the sensitive information like salary or disease. K-anonymity is one of the most widely use privacy preservation model. Because of its famous method, there are a lot of algorithms that have been introduced. In this thesis, we will compare between the most widely used three algorithms of k-anonymity. By performing a comparison of three algorithms to measure their efficiency and their effectiveness. We have extended the scope of the evaluation by using a more comprehensive set of scenarios: different dataset and parameter. Using open sources of those algorithms. We have a various set of experiments to identify each algorithm performance in order to know which algorithm is more appropriated in the selected scenario. We demonstrate through experiment evaluation, what condition that there is algorithm outperforms the other algorithms according to input privacy requirement and datasets. Our result will give future researcher to create new algorithm or methodologies about k-anonymity.
Abstract...i
摘要...ii
Acknowledgements...iii
Table of Contents...iv
List of Tables...vii
List of Figures...viii
List of Abbreviations...xi
Chapter 1 Introduction...1
1.1 Research Motivation...1
1.2 Thesis Objectives...2
1.3 Thesis Outline...2
Chapter 2 Background and Related Work...4
2.1 Data anonymization...4
2.2 K-anonymity...5
2.2.1 Generalization...7
2.2.1.1 Domain Generalization hierarchy...7
2.2.1.2 Value Generalization hierarchy...7
2.2.2 Suppression...11
2.3 K-anonymity Classification...12
2.4 K-anonymity Extension...13
2.5 Related Work...17
Chapter 3 Study of Methodology...20
3.1 Generalization Information loss (Genloss)...20
3.2 UTD anonymization Toolbox...21
3.3 Datafly Algorithms...24
3.4 Incognito Algorithms...28
3.5 Incognito with T-closeness Algorithms...31
Chapter 4 Comparison of K-anonymity...31
4.1 Dataset...31
4.2 Environment...34
4.3 Results...35
4.3.1 Experiment 1...35
4.3.2 Experiment 2...52
4.3.3 Experiment 3...58
Chapter 5 Conclusion...60
Reference...61
Extended Abstract...67
Curriculum Vitae...72
[1]Privacy Technology Focus Group Report. United States Department of Justice. 2006. p. 52.
[2]"Data anonymization". The Free Medical Dictionary. Retrieved 17 January 2014.
[3]Bin Zhou; Jian Pei; WoShun Luk (December 2008). "A brief survey on anonymization techniques for privacy preserving publishing of social network data". Newsletter ACM SIGKDD Explorations Newsletter. 10 (2): 12–22.
[4]Sun, X., Wang, H., Zhang, Y.: On the identity anonymization of high-dimensional rating data. No. March (2011). pp. 1108–1122 (2012)Sweeney, L.: k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. Int. J. Uncertainty, Fuzziness Knowledge-Based Syst., Vol. 10. No. 05. pp. 557–570. Oct (2002).
[5]Bayardo, R. J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. 21st Int. Conf. Data Eng. pp. 217–228.
[6]Li, N.: Provably Private Data Anonymization: Or, k -Anonymity Meets Differential Privacy. (2010)
[7]Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: Privacy Beyond k-Anonymity. ACM Trans. Knowl. Discov. Data, Vol. 1. No. 1, p. 3–es. Mar (2007)
[8]Zhou, B., Pei, J.: The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl. Inf. Syst., Vol. 28. No. 1. pp. 47–77. Jun (2010)
[9]Li, N.: t -Closeness: Privacy Beyond k -Anonymity and -Diversity. In ICDE, Vol. 7. pp. 106-115. (2007)
[10]Russom, Yohannes. Privacy preserving for Big Data Analysis. MS thesis. University of Stavanger, Norway, 2013.
[11]Tang, Qingming, et al. "Improving Strict Partition for Privacy Preserving Data Publishing." Networking and Distributed Computing (ICNDC), 2010 First International Conference on. IEEE, 2010.
[12]Zakerzadeh, Hessam, Charu C. Aggarwal, and Ken Barker. "Privacy-preserving big data publishing." Proceedings of the 27th International Conference on Scientific and Statistical Database Management. ACM, 2015
[13]Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati. “k-Anonymity” Springer US, Advances in information Security, 2007
[14]P. Samarati and L. Sweeney. Protecting Privacy when Disclosing Information: k-anonymity and its enforcements through generalization and suppression. Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory, 1998
[15]Samarati P (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.
[16]LeFevre K, DeWitt DJ, Ramakrishnan R (2005). Incognito: Efficient full-domain k-anonymity. In Proc. of the 24th ACM SIGMOD International Conference on Management of Data, pp. 49–60, Baltimore, Maryland, USA.
[17]Hundepool A, Van deWetering A, Ramaswamy R, Franconi L, Capobianchi A, DeWolf PP, Domingo-Ferrer J, Torra V, Brand R, Giessing S (2003). µARGUS version 3.2 software and user’s manual. Statistics Netherlands. http://neon.vb.cbs.nl/casc.
[18]Hundepool A, Willenborg L (1996). µ- and τ-ARGUS: software for statistical disclosure control. In Proc. of the 3rd International Seminar on Statistical Confidentiality, Bled.
[19]Sweeney L (1997). Guaranteeing anonymity when sharing medical data, the Datafly system. In Journal of the American Medical Informatics Association, Washington, DC: Hanley & Belfus, Inc.
[20]Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005). Approximation algorithms for k-anonymity. Journal of Privacy Technology, paper number 20051120001.
[21]Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005). Anonymizing tables. In Proc. of the 10th International Conference on Database Theory (ICDT’05), pp. 246–258, Edinburgh, Scotland.
[22]Meyerson A, Williams R (2004). On the complexity of optimal k-anonymity. In Proc. of the 23rd ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, pp. 223–228, Paris, France.
[23]Ron Kohavi, "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid", Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996
[24]Xiao and Y. Yao, “Anatomy: simple and effective privacy using generalization and suppression,” Int. J. Uncertain. Fuzziness Knowl.-Based Syst., vol. 10, no.5, pp. 571-588, 25-36.
[25]Cynthia Dwork. Differential privacy. In Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener, editors, Automata, Languages and Programming, volume 4052 of Lecture Notes in Computer Science, pages 1–12. Springer Berlin Heidelberg, 2006. ISBN 978-3-540-35907-4. doi: 10.1007/11787006_1. URL http://dx.doi.org/10.1007/11787006_1.
[26]Benjamin C. M. Fung, Ke Wang, Rui Chen, and Philip S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv., 42(4):14:1–14:53, June 2010. ISSN 0360-0300. doi: 10.1145/1749603.1749605.
URL http://doi.acm.org/10.1145/1749603.1749605.
[27]UTD Anonymization ToolBox. http://cs.utdallas.edu/dspl/cgi-bin/toolbox/.
[28]OpenData websites. http://www.data.gov/, http://data.gov.uk/.

[29]N. Li, T. Li, and S. Venkatasubramanian, “t-closeness: Privacy beyond k-anonymity and l-diversity,” in Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, 15-20 April 2007, pp. 106–115.
[30]Central Statistics Office Databases. http://www.cso.ie/en/databases/.
[31]Q. Zhang, N. Koudas, D. Srivastava, and T. Yu. Aggregate Query Answering on Anonymized Tables. In Proceedings of the 23rd International Conference on Data Engineering, ICDE ’07, pages 116–125, 2007.
[32]M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving Anonymization of Set-valued Data. Proceedings of the VLDB Endowment, 1(1):115–125, 2008.
[33]A. Pinto. A Comparison of Anonymization Protection Principles. In International Conference on Information Reuse and Integration, pages 207–214, 2012.
[34]L.Sweeney. Achieving K-anonymity Privacy Protection Using Generalization and Suppression. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):571–588, 2002.
[35]J. Domingo-Ferrer and V. Torra. Ordinal, Continuous and Heterogeneous k-Anonymity through Micro aggregation. Data Min. Knowl. Discov., 11(2):195–212, 2005.
[36]K.El Emam. Data Anonymization Practices in Clinical Research: A Descriptive Study. Technical report, Access to Information and Privacy Division of Health Canada, Ottawa, 2006.
[37]M. E. Nergiz and C. Clifton. Thoughts on k-Anonymization. Data and Knowledge Engineering, 63(3):622–645, 2007.
[38]V. S. Iyengar. Transforming Data to Satisfy Privacy Constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, pages 279– 288, 2002.
[39]P. Samarati. Protecting Respondents’ Identities in Microdata Release. IEEE Trans .on Knowl. and Data Eng., 13(6):1010–1027, 2001.
[40]K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian Multidimensional K-Anonymity. In Proceedings of the 22nd International Conference on Data Engineering, ICDE ’06, page 25, 2006.
[41]B.-C. Chen, D. Kifer, K. LeFevre, and A. Machanavajjhala. Privacy-Preserving Data Publishing. Foundations and Trends in Databases, 2(1–2):1–167, 2009.
[42]V. Ayala-Rivera, P. McDonagh, T. Cerqueus, and L. Murphy. Synthetic Data Generation using Benerator Tool. Technical report, University College Dublin, UCD-CSI-2013-03, 2013.f
[43]M. R. S. Aghdam and N. Sonehara. On Enhancing Data Utility in k-Anonymization for Data without Hierarchical Taxonomies. International Journal of Cyber-Security and Digital Forensics, 2(2):12–22, 2013.
[44]F. K. Dankar and K. El Emam. Practicing Differential Privacy in Health Care: A Review. Transactions on Data Privacy, 6(1):35–67, 2013.
[45]J. Gantz and D. Reinsel. The digital universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. Technical report, IDC, sponsored by EMC, 2012.
[46]D. Goodin. Poorly anonymized logs reveal NYC cab drivers’ detailed whereabouts. http://arstechnica.com/tech-policy/2014/06/poorly-anonymized-logs-reveal-nyc-cabdrivers-detailed-whereabouts/., 2014
[47]T.Tassa, A.Mazza, and A.Gionis. K-Concealment: An Alternative Model of k-Type Anonymity. Transactions on Data Privacy, 5(1):189–222, 2012.
[48]A. Pinto. A Comparison of Anonymization Protection Principles. In International Conference on Information Reuse and Integration, pages 207–214, 2012
[49]Mahesh R, Meyyappan T, “A New Method for Preserving Privacy in Data Publishing”, International workshop on cryptography and Information Security, CS&IT proceedings,2012,pp 261-266
[50]J.Soria-Comas, J.Domingo-Ferrer, D.S´anchez, and S.Mart´ınez. Improving the Utility of Differentially Private Data Releases via k-Anonymity. In Proceedings of the 12 th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TRUSTCOM ’13, pages 372–379, 2013.
[51]S. Morton, M. Mahoui, P. J. Gibson, and S. Yechuri. An Enhanced Utility-Driven Data Anonymization Method. Transactions on Data Privacy, 5(2):469–503, 2012.
[52]S. Mart´ınez, D. S´anchez, and A. Valls. Semantic Adaptive Microaggregation of Categorical Microdata. Computers & Security, 31(5):653–672, 2012.
[53]B. Kenig and T. Tassa. A Practical Approximation Algorithm for Optimal k-Anonymity. Data Min. Knowl. Discov., 25(1):134–168, 2012.
[54]Information Commissioner’s Office. Data Sharing Code of Practice. Technical report, ICO, 2011.
[55]F. K. Dankar and K. El Emam. Practicing Differential Privacy in Health Care: A Review. Transactions on Data Privacy, 6(1):35–67, 2013.
[56]K. S. Babu, N. Reddy, N. Kumar, M. Elliot, and S. K. Jena. Achieving k-anonymity Using Improved Greedy Heuristics for Very Large Relational Databases. Transactions on Data Privacy, 6(1):1–17, 2013.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top