臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.109) 您好！臺灣時間：2026/06/06 08:47

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

朱恩榮

研究生(外文):

Khunanon chunlakan

論文名稱:

K-匿名模型於確保隱私資料探勘之研究

論文名稱(外文):

A Study of K-anonymity Model for Privacy Preserving Data Mining

指導教授:

江季翰

指導教授(外文):

Ji-Han Jiang

學位類別:

碩士

校院名稱:

國立虎尾科技大學

系所名稱:

資訊工程系碩士班

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2018

畢業學年度:

106

語文別:

英文

論文頁數:

中文關鍵詞:

k-匿名、性能、確保隱私、算法

外文關鍵詞:

K-anonymity、Algorithm、Privacy-preservation、Performance

相關次數:

被引用:0
點閱:275
評分:
下載:10
書目收藏:0

在信息技術時代, 有很多行業從客戶那裡收集數據. 之後這些數據被發布用於研究和分析目的. 數據披露已被廣泛使用. K-匿名是使用最廣泛的之一隱私保護方法. 在這篇論文, 我們將比較眾所周知的三種k-匿名方法. 通過使用從網上資源公開發布的數據集, 也使用公開實施的k-匿名算法. 我們通過使用選定的相同度量和相同的數據集來演示實驗. 然後，我們將討論每個算法的性能哪一個是所選算法中最好的.

In the information technologies era, there are have a lot of industries that collected the data from the customer. And later those data are released for research or analysis purposes. The public data contain the sensitive information like salary or disease. K-anonymity is one of the most widely use privacy preservation model. Because of its famous method, there are a lot of algorithms that have been introduced. In this thesis, we will compare between the most widely used three algorithms of k-anonymity. By performing a comparison of three algorithms to measure their efficiency and their effectiveness. We have extended the scope of the evaluation by using a more comprehensive set of scenarios: different dataset and parameter. Using open sources of those algorithms. We have a various set of experiments to identify each algorithm performance in order to know which algorithm is more appropriated in the selected scenario. We demonstrate through experiment evaluation, what condition that there is algorithm outperforms the other algorithms according to input privacy requirement and datasets. Our result will give future researcher to create new algorithm or methodologies about k-anonymity.

Abstract...i
摘要...ii
Acknowledgements...iii
Table of Contents...iv
List of Tables...vii
List of Figures...viii
List of Abbreviations...xi
Chapter 1 Introduction...1
1.1 Research Motivation...1
1.2 Thesis Objectives...2
1.3 Thesis Outline...2
Chapter 2 Background and Related Work...4
2.1 Data anonymization...4
2.2 K-anonymity...5
2.2.1 Generalization...7
2.2.1.1 Domain Generalization hierarchy...7
2.2.1.2 Value Generalization hierarchy...7
2.2.2 Suppression...11
2.3 K-anonymity Classification...12
2.4 K-anonymity Extension...13
2.5 Related Work...17
Chapter 3 Study of Methodology...20
3.1 Generalization Information loss (Genloss)...20
3.2 UTD anonymization Toolbox...21
3.3 Datafly Algorithms...24
3.4 Incognito Algorithms...28
3.5 Incognito with T-closeness Algorithms...31
Chapter 4 Comparison of K-anonymity...31
4.1 Dataset...31
4.2 Environment...34
4.3 Results...35
4.3.1 Experiment 1...35
4.3.2 Experiment 2...52
4.3.3 Experiment 3...58
Chapter 5 Conclusion...60
Reference...61
Extended Abstract...67
Curriculum Vitae...72

[1]Privacy Technology Focus Group Report. United States Department of Justice. 2006. p. 52.
[2]"Data anonymization". The Free Medical Dictionary. Retrieved 17 January 2014.
[3]Bin Zhou; Jian Pei; WoShun Luk (December 2008). "A brief survey on anonymization techniques for privacy preserving publishing of social network data". Newsletter ACM SIGKDD Explorations Newsletter. 10 (2): 12–22.
[4]Sun, X., Wang, H., Zhang, Y.: On the identity anonymization of high-dimensional rating data. No. March (2011). pp. 1108–1122 (2012)Sweeney, L.: k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. Int. J. Uncertainty, Fuzziness Knowledge-Based Syst., Vol. 10. No. 05. pp. 557–570. Oct (2002).
[5]Bayardo, R. J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. 21st Int. Conf. Data Eng. pp. 217–228.
[6]Li, N.: Provably Private Data Anonymization: Or, k -Anonymity Meets Differential Privacy. (2010)
[7]Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: Privacy Beyond k-Anonymity. ACM Trans. Knowl. Discov. Data, Vol. 1. No. 1, p. 3–es. Mar (2007)
[8]Zhou, B., Pei, J.: The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl. Inf. Syst., Vol. 28. No. 1. pp. 47–77. Jun (2010)
[9]Li, N.: t -Closeness: Privacy Beyond k -Anonymity and -Diversity. In ICDE, Vol. 7. pp. 106-115. (2007)
[10]Russom, Yohannes. Privacy preserving for Big Data Analysis. MS thesis. University of Stavanger, Norway, 2013.
[11]Tang, Qingming, et al. "Improving Strict Partition for Privacy Preserving Data Publishing." Networking and Distributed Computing (ICNDC), 2010 First International Conference on. IEEE, 2010.
[12]Zakerzadeh, Hessam, Charu C. Aggarwal, and Ken Barker. "Privacy-preserving big data publishing." Proceedings of the 27th International Conference on Scientific and Statistical Database Management. ACM, 2015
[13]Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati. “k-Anonymity” Springer US, Advances in information Security, 2007
[14]P. Samarati and L. Sweeney. Protecting Privacy when Disclosing Information: k-anonymity and its enforcements through generalization and suppression. Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory, 1998
[15]Samarati P (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027.
[16]LeFevre K, DeWitt DJ, Ramakrishnan R (2005). Incognito: Eﬃcient full-domain k-anonymity. In Proc. of the 24th ACM SIGMOD International Conference on Management of Data, pp. 49–60, Baltimore, Maryland, USA.
[17]Hundepool A, Van deWetering A, Ramaswamy R, Franconi L, Capobianchi A, DeWolf PP, Domingo-Ferrer J, Torra V, Brand R, Giessing S (2003). µARGUS version 3.2 software and user’s manual. Statistics Netherlands. http://neon.vb.cbs.nl/casc.
[18]Hundepool A, Willenborg L (1996). µ- and τ-ARGUS: software for statistical disclosure control. In Proc. of the 3rd International Seminar on Statistical Conﬁdentiality, Bled.
[19]Sweeney L (1997). Guaranteeing anonymity when sharing medical data, the Dataﬂy system. In Journal of the American Medical Informatics Association, Washington, DC: Hanley & Belfus, Inc.
[20]Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005). Approximation algorithms for k-anonymity. Journal of Privacy Technology, paper number 20051120001.
[21]Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005). Anonymizing tables. In Proc. of the 10th International Conference on Database Theory (ICDT’05), pp. 246–258, Edinburgh, Scotland.
[22]Meyerson A, Williams R (2004). On the complexity of optimal k-anonymity. In Proc. of the 23rd ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, pp. 223–228, Paris, France.
[23]Ron Kohavi, "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid", Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996
[24]Xiao and Y. Yao, “Anatomy: simple and effective privacy using generalization and suppression,” Int. J. Uncertain. Fuzziness Knowl.-Based Syst., vol. 10, no.5, pp. 571-588, 25-36.
[25]Cynthia Dwork. Differential privacy. In Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener, editors, Automata, Languages and Programming, volume 4052 of Lecture Notes in Computer Science, pages 1–12. Springer Berlin Heidelberg, 2006. ISBN 978-3-540-35907-4. doi: 10.1007/11787006_1. URL http://dx.doi.org/10.1007/11787006_1.
[26]Benjamin C. M. Fung, Ke Wang, Rui Chen, and Philip S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv., 42(4):14:1–14:53, June 2010. ISSN 0360-0300. doi: 10.1145/1749603.1749605.
URL http://doi.acm.org/10.1145/1749603.1749605.
[27]UTD Anonymization ToolBox. http://cs.utdallas.edu/dspl/cgi-bin/toolbox/.
[28]OpenData websites. http://www.data.gov/, http://data.gov.uk/.

[29]N. Li, T. Li, and S. Venkatasubramanian, “t-closeness: Privacy beyond k-anonymity and l-diversity,” in Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, 15-20 April 2007, pp. 106–115.
[30]Central Statistics Ofﬁce Databases. http://www.cso.ie/en/databases/.
[31]Q. Zhang, N. Koudas, D. Srivastava, and T. Yu. Aggregate Query Answering on Anonymized Tables. In Proceedings of the 23rd International Conference on Data Engineering, ICDE ’07, pages 116–125, 2007.
[32]M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving Anonymization of Set-valued Data. Proceedings of the VLDB Endowment, 1(1):115–125, 2008.
[33]A. Pinto. A Comparison of Anonymization Protection Principles. In International Conference on Information Reuse and Integration, pages 207–214, 2012.
[34]L.Sweeney. Achieving K-anonymity Privacy Protection Using Generalization and Suppression. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):571–588, 2002.
[35]J. Domingo-Ferrer and V. Torra. Ordinal, Continuous and Heterogeneous k-Anonymity through Micro aggregation. Data Min. Knowl. Discov., 11(2):195–212, 2005.
[36]K.El Emam. Data Anonymization Practices in Clinical Research: A Descriptive Study. Technical report, Access to Information and Privacy Division of Health Canada, Ottawa, 2006.
[37]M. E. Nergiz and C. Clifton. Thoughts on k-Anonymization. Data and Knowledge Engineering, 63(3):622–645, 2007.
[38]V. S. Iyengar. Transforming Data to Satisfy Privacy Constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, pages 279– 288, 2002.
[39]P. Samarati. Protecting Respondents’ Identities in Microdata Release. IEEE Trans .on Knowl. and Data Eng., 13(6):1010–1027, 2001.
[40]K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian Multidimensional K-Anonymity. In Proceedings of the 22nd International Conference on Data Engineering, ICDE ’06, page 25, 2006.
[41]B.-C. Chen, D. Kifer, K. LeFevre, and A. Machanavajjhala. Privacy-Preserving Data Publishing. Foundations and Trends in Databases, 2(1–2):1–167, 2009.
[42]V. Ayala-Rivera, P. McDonagh, T. Cerqueus, and L. Murphy. Synthetic Data Generation using Benerator Tool. Technical report, University College Dublin, UCD-CSI-2013-03, 2013.f
[43]M. R. S. Aghdam and N. Sonehara. On Enhancing Data Utility in k-Anonymization for Data without Hierarchical Taxonomies. International Journal of Cyber-Security and Digital Forensics, 2(2):12–22, 2013.
[44]F. K. Dankar and K. El Emam. Practicing Differential Privacy in Health Care: A Review. Transactions on Data Privacy, 6(1):35–67, 2013.
[45]J. Gantz and D. Reinsel. The digital universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. Technical report, IDC, sponsored by EMC, 2012.
[46]D. Goodin. Poorly anonymized logs reveal NYC cab drivers’ detailed whereabouts. http://arstechnica.com/tech-policy/2014/06/poorly-anonymized-logs-reveal-nyc-cabdrivers-detailed-whereabouts/., 2014
[47]T.Tassa, A.Mazza, and A.Gionis. K-Concealment: An Alternative Model of k-Type Anonymity. Transactions on Data Privacy, 5(1):189–222, 2012.
[48]A. Pinto. A Comparison of Anonymization Protection Principles. In International Conference on Information Reuse and Integration, pages 207–214, 2012
[49]Mahesh R, Meyyappan T, “A New Method for Preserving Privacy in Data Publishing”, International workshop on cryptography and Information Security, CS&IT proceedings,2012,pp 261-266
[50]J.Soria-Comas, J.Domingo-Ferrer, D.S´anchez, and S.Mart´ınez. Improving the Utility of Differentially Private Data Releases via k-Anonymity. In Proceedings of the 12 th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TRUSTCOM ’13, pages 372–379, 2013.
[51]S. Morton, M. Mahoui, P. J. Gibson, and S. Yechuri. An Enhanced Utility-Driven Data Anonymization Method. Transactions on Data Privacy, 5(2):469–503, 2012.
[52]S. Mart´ınez, D. S´anchez, and A. Valls. Semantic Adaptive Microaggregation of Categorical Microdata. Computers & Security, 31(5):653–672, 2012.
[53]B. Kenig and T. Tassa. A Practical Approximation Algorithm for Optimal k-Anonymity. Data Min. Knowl. Discov., 25(1):134–168, 2012.
[54]Information Commissioner’s Ofﬁce. Data Sharing Code of Practice. Technical report, ICO, 2011.
[55]F. K. Dankar and K. El Emam. Practicing Differential Privacy in Health Care: A Review. Transactions on Data Privacy, 6(1):35–67, 2013.
[56]K. S. Babu, N. Reddy, N. Kumar, M. Elliot, and S. K. Jena. Achieving k-anonymity Using Improved Greedy Heuristics for Very Large Relational Databases. Transactions on Data Privacy, 6(1):1–17, 2013.

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	強化BlindIDS 之效能與正確性

無相關期刊

1.	基於卷積神經網絡的多屬性服裝分類
2.	情境資料蒐集之塑模研究
3.	季節性ARIMA模型於餐館評論之頻繁詞預測
4.	國立虎尾科技大學WiFi 使用的生存分析
5.	顧客滿意度與價格知覺對於口碑及重購意願的影響‒以越南廉價航空為例
6.	應用電磁式動能擷取之不斷電自行車照明裝置
7.	衡量顧客使用網路銀行之行為: 以泰國為例
8.	創新發展和充足經濟理念對企業績效的影響：以泰國農業和可持續發展調查為例
9.	基於MapReduce對於資料探勘技術之研製
10.	利用資料探勘技術於交通事故嚴重度預測模型之建立
11.	地方飲食文化與休憩關係之研究-以雲林虎尾小吃意象及體驗為例
12.	銑削鋁合金之最佳切削加工表面紋理端銑刀之研究
13.	社會媒體對消費者購買目的之影響的調查研究調查區域：越南
14.	基於混合轉換域技術及奇異值分解的視訊浮水印系統
15.	在雲端醫療知識平台運用關聯規則與語意網技術

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室