( 您好!臺灣時間:2021/05/10 00:56
字體大小: 字級放大   字級縮小   預設字形  


論文名稱(外文):The Impact of Linear Transformation on the Effectiveness and Security of the Privacy Preserving Data Mining Process
指導教授(外文):Tzu-Tsung Wong
外文關鍵詞:classificationdata perturbationencryptionlinear transformationprivacy preserving
  • 被引用被引用:0
  • 點閱點閱:35
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
Since data mining techniques can extract useful knowledge from data, more and more people are devoted to this field. The data for mining generally contain personal records, and hence people pay more attention on preventing their private data from being disclosed. This study attempts to establish a procedure to ensure the effectiveness and security of the original data. Data are transformed by piecewise linear functions before sending to data analysts who will apply classification methods on the transformed data. Transmitting data are also protected by perturbation and encryption processes. The classification models produced by data analysts can be sent back to data providers who will restore the models for the data analysts. This restoring process is designed to ensure that data analysts can have models for classifying new instances. According to the experimental results on ten data sets, the more pieces a linearly function has, the higher security can be achieved for algorithms decision tree and rule-based classifier. The data analyzed by algorithms logistic regression and support vector machine should not be transformed by multi-piece linear functions, because the accuracies of the original and transformed data resulting from these two algorithms will be different.
摘要 I
誌謝 VI
目錄 VII
表目錄 IX
圖目錄 X
第一章 緒論 1
1.1 研究背景及動機 1
1.2 研究目的 2
1.3 研究流程 3
1.4 研究限制 3
第二章 文獻探討 4
2.1 隱私保護資料探勘 4
2.1.1 隱私保護資料探勘的應用技術 4
2.1.2 隱私保護資料探勘的實際應用 6
2.1.3 隱私保護資料探勘的挑戰 7
2.2 資料擾動技術 8
2.3 加密技術 10
2.3.1 同態加密 10
2.3.2 RSA加密 11
2.4 PPDM評估方法 13
2.4.1 隱私級別 13
2.4.2 資料質量 13
2.5 小結 15
第三章 研究方法 17
3.1 資料轉換 18
3.1.1 一段式線性轉換 18
3.1.2 多段式線性轉換 19
3.2 效果測試 21
3.2.1 決策樹、規則分類 21
3.2.2 羅吉斯/線性迴歸、支持向量機 24
3.3 新資料預測 25
3.3.1 模型還原 27
3.3.2 預測資料程式 28
3.4 噪音干擾及加密 30
3.5 評估指標 31
第四章 實證研究 33
4.1 資料集介紹 33
4.2 效用性實證 34
4.2.1 一段式線性分析 35
4.2.2 多段式線性分析 38
4.3 安全性實證 45
4.4 小結 48
第五章 結論與建議 49
5.1 結論 49
5.2 未來研究與發展 50
參考文獻 51
Ahmad, I., & Archana, K. (2014). Homomorphic encryption method applied to cloud computing. International Journal of Information & Computation Technology, 4(15), 1519-1530.
Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. Proceedings of the ACM SIGMOD Conference on Management of Data, 439-450.
Bhaladhare, P. R., & Jinwala, D. C. (2016). Novel approaches for privacy preserving data mining in k-anonymity model. Journal of Information Science and Engineering, 32(1), 63–78.
Chen, K., & Liu, L. (2005). Privacy preserving data classification with rotation
perturbation. Fifth IEEE International Conference on Data Mining, 589-592.
Cai, YL., & Tang, CM. (2019). Privacy of outsourced two-party k-means clustering. Concurrency and Computation: Practice and Experience, doi: 10.1002/cpe.5473.
Gokulnath, C., Priyan, M. K., Balan, E. V., Prabha, K. R., & Jeyanthi, R. (2015).
Preservation of privacy in data mining by using PCA based perturbation technique. 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials, 202-206.
Gao, JL., Ping, Q., & Wang, JX. (2018). Resisting re-identification mining on social graph data. World Wide Web - Internet and Web Information Systems, 21(6), 1759-1771.
Jain, Y. K., & Bhandare, S. K. (2011). Min max normalization based data perturbation method for privacy protection. International Journal of Computer & Communication Technology, 2, 45-50.
Liew, C. K., Choi, U. J., & Liew, C. J. (1985). A data distortion by probability
distribution. ACM Transaction on Database Systems, 10(3), 395-411.
Li, G., & Xue, R. (2018). A new privacy-preserving data mining method using non-negative matrix factorization and singular value decomposition. Wireless Personal Communications, 102(2), 1799-1808.
López, V., Fernández, A., Moreno-Torres, J. G., & Herrera, F. (2012). Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Systems with Applications, 39(7), 6585-6608.
Mittal, D., Kaur, D., & Aggarwal, A. (2014). Secure data mining in cloud using
homomorphic encryption. 2014 IEEE International Conference on Cloud Computing in Emerging Markets, 1–7.
Ma, H., Guo, XY., Ping, Y., Wang, BC., Yang, YH., Zhang, ZL., & Zhou, JX. (2019). PPCD: Privacy-preserving clinical decision with cloud support. Plos One, 14(5), doi: 10.1371/journal.pone.0217349.
Maheswaria, N., & Revathi, M. (2014). Data security using decomposition. International Journal of Applied Science and Engineering, 12(4), 303-312.
Mendes, R., & Vilela, J. P. (2017). Privacy-preserving data mining: methods, metrics, and applications. IEEE Access, 5, 10562–10582.
Oliveira, S.R.M., & Zaı¨ane, O.R. (2010). Privacy preserving clustering by data
transformation. Journal of Information and Data Management, 1(1), 37–51.
Rivest, R., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM, 21(2), 120-126.
Rathna, S. S., & Karthikeyan, T. (2015). Survey on recent algorithms for privacy preserving data mining. International Journal of Computer Science and Information Technologies, 6(2), 1835-1840.
San, I., At, N., Yakut, I., & Polat, H. (2016). Efficient paillier cryptoprocessor for privacy-preserving data mining. Security and Communication Networks, 9(11), 1535–1546.
Saranya, K., Premalatha, K., & Rajasekar, S. S. (2015). A survey on privacy preserving data mining. 2nd International Conference on Electronics and Communication System, 1740–1744.
Sweeney, L. (2002). k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness, and Knowledge-Based Systems, 10(5), 557-570.
Tripathi, R., & Agrawal, S. (2014). Comparative study of symmetric and asymmetric cryptography techniques. International Journal of Advance Foundation and Research in Computer, 1(6), 68–76.
Tsiafoulis, S. G., & Zorkadis, V. C. (2010). A neural network clustering based
algorithm for privacy preserving data mining. International Conference on Computational Intelligence and Security, 401-405.
Upadhyay, S., Sharma, C., Sharma, P., Bharadwaj, P., & Seeja, K. R. (2018). Privacy preserving data mining with 3-D rotation transformation. Journal of King Saud University-Computer and Information Sciences, 30(4), 524-530.
Wang, Q., Du, MX., Chen, XY., Chen, YJ., Zhou, P., Chen, XF., & Huang, XY. (2018).Privacy-preserving collaborative model learning: the case of word vector training. IEEE Transactions on Knowledge and Data Engineering, 30(12), 2381-2393.
Wu, W., Parampalli, U., Liu, J., & Xian, M. (2019). Privacy preserving k-nearest neighbor classification over encrypted database in outsourced cloud environments. World Wide Web - Internet and Web Information Systems, 22(1), 101-123.
Wang, Y., Adams, S., Beling, P., Greenspan, S., Rajagopalan, S., Velez-Rojas, M., Mankovski, S., Boker, S., & Brown, D. (2018). Privacy preserving distributed deep learning and its application in credit card fraud detection. 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/ 12th IEEE International Conference on Big Data Science and Engineering, 1070–1078.
Yin, D., & Yang, Q. (2018) GANs based density distribution privacy-preservation on mobility data. Security and Communication Networks, 2018(2), 1-13.
Yu, S. (2016). Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access, 4, 2751–2763.
Yun, U., & Kim, J. (2015). A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Systems with Applications, 42(3), 1149–1165.
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔