跳到主要內容

臺灣博碩士論文加值系統

(44.222.64.76) 您好!臺灣時間:2024/06/16 04:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳啓玄
研究生(外文):Wu, Chi-Hsuan
論文名稱:用於公平與穩健學習的無關模型有樣本選擇之訓練方法
論文名稱(外文):Model-Agnostic Training with Sample Selection for Fair and Robust Learning
指導教授:陳柏安陳柏安引用關係
指導教授(外文):Chen, Po-An
口試委員:陳豐奇林莊傑郭柏志陳柏安
口試委員(外文):Chen, Feng-ChiLin, Chuang-ChiehKuo, Po-ChihChen, Po-An
口試日期:2023-06-28
學位類別:碩士
校院名稱:國立陽明交通大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:英文
論文頁數:39
中文關鍵詞:可信任人工智慧公平與穩健訓練無關模型訓練不公平消弭穩健收斂
外文關鍵詞:trustworthy AIfair and robust trainingmodel-agnosticunfairness mitigationconvergence to robustness
相關次數:
  • 被引用被引用:0
  • 點閱點閱:79
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
公平算法是目前熱門的議題,傳統機器學習、深度學習可以應用於犯罪人臉部識別系統、金融機構個人信用評分制度等,不過這些人工智慧模型經過沒有考慮公平性的訓練後可能已存在偏見,這將會使後續的應用上產生錯誤以及帶有歧視的決策,例如最近國外流行的人臉識別破案系統,演算法偏見將會使某類種族或性別的族群誤判率提高。美國國家標準和技術研究所的一份報告顯示研究人員調查了 189 種人臉識別算法,發現大多數人臉識別算法都含有偏見。帶有偏見的算法對黑人和亞洲人臉的錯誤識別率是針對白人面孔的 10 至 100 倍,並且對女性的錯誤識別多過男性,導致黑人婦女很容易受到算法偏見的影響。

公平的模型訓練方法可以有效解決上述存在的演算法偏見問題,FairBatch 和 OmniFair 兩個研究針對不同敏感特徵與預測變數的樣本對,進行不同的權重配置 (adaptive reweighting),使得容易被歧視的樣本有更多機會被模型重視、學習,以上的算法即為消弭不公平方法中的訓練中處理方法 (in-processing training method),訓練中處理的權重配置方法存在無關模型 (model-agnostic) 的好處,也就是不需要因為公平性限制調整模型本身的架構,訓練中處理方法中的權重配置,只需要在訓練過程中添加不同權重給不同的樣本,即可訓練出公平模型,添加樣本權重的方式包括對樣本個別損失的加權亦或挑選個別樣本的比率進模型訓練。

可信賴人工智慧 (trustworthy AI) 要考慮消弭模型偏見之外,另一關鍵要素是模型穩健性,簡單來說,當訓練資料一部分訓練資料的標籤經過竄改時,便需要一個同時兼具公平與穩健的訓練方式,讓最後的模型是公平且穩健的。本研究使用 OmniFair 的訓練中算法架構,並在模型訓練中加入 ITLM 的穩健收斂方法,使訓練後的模型存在公平和穩健兩種特性,實驗利用翻轉一部分訓練資料的標籤,達到部分資料被竄改的效果,實驗結果表明,我們提出的算法能訓練出同時保有公平性與穩健性的模型。
Fair algorithm are currently a hot topic. Machine learning and deep learning can be applied to CV or NLP, and so on. However, these AI models may already have biases after training without considering fairness, which will lead to errors and discriminatory decisions in subsequent applications. A report from the National Institute of Standards and Technology (NIST) in the United States show that researchers investigate 189 facial recognition algorithms and find that error recognition rates for black and Asian faces that are 10 to 100 times higher than those for white faces, and have more errors in identifying women than men, leading to the easy influence of algorithmic bias on black women. Fair training methods can effectively address the algorithmic bias. Two studies, FairBatch and OmniFair, apply adaptive reweighting method for each of the combinations of sensitive attribute values and target label values, to give sensitive group more opportunities to be recognized and learned by the model. This method is in-processing unfairness mitigation, and the benefit of in-processing training is model-agnostic, meaning that it is not necessary to adjust the model architecture when training. The ways to reweight sample include weighting individual loss or adjusting fair group ratios in a batch for model training.

Another key element of trustworthy AI is model robustness. Simply put, when a portion of the labels in the training data is corrupted, a training method is needed to ensure that the final model is both fair and robust. In this study, we equal merge OmniFair and ITLM robust convergence method into the fair and robust training framework, resulting in a model with both fairness and robustness. In the experiment, we randomly flip a portion of the training data labels to simulate tampering with some of the data, and the results show that our proposed algorithm is able to train a fair and robust model.
誌謝 . . i
摘要 . . ii
Abstract. . iii
Table of Contents . . iv
List of Figures . . vi
List of Tables . . vii
1 Introduction . . 1
1.1 Related Work . . 2
2 Preliminaries . . 4
2.1 Fairness and Robustness Definitions . . 4
2.1.1 Fairness Measures . . 4
2.1.2 Group Fairness Constraints . . 5
2.1.3 Robustness Metric . . 7
2.2 Adaptive Reweighting for Fairness . . 8
2.3 Convergence with Robustness . . 14
3 Model-Agnostic Training with Sample Selection for Fair and Robust Learning . . 15
3.1 Loss Reweighting as Fair Group Ratio in Batch . . 17
3.2 Sample Selection for Robustness and Fairness . . 18
4 Experiments . . 20
4.1 Experimental Setup . . 20
4.1.1 Datasets . . 20
4.1.2 Model and Baselines . . 21
4.1.3 Noise Injection . . 22
4.1.4 Fairness and Robustness Metrics . . 22
4.1.5 Experimental Setting and Hyperparameters . . 22
4.2 Numerical Results and Analysis . . 23
iv
4.2.1 Performance on the Benchmark Datasets . . 24
4.2.2 Fine-tuning Pretrained Untrustworthy Model for Fairness and Robustness . . 30
4.2.3 Search Rate for Best Fairness . . 31
5 Conclusions and Future Work . . 35
5.1 Conclusions . . 35
5.2 Future Work . . 35
References . . 36
Appendix A Loss Reweighting as Fair Group Ratio . . 39
[1] S. E. Whang, Y. Roh, H. Song, and J.-G. Lee, “Data collection and quality challenges in deep learning: A data-centric ai perspective,” The VLDB Journal, pp. 1–23, 2023.
[2] Y. Roh, K. Lee, S. Whang, and C. Suh, “Sample selection for fair and robust training,” Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 815–827, 2021.
[3] H. Zhang, X. Chu, A. Asudeh, and S. B. Navathe, “Omnifair: A declarative system for model-agnostic group fairness in machine learning,” in Proceedings of the 2021 International Conference on Management of Data (ICMD), 2021, pp. 2076–2088.
[4] Y. Shen and S. Sanghavi, “Learning with bad training data via iterative trimmed loss minimization,” in International Conference on Machine Learning (ICML). PMLR, 2019, pp. 5739–5748.
[5] A. Sinha, H. Namkoong, and J. Duchi, “Certifying some distributional robustness with principled adversarial training,” in International Conference on Learning Representations (ICLR), 2018.
[6] T. Hashimoto, M. Srivastava, H. Namkoong, and P. Liang, “Fairness without demographics in repeated loss minimization,” in International Conference on Machine Learning (ICML). PMLR, 2018, pp. 1929–1938.
[7] P. Lahoti, A. Beutel, J. Chen, K. Lee, F. Prost, N. Thain, X. Wang, and E. Chi, “Fairness without demographics through adversarially reweighted learning,” Advances in neural information processing systems (NeurIPS), vol. 33, pp. 728–740, 2020. [8] F. Khani and P. Liang, “Removing spurious features can hurt accuracy and affect groups disproportionately,” in Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 2021, pp. 196–205.
[9] Y. Roh, K. Lee, S. Whang, and C. Suh, “Fr-train: A mutual information-based approach to fair and robust training,” in International Conference on Machine Learning (ICML). PMLR, 2020, pp. 8147–8157.
[10] D. Solans, B. Biggio, and C. Castillo, “Poisoning attacks on algorithmic fairness,” in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part I. Springer, 2021, pp. 162–177.
[11] C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, “Fairness through awareness,” in Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS), 2012, pp. 214–226.
[12] A. Khademi, S. Lee, D. Foley, and V. Honavar, “Fairness in algorithmic decision making: An excursion through the lens of causality,” in The World Wide Web Conference (WWW), 2019, pp. 2907–2914.
[13] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Computing Surveys (CSUR), vol. 54, no. 6, pp. 1–35, 2021.
[14] K. H. Brodersen, C. S. Ong, K. E. Stephan, and J. M. Buhmann, “The balanced accuracy and its posterior distribution,” in 2010 20th International Conference on Pattern Recognition (ICPR). IEEE, 2010, pp. 3121–3124.
[15] S. Boyd, S. P. Boyd, and L. Vandenberghe, Convex optimization. Cambridge University Press, 2004.
[16] J. Angwin, J. Larson, S. Mattu, and L. Kirchner, “Machine bias: There’s software used across the country to predict future criminals,” And it’s biased against blacks. ProPublica, vol. 23, pp. 77–91, 2016.
[17] Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by conditional adversarial autoencoder,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5810–5818.
[18] M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi, “Fairness constraints: Mechanisms for fair classification,” in Artificial Intelligence and Statistics. PMLR, 2017, pp. 962–970.
[19] R. K. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovic et al., “Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias,” arXiv preprint arXiv:1810.01943, 2018.
[20] Y. Roh, K. Lee, S. E. Whang, and C. Suh, “Fairbatch: Batch selection for model fairness,” in 9th International Conference on Learning Representations (ICLR), 2021.
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[22] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan et al., “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314–1324.
[23] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Ieee, 2009, pp. 248–255.
[24] D. C. Liu and J. Nocedal, “On the limited memory bfgs method for large scale optimization,” Mathematical Programming, vol. 45, no. 1-3, pp. 503–528, 1989.
[25] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[26] A. Paudice, L. Muñoz-González, and E. C. Lupu, “Label sanitization against label flipping poisoning attacks,” in ECML PKDD 2018 Workshops: Nemesis 2018, UrbReas 2018, SoGood 2018, IWAISe 2018, and Green Data Mining 2018, Dublin, Ireland, September 10- 14, 2018, Proceedings 18. Springer, 2019, pp. 5–15.
電子全文 電子全文(網際網路公開日期:20240718)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top