研究生(外文):Chen, Yan-Ting
論文名稱(外文):Enhancing Personalized Federated Learning Using Bidirectional Knowledge Distillation
指導教授(外文):Liu, Ren-Shiou
口試委員(外文):Tsai, Meng-HsunChang, Yu-ChingLi, Min-Yang
外文關鍵詞:Knowledge DistillationBidirectional DistillationFederated Learning
本研究將能克服模型差異的雙向蒸餾加入到聯盟式學習的設置中,因應訓練設備之間的模型多樣性。先將相似特性的模型進行分群。每個群集會有一個Prototype Model (原型模型)。在Prototype Model上訓練出一個規模更大的Meta Model (全域模型),透過雙向蒸餾的方式提升彼此模型間的效能,也進而提升每個設備端模型效能的表現。經過實驗證明,透過全域模型蒸餾而得到的新Prototype Model,其準確率皆比原模型來的高。
In recent years, the demand for artificial intelligence (AI) has been steadily increasing across various industries. Consequently, there is a significant need for larger datasets and greater diversity to train AI models effectively. However, the reality is that most fields have limited access to data, and businesses face challenges in exchanging information due to competitive relationships. Moreover, the growing awareness of data privacy among people has significantly complicated data collection efforts. As a result, the inability to acquire sufficient data has posed a challenge in training high-performing models. In response to these difficulties, Federated learning has emerged as a prominent solution to address these issues.
In the framework of Federated learning, every device contributes to the collaborative training of a shared model using its own data. This approach eliminates the necessity of uploading private data, thereby addressing concerns related to data collection and privacy security. However, as federated learning has rapidly progressed in recent years, its associated challenges have also come to light. One such challenge is the unequal capabilities among devices, which hinder the easy generalization of models. Additionally, the performance of the models is influenced by variations in data distribution across different devices. As a result, achieving more personalized Federated learning has emerged as a key research direction for the future.
This study incorporates the bidirectional distillation technique, which addresses
model differences, into the framework of Federated learning. Firstly, models with similar characteristics are grouped together. Each group has a Prototype Model,
and the parameters transmitted by client devices are aggregated. Subsequently,
each Prototype Model is trained using unlabeled data. A larger Meta Model is then
trained based on these Prototype Models. By employing bidirectional distillation,
the performance of the Prototype Models is enhanced, leading to improved performance of individual models on each device.
摘要 i
誌謝 xii
目錄 xiii
表目錄 xvi
圖目錄 xvii
1 緒論 1
1.1 背景及動機 2
1.2 研究目的 3
1.3 研究方法 3
1.4 論文架構 4
2 相關文獻探討 5
2.1 聯盟式學習 7
2.1.1 聯盟式學習的種類 8
2.1.2 個人化聯盟式學習 11
2.2 知識蒸餾 18
2.2.1 聯盟式學習中的知識蒸餾 20
2.2.2 雙向蒸餾 21
2.3 FederatedAveraging 22
2.4 小結 26
3 研究方法 27
3.1 問題描述及模型架構 28
3.2 集成蒸餾 30
3.2.1 用戶端本地更新 30
3.2.2 伺服器端集成蒸餾 31
3.3 雙向蒸餾 33
3.3.1 雙向蒸餾損失 35
4 實驗與分析 38
4.1 實驗流程 38
4.2 實驗資料集概述 39
4.3 實驗之模型概述 40
4.4 實驗環境與參數設定 41
4.5 實驗評估指標 43
4.6 實驗結果與分析 44
4.6.1 實驗一:探討不同α值下,non-iid程度對雙向蒸餾的影響 44
4.6.2 實驗二:將各Prototype Model對Meta Model做知識蒸餾後,Meta Model與各Prototype Model的效能比較 48
4.6.3 實驗三:將Meta Model知識蒸餾回給各Prototype Model,蒸餾前後Prototype Model的效能比較 51
4.6.4 實驗四:在原有環境下加入新的Prototype Model,比較原訓練方法和Meta Model蒸餾的效能比較 53
4.6.5 實驗五:比較不同Prototype Model的數量對Meta Model的效能影響 56
5 結論與未來發展 59
參考文獻 60
