跳到主要內容

臺灣博碩士論文加值系統

(44.221.73.157) 您好!臺灣時間:2024/06/17 20:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蔡哲平
研究生(外文):Che-Ping Tsai
論文名稱:使用生成式對抗網路及最佳補全序列汲取法之多標籤分類技術
論文名稱(外文):Multi-label Classification Techniques with Generative Adversarial Network and Optimal Completion Distillation
指導教授:李琳山李琳山引用關係
口試委員:陳信宏王小川鄭秋豫李宏毅
口試日期:2019-07-09
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:中文
論文頁數:96
中文關鍵詞:多標籤分類生成對抗網路
外文關鍵詞:Multi-label classificationGenerative Adversarial Network
DOI:10.6342/NTU201901512
相關次數:
  • 被引用被引用:1
  • 點閱點閱:222
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文的主軸是多標籤分類(Multi-Label Classification)之新技術。隨著機器學習技術的日新月異,基於深層類神經網路(Deep Neural Network)的解決方法陸續被提出,前人的研究指出考慮標籤間的關聯性,是增進模型表現的關鍵。本論文的第一個大方向是以生成對抗網路(Generative Adversarial Network)來模擬標籤關聯性。在此架構下,分類器扮演生成器(Generator)的角色,其輸入是一個物件,輸出是屬於此物件的標籤集(Label set),鑑別器(Discriminator)則需要學習標籤之間的關聯性,來分辨此標籤集是從生成器產生還是來自真實的資料;分類器不只需要學會標籤和物件間的關係,也需要使產生出的標籤集具有正確的關聯性,以欺騙鑑別器。本論文第二個方向是改進基於遞迴式類神經網路(Recurrent Neural Network)的多標籤分類器;這種模型使用遞迴式類神經網路解碼器來模擬標籤關聯性,並依序預測標籤。然而,此模型在訓練時,需要人為定義的標籤順序,用來將標籤集轉變成標籤序列,為訓練遞迴式類神經網路的目標序列;前人的研究已指出標籤順序對模型表現有相當大的影響,人為強加的順序性也可能會和機器推斷
的標籤關係不一致。因此,本論文提出最佳補全序列汲取法(Optimal Completion Distillation),使模型不需要標籤順序便可訓練。透過分析實驗數據,我們也證實我們提出的模型不只表現較好,廣泛化能力(Generalization ability)也較強,能夠預測出在訓練集沒有出現過的標籤集。本論文也提供了上述兩種方法在多標籤影像分類、文件分類、環境音分類上相當豐富的測試結果。
Multi-label classification (MLC) assigns multiple labels to each sample. This paper proposes two methods that improves performance of multi-label classifiers.

Recent work has shown that exploiting relations between labels improves the performance of multi-label classification. The first direction in this paper is to use Generative Adversarial Network (GAN) to model label dependencies. The discriminator learns to model label dependency by discriminating real and generated label sets. To fool the discriminator, the classifier, or generator, learns to generate label sets with dependencies close to real data.

The second direction is to improve state-of-the-art multi-label classifiers , which utilize a recurrent neural network (RNN) decoder to model the label dependency. However, training a RNN decoder requires a predefined order of labels, which is not directly available in the MLC specification. Besides, RNN thus trained tends to overfit the label combinations in the training set and have difficulty generating unseen label sequences. Therefore, we propose a new framework for MLC which does not rely on a predefined label order and thus alleviates exposure bias. We also find the proposed approach has a higher probability of generating label combinations not seen during training than the baseline models. The result shows that the proposed approach has better generalization capability.

This paper also provides experimental results on multiple multi-label classification benchmark datasets in different domains, including text classification, image classification and sound-event classification.
口試委員會審定書 i
誌謝 ii
中文摘要 v
英文摘要 vi
一、導論 1
1.1 研究動機 1
1.2 研究方向 3
1.3 章節安排 4
二、背景知識 5
2.1 多標籤分類(multi-label classification) 5
2.1.1 簡介 5
2.1.2 常見的解決方法 5
2.2 序列到序列(sequence-to-sequence)模型 8
2.2.1 類神經網路(Neural Network, NN) 8
2.2.2 遞迴式類神經網路(Recurrent Neural Network, RNN) 13
2.2.3 序列到序列模型 15
2.2.4 序列到序列模型應用於多標籤分類 19
2.2.5 強化學習(Reinforcement Learning, RL)應用於序列到序列模型 20
2.3 生成對抗網路(Generative Adversarial Network, GAN) 23
2.3.1 簡介 23
2.3.2 條件式生成對抗網路 25
2.3.3 霍式生成對抗網路(Wasserstein GAN) 26
2.4 本章總結 30
三、以生成式對抗網路幫助多標籤分類器 31
3.1 簡介 31
3.1.1 研究動機 31
3.2 本論文所提出之模型 32
3.3 模型之訓練方式 34
3.3.1 分類器之訓練方式 34
3.3.2 鑑別器之訓練方式 35
3.4 系統評估 37
3.4.1 實驗設定 37
3.4.2 實驗結果 42
3.4.3 實驗結果分析 44
3.4.4 切除研究 46
3.4.5 模型輸出範例 48
3.5 本章總結 50
四、最佳補全蒸餾法應用於多標籤分類 51
4.1 簡介 51
4.1.1 研究動機 51
4.2 模型簡介 52
4.2.1 編碼器E架構 53
4.2.2 遞迴式類神經網路解碼器架構 53
4.2.3 二元關聯解碼器架構 54
4.3 訓練方式 55
4.3.1 訓練遞迴式類神經網路解碼器 55
4.3.2 訓練二元關聯解碼器 57
4.3.3 多目標訓練 58
4.4 測試方式 60
4.4.1 基本的測試方式 60
4.4.2 結合兩解碼器的測試方式 60
4.5 模型表現評估 61
4.5.1 實驗資料集 62
4.5.2 基準模型介紹 62
4.5.3 評估指標介紹 65
4.5.4 實驗設定 66
4.5.5 實驗結果 67
4.5.6 實驗結果討論 75
4.6 與基於生成對抗網路的多標籤分類器比較 78
4.6.1 實驗資料集 78
4.6.2 模型介紹 79
4.6.3 實驗設定 83
4.6.4 實驗結果 84
4.7 本章總結 87
五、結論與展望 88
5.1 研究貢獻 88
5.2 未來展望 89
5.2.1 以生成式對抗網路幫助多標籤分類器 89
5.2.2 最佳補全蒸餾法應用於多標籤分類 89
參考文獻 91
[1] Qiang Li, Maoying Qiao, Wei Bian, and Dacheng Tao, “Conditional graphical lasso for multi-label image classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2977–2986.
[2] Jinseok Nam, Eneldo Loza Menc´ıa, Hyunwoo J Kim, and Johannes F¨urnkranz, “Maximizing subset accuracy with recurrent neural networks in multi-label classification,” in Advances in neural information processing systems, 2017, pp. 5413–5423.
[3] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.
[4] Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, and Lin-shan Lee, “Multi-target voice conversion without parallel data by adversarially learning disentangled audio representations,” arXiv preprint arXiv:1804.02812, 2018.
[5] Alexander H Liu, Hung-yi Lee, and Lin-shan Lee, “Adversarial training of end-to-end speech recognition using a criticizing language model,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019, pp. 6176–6180.
[6] Jiang Wang, Yi Yang, Junhua Mao, Zhiheng Huang, Chang Huang, and Wei Xu,“Cnn-rnn: A unified framework for multi-label image classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2285–2294.
[7] Sara Sabour, William Chan, and Mohammad Norouzi, “Optimal completion distillation for sequence learning,” arXiv preprint arXiv:1810.01398, 2018.
[8] Grigorios Tsoumakas and Ioannis Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining (IJDWM), vol.3, no. 3, pp. 1–13, 2007.
[9] Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank, “Classifier chains for multi-label classification,” Machine learning, vol. 85, no. 3, pp. 333, 2011.
[10] Grigorios Tsoumakas and Ioannis Vlahavas, “Random k-labelsets: An ensemble method for multilabel classification,” in European conference on machine learning. Springer, 2007, pp. 406–417.
[11] Jesse Read, Bernhard Pfahringer, and Geoffrey Holmes, “Multi-label classification using ensembles of pruned sets,” in 8th IEEE international conference on data mining. IEEE, 2008, pp. 995–1000.
[12] Krishnakumar Balasubramanian and Guy Lebanon, “The landmark selection method for multiple output prediction,” arXiv preprint arXiv:1206.6479, 2012.
[13] Wei Bi and James Kwok, “Efficient multi-label classification with many labels,” in International Conference on Machine Learning, 2013, pp. 405–413.92
[14] Yao-Nan Chen and Hsuan-Tien Lin, “Feature-aware label space dimension reduction for multi-label classification,” in Advances in Neural Information Processing Systems, 2012, pp. 1529–1537.
[15] Chih-Kuan Yeh, Wei-Chieh Wu, Wei-Jen Ko, and Yu-Chiang Frank Wang, “Learning deep latent space for multi-label classification.,” in AAAI, 2017, pp. 2838–2844.
[16] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
[17] Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer, “Scheduled sampling for sequence prediction with recurrent neural networks,” in Advances in Neural Information Processing Systems, 2015, pp. 1171–1179.
[18] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, “Unpaired imageto-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
[19] Santiago Pascual, Antonio Bonafonte, and Joan Serr`a, “Segan: Speech enhancement generative adversarial network,” arXiv preprint arXiv:1703.09452, 217.
[20] Martin Arjovsky, Soumith Chintala, and L´eon Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017.
[21] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville, “Improved training of wasserstein gans,” in Advances in Neural Information Processing Systems, 2017, pp. 5767–5777.93
[22] Karen Simonyan and Andrew Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[23] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and ZbigniewWojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.
[24] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[25] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee, “Generative adversarial text to image synthesis,” arXiv preprint arXiv:1605.05396, 2016.
[26] Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, and Sergey Ioffe, “Deep convolutional ranking for multilabel image annotation,” arXiv preprint arXiv:1312.4894, 2013.
[27] Zhouxia Wang, Tianshui Chen, Guanbin Li, Ruijia Xu, and Liang Lin, “Multi-label image recognition by recurrently discovering attentional regions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.464–472.
[28] Shang-Fu Chen, Yi-Chen Chen, Chih-Kuan Yeh, and Yu-Chiang Frank Wang, “Order-free rnn with visual attention for multi-label classification,” arXiv preprint arXiv:1707.05495, 2017.94
[29] Junjie Zhang, Qi Wu, Chunhua Shen, Jian Zhang, and Jianfeng Lu, “Multi-label image classification with regional latent semantic dependencies,” IEEE Transactions on Multimedia, 2018.
[30] LiminWang, Yuanjun Xiong, ZheWang, and Yu Qiao, “Towards good practices for very deep two-stream convnets,” arXiv preprint arXiv:1507.02159, 2015.
[31] Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, and Houfeng Wang, “Sgm: sequence generation model for multi-label classification,” in Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 3915–3926.
[32] David D Lewis, Yiming Yang, Tony G Rose, and Fan Li, “Rcv1: A new benchmark collection for text categorization research,” Journal of machine learning research, vol. 5, no. Apr, pp. 361–397, 2004.
[33] Lifu Tu and Kevin Gimpel, “Learning approximate inference networks for structured prediction,” arXiv preprint arXiv:1803.03376, 2018.
[34] Jos´e Ram´oN Quevedo, Oscar Luaces, and Antonio Bahamonde, “Multilabel classifiers with a probabilistic thresholding strategy,” Pattern Recognition, vol. 45, no. 2, pp. 876–883, 2012.
[35] Pengcheng Yang, Shuming Ma, Yi Zhang, Junyang Lin, Qi Su, and Xu Sun, “A deep reinforced sequence-to-set model for multi-label text classification,” arXiv preprint arXiv:1809.03118, 2018.95
[36] Franca Debole et al., “An analysis of the relative hardness of reuters-21578 subsets,” Journal of the American Society for Information Science and technology, vol. 56, no.6, pp. 584–596, 2005.
[37] Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence,R Channing Moore, Manoj Plakal, and Marvin Ritter, “Audio set: An ontology and human-labeled dataset for audio events,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017, pp. 776–780.
[38] Changsong Yu, Karim Said Barsim, Qiuqiang Kong, and Bin Yang, “Multilevel attention model for weakly supervised audio classification,” arXiv preprint arXiv:1803.02353, 2018.
[39] Cyprien de Masson d’Autume, Mihaela Rosca, Jack Rae, and Shakir Mohamed, “Training language gans from scratch,” arXiv preprint arXiv:1905.09922, 2019.
[40] Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine,“Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow,” arXiv preprint arXiv:1810.00821, 2018.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top