跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.81) 您好!臺灣時間:2025/10/05 10:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:沈聖堯
研究生(外文):Sheng-Yao Shen
論文名稱:以稀疏潛在概念層改善基於變分自編碼架構的神經網絡主題模型
論文名稱(外文):Improving Variational Auto-Encoder Based Neural Topic Model with Sparse Latent Concept Layer
指導教授:黃鐘揚
指導教授(外文):Chung-Yang Huang
口試日期:2017-07-28
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:39
中文關鍵詞:潛在狄利克雷分配主題模型變分自編碼器
外文關鍵詞:Latent Dirichlet AllocationTopic ModelVariational Auto-encoder
相關次數:
  • 被引用被引用:0
  • 點閱點閱:212
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
此論文主要貢獻為提出一個簡單的基於變分自編碼器的主題模型,並提出有效的主題字選擇方式。通過將機率矩陣分解為主題矩陣與文字矩陣的乘積,我們引入了潛在概念 (Sparse Latent Concept, SLC) 作為主題與文字的語意向量空間維度,並基於主題具有「潛在概念的稀疏性」的假設,和以主題與文字的語意相似度作為主題字的選擇函數。實驗結果顯示,基於SLC的模型具有更高的平均主題一致性 (topic coherence)。
In this thesis, the primary contribution is proposing a simple variational auto-encoder based topic model, and effective topic word selection criteria. By decomposing the probability matrix into the product of a topic matrix and a word matrix, we introduce sparse latent concepts (SLC) as the dimensionalities of the semantic space of the topic and word vectors, improve the model based on the idea that a topic is represented as few latent concepts, and select topic words by semantic similarity between topic and word vectors. In the experiments, SLC-based model outperforms the non-SLC-based model in terms of average topic coherence.
誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
Chapter 1 Introduction 1
1.1 Related Work 2
1.2 Contributions of the Thesis 2
1.3 Organization of the Thesis 3
Chapter 2 Preliminaries 4
2.1 Latent Dirichlet Allocation 4
2.2 Variational Auto-Encoder 5
2.3 Auto-Encoding Variational Inference for Topic Model 7
2.4 ProdLDA: LDA with Product of Experts 10
Chapter 3 Sparse Latent Concept Topic Model 11
3.1 Incorporating Word Embedding 11
3.2 Sparsity Transformation 12
3.2.1 Rectified Linear Unit (ReLU) and Parametrized Rectified Linear Unit (PReLU) 12
3.2.2 Softmax Normalization (SN) 14
3.3 Word Selection by Similarity-Based Criterion 15
Chapter 4 Architecture 17
4.1 Overview 17
4.2 Implementation Issue 18
Chapter 5 Experiments 20
5.1 Evaluating Topic Coherence 20
5.1.1 Evaluation Metric 20
5.1.2 Dataset and Preprocessing 21
5.1.3 Experimental Results 22
5.1.4 Qualitative Study – 20 Newsgroups 23
5.1.5 Qualitative Study – NIPS 24
5.2 Investigating Learned Representation 29
5.2.1 Discussion 1: Impact of Rich Concept Information 30
5.2.2 Discussion 2: Effectiveness of Pre-Trained Word Embedding 31
5.2.3 Discussion 3: Effectiveness of Softmax Normalization 32
Chapter 6 Conclusion and Future Work 35
6.1 Conclusion 35
6.2 Future Work 35
6.2.1 Theory on Learning Latent Representation of Topics 36
6.2.2 Hierarchical Topic Model with Neural Network Architecture 36
REFERENCE 37
[1]D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” JMLR, vol. 3, pp. 993–1022, 2003.
[2]D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes,” ICLR, 2014.
[3]A. Srivastava and C. Sutton, “Autoencoding Variational Inference for Topic Models,” ICLR, 2017.
[4]T. Hofmann and I. Computer, “Probabilistic latent semantic indexing,” in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, 1999, pp. 50–57.
[5]D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum, “Optimizing Semantic Coherence in Topic Models,” in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, no. 2, pp. 262–272, 2011.
[6]R. Das, M. Zaheer, and C. Dyer, “Gaussian LDA for Topic Models with Word Embeddings,” in Proceedings ACL 2015, pp. 795–804, 2015.
[7]D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson, “Improving topic models with latent feature word representations,” in Transactions of the Association for Computational Linguistics, vol. 3, pp. 299–313, 2015.
[8]L. Niu, X. Dai, J. Zhang, and J. Chen, “Topic2Vec: Learning distributed representations of topics,” in Proceedings of 2015 International Conference on Asian Language Processing (IALP 2015), pp. 193–196, 2015
[9]C. E. Moody, “Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec,” 2016.
[10]D. J. C. MacKay, “Choice of basis for Laplace approximation,” in Machine Learning, vol. 33, no. 1, pp. 77–86, 1998.
[11]P. Henning, D. Stern, R. Herbrich, T. Graepel, and P. Hennig, “Kernel Topic Models,” Proc. 15th Int. Conf. Artif. Intell. Stat., vol. 22, pp. 511–519, 2012.
[12]T. Mikolov, W.-T. Yih, and G. Zweig, “Linguistic regularities in continuous space word representations,” Proc. NAACL-HLT, no. June, pp. 746–751, 2013.
[13]Pedregosa et al, “Scikit-learn: Machine Learning in Python,” JMLR, pp. 2825-2830, 2011.
[14]C. Tan, D. Card, and N. A. Smith, “Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts,” 2017.
[15]McCallum, Andrew Kachites. “MALLET: A Machine Learning for Language Toolkit.” http://mallet.cs.umass.edu. 2002.
[16]J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1532–1543, 2014.
[17]D. Newman, J. Lau, K. Grieser, and T. Baldwin, “Automatic evaluation of topic coherence,” Hum. Lang. Technol. 2010 Annu. Conf. North Am. Chapter ACL, pp. 100–108, 2010.
[18]J. H. Lau, D. Newman, and T. Baldwin, “Machine Reading Tea Leaves : Automatically Evaluating Topic Coherence and Topic Model Quality,” Proc. 14th Conf. Eur. Chapter Assoc. Comput. Linguist. (EACL 2014), pp. 530–539, 2014.
[19]D. P. Kingma and J. B. Adam, “A method for stochastic optimization,” ICLR, 2015.
[20]T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” in The Workshop Proceedings of the International Conference on Learning Representations (ICLR), 2013.
[21]R. Salakhutdinov and G. E. Hinton, “Replicated softmax: An undirected topic model,” Adv. Neural Inf. Process. Syst. 22 - Proc. 2009 Conf., pp. 1607–1614, 2009.
[22]G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Comput., vol. 14, no. 8, pp. 1771–1800, 2002.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊