跳到主要內容

臺灣博碩士論文加值系統

(3.236.84.188) 您好!臺灣時間:2021/07/30 02:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:劉元銘
研究生(外文):Yuan-Ming Liou
論文名稱:使用隨機漫步及分佈式詞彙表示法增強個人相片之語意檢索
論文名稱(外文):Enhanced Semantic Retrieval of Personal Photos Using Random Walk and Distributed Word Representations
指導教授:李琳山李琳山引用關係
口試委員:李宏毅鄭秋豫王小川陳信宏簡仁宗
口試日期:2015-07-06
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電信工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:中文
論文頁數:89
中文關鍵詞:檢索隨機漫步詞彙表示法個人相片
外文關鍵詞:retrievalrandom walkword representationpersonal photos
相關次數:
  • 被引用被引用:0
  • 點閱點閱:134
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文探討在使用者加入稀疏語音標註之條件下,如何進行有效的個人相片語意檢索(semantic retrieval of personal photos)。由於近年來數位相機及智慧型手機十分普及,使用者通常會迅速累積大量的個人相片,隨之而來的一個重要問題便是如何在數量龐大的個人相片資料庫中快速瀏覽與搜尋。一般使用者都喜歡直接用語意式查詢指令(semantic query) 來找相片,例如「母親節聚餐」。但以前的個人相片檢索多半是以內容為基礎的影像檢索(content-based image retrieval, CBIR),倚賴影像低階描述特徵且必須以一張相片作為查詢指令(query),並不適用於使用高階語意概念(high level smenatic concepts) 的影像檢索;而以語意為基礎的影像檢索則非常倚賴影像相關的標籤(tags) 或標註(annotations),但使用者不太可能把所有的相片都加上標註,且使用語音標註的方式又比使用鍵盤輸入的文字標註來的更為方便,所以本論文把主題設定在使用者輸入稀疏語音標註之條件下的個人相片語意檢索,亦即有少數相片上有語音標註。實現的方法主要是利用主題模型(topic model) 整合語音和影像特徵,並使用隨機漫步模型進行重新排序(re-ranking),最後再提出使用分佈式詞會表示法(distributed word representation) 來舒緩語音特徵稀疏的問題。
首先,由於語音標註可能在任何地方被錄製,可能是非常自發性的
(spontaneous) 說話方式,所以導致辨識率低下,所以利用詞圖進行抽取字詞頻率頻率期望值(expected term frequency) 當作是語音特徵,但只有少數的相片有語音標註,所以我們必需對每張相片抽取局部(local) 與全域(global) 的影像特徵,來補充語音特徵所遺漏的資訊。而本論文利用主題模型來整合語音和影像特徵,並以此模型訓練出來的「潛藏主題」建構檢索模型。
此外,我們發現主題模型的檢索效能還有很多進步空間,所以把從主題模型檢索出的首次檢索結果(first-pass retrieval results) ,基於字詞頻率期望值、局部與全域的影像特徵計算相片之間的相似度,再套用隨機漫步模型(random walk) 演算法,讓相似度越高的相片獲得越相近的相關分數(relevance score) ,進而達成重新排序的效果,並使其檢索效能獲得相當大的進步。
此外,我們發現由於語音特徵非常稀疏,導致在訓練主題模型時就特別仰賴影像特徵,但其實語音特徵才是最主要提供使用者個人化與語意資訊的來源,所以進一步使用近年在尋找語意(semantic) 和句法(syntactic) 相關詞的任務中有良好表現的分佈式詞彙表示法,基於字詞頻率期望值與整體影像語意概念,以類似自動增加標註的方法找出相關詞並加入語音特徵中,讓原本稀疏的語音特徵不再稀疏,進而讓主題模型在訓練時考慮更多個人化與語意相關的資訊,並且也讓隨機漫步模型重新排序的效能也更好。

誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
一、導論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 相關前人研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 本論文研究貢獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 章節安排. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
二、背景知識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 語音辨識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 前端處理. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 統計式語音辨識. . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 聲學模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.4 語言模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.5 搜尋演算法與詞圖. . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 檢索系統簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 語音文件檢索. . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 以內容為基礎的影像檢索. . . . . . . . . . . . . . . . . . . . 13
2.3 主題模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 機率式潛藏語意分析. . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 非負矩陣分解. . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 圖論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 隨機漫步(Random Walk) . . . . . . . . . . . . . . . . . . . . . 20
2.5 詞彙表示法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.1 傳統的詞彙表示法. . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.2 分佈式詞彙表示法. . . . . . . . . . . . . . . . . . . . . . . . 22
三、以主題模型整合語音和影像特徵實現個人相片語意檢索系統. . . . . . . 25
3.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 系統架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 影像特徵抽取. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 視覺詞作為局部影像特徵. . . . . . . . . . . . . . . . . . . . 28
3.3.2 整體影像語意概念作為全域影像特徵. . . . . . . . . . . . . . 31
3.4 語音特徵抽取. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 建立相片文件. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 利用主題模型建構檢索模型. . . . . . . . . . . . . . . . . . . . . . . 35
3.6.1 機率式潛藏語意分析檢索模型. . . . . . . . . . . . . . . . . . 35
3.6.2 非負矩陣分解檢索模型. . . . . . . . . . . . . . . . . . . . . . 36
3.7 實驗基礎設置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.1 個人相片資料庫. . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7.2 個人相片語音標註收集過程. . . . . . . . . . . . . . . . . . . 38
3.7.3 語音辨識結果. . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.7.4 實驗配置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.7.5 評估方式. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.8 實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.9 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
四、利用隨機漫步模型增強個人相片語意檢索系統. . . . . . . . . . . . . . . 43
4.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 系統架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 隨機漫步模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.1 裁剪. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.2 單層隨機漫步模型. . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.3 雙層隨機漫步模型. . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 基本實驗配置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 實驗結果與討論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 整體實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.2 探討內插參數 與裁剪方法. . . . . . . . . . . . . . . . . . . 53
4.6 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
五、使用分佈式詞彙表示法加強稀疏語音標注之個人相片語意檢索系統. . . 57
5.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 系統架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 使用分佈式詞彙表示法模型. . . . . . . . . . . . . . . . . . . . . . . 58
5.3.1 遞迴式類神經網路語言模型. . . . . . . . . . . . . . . . . . . 60
5.3.2 連續詞袋模型. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.3 跳躍文法模型. . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 從詞彙表示法模型找出相關詞加入相片文件. . . . . . . . . . . . . . 67
5.5 基本實驗配置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.6 實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.6.1 使用分佈式詞彙表示法模型. . . . . . . . . . . . . . . . . . . 70
5.6.2 再使用隨機漫步模型之結果. . . . . . . . . . . . . . . . . . . 73
5.7 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
六、結論與展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.1 結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 未來研究方向. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2.1 語音辨識系統的改進. . . . . . . . . . . . . . . . . . . . . . . 79
6.2.2 影像特徵上的改進. . . . . . . . . . . . . . . . . . . . . . . . 80
6.2.3 整合語音和影像特徵模型的改進. . . . . . . . . . . . . . . . 80
6.2.4 自動標註之研究. . . . . . . . . . . . . . . . . . . . . . . . . . 80
參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Ritendra Datta, Jia Li, and James Z Wang, “Content-based image retrieval: approaches
and trends of the new age,” in Proceedings of the 7th ACM SIGMM international
workshop on Multimedia information retrieval. ACM, 2005, pp. 253–262.
[2] Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang,
Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, et al.,
“Query by image and video content: The qbic system,” Computer, vol. 28, no. 9,
pp. 23–32, 1995.
[3] John R Smith and Shih-Fu Chang, “Visualseek: a fully automated content-based
image query system,” in Proceedings of the fourth ACM international conference
on Multimedia. ACM, 1997, pp. 87–98.
[4] Yi Hsuan Yang, Po Tun Wu, Ching Wei Lee, Kuan Hung Lin, Winston H Hsu, and
Homer H Chen, “Contextseer: context search and recommendation at query time for
shared consumer photos,” in Proceedings of the 16th ACM international conference
on Multimedia. ACM, 2008, pp. 199–208.
[5] Liana Stanescu, Dumitru Dan Burdescu, Marius Brezovan, and Cristian Gabriel Mihai,
“Semantic-based image retrieval,” in Creating New Medical Ontologies for
Image Annotation, pp. 91–102. Springer, 2012.
[6] Ciprian Chelba, Timothy J Hazen, and Murat Sarac¸lar, “Retrieval and browsing
of spoken content,” Signal Processing Magazine, IEEE, vol. 25, no. 3, pp. 39–49,
2008.
81
[7] Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, and Tye Rattenbury,
“How flickr helps us make sense of the world: context and content in communitycontributed
media collections,” in Proceedings of the 15th international conference
on Multimedia. ACM, 2007, pp. 631–640.
[8] Jiayi Chen, Tele Tan, Philippe Mulhem, and Mohan Kankanhalli, “An improved
method for image retrieval using speech annotation.,” in MMM, 2003, pp. 15–32.
[9] Timothy J Hazen, Brennan Sherry, and Mark Adler, “Speech-based annotation and
retrieval of digital photographs.,” in INTERSPEECH, 2007, vol. 7, pp. 2165–2168.
[10] Dmitri V Kalashnikov, Sharad Mehrotra, Jie Xu, and Nalini Venkatasubramanian,
“A semantics-based approach for speech annotation of images,” Knowledge and
Data Engineering, IEEE Transactions on, vol. 23, no. 9, pp. 1373–1387, 2011.
[11] Xavier Anguera, JieJun Xu, and Nuria Oliver, “Multimodal photo annotation and retrieval
on a mobile phone,” in Proceedings of the 1st ACM international conference
on Multimedia information retrieval. ACM, 2008, pp. 188–194.
[12] Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, and Raj Foreword By-Reddy, Spoken
language processing: A guide to theory, algorithm, and system development,
Prentice Hall PTR, 2001.
[13] Rivarol Vergin, Douglas O’shaughnessy, and Azarshid Farhat, “Generalized mel frequency
cepstral coefficients for large-vocabulary speaker-independent continuousspeech
recognition,” Speech and Audio Processing, IEEE Transactions on, vol. 7,
no. 5, pp. 525–532, 1999.
82
[14] Lawrence Rabiner, “A tutorial on hidden markov models and selected applications
in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989.
[15] Slava Katz, “Estimation of probabilities from sparse data for the language model
component of a speech recognizer,” Acoustics, Speech and Signal Processing, IEEE
Transactions on, vol. 35, no. 3, pp. 400–401, 1987.
[16] Stefan Ortmanns, Hermann Ney, and Xavier Aubert, “A word graph algorithm for
large vocabulary continuous speech recognition,” Computer Speech & Language,
vol. 11, no. 1, pp. 43–72, 1997.
[17] FrankWessel, Ralf Schluter, Klaus Macherey, and Hermann Ney, “Confidence measures
for large vocabulary continuous speech recognition,” Speech and Audio Processing,
IEEE Transactions on, vol. 9, no. 3, pp. 288–298, 2001.
[18] Lidia Mangu, Eric Brill, and Andreas Stolcke, “Finding consensus in speech recognition:
word error minimization and other applications of confusion networks,”
Computer Speech & Language, vol. 14, no. 4, pp. 373–400, 2000.
[19] Yi-Sheng Fu, Yi-Cheng Pan, and Lin-Shan Lee, “Improved large vocabulary continuous
chinese speech recognition by character-based consensus networks,” in Chinese
Spoken Language Processing, pp. 422–434. Springer, 2006.
[20] Lin-shan Lee and Berlin Chen, “Spoken document understanding and organization,”
Signal Processing Magazine, IEEE, vol. 22, no. 5, pp. 42–60, 2005.
[21] Ya-chao Hsieh, Yu-tsun Huang, Chien-chih Wang, and Lin-shan Lee, “Improved
spoken document retrieval with dynamic key term lexicon and probabilistic latent se-
83
mantic analysis (plsa),” in Acoustics, Speech and Signal Processing, 2006. ICASSP
2006 Proceedings. 2006 IEEE International Conference on. IEEE, 2006, vol. 1, pp.
I–I.
[22] Ciprian Chelba, Jorge Silva, and Alex Acero, “Soft indexing of speech content for
search in spoken documents,” Computer Speech & Language, vol. 21, no. 3, pp.
458–478, 2007.
[23] Leif Azzopardi, Mark Girolami, and CJ Van Rijsbergen, “Topic based language
models for ad hoc information retrieval,” in Neural Networks, 2004. Proceedings.
2004 IEEE International Joint Conference on. IEEE, 2004, vol. 4, pp. 3281–3286.
[24] Thomas Hofmann, “Probabilistic latent semantic indexing,” in Proceedings of the
22nd annual international ACM SIGIR conference on Research and development in
information retrieval. ACM, 1999, pp. 50–57.
[25] Scott C. Deerwester, Susan T Dumais, Thomas K. Landauer, GeorgeW. Furnas, and
Richard A. Harshman, “Indexing by latent semantic analysis,” JAsIs, vol. 41, no. 6,
pp. 391–407, 1990.
[26] Arthur P Dempster, Nan M Laird, and Donald B Rubin, “Maximum likelihood from
incomplete data via the em algorithm,” Journal of the royal statistical society. Series
B (methodological), pp. 1–38, 1977.
[27] Ajit P Singh and Geoffrey J Gordon, “A unified view of matrix factorization models,”
in Machine Learning and Knowledge Discovery in Databases, pp. 358–373.
Springer, 2008.
84
[28] Daniel D Lee and H Sebastian Seung, “Learning the parts of objects by non-negative
matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.
[29] Daniel D Lee and H Sebastian Seung, “Algorithms for non-negative matrix factorization,”
in Advances in neural information processing systems, 2001, pp. 556–562.
[30] Lawrence Page, Sergey Brin, Rajeev Motwani, and TerryWinograd, “The pagerank
citation ranking: Bringing order to the web.,” 1999.
[31] Gunes Erkan and Dragomir R Radev, “Lexrank: graph-based lexical centrality as
salience in text summarization,” Journal of Artificial Intelligence Research, pp.
457–479, 2004.
[32] Robin Pemantle et al., “A survey of random processes with reinforcement,” Probab.
Surv, vol. 4, no. 0, pp. 1–79, 2007.
[33] Geoffrey E Hinton, “Learning distributed representations of concepts,” in Proceedings
of the eighth annual conference of the cognitive science society. Amherst, MA,
1986, vol. 1, p. 12.
[34] Wei Xu and Alexander I Rudnicky, “Can artificial neural networks learn language
models,” 2000.
[35] Andriy Mnih and Geoffrey Hinton, “Three new graphical models for statistical language
modelling,” in Proceedings of the 24th international conference on Machine
learning. ACM, 2007, pp. 641–648.
[36] Tomas Mikolov, Martin Karafi´at, Lukas Burget, Jan Cernock`y, and Sanjeev Khudanpur,
“Recurrent neural network based language model.,” in INTERSPEECH 2010,
85
11th Annual Conference of the International Speech Communication Association,
Makuhari, Chiba, Japan, September 26-30, 2010, 2010, pp. 1045–1048.
[37] Ronan Collobert, JasonWeston, L´eon Bottou, Michael Karlen, Koray Kavukcuoglu,
and Pavel Kuksa, “Natural language processing (almost) from scratch,” The Journal
of Machine Learning Research, vol. 12, pp. 2493–2537, 2011.
[38] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean, “Distributed
representations of words and phrases and their compositionality,” in Advances
in Neural Information Processing Systems 26, C.J.C. Burges, L. Bottou,
M. Welling, Z. Ghahramani, and K.Q. Weinberger, Eds., pp. 3111–3119. Curran
Associates, Inc., 2013.
[39] Jeffrey Pennington, Richard Socher, and Christopher D Manning, “Glove: Global
vectors for word representation,” Proceedings of the Empiricial Methods in Natural
Language Processing (EMNLP 2014), vol. 12, 2014.
[40] Pierre Tirilly, Vincent Claveau, and Patrick Gros, “Language modeling for bagof-
visual words image categorization,” in Proceedings of the 2008 international
conference on Content-based image and video retrieval. ACM, 2008, pp. 249–258.
[41] David G Lowe, “Distinctive image features from scale-invariant keypoints,” International
journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004.
[42] Corinna Cortes and Vladimir Vapnik, “Support-vector networks,” Machine learning,
vol. 20, no. 3, pp. 273–297, 1995.
86
[43] Murat Saraclar and Richard Sproat, “Lattice-based search for spoken utterance retrieval,”
Urbana, vol. 51, pp. 61801, 2004.
[44] Yi-Cheng Pan, Hung-lin Chang, and Lin-shan Lee, “Analytical comparison between
position specific posterior lattices and confusion networks based on words and subword
units for spoken document indexing,” in Automatic Speech Recognition &
Understanding, 2007. ASRU. IEEE Workshop on. IEEE, 2007, pp. 677–682.
[45] XingWei andWBruce Croft, “Lda-based document models for ad-hoc retrieval,” in
Proceedings of the 29th annual international ACM SIGIR conference on Research
and development in information retrieval. ACM, 2006, pp. 178–185.
[46] Satoru Tsuge, Masami Shishibori, Shingo Kuroiwa, and Kenji Kita, “Dimensionality
reduction using non-negative matrix factorization for information retrieval,”
in Systems, Man, and Cybernetics, 2001 IEEE International Conference on. IEEE,
2001, vol. 2, pp. 960–965.
[47] Emine Yilmaz and Javed A Aslam, “Estimating average precision with incomplete
and imperfect judgments,” in Proceedings of the 15th ACM international conference
on Information and knowledge management. ACM, 2006, pp. 102–111.
[48] Eric Gaussier and Cyril Goutte, “Relation between plsa and nmf and implications,”
in Proceedings of the 28th annual international ACM SIGIR conference on Research
and development in information retrieval. ACM, 2005, pp. 601–602.
[49] Deng Cai, Xiaofei He, Jiawei Han, and Thomas S Huang, “Graph regularized nonnegative
matrix factorization for data representation,” Pattern Analysis and Machine
Intelligence, IEEE Transactions on, vol. 33, no. 8, pp. 1548–1560, 2011.
87
[50] Rafael A Calvo and Sunghwan Mac Kim, “Emotions in text: dimensional and categorical
models,” Computational Intelligence, vol. 29, no. 3, pp. 527–543, 2013.
[51] Yun-Nung Chen and Florian Metze, “Two-layer mutually reinforced random walk
for improved multi-party meeting summarization.,” in SLT, 2012, pp. 461–466.
[52] Sujay Kumar Jauhar, Yun-Nung Chen, and Florian Metze, “Prosody-based unsupervised
speech summarization with two-layer mutually reinforced random walk,”
IJCNLP 2013, 2013.
[53] Geoffrey E Hinton, “Distributed representations,” 1984.
[54] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams, “Learning representations
by back-propagating errors,” Cognitive modeling, vol. 5, 1988.
[55] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig, “Linguistic regularities in continuous
space word representations.,” in HLT-NAACL, 2013, pp. 746–751.
[56] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, “Efficient estimation
of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
[57] Mikael Boden, “A guide to recurrent neural networks and backpropagation,” 2001.
[58] Alex Graves, A-R Mohamed, and Geoffrey Hinton, “Speech recognition with deep
recurrent neural networks,” in Acoustics, Speech and Signal Processing (ICASSP),
2013 IEEE International Conference on. IEEE, 2013, pp. 6645–6649.
[59] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in neural information processing
systems, 2012, pp. 1097–1105.
88
[60] Geoffrey E Hinton and Ruslan R Salakhutdinov, “Reducing the dimensionality of
data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006.
[61] Andrej Karpathy and Li Fei-Fei, “Deep visual-semantic alignments for generating
image descriptions,” arXiv preprint arXiv:1412.2306, 2014.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top