跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2025/01/17 10:30
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林彥文
研究生(外文):Yan-Wun Lin
論文名稱:從網路新聞到鏈結資料的轉換系統設計與開發
論文名稱(外文):The System Design and Development for Transforming Digital News to Linked Open Data
指導教授:廖宜恩廖宜恩引用關係
口試委員:高勝助陳朝欽
口試日期:2017-06-27
學位類別:碩士
校院名稱:國立中興大學
系所名稱:資訊工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:48
中文關鍵詞:鏈結資料網路新聞語意網命名實體識別卷積神經網路
外文關鍵詞:Linked DataDigital NewsSemantic WebNamed Entity RecognitionConvolutional Neural Networks.
相關次數:
  • 被引用被引用:1
  • 點閱點閱:185
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著社群媒體與網路新聞的興起,大大地改變了人們觀看新聞與討論新聞議題的習慣。在這個行動裝置普及的世代裡,人們多以網路新聞平台作為瀏覽新聞的主要管道;以社群網路作為討論新聞議題的平台。然而,由於目前台灣新聞出版業者所提供之網路新聞平台多以普通網頁(HTML)為主,在呈現新聞頁面時並未提供能被機器所解讀的語意(Metadata)。這使得使用者在使用網路新聞平台的服務時僅能依靠平台本身所提供之服務,不容易將新聞內容相關的人物與組織連結至其他的網路資源。
因此,為了改善新聞頁面缺乏語意的問題,本研究提出一套網路新聞到鏈結資料(Linked Data)的轉換系統,將目前網路新聞所提供之資訊轉為與本研究為台灣網路新聞平台所設計的資源描述框架相對應的Metadata。本系統利用命名實體識別(Named Entity Recognition,NER)技術標記新聞內文中所提之人物及組織,藉由將這些實體鏈結至中文維基百科與DBpedia等知識庫的方式,改善使用者在搜尋特定對象新聞時可能遭遇的同義詞問題,並使用語意網技術中的SPARQL查詢取代現有的關鍵字查詢。
由於社群網路平台的興起,人們在網路新聞平台上除了瀏覽新聞之外,使用者於社群平台對於該新聞特定對象的評論亦是人們關心的重點。因此,本研究不只針對網路新聞平台資料做轉換,更進一步將人們於社群網路平台所提供之新聞留言轉換為鏈結資料,並利用卷積神經網路(Convolutional Neural Networks, CNN)分析新聞留言與新聞內文各標籤的相關程度,以猜測各留言所討論的對象並進行標記。藉此改善新聞留言搜尋問題,讓使用者能透過本系統搜尋與特定人物、組織相關的新聞評論。
With the rising of social media and digital news, it greatly changed the habit of people watching news and discussing news topics. People browse news through digital news platform as the main source and discuss news issues on social network. However, due to digital news platforms provided by Taiwanese news publishers using traditional web pages (HTML), the presentation of news pages doesn't provide the semantics (Metadata) that can be parsed by computer. As a result, the web resources relating to the news contents cannot be linked to the news.
In order to solve this problem in digital news, we design a system which transforms digital news data to the Resource Description Framework (RDF). The proposed system recognizes the entities, person and organization, in digital news by the named entity recognition technique then and links them to the knowledge bases such as Wikipedia or DBpedia. Our system also solves the synonym problem when users search news about specific object. In addition, it provides SPARQL query of semantic web to enhance search capability.
People not only browse news through digital news platform but discuss news issues on social network. Our system transforms digital news platform data to linked data. Besides, we use Convolutional Neural Networks(CNN) classifier to analyze degree of relatedness between news opinions from social network and entities in the news. By the means, we allow users to search the opinion about specific entity.
致謝 i
摘要 iii
Abstract iv
目次 v
表目次 vi
圖目次 vii
第1章 緒論 1
1-1 研究背景及動機 1
1-2 研究目的 4
1-3 主要貢獻 6
1-4 論文架構 7
第2章 相關研究 8
2-1 語意網 8
2-2 鏈結資料 8
2-3 鏈結資料與網路新聞 11
2-4 網路新聞到鏈結資料轉換系統 12
2-5 命名實體識別 13
2-6 卷積神經網路於意見探勘之應用 14
第3章 系統架構與演算法 16
3-1 系統架構 16
3-2 網路新聞RDF Metadata 17
3-3 資料蒐集模組 20
3-3-1 社群媒體資料蒐集 20
3-3-2 網路新聞資料蒐集 21
3-4 資料預處理模組 23
3-5 hasTag資料萃取模組 25
3-6 TalkingAbout資料萃取模組 30
第4章 系統實作與實驗分析 37
4-1 開發工具與實驗環境 37
4-2 應用鏈結資料於網路新聞範例平台 38
第5章 結論與未來研究 44
5-1 結論 44
5-2 未來研究方向 45
參考文獻 46
[1]N. Newman, R. Fletcher, D. A. L. Levy, and R. K. Nielsen, Reuters Institute Digital News Report 2016, 2016.
[2]"新聞資訊網站調查與使用概況," 創市際雙週刊,59, May. 17, 2017; Internet: http://www.ixresearch.com/wp-content/uploads/report/InsightXplorer%20 Biweekly%20Report_20160315.pdf.
[3]T. Berners-Lee, J. Hendler, and O. Lassila, “The semantic web,” Scientific american, vol. 284, no. 5, pp. 28-37, 2001.
[4]F. Kalloubi, and E. H. Nfaoui, “Microblog semantic context retrieval system based on linked open data and graph-based theory,” Expert Systems with Applications, vol. 53, pp. 138-148, 2016.
[5]F. Chen, C. Lu, H. Wu, and M. Li, “A semantic similarity measure integrating multiple conceptual relationships for web service discovery,” Expert Systems with Applications, vol. 67, pp. 19-31, 2017.
[6]I. Fundulaki, and S. Auer, “Linked open data-introduction to the special theme,” ERCIM News, vol. 96 , 2014.
[7]L. Masinter, T. Berners-Lee, and R. T. Fielding, “Uniform resource identifier (URI): Generic syntax,” 2005.
[8]O. Lassila, and R. R. Swick, “Resource description framework (RDF) model and syntax specification,” 1999.
[9]吳政叡, "資源描述架構在都柏林核心集的應用介紹," 國立中央圖書館臺灣分館館刊, 1998.
[10]W. W. W. Consortium, “RDF 1.1 concepts and abstract syntax,” 2014.
[11]G. Kobilarov, T. Scott, Y. Raimond, S. Oliver, C. Sizemore, M. Smethurst, C. Bizer, and R. Lee, “Media Meets Semantic Web – How the BBC Uses DBpedia and Linked Data to Make Connections,” in European Semantic Web Conference, Springer Berlin Heidelberg., pp. 723-737, 2009.
[12]J. E. Ingvaldsen, and J. A. Gulla, “Taming news streams with linked data,” in Research Challenges in Information Science (RCIS), 2015 IEEE 9th International Conference on, pp. 536-537, 2015.
[13]D. Vrandečić, and M. Krötzsch, “Wikidata: a free collaborative knowledgebase,” Communications of the ACM, vol. 57, no. 10, pp. 78-85, 2014.
[14]T. Georgieva-Trifonova, and T. Stefanov, “Applying linked data technologies for online newspapers,” International Journal of Advanced Computer Science and Applications, vol. 6, no. 5, pp. 29-33, 2015.
[15]C. Wang, Y. Song, H. Li, M. Zhang, and J. Han, “Knowsim: A document similarity measure on structured heterogeneous information networks,” in Data Mining (ICDM), 2015 IEEE International Conference on, 2015, pp. 1015-1020.
[16]W. Hua, Z. Wang, H. Wang, K. Zheng, and X. Zhou, “Short text understanding through lexical-semantic analysis,” in Data Engineering (ICDE), 2015 IEEE 31st International Conference on, pp. 495-506, 2015.
[17]H. He, and X. Sun, “A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[18]N. Peng, and M. Dredze, “Improving named entity recognition for Chinese social media with word segmentation representation learning,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 149-155, 2016.
[19]A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, pp. 1097-1105, 2012.
[20]C. N. Dos Santos, and M. Gatti, “Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts,” in COLING, 2014, pp. 69-78.
[21]A. Severyn, and A. Moschitti, “Twitter sentiment analysis with deep convolutional neural networks,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, 2015.
[22]Y. Kim, “Convolutional neural networks for sentence classification,” arXiv preprint arXiv:1408.5882, 2014.
[23]T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
[24]T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, 2013, pp. 3111-3119.
[25]T. Mikolov, W.-t. Yih, and G. Zweig, “Linguistic Regularities in Continuous Space Word Representations,” in Hlt-naacl, 2013, pp. 746-751.
[26]"Graph API," May. 17, 2017; Internet: https://developers.facebook.com/docs/graph-api?locale=zh_TW.
[27]"Beautiful Soup Documentation," May. 17, 2017; Internet: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#.
[28]詞庫小組, 中文詞類分析,中文詞知識庫小組技術報告#93-05,中央研究院, 1993.
[29]"TensorFlow," May. 17, 2017; Internet: https://www.tensorflow.org/.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top