跳到主要內容

臺灣博碩士論文加值系統

(44.211.26.178) 您好!臺灣時間:2024/06/24 21:21
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:許智凱
研究生(外文):Chih-Kai Hsu
論文名稱:植基於主要研究議題之生醫文獻分群
論文名稱(外文):Clustering of Biomedical Articles based on Main Research Issues
指導教授:劉瑞瓏劉瑞瓏引用關係
指導教授(外文):Rey-Long Liu
口試委員:林紋正李官陵
口試委員(外文):Wen-Cheng LinGuan-Ling Lee
口試日期:2019-07-09
學位類別:碩士
校院名稱:慈濟大學
系所名稱:醫學資訊學系碩士班
學門:醫藥衛生學門
學類:醫學技術及檢驗學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:40
中文關鍵詞:生醫文章重疊式分群主要研究議題議題式分群參考文獻標題
外文關鍵詞:Biomedical ArticlesOverlapping ClusteringMain Research IssuesIssue-Based ClusteringReference Titles
相關次數:
  • 被引用被引用:0
  • 點閱點閱:184
  • 評分評分:
  • 下載下載:12
  • 收藏至我的研究室書目清單書目收藏:0
隨著生醫文章的數量越來越多,生醫學家想要完整且即時地掌握文章中探討的主要研究議題及其成果是非常困難的。所以如果能先將生醫文章根據研究議題做分群,就能輔助生醫學家更有效率地掌握已發表文章之研究成果,促進生醫研究之發展。但是要成功辨識出生醫文章中探討的研究議題是非常困難的,更何況還要依研究議題進行文章相似度的計算和分群。本研究提出一種技術ICRT(Issue
-Based Clustering with Reference Titles),它利用文章中參考文獻的標題尋找兩篇文章中與研究議題相關的字詞,並使用這些字詞改進以往的生醫文章分群結果。實驗結果顯示使用ICRT當作這些分群系統的後處理器,確實能有效地顯著提升分群系統在文章研究議題上的分群效能。對於生醫學家來說,這個研究成果除了能更有效率地將探討相同議題的生醫文章分在同一群之外,還可以促進高相關生醫文章的交叉分析。
As the number of biomedical articles is ever-growing, it is quite difficult for biomedical scientists to comprehensively and timely get main research results reported in the articles. Clustering of biomedical articles based on main research issues can support the scientists to get the main research results, and hence facilitate the progress of biomedical research. However, it is challenging to identify main research issues in articles, and then cluster the articles based on inter-article similarity in the research issues. In this thesis, a new technique ICRT(Issue-Based Clustering with Reference Titles) is proposed. It uses titles of references in articles to find the words relating to the main research issues of two articles. These words can be used to improve biomedical article clustering. Experimental results show that using ICRT as a post-processor of various clustering systems successfully improves the performance of biomedical article clustering. ICRT can thus be a good tool to improve cross-analysis of highly-related biomedical articles.
致謝詞 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VII
壹、 簡介 1
1.1 問題定義 2
1.2 研究動機 2
1.3 主要貢獻與本文結構 4
貳、 文獻探討 5
2.1 尋找中心點為算術中心的重疊式分群演算法 5
2.2 尋找中心點為文章的重疊式分群演算法 6
參、 研究方法 9
3.1 尋找文章的最相似參考文獻 9
3.2 計算文章配對的相似度 11
3.3 找出描述文章相似度的字詞 12
3.4 文章配對的分群 15
3.4.1 產生文章配對 15
3.4.2 分群流程 16
3.4.3 第一次分群 17
3.4.4 第二次分群 19
3.4.5 實驗組分群結果 20
肆、 實驗評估 21
4.1 蒐集實驗資料 21
4.2 資料前處理 22
4.3 對照組設計 23
4.4 評估準則 24
4.5 訓練閥值 25
4.6 實驗結果 27
4.7 案例分析與討論 29
伍、 結論與未來展望 32
參考文獻 33
附錄 36
[1]Abuaiadah D. Using bisect k-means clustering technique in the analysis of arabic documents. ACM Transactions on Asian and Low-Resource Language Information Processing, 2016, 15(3), 17.

[2]Amigo E., Gonzalo J., Artiles J., Verdejo F. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval, 2009, 12(4), 461–486.

[3]Banerjee A., Krumpelman C., Ghosh J., Basu S., Mooney R. J. Model based overlapping clustering. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, 2005, Chicago, USA, pp. 532–537.

[4]Blei D. M. Probabilistic topic models. Communication of the ACM (CACM), 2012, 55(4):77-84.

[5]Bezdek J. C., Ehrlich R., Full W. FCM: The Fuzzy c-means Clustering Algorithm. Computers & Geosciences, 1984, Vol. 10, No. 2-3, 191-203.

[6]Boyack K. W., Newman D., Duhon R. J., Klavans R., Patek M., Biberstine J. R., et al. Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches. PLoS ONE, 2011, 6(3): e18029.

[7]Couto T., Cristo M., Gonc¸alves M. A., Calado P., Nivio Ziviani N., Moura E., Ribeiro-Neto B. A Comparative Study of Citations and Links in Document Classification. in Proc. of the 6th ACM/IEEE-CS joint conference on Digital libraries, 2006, 75-84.

[8]Hang N., Honda K., Ichihashi H., Notsu A. Linear fuzzy clustering of relational databased on extended fuzzy c-medoids. In Proceedings of IEEE International Conference on Fuzzy Systems, 2008, 366–371.

[9]Kessler M. M. Bibliographic coupling between scientific papers. American Documentation, 1963, 14(1):10–25.

[10]Krishnapuram R., Joshi A., Yi L. A Fuzzy Relative of the k-Medoids Algorithm with Application to Web Document and Snippet Clustering. in Proc. of IEEE International Conference on Fuzzy Systems, 1999, 1281-1286.

[11]Lin J., Wilbur W. J. PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics, 2007, 8:423.

[12]Liu R. L. Context-based Term Frequency Assessment for Text Classification. Journal of the American Society for Information Science and Technology, 2010, Vol. 61, Issue 2, 300-309.

[13]Liu R. L. Passage-based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles. PLOS ONE, 2015, 10(10): e0139245.

[14]Liu R. L. A New Bibliographic Coupling Measure with Descriptive Capability. Scientometrics, Volume 110, Issue 2, 2017, 915–935.

[15]Nourashrafeddin S., Sherkat E., Minghim R., Milios E. E. A Visual Approach for Interactive Keyterm-Based Clustering. ACM Transactions on Interactive Intelligent Systems (TiiS), 2018, 8(1), 6.

[16]Peters G., Crespo F., Lingras P., Weber R. Soft clustering - fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 2013, 307–322.

[17]Sisodia D. S., Verma S., Vyas O. P. A Subtractive Relational Fuzzy C-Medoids Clustering Approach To Cluster Web User Sessions from Web Server Logs. International Journal of Applied Engineering Research, 2017, Vol, 12, Number 7, 1142-1150.

[18]Wang Y., Chen L., Mei J. P. Incremental fuzzy clustering with multiple medoids for large data. IEEE transactions on fuzzy systems, 2014, 22(6), 1557-1568.

[19]Xie P., Xing E. P. Integrating Document Clustering and Topic Modeling. in Proc. of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, 2013, 694–703.

[20]Yau C. K., Porter A. L., Newman N. C., Suominen A. Clustering scientific documents with topic modeling, Scientometrics, 2014, 100(3), 767-786.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top