跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/09 01:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林青芸
研究生(外文):Ching-Yun Lin
論文名稱:以自然語言處理技術將敘述性研究論文文本轉為結構化文件之研究—以臺灣健保資料庫研究論文之收案條件為例
論文名稱(外文):Research on Converting the Narrative Text of Scientific Research Articles into Structured Documents by Natural Language Processing
指導教授:劉德明劉德明引用關係潘美連潘美連引用關係
指導教授(外文):Der-Ming LiouMei-Lien Pan
學位類別:碩士
校院名稱:國立陽明大學
系所名稱:生物醫學資訊研究所
學門:生命科學學門
學類:生物化學學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:39
中文關鍵詞:健保資料庫收案條件自然語言處理中介對應器
外文關鍵詞:NHIRDEligibility CriteriaNatural Language ProcessingMetaMap
相關次數:
  • 被引用被引用:0
  • 點閱點閱:489
  • 評分評分:
  • 下載下載:13
  • 收藏至我的研究室書目清單書目收藏:0
背景:使用健保資料庫進行研究分析時,須先明確定義出將具有何種特徵的病患當作研究個案的收案條件。目前已有許多學者表示不同收案條件定義方式會造成結果判讀的差異,因此在設計研究方法時,定義出合適、客觀的收案條件是不可輕忽的首要步驟。
研究目的:為了將敘述性的收案條件文本轉為結構化形式,使收案條件成為可重複利用的知識,本研究開發了一個R語言套件,藉由標準化的形式來歸納過去健保資料庫研究論文中各種收案條件所使用的定義方式。
方法:本研究使用了2517篇發表於國際期刊之健保資料庫研究論文,從內文擷取出收案條件段落,偵測語句之醫學概念與診斷代碼、人口學特徵、就醫科別限制、診斷代碼與時序關係等語義,最終以結構化形式儲存成XML文件。
結果:測試結果顯示本研究開發之處理工具在縮寫偵測、年齡限制與就醫科別等條件辨識上有較好的表現,偵測最低就醫次數與時序關係則有改善空間。
Background: In NHIRD studies, researchers must define a set of suitable eligibility criteria of study samples. Several researches focus on eligibility criteria have suggested that study results of the same disease would vary with different setting of eligibility criteria. Therefore, the establishment of eligibility criteria is an important step in the process of study design.
Aim: To convert the narrative eligibility criteria into structured form, we built a series of R-based text processing methods to analyze the eligibility criteria in NHIRD articles.
Methods: There were 2517 NHIRD papers used to build up the text processing tool. We classified their study type from article title, identifying medical concepts and abbreviations, detecting basic demographic characteristics and limitation of specialist, extracting diagnosis codes and temporal relationship, then created the structured eligibility criteria XML files.
Results: Although there is still room for improvement on visit of medical utilization and temporal relationship identifying, the high performance in detecting abbreviations, age restrictions and limitation of specialists still show the system useful for eligibility criteria analysis.
誌謝 i
摘要 ii
Abstract iii
目錄 iv
圖目錄 v
表目錄 vi
第1章 緒論 1
第一節 研究背景 1
第二節 研究動機 3
第三節 研究目的 5
第四節 研究架構 5
第2章 文獻探討 6
第3章 研究設計 10
第一節 研究材料 10
第二節 定義研究事件 11
第三節 定義研究類型 11
第四節 開發環境 12
第五節 文本處理流程 13
第4章 結果與評估 25
第5章 討論與結論 33
第一節 討論 33
第二節 研究限制 35
第三節 未來展望 36
第四節 結論 36
參考文獻 37
附錄 39

圖目錄
圖 4-1. 健保資料庫研究論文之結構化收案條件XML文件 32

表目錄
表 1-1. 健保資料庫研究論文與臨床試驗文本收案條件比較表 4
表 2-1. 收案條件分類面向 6
表 2-2. 收案條件語境模式 9
表 3-1. 健保資料庫研究論文標題常見之研究類型 12
表 3-2. 論文標題之研究類型正則表達式 14
表 3-3. 傳回單一研究事件判斷結果 15
表 3-4. 傳回複數研究事件判斷結果 15
表 3-5. 縮寫之候選全稱比較規則 17
表 3-6. 收案條件段落文字替換規則 17
表 3-7. 年齡限制之正則表達式 18
表 3-8. 診斷代碼正則表達式 20
表 3-9. 診斷代碼對應至UMLS概念 20
表 3-10. 最低就醫次數限制正則表達式 21
表 3-11. 時序關係正則表達式 22
表 3-12. 本研究所使用之正則表達式總整理 23
表 4-1. 以論文標題判斷研究類型結果 25
表 4-2. MetaMap不恰當拆解醫學詞彙示例 27
表 4-3. 年齡限制偵測結果 29
表 4-4. 性別限制偵測結果 29
表 4-5. 就醫科別限制偵測結果 30
表 4-6. 最低就醫次數限制偵測結果 31
表 4-7. 時序關係偵測結果 32
參考文獻
[1]Chin, R. and B.Y. Lee, Principles and practice of clinical trial medicine. 2008: Elsevier.
[2]Chen, Y.C., et al., 13-year nationwide cohort study of chronic kidney disease risk among treatment-naive patients with chronic hepatitis B in Taiwan. BMC Nephrol, 2015. 16: [3]Pan, M.L., et al., Relationship between Polycystic Ovarian Syndrome and Subsequent Gestational Diabetes Mellitus: A Nationwide Population-Based Study. PLoS One, 2015. 10(10): p. e0140544.
[4]Lin, M.C., et al., Epidemiologic features of Kawasaki disease in acute stages in Taiwan, 1997-2010: effect of different case definitions in claims data analysis. J Chin Med Assoc, 2015. 78(2): p. 121-6.
[5]Chang, M.J., H.I. Ma, and T.H. Lu, Estimating the prevalence of cerebral palsy in Taiwan: A comparison of different case definitions. Res Dev Disabil, 2014. 36C: p. 207-212.
[6]Lee, W.L., et al., The Risk of Epithelial Ovarian Cancer of Women With Endometriosis May be Varied Greatly if Diagnostic Criteria Are Different: A Nationwide Population-Based Cohort Study. Medicine (Baltimore), 2015. 94(39): p. e1633.
[7]Meinert, C.L., ClinicalTrials: Design, Conduct and Analysis. 2012: Oxford University Press.
[8]Weng, C., et al., Formal representation of eligibility criteria: a literature review. J Biomed Inform, 2010. 43(3): p. 451-67.
[9]Ohno-Machado, L., et al., AIDS2: a decision-support tool for decreasing physicians' uncertainty regarding patient eligibility for HIV treatment protocols. Proc Annu Symp Comput Appl Med Care, 1993: p. 429-33.
[10]Peleg, M., et al., Comparing Computer-interpretable Guideline Models: A Case-study Approach. Journal of the American Medical Informatics Association, 2003. 10(1): p. 52-68.
[11]Fink, E., et al., Selection of patients for clinical trials: an interactive web-based system. Artif Intell Med, 2004. 31(3): p. 241-54.
[12]Luo, Z., S.B. Johnson, and C. Weng, Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering. 2010.
[13]Milian, K., et al., Patterns of Clinical Trial Eligibility Criteria, in Proceedings of the AIME’11 workshop on Knowledge Representation for Healthcare (KR4HC11), lecture notes AI. 2011.
[14]Aronson, A.R., Metamap: Mapping text to the umls metathesaurus. Bethesda, MD: NLM, NIH, DHHS, 2006: p. 1-26.
[15]Aronson, A.R. and F.M. Lang, An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc, 2010. 17(3): p. 229-36.
[16]He, Z., Z. Chen, and J. Bian. Analysis of temporal constraints in qualitative eligibility criteria of cancer clinical studies. in 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2016.
[17]Demner-Fushman, D. and J.G. Mork, Extracting Characteristics of the Study Subjects from Full-Text Articles. AMIA Annu Symp Proc, 2015. 2015: p. 484-91.
[18]Meng, F. and C. Morioka, Automating the generation of lexical patterns for processing free text in clinical documents. J Am Med Inform Assoc, 2015. 22(5): p. 980-6.
[19]Lu, M.C., et al., Increased risk of primary Sjogren's syndrome in female patients with thyroid disorders: a longitudinal population-based study in Taiwan. PLoS One, 2013. 8(10): p. e77210.
[20]Shah, A.D., C. Martinez, and H. Hemingway, The freetext matching algorithm: a computer program to extract diagnoses and causes of death from unstructured text in electronic health records. BMC Med Inform Decis Mak, 2012. 12: p. 88.
[21]Wu, Y.H., et al., Premotor Symptoms as Predictors of Outcome in Parkinsons Disease: A Case-Control Study. PLoS One, 2016. 11(8): p. e0161271.
[22]Liu, C.J., et al., Irritable brain caused by irritable bowel? A nationwide analysis for irritable bowel syndrome and risk of bipolar disorder. PLoS One, 2015. 10(3): p. e0118209.
[23]Chen, C.Y., et al., Use of anti-asthmatic medications in elderly Taiwanese patients. Kaohsiung J Med Sci, 2003. 19(6): p. 305-12.
[24]Liao, C.H., et al., Schizophrenia patients at higher risk of diabetes, hypertension and hyperlipidemia: a population-based study. Schizophr Res, 2011. 126(1-3): p. 110-6.
[25]Jeng, M.J., et al., A longitudinal study on early hospitalized airway infections and subsequent childhood asthma. PLoS One, 2014. 10(4): p. e0121906.
[26]Hung, M.H., et al., Risk of Second Non-Breast Primary Cancer in Male and Female Breast Cancer Patients: A Population-Based Cohort Study. PLoS One, 2016. 11(2): p. e0148597.
[27]Milian, K., et al., Enhancing reuse of structured eligibility criteria and supporting their relaxation. J Biomed Inform, 2015. 56: p. 205-219.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top