跳到主要內容

臺灣博碩士論文加值系統

(98.82.120.188) 您好!臺灣時間:2024/09/11 08:13
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:阮相宇
研究生(外文):Hsiang-yu Yuan
論文名稱:自然語言處理在醫學資料庫系統上的應用
論文名稱(外文):The Application of Natural Language Processing in a Medical Database System
指導教授:翁昭旼翁昭旼引用關係
指導教授(外文):Jau-Min Wong
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:醫學工程學研究所
學門:工程學門
學類:綜合工程學類
論文種類:學術論文
論文出版年:2001
畢業學年度:89
語文別:英文
論文頁數:66
中文關鍵詞:自然語言處理資訊萃取醫學資料庫
外文關鍵詞:Natural Language ProcessingInformation ExtractionMedical Database
相關次數:
  • 被引用被引用:2
  • 點閱點閱:587
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
醫院中長期所累積大量的敘述型文字病歷,潛藏著豐富的臨床資訊與知識。不過由於文字病歷的特性,資訊是潛藏在醫生的文字敘述中,因此這些豐富大量的病歷資料,並無法拿來做有效的利用,例如臨床決策系統、資料探勘、和醫學統計等。另一方面,目前並沒有一個有效的全文資訊檢索系統能針對這些文字病歷,提供準確的病歷查詢,使得醫師在查詢病人病歷時,便得費時且不方便。
本研究探討自然語言處理的技術,在醫學領域上的應用。目的在於處理敘述型的文字病歷,將其資料轉化成具有結構化特性,並含有臨床意義的臨床資料。並利用其結構化特性,提供高效率,更準確,更可靠的病歷查詢系統。
我們開發一個採用自然語言處理與資訊萃取技術的醫學文字分析工具。此工具應用於大腸鏡醫學資料庫,共7590大腸鏡檢查病歷經過分析,產生結構化的結果,並存於醫學資料庫中。我們完成兩種後續應用:以網站為基礎的病歷查詢系統,能針對疾病各種屬性做查詢;大腸疾病臨床分布的獲得,得到豐富的臨床價值。
In hospitals, a large amount of patient data is produced each day. It has become an important issue for medical professions to extract crucial clinical information from the stack of medical data. But due to the reason that patient data is stored in narrative form, it is difficult to access information from them.
In this study, we develop a natural language processor, which can transform the narrative data into structured information, which can be further used in decision support, data mining, and biomedical statistics.
The natural language processing system successfully processed 7590 medical reports. The extracted clinical information is further used in a web-based patient record system. Moreover, clinical distributing is counted based on these extracted information.
1Introduction……………………………………………1
1.1The importance of Narrative Data in Medicine…1
1.2Difficulties in Accessing Information in Narrative Data……………………………………………………………………1
1.3Natural Language Processing as a Solution………2
1.4System Purpose of this Study………………………2
2Background………………………………………………3
2.1Natural Language Processing and Information Extraction……………………………………………………………3
2.1.1Properties of Natural Language……………………3
2.1.2Overview of Natural Language Processing…………4
2.1.3Natural Language Processing…………………………4
2.1.3.1Morphological Analysis………………………………5
2.1.3.2Syntactic Analysis……………………………………5
2.1.3.3Semantic Analysis………………………………………6
2.1.3.4Contextual Analysis……………………………………7
2.1.4Critical Problems in Natural Language Processing……………………………………………………………7
2.1.4.1Ambiguities………………………………………………7
2.1.4.2Ill-form Context………………………………………9
2.1.4.3Robustness Problem……………………………………9
2.1.5Different Natural Language Processing Tasks…………………………………………………………………9
2.1.6Information Extraction………………………………10
2.1.7System Evaluation in Message Understanding Conference…11
2.1.8Natural Language in Medicine………………………11
2.1.9Natural Language Processing in Medicine………11
2.1.10System Evaluation in Medicine……………………12
2.2Knowledge Representation……………………………12
2.2.1Ontology…………………………………………………12
2.2.2Frame Representation…………………………………12
2.2.3Knowledge representation in medicine……………13
2.3Colonoscopic Diseases………………………………14
2.3.1Polyps of Colon………………………………………14
2.3.2Colonoscopic Findings………………………………14
3Methods…………………………………………………16
3.1Material…………………………………………………16
3.2System Architecture…………………………………16
3.3NLP Component of Medical Texts……………………18
3.3.1NLP System Description………………………………18
3.3.2Knowledge Base in Natural Language Processor…19
3.3.2.1Knowledge Base Descriptions………………………19
3.3.3Lexical Preprocessing………………………………20
3.3.3.1Structuring and Tokenizing…………………………21
3.3.3.2Part of Speech Tagging………………………………23
3.3.4Syntactical Analysis…………………………………25
3.3.4.1Partial Parsing………………………………………26
3.3.5Semantic Analysis……………………………………28
3.3.5.1Semantic Tagging………………………………………29
3.3.5.2Pattern Matching………………………………………31
3.3.5.3Pronoun Processing……………………………………35
3.3.6Frame Representation…………………………………36
3.3.6.1Frame Generating………………………………………36
3.3.7Frame-Slot Mapping……………………………………38
3.4Clinical Information Model…………………………40
3.4.1The Role of XML Document Result…………………40
3.4.2Representation of Clinical Information…………40
3.4.2.1Goal Oriented Document………………………………41
3.4.2.2Data Oriented Document………………………………42
3.4.3Controlled vocabulary mapping……………………43
3.5Medical Query System Design………………………45
3.5.1Query System Architecture…………………………45
3.5.2Interface Design Aspects……………………………47
3.5.3Colonoscpic Query System……………………………47
3.6Evaluation methods for NLP…………………………48
3.6.1The Gold Standard of Evaluation…………………48
4Results…………………………………………………50
4.1Material Source………………………………………50
4.2Show Application………………………………………50
4.3Clinical information…………………………………54
4.3.1Information of Polyp…………………………………54
4.3.2Disease-Age Distribution……………………………55
4.4Evaluation………………………………………………57
5Discussion………………………………………………59
5.1The Advantage of the Query System Using NLP…59
5.2The Advantage of MeLPS………………………………59
5.3Comparisons with other NLP Systems………………59
5.4The reason of Error Results………………………60
5.5Dictionaries Constructing…………………………60
5.6Contributions…………………………………………61
5.7Future Work……………………………………………61
5.8Conclusions……………………………………………62
[1] Allen, J.F., Natural Language Understanding. Benjamin Cummings, Redwood City, CA, Second Edition, 1994.
[2] Baud, R.H., Rassinoux, A.-M., and Scherrer, J.-R., “Natural Language Processing and Medical Records,” in Proceedings of the Seventh World Congress on Medical Informatics (MEDINFO-92), pp. 1362-1367, Geneva, 1992.
[3] Baud, R.H., Rassinoux, A.-M., and Scherrer, J.-R., “Natural Language Processing and Semantical Representation of Medical Texts,” in Methods of Information in Medicine, vol. 31, no. 2, pp. 117-125, 1992.
[4] Baud, R.H., LOVIS, C., Rassinoux, A.-M., and Scherrer, J.-R., “Modelling for Natural Language Understanding,” in Proceedings of Seventeenth Annual Symposium on Computer Applications in Medical Cares (SCAMC-93), pp. 289--293, Washington DC, 1993.
[5] Cimino, J.J., Clayton, P.D., Hripcsak, G., and Johnson, S.B., “Knowledge-based Approaches to the Maintenance of a Large Controlled Medical Terminology,” in Journal of the American Medical Informatics Association, vol. 1, no. 1, pp. 35-50, 1994.
[6] Cowie, J. and Lehnert, W., “Information Extraction,” in Communications of the ACM special issue on Natural Language Processing, vol. 39, no, 1, pp. 81-91, 1996.
[7] Cowie J. and Yorick, W., “Information Extraction,” in R Dale, H Moisl and H Somers, editors, Handbook of Natural Language Processing. Marcel Dekker Inc, New York, 2000.
[8] Craven, M. and Kumlien, J., “Constructing Biological Knowledge Bases by Extracting Information from Text Sources,” in Proceedings of the seventh International Conference on Intelligent Systems for Molecular Biology, pp. 77--86, Heidelberg, Germany, 1999.
[9] Clark, H.H. and Clark, E.V., Psychology and Language: An Introduction to Psycholinguistic, Harcourt Brace Jovanovich, New York, 1977.
[10] Chomsky, N., Aspects of a theory of syntax. Cambridge. MIT Press, MA, 1965.
[11] Eugene, C. and W. Yorick (eds.) Computational Semantics. North-Holland, Amsterdam, Netherlands, 1976, 294 pp.
[12] Friedman, C., Alderson, P.O., Austin, J., Cimino, J.J., and Johnson, S.B., “A general natural language text processor for clinical radiology,” in Journal of American Medical Informatics Association, vol.1, no. 2, pp. 161-174, 1994.
[13] Friedman, C. and Hripcsak, G., “Evaluating natural language processors in the clinical domain,” in Proceedings of the Conference on Natural Language and Medical Concept Representation (IMIA WG6), pp. 41--52, Jacksonville, Fl, 1997.
[14] Gaizaukas, R., T. Wakao, K. Humphreys, H. Cunningham and Y. Wilks, “Description of the LaSIE System as used for MUC-6,” in Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 207-220, Columbia, Maryland, 1995.
[15] HELIOS-2 Project. Available at http://www.hbroussais.fr/helios/
[16] Hwang, C.H., “Incompletely and Imprecisely Speaking: Using Dynamic Ontologies for Representing and Retrieval Information,” in Technical Report of Microelectronics and Computer Technology Corp, MCC, 2000.
[17] Lehnert, W. and Sundheim, B., “A Performance Evaluation of Text-Analysis Technologies,” in AI Magazine, pp. 81-94, 1991.
[18] Lehnert, W., “Natural Language Processing Overview,” in Research Brochure for the Department of Computer Science at the University of Massachusetts, Amherst, 1993.
[19] Lehnert, W., McCarthy, J., Soderland, S., Riloff, E., Cardie, C., Peterson, J., Feng, F., Dolan, C., and Goldman S., “UMASS/HUGHES: Description of the CIRCUS System Used for MUC-5,” in Proceedings of the Fifth Message Understanding Conference, pp. 277--291, Baltimore, Maryland, 1993.
[20] Lehnert, W., Cardie, C., Fisher, D., McCarthy, J., Riloff, E., and Soderland, S., "Evaluating an Information Extraction System," in Journal of Integrated Computer-Aided Engineering, vol, 1, no. 6, pp. 453-472, 1995.
[21] Lindberg, D., Humphreys, B., and McCray, A, “The Unified Medical Language System,” in Methods of Information in Medicine, vol. 32, no. 4, pp. 281-291, 1993.
[22] McCarthy, J. and Lehnert, W., “Using Decision Trees for Coreference Resolution,” in Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), pp. 1050-1055, Montreal, Canada, 1995.
[23] Minimal Standard Terminology for a Computerized Endoscopic Database. The Committee for Minimal Standards of Terminology and Documentation in Digestive Endoscopy of the European Society of Gastrointestinal Endoscopy (ESGE).
[24] Minsky, M., “A Framework for Representing Knowledge,” in Winston, P., editor, The Psychology of Computer Vision, pp. 211-277. McGraw-Hill, New York, 1975.
[25] Noy, N. and McGuinness, D.L., “Ontology Development 101: A Guide to Creating Your First Ontology,” in SMI Technical Report, 2001.
[26] PDQ Treatment Summary for Health Professionals: Colon Cancer. The National Cancer Institute of The National Institutes of Health. 1999.
[27] Rassinoux A.-M., Michel P.-A., Juge C., Baud R., Scherrer J.-R., “Natural Language Processing of Medical Texts within the HELIOS Environment,” in Computer Methods and Programs in Biomedicine, 1994, vol. 45, pp. 79-96.
[28] Rector, A., Solomon, W.D., Nowlan, W.A., and Rush, T.W., “A Terminology Server for Medical Language and Medical Information Systems,” in Methods of Information in Medicine, vol. 34, pp. 147-157, 1994. First published in the Proceedings of IMIA WG6, Geneva, May 1994.
[29] Rector, A., Glowinski, A.J., Nowlan, W.A., and Rossi-Mori, A., “Medical-Concept Models And Medical Records: An Approach Based On GALEN And PEN & PAD,” in Journal of American Medical Informatics Association , vol. 2, no. 1, pp. 19-35, 1995.
[30] Riloff, E., “Automatically constructing a dictionary for information extraction tasks,” in Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 811-816, Washington, DC, 1993.
[31] Riloff, E., “An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains,” in Artificial Intelligence. vol. 85, pp. 101-134, 1996.
[32] Riloff, E., “Automatically generating extraction patterns from untagged text,” in Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1044-1049, Portland, Oregon. 1996.
[33] Riloff, E., “Information Extraction as a Stepping Stone toward Story Understanding,” in Computational Models of Reading and Understanding, Ram, A. and Moorman, K., editos, The MIT Press, Cambridge, MA, 1999.
[34] Sager, N., Lyman, M.S., Bucknall, C., Nhan, N.T., and Tick, L.J., “Natural Language Processing and the Representation of Clinical Data,” in Journal of American Medical Informatics Association, vol. 1, no. 2, pp. 142-160, 1994.
[35] Schank, R.C. and Colby, K.M., Computer Models of Thought and Language. W.H. Freeman, San Francisco, 1973, 454 pp.
[37] Schank, R.C. “Reminding and memory organization: an introduction to MOPs”, in Lehnert, W.G. and Ringle, M.H., editors, Strategies for Natural Language Processing, Lawrence Erlbaum Associates, Hillsdale, NJ, 1982.
[38] Soderland, S., “Learning to Extract Text-based Information from the World Wide Web,” in Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), pp. 251─254, Newport Beach, Ca, 1997.
[39] Sowa, J.F., Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks / Cole, Pacific Grove, CA, 2000.
[40] Tennant, H.R., Natural Language Processing, Petrocelli Books, New York, NY, 1981.
[41] UMLS Knowledge Sources. (10th ed.) National Library of Medicine, 1999
[42] Wilks, Y., and Margaret Masterman, “In Early Years in machine translation: memoirs and biographies of pioneers,” in Hutchchins, L., editor, Studies in the History of the Language Sciences 97, John Benjamins, Amsterdam, 2000.
[43] W3C, W3C Recommendation: Extensible Markup Language (XML) 1.0, available at http://www.w3c.org/TR/1998/REC-xml-19980210, 1998
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top