跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.84) 您好!臺灣時間:2024/12/14 21:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:周伯翰
研究生(外文):Po-Han Chou
論文名稱:自動化的資訊擷取:擷取生物學文獻中的蛋白質對蛋白質互動關係
論文名稱(外文):Automated extraction of protein to protein interactions from biological literatures
指導教授:留忠賢留忠賢引用關係
指導教授(外文):Chung Shyan Liu
學位類別:碩士
校院名稱:中原大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:37
中文關鍵詞:自然語言處理蛋白質互動網路資訊擷取
外文關鍵詞:information extractionprotein to protein interaction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:194
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
動機:為了提高現在生命科學的數位資料庫的文獻使用的效率, 我們實作自動化的電腦程式系統來判斷文獻中是否有蛋白質對對蛋白質互動的關係, 來輔助生物學的生命遺傳和演化現象的研究。它利用電腦資訊擷取(information extraction)技術, 可以自動擷取數位資料庫中,蛋白質之間互動或影響(protein-protein interactions)的資訊, 並以圖形化的方式呈現互動網路(interaction network)資訊給使用者。
現在科學研究所發現的蛋白質互動資訊,主要是以科學的期刊文獻的形式存放在數位資料庫中。這些資料並不是用電腦方便處理的格式儲存, 也因此一般的生物文獻在利用電腦去搜索蛋白質對蛋白質的互動關係時, 效率低落。因此,現在為了利用電腦去幫助研究人員獲取資料庫中的蛋白質對蛋白質的互動資訊,
我們利用自然語言處理的技巧, 實作一個資訊擷取系統來幫助文獻內容分析。我們的電腦程式, 利用BIND(Biomolecular interaction network database)
資料庫中現有的酵母菌相關蛋白質文獻摘要作測試, 精確度有82%

Motivation: To improve the efficiency of using life science literatures storing in
digital databases, we designed software to determine the interactions between proteins,
which have been described in the abstracts of life science papers. This program will
help the researches in genetics and other life sciences. It is using information
extraction techniques and can automatically determine the protein interaction
information. Finally, this program outputs protein interaction network graphically to
research users.
Currently, most protein interactions information was described in life science journals
or genomic conferences. However, these data were stored in a format that is
convenient for human reading but hard for programmer to implement aided software
and difficult to perform query task precisely. Therefore, we take advantage of natural
language processing techniques and implement the software that will determine the
protein interactions. This information extraction system will help analysis of life
literatures and gain protein interaction information automatically.
Our program achieves the precision rate in about 82%. And we used the abstracts of
yeast papers for program test. These papers for test originally stored in Biomolecular
Interaction Network Database (BIND).

中文摘要……………………………………………………i
英文摘要……………………………………………………ii
誌謝…………………………………………………………iii
圖表目錄……………………………………………………iv
第一章緒論…………………………………………… 2
第二章現有方法的討論…………………………………4
2.1 為什麼需要文獻資訊擷取系統判斷蛋白質對蛋白質互動
關係。………………………………………………4
2.2 資訊擷取技術和自然語言處理簡介…………………… 5
2.2.1 目前的資訊擷取技術和資訊採礦技術應用在生物文
獻搜索蛋白質互動關係的現況和發展………………6
2.2.2 我們的實作的方法和目標應用………………………8
第三章我們的研究方法的流程…………………………….9
3.1 程式設計的流程和架構………………………….9
3.1.1 第一步蛋白質名稱的分辨…………………… 10
3.1.2 第二步處理複合字或複雜的句子……………..11
3.1.3 第三步辨認蛋白質-蛋白質的互動…………… 13
3.2 關於part-of-speech.
3.2.1 什麼是part-of-speech……………………….14
3.2.2 part-of-speech 的檔案格式………………….15
第四章蛋白質─ 蛋白質資訊擷取系統(程式示範及操作)
4.1 實作原理及流程………………………………17
4.2 程式示範………………………………………18
第五章蛋白質─ 蛋白質互動資訊擷取系統的效能評量
(實作的設計與結果)……………………………………………… 21
5.1 程式效能的量測及準確度的定義………………… 21
5.2 試驗結果缺點的討論………………………………28
第六章結論……………………………………………… 34
參考文獻…………………………………………………… 35
附錄一蛋白質互動資訊擷取系統測試的記錄
附錄二蛋白質互動資訊擷取系統測試的測量結果
作者簡介
1. MEDLINE 的網址: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi2. Eric Brill. et. al. Some Advances in Transformation-Based Part of SpeechTagging, 1994, June, 02, Vol. 1, AAAI.3. Yahoo, http://www.yahoo.com4. Google, http://www.google.com5. James Allen. Natural language understanding, 2nd,1994.6. Jiawei Han, Micheline Kamber. Data mining concepts and Techniques.7. Edward M.Marcotte, Loannis Xenarios, and David Eisenberg. Mining literaturefor protein-protein interactions. Vol. 17 no.4 2001 page 359~363. Bioinformatics.8. Thomas C.Rindflesch. Jayant V.Rajan, Lawrence Hunter. Extracting molecularbinding relationships from biomedical text, Proc. Of the 6th applied naturallanguage processing conference, pt.188~195 Association for Computational9. T.Sekimizu, H.S. Park, and J.Tsujii. Identifying the interaction between genesand gene products based on frequently seen verbs in medline abstracts.Genome Informatics. 1998 Universal Academy Press, Japan.10. Fukuda,K., Tsunode.T., Tamura.A. and Takagi,T. Toward informationextraction: identifying protein names from biological papers. In Proceeding ofthe Pacific Symposium on Biocomputing (PSB98), pp.707-718, 1998.11. HUGO Gene Nomenclature Committee, 程式基因字典下戴網址:http://www.gene.ucl.ac.uk/nomenclature/12. FlyBase: A Database of the Drosophila Genome13. the munich information center for protein sequences (MIPS)14. The HIGH database is a curated annotation of human DNA regions that areproven to code for Immunoglobulin-superfamily proteins.15. Biomolecular Interaction Network Database (BIND)16. Claude Roux, Denys Proux, Francois Rechenmann and Laurent Julliard.An ontology enrichment method for a pragmatic information extraction systemgathering data on genetic interactions, August 22, 200014th European Conference on Artificial Intelligence, Ontology Learning17. Saccharomyces Genome Database.http://genome-www.stanford.edu/Saccharomyces/18. Craven, Mark and Kumlien, Johan. Constructing Biological Knowledge Bases byExtracting information form Text Sources. In Proceedings of the 7th InternationalConference on Intelligent Systems for Molecular Biology 1999.19. Rindflesch, Thomas C; Rajan, Jayant V. and Hunter, Lawrence. ExtractingMolecular Binding Relationships from Biomedical Text. In Proceedings of theANLP-NAACL 2000, pages 188-195 Association for Computational Linguistics,2000.20. Proux,D. and Rechenmann, F.and Laurent, J. A Pragmatic Information ExtractionStrategy for gathering Data on Genetic Interactions. In ISMB 2000279~285,200021. William B.Frakes, Information Retrieval, Data Structures and algorithms.22. Albert,R.,Jeong,H. and Barabasi,A.L. (2000) Error and attack tolerance ofcomplex networks. Nature,406,378-38223. Gary D. Bader, lan Donaldson, Cheryl Wolting, B. F. Francis Ouellette, TonyPawson, and Christopher W. V. Hogue. BIND — The Biomolecular InteractionNetwork Database (2001) Nucleic Acids Research, Vol.29, No. 1. page242-245.24. Pawson,.T.(1995) Protein modules and signaling networks. Nature,373,page573-580.25. J.Pustejovsky, J.Castano, J.Zhang, 2002,Janauary. Robust relational parsing overbiomedical literature: extracting inhibit relations. the Pacific Symposium onBiocomputing Natural language processing session literature, Data mining forbiology.26. Mikio Yoshida, Ken-ichiro Fukuda and Toshihisa Takagi. 2000. Vol. 16 no. 2.pages 169-175. PNAD-CSS:a workbench for constructing a protein nameabbreviation dictionary.27. Benno Schwikowski, Peter Uetz, and Stanley Fields. A network ofprotein-protein interactions in yeast Vol 18 Dec 2000. Nature Biotechnology.28. Amy Hin Yan Tong, Marie Evangelista, Ainslie B. Parsons, Hong Xu, Gary D.bader, Nicholas Page, Mark Robinson Sasan Raghibizadeh, Christopher W. V.Hogue, Howard Busey, Brenda Andrews, Mike Tyers, Charles Boone.Systematic Genetic analysis with ordered arrays of yeast deletion mutants. Vol294. 14 Dec. 2001. Science.29. Chardin P, Camonis JH, Gale NW, van Aelst L, Schlessinger J, Wigler MH,Bar-Sagi D. Human Sos1: a guanine nucleotide exchange factor for Ras thatbinds to GRB2. Vol.260,1338-43, May 28, 1993, Science.

電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊