跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/09 08:12
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳建利
研究生(外文):CHIEN-LI CHEN
論文名稱:應用語意分析技術於癌症相關基因探勘與預測整合平台
論文名稱(外文):Applying the Semantic Analysis in Cancer-Related Genes Mining and Prediction Integrated System
指導教授:陳士農陳士農引用關係
指導教授(外文):SHIN-NUNG CHEN
學位類別:碩士
校院名稱:亞洲大學
系所名稱:生物資訊學系碩士班
學門:工程學門
學類:生醫工程學類
論文種類:學術論文
畢業學年度:99
語文別:中文
論文頁數:70
中文關鍵詞:癌症相關基因異型接合性損失比較型基因組雜交法醫學文獻
外文關鍵詞:cancer-related geneloss of heterozygosity (LOH)comparative genomic hybridization (CGH)biomedical literature
相關次數:
  • 被引用被引用:0
  • 點閱點閱:270
  • 評分評分:
  • 下載下載:53
  • 收藏至我的研究室書目清單書目收藏:1
隨著國民生活水準的提升及飲食習慣的改變,近年來癌症發生機率逐年的提升,根據行政院衛生署公佈的十大死因統計資料中,癌症至今已連續29年位居國人十大死因之首。面對癌症與日俱增的威脅下,癌症發生的原因便是相當重要的研究課題,但是隨著許多的研究人員投入癌症的相關研究使得許多醫學文獻及成果被發表,這些醫學文獻中包含著豐富的資訊如:基因與基因的反應、基因的功能性、生化反應路徑、基因和疾病的關係等,這些資訊都是非常值得被參考的,因此醫學研究人員如何在這些資訊過量的醫學文獻中取得值得研究的資訊,是一個棘手的問題。
本研究主要的目標為提供一個整合性平台結合語意分析(sematic analysis)的方法來收集醫學文獻並從中分析及預測癌症相關基因(cancer-related gene)資訊,利用NCBI所提供的PubMed搜尋引擎來搜尋癌症相關基因的醫學文獻(medical literature)與基因序列(gene sequence),並根據使用者輸入的癌症名稱 (cancer name)與異型接合性損失(loss of heterozygosity, LOH) 及比較型基因組雜交法 (comparative genomic hybridization, CGH)二個癌症的研究方法做組合來進行醫學文獻的收集、分類及探勘,本論文的方法能探勘出重要的癌症基因資訊,並了解基因內各組織的特性,希望藉由此系統能協助癌症相關醫學研究人員,在這資訊過量的時代裡迅速獲得所需的癌症相關醫學文獻與基因資訊,以節省時間並提高研究效率。
Abstract
  With the upgrading of the national standard of living and eating habits change, in recent years, the incidence of cancer increased year-by-year. According to the statistics of ten major causes of death published by the Bureau of Health Promotion, Department of Health, R.O.C. (Taiwan), cancer is a leading cause of death for twenty-nine consecutive years. The face of the growing threat of cancer, it is important to study the cause of cancer. With advances in the Human Genome Project, researchers are increasingly becoming engaged in bioinformatics-related research, including genome sequence analysis, drug design and discovery, and curative methods. The published literature contains a wealth of information, such as gene and gene expression, gene and function, biopathway, gene and disease relationship. However, while biomedical researchers how to search and retrieve worthy of study information in biomedical literature, there is a problem of information overloading.
  The purpose of this study is to develop a biomedical literature mining platform to predict cancer-related genes. The platform applied semantic analysis technology to increase the prediction accuracy of cancers, genes, and chromosome regions. Several value-added databases are constructed to achieve this purpose. They contain information of genes in the instable regions of cancer cells basing on the data accumulated from LOH and CGH experiments. This proposed platform can extract important information to accelerate the study and save plenty of time for biomedical researchers. Besides, this system can also be used on other diseases.
目錄
摘要............................................................................................................................................................i
Abstract....................................................................................................................................................ii
目錄..........................................................................................................................................................iii
圖目錄......................................................................................................................................................vi
第一章 緒論..............................................................................................................................................1
1.1 研究背景.......................................................................................................................................1
1.2 研究動機.......................................................................................................................................3
1.3 研究目的.......................................................................................................................................4
1.4 研究方法.......................................................................................................................................4
1.5 研究架構.......................................................................................................................................5
第二章 相關研究......................................................................................................................................6
2.1生物相關研究................................................................................................................................6
2.1.1 中心法則..............................................................................................................................6
2.1.2比較型基因體雜交(Comparative genomic hybridization, CGH)...................................7
2.1.3異質型缺失(Loss of Heterozygosity , LOH)........................................................................8
2.2文件探勘........................................................................................................................................9
2.2.1 資訊檢索(Information Retrieval) .......................................................................................9
2.2.2 資訊擷取(Information Extraction) ...................................................................................10
2.3 資料分類(Data Classification) ..................................................................................................10
2.3.1貝式分類器........................................................................................................................11
2.3.2決策樹 (Decision Tree).....................................................................................................13
2.3.3 KNN分類法(K-Nearest-Neighbor Classifier)..................................................................14
2.3.4語意分析(Semantic Analysis) ).........................................................................................15
2.4 生物資料庫................................................................................................................................15
2.4.1 PubMed ………………………………………………………………………………....16
2.4.2 MeSH................................................................................................................................17
2.4.3 Entrez Gene.......................................................................................................................18
2.4.4 RefSeq...............................................................................................................................19
2.4.5 Ensembl.............................................................................................................................20
2.4.6 GeneCards.........................................................................................................................21
第三章 研究方法..................................................................................................................................22
3.1 研究流程...................................................................................................................................22
3.2 系統架構...................................................................................................................................23
3.3 文獻與資料來源收集...............................................................................................................24
3.3.1 癌症相關資訊收集............................................................................................................24
3.3.2 基因相關資訊收集............................................................................................................25
3.3.3 基因序列收集....................................................................................................................28
3.3.4 相關醫學文獻收集............................................................................................................32
3.4 醫學文獻探勘...........................................................................................................................34
3.4.1 自然語言處理..............................................................................................................34
3.4.2 貝式分類......................................................................................................................36
3.4.3 醫學文獻探勘..............................................................................................................38
3.5 智慧型代理人系統...................................................................................................................41
第四章 實作結果與分析......................................................................................................................43
4.1 資料集與文件前處理...............................................................................................................43
4.1.1 資料來源收集....................................................................................................................43
4.1.2 查詢語法............................................................................................................................44
4.1.3 文件格式............................................................................................................................45
4.2 系統實作...................................................................................................................................46
4.2.1 醫學文獻取得..................................................................................................................46
4.2.2 自然語言處理..................................................................................................................49
4.2.3 二維關聯分析..................................................................................................................51
第五章 結論與未來研究方向............................................................................................................52
5.1 結論.........................................................................................................................................52
5.2 未來研究方向.........................................................................................................................53
參考文獻..............................................................................................................................................54
附錄 詞性標記列表............................................................................................................................57
誌謝......................................................................................................................................................59
簡歷......................................................................................................................................................60
[1]行政院衛生署99年國人十大死因,http://health99.doh.gov.tw/Hot_News/h_NewsDetailN.aspx?TopIcNo=6240.
[2]2008年世界癌症大會紀實,http://canceraway.c6.ixwebhosting.com/International_Show.asp?AppCode=SITEPAGES&ID=748.
[3]蘇武雄,癌症,臺北市:水牛,1991.
[4]F. Crick,“Central Dogma of Molecular Biology”,Nature,vol. 227,pp. 561-563,Aug. 1970.
[5]基因定序核心實驗室, http://www.cgmh.com.tw/intr/intr2/c32a0/chinese.
[6]Medline Growth,http://jasonpriem.com/2010/10/medline-literature-growth-chart.
[7]何謂生物資訊學,http://microbiology.scu.edu.tw/wong/courses/inform/bioinform1.htm.
[8]黃鎮剛,「研究新領域報導」-生物資訊簡介,vol. 13, no. 2, pp.58-60, 2001.
[9]生命的藍圖-中心法則,http://biotech.nstm.gov.tw/03/032.asp.
[10]A. Kallioniemi, O. P.Kallioniemi, D. Sudar, D.Rutovitz, J. W. Gray, F. Waldman and D. Pinkel, “Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors,” Science, vol. 258, no. 5083, pp.818-21, Oct. 1992.
[11]Oslo University Hospital,http://www.ous-research.no/home/lothe/methods/1386.
[12]Grobelnik, M., Mladenic, D. and Frayling, N. M. “Text Mining as integration of several related research areas: report on kdd’s workshop on text mining 2000.” ACM SIGKDD Explorations Newsletter, vol. 2, no. 2, pp.99-102, 2000.
[13]Chien,L. F., ”PAT-Tree-based keyword extraction for Chinese information retrieval,” Proceedings of the 1997 ACM SIGIR, pp.50-58, 1998.
[14]Ong, T. H. and Chen, H., “Updateable PAT-Tree approach to Chinese key phrase extraction using mutual information: A linguistic foundation for knowledge management,” The 2nd Asian Digital Libraries Conference, pp.63-84, 1999.
[15]Wong, K. F. and Li, W. J., “Intelligent Chinese information retrieval; Why is it so Difficult?,” Proceedings of the First Asia Digital Library Workshop, pp.47-56, 1998.
[16]M. E. Maron, “Automatic Indexing: An experimental Inquiry” , Journal of The Association for Computing Machinery, vol. 8, PP.404-417, 1961.
[17]H. Borko and M. Bernick, “Automatic Document Classification,” Journal of the ACM, vol. 10, no. 1, pp.151-162, 1963.
[18]J.R. Quinlan, “Induction of decision trees.” Machine Learning, vol. 1, pp.81-106, 1986.
[19]KNN,http://awwthor.wordpress.com/2009/12/31/a-thousand-foot-view-of-machine-learning.
[20]陳雅娟, 基於Ontology之模糊代理人於中文新聞文件摘要技術之研究, 長榮大學經營管理研究所碩士論文, 2003.
[21]Choi, N., Song, I., and Han, H, “A survey on ontology mapping”, SIGMOD, vol. 35, Issue 3, pp.34-41, September 2006.
[22]gopubmed, http://www.gopubmed.org/web/gopubmed.
[23]PubMed, http://www.ncbi.nlm.nih.gov/PubMed.
[24]MeSH, http://www.ncbi.nlm.nih.gov/MeSH.
[25]Entrez Gene, http://www.ncbi.nlm.nih.gov/gene.
[26]RefSeq, http://www.ncbi.nlm.nih.gov/refseq.
[27]Ensembl, http://www.ensembl.org/index.html.
[28]GeneCards, http://www.genecards.org.
[29]Collier, N., C. Nobata, C. and Tsujii, J. “Extracting the names of genes and gene products with a Hidden Markov Model”, In Proceedings COLING 2000, pp.201-207, 2000.
[30]Genia Tagger: http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger.
[31]Sycara, K., A. Pannu, M. Williamson and Zeng, D., “Distributed Intelligent Agents,” IEEE Expert, 1996.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 張火燦、紀乃文(2006),護理人員組織承諾與專業承諾多元構陎交互作用對離職債向與離業債向的影響:離職債向與離業債向的中介效果,人力資源管理學報。6(2),111-133。
2. 張晏蓉、葉婉榆、陳春萬、陳秋蓉、石東生、鄭雅文(2007)。臺灣受僱者疲勞狀況的分布狀況與相關因素。臺灣公共衛生雜誌,26(1),75-87。
3. 曾慧萍、鄭雅文(2002)。「負荷控制支持」與「付出回饋失衡」工作壓力模型之探討與其中文版量表信效度之檢驗:以電子產業員工為研究對象。臺灣公共衛生雜誌,21(6),420-432。.
4. 黃臺生(2001)。工作倦怠相關理論探述。中國行政,70,37-68。
5. 葉婉榆、鄭雅文、陳美如、邱文祥(2008)。職場疲勞狀況與工作過度投入之相關因素:以臺北市36家職場受雇員工為例。臺灣公共衛生雜誌,27(6),463-477。
6. 葉婉榆、鄭雅文、陳美如與邱文祥(2008)。職場疲勞量表的編製與信效度分析。臺灣公共衛生雜誌,27(5),349-364。
7. 竹碧華 臺灣北部說唱音樂之研究 復興崗學報 第63期 臺北 1998
8. 王馗 梅州佛教香花的結構、文本與變體 李豐懋 民俗曲藝禮儀實踐與地方社
9. 劉志偉 國際農糧體制與臺灣的糧食依賴:臺灣養豬業的歷史考察 臺灣史研究
10. 竹碧華 臺灣北部說唱音樂之研究 復興崗學報 第63期 臺北 1998
11. 王馗 梅州佛教香花的結構、文本與變體 李豐懋 民俗曲藝禮儀實踐與地方社
12. 劉志偉 國際農糧體制與臺灣的糧食依賴:臺灣養豬業的歷史考察 臺灣史研究
13. 竹碧華 臺灣北部說唱音樂之研究 復興崗學報 第63期 臺北 1998
14. 王馗 梅州佛教香花的結構、文本與變體 李豐懋 民俗曲藝禮儀實踐與地方社
15. 劉志偉 國際農糧體制與臺灣的糧食依賴:臺灣養豬業的歷史考察 臺灣史研究