跳到主要內容

臺灣博碩士論文加值系統

(44.211.26.178) 您好!臺灣時間:2024/06/16 01:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳函希
研究生(外文):Han-Hsi Chen
論文名稱:建置團隊導向及個人化之序列加值型資料庫
論文名稱(外文):Building Team-Oriented and Personalized Value-Added Sequence database
指導教授:王惠嘉王惠嘉引用關係
指導教授(外文):Hei-Chia Wang
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:中文
論文頁數:69
中文關鍵詞:序列註解蛋白質家族資訊過濾知識分享推薦系統文字探勘
外文關鍵詞:text miningrecommender systemknowledge sharinginformation filteringsequence annotationprotein family
相關次數:
  • 被引用被引用:1
  • 點閱點閱:247
  • 評分評分:
  • 下載下載:28
  • 收藏至我的研究室書目清單書目收藏:2
  由於網際網路的發達以及生物技術的進步,世界各地生物學家們每天研究產出的許多生物學資料,已可透過網路的發表分享知識;一些政府機構鑒於生物資料在人類健康上扮演重要角色,因此大量收集生物資訊、建立電子資料庫,並提供使用者免費檢索的服務,如美國國家醫學圖書館所成立的NCBI(National Center for Biotechnology Information),便為目前世界上最大的公共生物資料庫。然而這些資料的快速成長,雖提供研究者豐富的資訊來源,卻也同時帶來「資訊過載」(information overload)的困擾。
  自1990年人類基因體計劃(Human Genome Project, HGP)開始運作以來,序列資料量便以級數倍數快速成長,目前以人工方式註解基因序列面臨到幾個問題:ㄧ是大量的序列資料;二是複雜的分析工具,對一般的生物學家而言,要學習怎樣使用序列分析工具,需要投入不少的時間與心力;三是過量的文獻資料,在註解未知序列過程中,需要多方搜集閱讀相關文獻資料,以推測序列功能,過多文獻使得研究者難以迅速搜尋到有關的文章;四是無法重用的分析結果,同一個研究團隊中的成員可能會重覆進行相同的分析程序,使得分析結果與搜尋得到的文獻散落各處,浪費分析時間與儲存成本。
  本研究動機是鑒於目前在註解蛋白質家族類型的基因序列尚未有較有效率的方式,因此提出一可為各研究團隊建立其有興趣之專屬知識庫的方法,為一具個人化文獻推薦功能之資訊管理架構;結合序列分析工具、決策樹方法,提供生物學家一個有效率的蛋白質家族基因註解環境,幫助解決上述序列註解過程中所面臨到的問題。並著重在個人化的序列相關文獻推薦,利用改良協同過濾、內容導向過濾、網頁使用探勘、資訊擷取等方法分析使用者及團隊成員的研究喜好,過濾文件品質、實施知識推薦,緩和知識過載所帶來的困擾。
[目錄]
1. 緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 研究流程 6
1.4 研究範圍與限制 7
1.5 論文架構 7
2. 文獻探討 8
2.1 生物資訊學 8
2.2 資料探勘技術在生物資料上的應用 9
2.2.1 決策樹方法 10
2.3 資訊擷取 13
2.3.1 資訊擷取系統之分類 13
2.3.2 資訊擷取與序列註解 15
2.4 推薦系統 16
2.4.1 推薦系統的介紹 16
2.4.2 推薦系統評分方式 17
2.4.3 推薦系統的應用 18
2.5 推薦技術的分類 19
2.5.1 各種推薦技術介紹 19
2.5.2 User-based與Item-based協同過濾方法 23
3. 研究方法 25
3.1 研究架構 25
3.2 資訊收集模組 27
3.3 資訊擷取模組 29
3.3.1 學習蛋白質家族特徵資料 30
3.3.2 學習文件特徵資料 31
3.3.3 學習使用者特徵資料 33
3.4 知識推薦模組 34
3.4.1 推薦方法的設計 35
3.4.2 以序列相似度過濾文件 37
3.4.3 工作導向內容為基礎的知識推薦 38
3.4.4 結合使用者觀點與文獻內容的知識推薦 38
4. 系統建置與驗證 43
4.1 系統實作設計 43
4.1.1 資訊收集模組 44
4.1.2 資訊擷取模組 45
4.1.3 知識推薦模組 48
4.1.4 資料庫設計 50
4.2 實驗方法 50
4.2.1 實施方法 50
4.2.2 評估項目 51
4.2.3 實驗參與者 52
4.2.4 資料來源 53
4.3 實驗結果與分析 53
5. 結論及未來研究方向 58
5.1 研究成果 58
5.2 未來研究方向 59
參考文獻 61
附錄一 Rule-based Classifier of late embryogenesis abundant protein family 68
附錄二 Stopwords 69
[中文文獻]
阮明淑、溫達茂(民91)。Ontology應用於知識組織之初探。佛教圖書館館訊,32,6-17。
郭弘志(民92)。應用資料挖掘技術推論未知PLTP序列相關文獻。成功大學資訊管理研究所碩士論文,未出版,台南市。
張毓倫(民92)。個人化顯隱性知識推薦方法之研究。成功大學資訊管理研究所碩士論文,未出版,台南市。
[網站資料]
National Center for Biotechnology Information.
(http://www.ncbi.nlm.nih.gov/)
楊永正(2000)。發展國內生物資訊學之我見。NHRI研究資源週演講。(http://binfo.ym.edu.tw/yang/talks/promote_binfo.htm)
[英文文獻]
Aggarwal, C. C., Wolf, J. L., Wu, K. L., & Yu, P. S. (1999). Horting hatches an egg: a new graph–theoretic approach to collaborative filtering. Proceedings of the KDD’99, San Diego, CA, 201–212.
Attwood, T., & Parry-Smith, D. J. (1999). Introduction to bioinformatics. Harlow, Essex, England : Longman.
Aubourg, S., Lecharny, A., & Bohlmann, J. (2002). Genomic Analysis of the Terpenoid Synthase Attps Gene Family of Arabidopsis Thaliana. Molecular Genetics and Genomics, 267(6), 730-745.
Balabanovic, M., & Shoham, Y. (1997). Combining Content-Based and Collaborative Recommendation. Communication of the ACM, 40(3), 66-72.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. New York: The ACM Press.
Baldi, P., & Pollastri, G. (2002). A Machine Learning Strategy for Protein Analysis. Bioinformatics, 17(2), 21-27.
Baker, D., & Sali, A. (2001). Protein Structure Prediction and Structural Genomics. Science, 294(5540), 93-96.
Bazzan, A., Engel, P. M., Schroeder, L. F., & da Silva, S. C. (2002). Automated Annotation of Keywords for Proteins Related to Mycoplasmataceae Using Machine Learning Techniques. Bioinformatics, 18, S35-S43.
Beeferman, D., & Berger, A. (2000). Agglomerative Clustering of a Search Engine Query Log. Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, Boston, Massachusetts, United States, 401-416.
Bohlmann, J., Meyer-Gauen., G., & Croteau, R. (1998). Plant Terpenoid Synthases: Molecular Biology and Phylogenetic Analysis. Proceedings of the National Academy of Sciences, USA, 95, 4126-4133.
Bollacker, K., Lawrence S., & Giles, C. L. (1999). A System for Automatic Personalized Tracking of Scientific Literature on the Web. Digital Libraries The Fourth ACM Conference on Digital Libraries. ACM Press, New York, 105–113.
Chen, H., Chung, Y. M., Ramsey, M., & Yang, C. C. (1998). An Intelligent Personal Spider (Agent) for Dynamic Internet/Intranet Searching. Decision Support Systems, 23(1), 41-58.
Cheng, P. J., & Yang, W. P. (1999). A new content-based access method for video databases. Information Sciences, 118, 37–73.
Chiang, J. J. & Yu, H. C. (2003). MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment. Bioinformatics, 19(11), 1417-1422.
Clare, A., & King, R. D. (2002). Machine learning of functional class from phenotype data. Bioinformatics, 18(1), 160-166.
Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., & Sartin, M. (1999). Combining Content based and Collaborative Filters in an Online Newspaper. Proceedings of the ACM SIGIR '99 Workshop on Recommender Systems: Algorithms and Evaluation, University of California, Berkeley.
Claypool, M., Brown, D., Le, P., & Waseda, M. (2001). Inferring User Interests. IEEE Internet Computing, 5(6), 32-39.
Deshpande, M., & Karypis, G. (2004). Item-based Top-N Recommendation Algorithms. ACM Transactions on Information Systems, 22(1), 143–177.
Friedman, C., Kra, P., Yu, H., Krauthammer, M., & Rzhetsky, A. (2001). GENIS: A Natural-language Processing System for the Extraction of Molecular Pathways from Journal Articles. Bioinformatics, 17(9001), S74-S82.
Gauch, J. M., Gauch, S., Bouix, S., & Zhu, X. (1999). Real time video scene detection and classification. Information Processing and Management, 35, 401–420.
Gibas, C., & Jambeck, P. (2001). Developing Bioinformatics Computer Skills. Cambridge: O'Reilly.
Herlocker, L. H., & Konstan, A. K. (2001). Content-Independent Task-Focused Recommendation. IEEE Internet Computing, 5(6), 40-47.
Hofmann, T. (2004). Latent Semantic Models for Collaborative Filtering. ACM Transactions on Information Systems, 22(1), 89-115.
Huang, M., Zhu, X., Hao, Y., Payan, D. G., Qu, K., & Li, M. (2004). Discovering patterns to extract protein–protein interactions from full texts. Bioinformatics, 20(18); 3604 - 3612.
Hughey, R., & Karplus, K. (2001). Bioinformatics : A new field in engineering education. In 31st ASEE/IEEE Frontiers in Education Conference, F2B-15-F2B-19.
Karypis, G. (2000). Evaluation of Item-based Top-N Recommendation Algorithms. (Tech. Rep. No. 00-046), Department of Computer Science, University of Minnesota / Army HPC Research Center.
Kautz, H., Selman, B., & Shah, M. (1997). Referral web: combining social networks and collaborative filtering. Communications of the ACM, 40(3), 63–65.
King, R. D., Karwath, A., Clare, A., & Dehaspe, L. (2000a). Genome Scale Prediction of Protein Functional Class from Sequence Using Data Mining. In Ramakrishnan, R., Stolfo, S. & Bayardo, R. (eds). The sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, The Association for Computing Machinery, New York, 384-389.
King, R.D., Karwath, A., Clare, A., & Dehaspe, L. (2000b). Accurate Prediction of Protein Functional Class from Sequence in the Mycobacterium Tuberculosis and Escherichia Coli Genomes Using Data Mining. Yeast(Comparative and Functional Genomics, 17(4), 283-293.
Kohrs, A., & Merialdo, B.(2001). Creating user-adapted websites by the use of collaborative filtering. Interacting with Computers, 13, 695-716.
Koike, A., Niwa, Y., & Takagi1, T. (2005). Automatic extraction of gene/protein biological functions from biomedical text, Bioinformatics, 21(7), 1227-1236.
Kolda, T. G., & O'Leary, D. P. (1998). A Semidiscrete Matrix Decomposition for Latent Semantic Indexing in Information Retrieval. ACM Transactions on Information Systems, 16(4), 322-346.
Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., & Riedl, J. (1997). GroupLens: Applying Collaborative Filtering to Usenet News. Communications of the ACM, 40(3), 77-87.
Konstan, T. A. (2004). Introduction to Recommender Systems: Algorithms and Evaluation. ACM Transactions on Information Systems (TOIS), 22(1), 1-4.
Kretschmann, E., Fleischmann, W., & Apweiler, R. (2001). Automatic Rule Generation for Protein Annotation with the C4.5 Data Mining Algorithm Applied on SWISS-PROT. Bioinformatics, 17(10), 920-926.
Kuo, Y.-F., & Chen, L.-S. (2001). Personalization technology application to internet content provider. Expert Systems with Applications, 21, 203–215.
Lawrence, R. D., Almasi, G. S., Kotlyar, V., Viveros, M. S., & Duri, S. S. (2001). Personalization of supermarket product recommendations. Data Mining and Knowledge Discovery, 5(1–2), 11–32.
Lee, D. S., Kim, G. Y., & Choi, H. I. (2003). A web-based collaborative filtering system. Pattern Recognition, 36, 519–526.
Lee, C. H., Kim, Y. H., & Rhee, P. K. (2001). Web personalization expert with combining collaborative filtering and association rule mining technique. Expert Systems with Applications, 21, 131–137.
Lee, W. P., Liu, C. H., & Lu, C. C. (2002). Intelligent Agent-based Systems for Personalized Recommendations in Internet Commerce. Expert Systems with Applications, 22(4),275–284.
Lin, S. H., Shin, C. S., Chen, M. C., Ho, J. M., Ko, M. T. & Huang, Y. M. (1998). Extracting Classification Knowledge of Internet Documents with Mining Term Associations: A Semantic Approach. Proceedings of the 21 st annual international ACM SIG1R conference on Research and development in information retrieval (SIGIR-98), Melbourne, Australia.
Linden, G., Smith, B., & York, J. (2003). Amazon.com Recommendations. IEEE Internet Computing, 7(1), 76–80.
McNee, S. M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S. K., Rashid, A. M., Konstan, J. A., & Riedl, J. (2002). On the recommending of citations for research papers. Proceedings of the CSCW’02, New Orleans, Louisiana, 116–125.
Middleton, S.E., De Roure, D. C., & Shadbolt, N.R. (2001). Capturing Knowledge of User Preferences: ontologies on recommender systems. Proceedings of the 1st International Conference on Knowledge Capture, Victoria, Canada.
Mobasher, B., Dai, H., Luo, T., Nakagawa, M., Sun, Y., & Wiltshire, J. (2000). Discovery of Aggregate Usage Profiles for Web Personalization. Proceedings of the Web Mining for E-Commerce Workshop (WEBKDD'2000), Boston, Massachusetts.
Mooney, R. J., & Roy, L. (2000). Content-based book recommending using learning for text categorization. Proceedings of the Digital Libraries, San Antonio, TX, 195–204.
Morita, J., & Shinoda, Y. (1994). Information Filtering based on User Behavior Analysis and Best Match Text Retrieval. Proceeding of the 7th annual ACM-SIGIR conference on research and development in information retrieval, New York:ACM Press, 272-281.
Nowak, R. (1995). Entering the postgenome era. Science, 270(5235), 368-371.
O'Connor, M., & Herlocker, J. (1999). Clustering Items for Collaborative Filtering. Workshop on Recommender System: Algorithms and Evaluation, University of California, Berkeley, USA.
Palakal, M., Mukhopadhyay, S., Mostafa, J., Raje, J., N'Cho, M., & Mishra, S. (2002). An Intelligent Biological Information Management System. Bioinformatics, 18(10), 1283-1288.
Pennock, D. M., Horvitz, E., Lawrence, S., & Giles, C. L. (2000). Collaborative Filtering by Personality Diagnosis: A Hybrid Memory-based and Model-based Approach. Proceedings of the 16th of Conference on Uncertainty in Artificial Intelligence, Stanford, CA, 473-480.
Resnick, P., & Varian, R. H. (1997). Recommender Systems. Communication of ACM, 40(3), 56-58.
Rindflesch, T. C., Tanabe, L., Weinstein. J. N., & Hunter, L. (2000). EDGAR: extraction of drugs, genes and relations from the biomedical literature. Pac. Symp. Biocomput, 5, 517-528.
Salton, G.,. (1986). Another Look At Automatic Text-Retrieval System. Communications of the ACM, 29, 648-656
Sakagami, H., & Kamba, T. (1997). Learning Personal Preferences on Online Newspaper Articles from User Behaviors. Computer Networks and ISDN Systems, 29(8-13), 1447-1455.
Sarwar, B. M., Karypis, G., Konstan, J. A., & Riedl, J. (2000). Analysis of Recommendation Algorithms for E-Commerce. Proceedings of the ACM EC'00 Conference, Minneapolis, MN, 158-167.
Sarwar, B. M., Karypis, G., Konstan, J. A., & Riedl, J. (2001). Item-based Collaborative Filtering Recommendation Algorithm. Proceedings of the 10th International World Wide Web Conference, 285–295.
Schafer, J. B., Konstan, J. A., &Ridel, J. (1999). Recommender systems in E-Commerce. Proceedings of the E-COMMERCE 99, Denver, CO, 158-166.
Schafer, J.B., Konstan, J. A., & Ridel, J. (2001). E-Commerce Recommendation Applications. Data Mining and Knowledge Discovery, 5(1/2), 115-153.
Shah, P.K., Perez-Iratxeta, C., Bork, P., & Andrade, M.A. (2003). Information extraction from full text scientific articles: where are the keywords? BMC Bioinformatics, 4(1), 20-28.
Sparck Jones, K. (1995). Reflections on TREC. Information Processing and Management, 31, 291-314.
Stein, L. (2001). Genome Annotation: From Sequence to Biology. Nature Genetics Review, 2, 493.
Wang, H. C., Kuo, H. C., Chen, H. H., Hsiao, Y. Y., & Tsai, W. C. (2005). KSPF: Using Gene Sequence Patterns and Data Mining for Biological Knowledge Management. Expert Systems with Applications, 28(3), 537-545.
Webb, G., Pazzani, M. J., & Billsus, D. (2001). Machine Learning for User Modeling. User Modeling and User-Adapted Interaction, 11, 19-20.
Wise, M. J. (2003). LEAping to conclusions: A computational reanalysis of late embryogenesis abundant proteins and their possible roles. BMC Bioinformatics, 4(52).
Wise, M. J. & Tunnacliffe, A. (2004). POPP the question: what do LEA proteins do? TRENDS in Plant Science, 9(1), 13-17
Witten, I., H., & Frank E. (2000). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco, Calif.: Morgan Kaufmann.
Xu, J. (1997). Solving the Word Mismatch Problem Through Automatic Text Analysis. PhD Thesis, University of Massachusetts at Amherst.
Yan, T. W. & Garcia-Molina, H. (1999). The SIFT Information Dissemination System. ACM Transactions on Database Systems, 24(4), 529-565.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top