跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.110) 您好!臺灣時間:2025/09/25 06:13
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:劉其瑋
研究生(外文):Chi-Wei Liu
論文名稱:質譜資料前處理中基底線修正與波峰校準之新方法
論文名稱(外文):Novel Baseline Correction and Peak Alignment Methods for Mass Spectrometry Data Preprocessing
指導教授:曾新穆曾新穆引用關係
指導教授(外文):Shin-Mu Tseng
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:59
中文關鍵詞:波峰校準基底線修正質譜資料前處理資料探勘蛋白質譜
外文關鍵詞:baseline correctionMS data preprocessingdata miningmass spectrometrypeak alignment
相關次數:
  • 被引用被引用:0
  • 點閱點閱:424
  • 評分評分:
  • 下載下載:12
  • 收藏至我的研究室書目清單書目收藏:0
質譜分析在蛋白質體學研究中是重要的技術之一,而質譜資料前處理過程中又以基底線修正及波峰校準處理更是影響最後分析結果品質的關鍵。在目前已有的研究方法中,往往基底線修正的失真度與波峰校準的雜訊敏感度皆過高。因此,在本研究中,我們分別提出改進方法。在基底線修正的處理上,我們結合凸包(convex hull)演算法和LOESS迴歸法的優點找出更精準的質譜基底線,如此便能提升質譜訊號的品質。另一方面,由於目前已有的波峰校準方法無法找出雜訊位置,因此做出來的校對結果容易受到雜訊影響,所以我們提出了一個新的波峰校準演算法TPC (Two-Phases Clustering),利用此演算法,我們可以有效地從含有雜訊的波峰集中,把潛在雜訊從中篩選出來,進而提升質譜波峰資料間校對的正確性。在實驗部份,我們使用真實資料與人造資料來測試效能。在真實資料的實驗結果中,其效能評估比之前的方法還要好,而在人造資料的實驗中,我們所提出的方法可以更精確的找出實驗預藏的潛在雜訊,並且其涵蓋率(Recall)、精確率(Precision)以及F-measure值都很高。由實驗結果來看,我們提出的方法的確比目前已有的分析法有更佳的正確性。
In most proteomic studies, Mass spectrometry (MS) data analysis has become an important protein identification technique. The “baseline correction” and “peak alignment” methods are the key factors in MS data preprocessing stage for further analysis. However, the existing baseline correction methods may cause the distortion for original peak signals. And the existing peak alignment methods may be sensitive to noise peaks across various MS samples. In this study, we proposed two novel algorithms for these two key factors. We combined Convex Hull algorithm and LOESS regression method to find a better baseline for a MS data. It can successfully correct each MS peak profile and the result is more similar to original profile than the existing methods do. In the existing peak alignment methods, no studies have ever tried to point out the inconsistent peaks across various MS samples. We also proposed a new TPC (Two-phases clustering) algorithm to align multiple MS samples while the potential noise peaks could be indexed. In our experiments, we used real MS datasets and also generated synthetic datasets to evaluate the accuracy of peak alignment method. The results show that our method is better than previous method.
中文摘要 I
英文摘要 II
誌謝 IV
目錄 V
表目錄 VII
圖目錄 VIII
第一章 導論 1
1.1 背景 1
1.2 研究動機 4
1.3 問題定義 5
1.4 研究方法 6
1.5 貢獻 7
1.6 論文架構 7
第二章 文獻探討 8
2.1 生物資訊學上的相關研究 8
2.2 基底線修正(Baseline Correction) 9
2.3 強度值正規化(Intensity Normalization) 10
2.4 波峰偵測(Peak Detection) 11
2.5 波峰校準(Peak Alignment) 14
第三章 研究方法 18
3.1 相關基礎描述 18
3.2 基底線校正(Baseline Correction) 19
3.3 波峰校準 (Peak Alignment) 22
3.3.1 Intensity Clustering Phase 24
3.3.2 Build Potential Noise List 26
3.3.3 M/Z Clustering Phase 28
第四章 實驗分析 31
4.1 實驗資料與環境 31
4.2 真實資料(Real Data)實驗結果 32
4.3 人造資料(Synthetic Data) 產生器 39
4.3.1 荷質比模擬 與 生物變異性 41
4.3.2 強度值模擬 42
4.3.3 產生雜訊點 43
4.4 人造資料(Synthetic Data)的評比方法 46
4.4.1 人造資料(Synthetic Data)實驗結果 48
4.5 實驗總結 52
第五章 結論與未來的研究方向 53
5.1 結論 53
5.2 未來發展 54
第六章 參考文獻 55
[1]R. Aebersold, and M. Mann, “Mass spectrometry-based proteomics”. Nature, 422, 198–207, 2003
[2]K. A. Baggerly, J. S. Morris, J. Wang, D. Gold, L. C. Xiao, and K. R. Coombes, “A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples,” Proteomics, vol. 3, pp. 1667-72, 2003.
[3]E. J. Breen, F. G. Hopwood, K. L. Williams, and M. R. Wilkins. “Automatic poisson peak harvesting for high throughput protein identification,” Electrophoresis, 21:2243–2251,2000.
[4]TP. Conrads, VA. Fusaro, S. Ross, D. Johann, V. Rajapakse, BA. Hitt, SM. Strinberg, EC. Kohn, DA. Fishman, G. Whitely, JC. Barrett, LA. Liotta, EF 3rd. Petricoin, TD. Veenstra, “High-resolution serum proteomic features for ovarian cancer detection,” Endocr Relat Cancer, 2004 Jun;11(2):163-78.
[5]KR. Coombes, HA. Fritsche, C. Clarke, JN. Chen, KA. Baggerly, JS. Morris, LC. Xiao, MC. Hung, HM. Kuerer, “Quality control and peak finding for proteomics data collected from nipple aspirate fluid by surface-enhanced laser desorption and ionization,” Clinical Chemistry. 2003 Oct;49(10):1615-23.
[6]K. R. Coombes, S. Tsavachidis, J. S. Morris, K. A. Baggerly, M. C. Hung, and H. M. Kuerer, "Improved peak detection and quantification of mass spec-trometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform," The University of Texas M.D. Anderson Cancer Center, Technical Report UTMDABTR-001-04, 2004.
[7]E.P. Diamandis, “Mass Spectrometry as a diagnostic and a cancer biomarker discovery tool: opportunities and potential limitations.” Mol. Cell. Proteomics, 3, 367–378, 2004
[8]R. Etziono, N. Urban, S. Ramsey, M. Mcintosh, S. Schwartz, B. Reid, J. Radich, G. Anderson, L. Hartwell, “The case for early detection,” Nature reviews cancer, 3(4):243-52, 2003 Apr.
[9]E. T. Fung and C. Enderwick, “ProteinChip clinical proteomics: computational challenges and solutions,” Biotechniques, vol. Suppl, pp. 34-8, 40-1, 2002.
[10]P. Geurts, M. Fillet, D. de Seny, MA. Meuwis, M. Malaise, MP. Meerville, L. Wehenkel, “Proteomic mass spectra classification using decision tree based ensemble methods,” Bioinformatics, Volume 21, Number 14, page 3138--3145 – 2005
[11]Y. Hu, S. Zhang, J. Yu, J. Liu, S. Zheng, “SELDI-TOF-MS: the proteomics and bioinformatics approaches in the diagnosis of breast cancer,” Breast, 14(4):250-5, 2005 Aug.
[12]Q. Liu, B. Krishnapuram, P. Pratapa, X. Liao, A. Hartemink, L. Carin, “Identification of differentially expressed proteins using MALDI-TOF mass spectra,” Asilomar Conf on Signals, Systems and Computers, November 2003.
[13]D. I. Malyarenko, W. E. Cooke, B. L. Adam, G. Malik, H. Chen, E. R. Tracy, M. W. Trosset, M. Sasinowski, O. J. Semmes, and D. M. Manos, "Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques," Clin Chem, vol. 51, pp. 65-74, 2005.
[14]EF. Petricoin, AM. Ardekani, BA. Hitt, PJ. Levine, VA. Fusaro, SM. Steinberg, GB. Mills, C. Simone, DA. Fishman, EC. Kohn, LA. Liotta, “Use of proteomic patterns in serum to identify ovarian cancer,” Lancet , 359(9306):572-7, 2002 Feb 16
[15]J. Prados A. Kalousis M. Hilario, “On Preprocessing of SELDI-MS Data and its Evaluation,” 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) , pp 953-958, 2006.
[16]J. Prados, A. Kalousis, JC. Sanchez, L. Allard, O. Carrette, M. Hilario, “Mining mass spectra for diagnosis and biomarker discovery of cerebral accidents,” Proteomics, 2004 Aug;4(8):2320-32.
[17]V. Paradis, F. Degos, D. Dargere, N. Pham, J. Belghiti, C. Degott, J. L. Janeau, A. Bezeaud, D. Delforge, M. Cubizolles, I. Laurendeau, and P. Bedossa, "Identification of a new marker of hepatocellular carcinoma by serum protein profiling of patients with chronic liver diseases," Hepatology, vol. 41, pp. 40-7, 2005.
[18]H. W. Ressom, R. S. Varghese, and E. Orvisky, et al., “Analysis of MALDI-TOF serum profiles for biomarker selection and sample classification,” in Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB '05), November 2005.
[19]A. C. Sauve, T. P. Speed, and "Normalization, baseline correction and alignment of high-throughput mass spectrometry data " Proceedings of the Genomic Signal Processing and Statistics workshop, Baltimore, MD, USA., May 26-27, 2004.
[20]R. Tibshirani, T. Hastie, B. Narasimhan, S. Soltys, G. Shi, A. Koong, QT. Le, “Sample classification from protein mass spectrometry, by 'peak probability contrasts',” Bioinformatics, 2004 Nov 22;20(17):3034-44.
[21]R.J.O. Torgrip, M.Aberg, B. Karlberg, and S.P. Jacobsson, “Peak alignment using reduced set mapping,” J. Chemometrics, 17, 573-582, 2003
[22]M. Wagner, D. Naik, A. Pothen, “Protocols for disease classification from mass spectrometry data,” Proteomics, 2003 Sep;3(9):1692-8
[23]B. Williams, S. Cornett, A. Crecelius, R. Caprioli, B. Dawant, and B. Bodenheimer, “An algorithm for baseline correction of MALDI mass spectra,” in Proceedings of the 43rd ACM Southeast Conference (ACMSE '05), March 2005.
[24]W. Yu, B. Wu, N. Lin, K. Stone, K. Williams, H. Zhao, “Detecting and aligning peaks in mass spectrometry data with applications to MALDI,” Computational Biology and Chemistry 30(1): 27-38 (2006).
[25]Y. Yasui, D. McLerran, BL. Adam, M. Winget, M. Thornquist, Z. Feng, ” An Automated Peak Identification/Calibration Procedure for High-Dimensional Protein Measures From Mass Spectrometers,” Journal of Biomedicine and Biotechnology, 2003(4):242-248.
[26]Y. Yasui, M. Pepe, ML. Thompson, BL. Adam, GL. Wright, Y. Qu, JD. Potter, M. Winget, M. Thornquist, Z. Feng, “A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection,” Biostatistics, 2003 Jul;4(3):449-63.
[27]W. Yu, X. Li, J. Liu, B. Wu, KR. Williams, H. Zhao, “Multiple peak alignment in sequential data analysis: a scale-space-based approach,” IEEE/ACM Trans Comput Biol Bioinform. 2006 Jul-Sep;3(3):208-19.
[28]Z. Zhang, R. C. Bast, Jr., Y. Yu, J. Li, L. J. Sokoll, A. J. Rai, J. M. Rosenzweig, B. Cameron, Y. Y. Wang, X. Y. Meng, A. Berchuck, C. Van Haaften-Day, N. F. Hacker, H. W. de Bruijn, A. G. van der Zee, I. J. Jacobs, E. T. Fung, and D. W. Chan, "Three biomarkers identified from serum proteomic analysis for the detection of early stage ovarian cancer," Cancer Res, vol. 64, pp. 5882-90, 2004.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top