跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.176) 您好!臺灣時間:2025/09/09 19:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張彙音
研究生(外文):Hui-Yin Chang
論文名稱:應用於蛋白體及代謝體的質譜資料之計算與分析方法
論文名稱(外文):Computational and Analytical Methods for Mass Spectrometry-Based Omics Data and Applications
指導教授:宋定懿宋定懿引用關係
指導教授(外文):Ting-Yi Sung
學位類別:博士
校院名稱:國立陽明大學
系所名稱:生物醫學資訊研究所
學門:生命科學學門
學類:生物化學學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:120
中文關鍵詞:代謝體學蛋白質體學液相層析質譜儀液相層析串連式質譜儀代謝物定量蛋白質定序波峰偵測波峰校準失蹤蛋白質非數據依賴擷取方法生物資訊
外文關鍵詞:metabolomicsproteomicsliquid chromatography coupled with mass spectrometry (LC-MS)liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS)metabolite quantificationprotein identificationpeak detectionpeak alignmentmissing proteindata-independent acquisitionbioninformatics
相關次數:
  • 被引用被引用:0
  • 點閱點閱:2348
  • 評分評分:
  • 下載下載:89
  • 收藏至我的研究室書目清單書目收藏:0
隨著質譜技術的快速發展,蛋白質體學 (Proteomics) 和代謝體學 (Metabolomics) 在過去二十年間亦隨之蓬勃發展。液相層析串連式質譜儀 (Liquid chromatography coupled with mass spectrometry) 因其高靈敏性和高解析度已逐漸成為蛋白質體學和代謝體學中常見的分析工具。
使用質譜儀分析過程中常會產生大量的數據,其數據通常含有分析物的質荷比 (m/z)、滯留時間(retention time)和離子強度(intensity)等資訊。質譜資料依據其訊號擷取模式可區分為:數據依賴擷取(Data-Dependent Acquisition,DDA)和非數據依賴擷取(Data-Independent Acquisition,DIA)方法。數據依賴擷取方法即是在質譜儀中先進行一次掃描,挑選此次掃描中訊號高的胜肽離子分別進行串連質譜分析(MS/MS),將選中的胜肽離子進行二次撞碎,得到其離子碎片。數據依賴擷取方法因可在複雜樣品中獲得大量離子訊號,已被大量應用於蛋白體分析與代謝體分析。然而,此方法容易受限於質譜本身掃瞄速度而無法全面性偵測離子,導致許多訊號低但重要的離子無法被挑中進行串聯質譜分析。為解決此問題,非數據依賴擷取方法改採取連續區段式的固定質量範圍前驅物離子掃描。與數據依賴擷取方法不同的是在每一次的掃描中,將會有多個前驅物離子同時送入碰撞室執行離子碰撞。由於此方法能夠全面地將胜肽離子進行撞碎,記錄其產生的所有碎裂離子,有助於提升分析物偵測數量與增加層析波峰圖譜數,也逐漸應用於蛋白質與小分子鑑定與定量。
典型以質譜數據進行定量分析的處理程序分成波峰偵測(Peak Detection)、波峰篩選(Peak Filtering)及波峰校準(Peak Alignment)等步驟,其中波峰偵測是數據處理中最重要的步驟。從複雜的質譜數據中挑選出有意義的分析物訊號,建構萃取離子層析圖(extracted ion chromatogram),是波峰偵測的重點。在代謝體的定量分析上,雖然現今已有多套軟體,例如:XCMS,MetAlign和MZmine 2,可供生物學者進行代謝物數據分析。然而,大多數的軟體都需要使用者輸入演算法導向的多種參數,增加軟體操作上的困難度。有鑒於此,在本研究中我們針對代謝體的定量分析提出一套智能化的代謝體定量軟體,讓使用者只需輸入少數與質譜儀器相關的參數,就能快速且準確地進行代謝物定量。
近幾年非數據依賴擷取方法在串聯質譜儀實驗逐漸受到重視,因此方法能全面地分析胜肽離子,對蛋白質定性和定量分析有利。然而非數據依賴擷取方法一次挑選多個前驅物離子進行碰撞,導致許多離子碎片同時出現在同一張離子圖譜中,如何將這些離子碎片對應回其胜肽離子來源是在資料分析上的一大挑戰;為此,我們利用上述在代謝體定量相似的演算法能有效地將這些離子碎片進行拆解,建構離子的相關波峰,並提出一個計算方法將前驅物離子與其相關的碎片離子群組在一起,產出離子圖譜以利進行胜肽序列鑑定。
除了上述的質譜訊號計算分析的研究外,我們同時致力於國際人類蛋白體計畫Chromosome-centric Human Proteome Project (C-HPP),參與台灣團隊的研究。此計畫主要目的是為了探究和辨識所有自基因轉譯而得的人類蛋白質序列。在目前蛋白質序列研究中,根據neXtProt的報告,已知有20,055人類蛋白質序列,其中16,491條蛋白質序列已成功地在蛋白質實驗中被確認,616條蛋白質序列被判定為有疑慮或不確定,其餘2,948條蛋白質序列僅有在同源分析或DNA或RNA層次的實驗證據,尚未有蛋白質實驗的證據,被稱為「失蹤蛋白質」 (missing proteins)。現階段人類蛋白體計畫的最重要目標,就是希望能在蛋白質實驗上確認這些失蹤蛋白質。質譜儀實驗目前是最主要的蛋白質實驗方法,我們探索在質譜實驗上找尋失蹤蛋白質的挑戰;也就是我們以資訊的角度分析,即使在理想狀況下,假設所有胜肽可在蛋白質實驗中偵測到,我們是否能夠推論所有蛋白質的存在;這是一個被公認的基本問題。為了回答這個基本問題,我們進行一系列的分析,包含:使用電腦模擬水解蛋白質所得之所有胜肽序列,少數蛋白質不含有專一的胜肽,我們亦發現有些蛋白質彼此的序列相似度達100%,上述這些蛋白質的實驗證據必須審慎,可能需要採不同的質譜儀實驗方法。同時,我們也揭露使用質譜資料對於辨識嗅覺感受體 (olfactory receptors)和高相似度蛋白質序列的困難度。根據我們的綜合探討,當進行蛋白質辨識時,應當在蛋白質資料庫中加入蛋白質變異序列以利辨識。
Dramatic technological advances in biological sciences over the past decade have inspired rapid development in proteomics and metabolomics research. Liquid chromatography (LC) coupled with mass spectrometry (MS) has increasingly become the method of choice for metabolite and protein analyses, in which analytes are first separated by LC and then analyzed by mass spectrometer.
In tandem mass spectrometry, an MS scan (also called survey scan) followed by several fragment (product) ion scans are generated by two types of data acquisition mode, i.e. data-dependent acquisition (DDA) and data-independent acquisition (DIA). Data-dependent acquisition (DDA) is the conventional technique, in which precursor ions with top n (user-adjustable) abundances in a survey scan are sequentially fragmented and their fragment ions are recorded in consecutive fragment ion scans. However, the DDA approach may suffer the limitation of identifying low-abundance analytes since they may not be selected for further fragmentation due to the limited scan rate of mass spectrometer. Therefore, the DIA approach has been proposed and has emerged as an alternative. Instead of selecting a limited number of precursor ions in a survey scan, DIA separates the survey scan into several continuous m/z segment windows and sequentially fragments the precursor ions in each segment window into a fragment ion scan. This method can very likely increase the number of selected precursors for quantitation and identification, and thus has been increasingly applied to proteomics and metabolomics research.
In the past few years, we have particularly dedicated to study the following topics: implementing an automated metabolomics quantitation for LC-MS data, reconstructing large-scale in silico MS/MS spectra from DIA proteomics data, and studying the challenges of identifying missing proteins in the Chromosome-centric Human Proteome Project (C-HPP). First, since current metabolomics studies frequently involve analyze a large number of biosamples, many LC-MS experiments only conduct MS analysis to efficiently generate MS data for comparison of metabolites among the biosamples. However, it is a time-consuming task to manually process the high-throughput LC-MS data. Thus we started to work on automated metabolomics quantitation from LC-MS data that usually contains the following steps: peak detection, peak filtering, and peak alignment. Peak detection is the first and also a critical step which aims at selecting meaningful analytic signals for quantitation and avoiding false-positive signals. Currently, there are several tools available for metabolite quantitation; among them, XCMS, MetAlign, and MZmine 2 are very popular. However, most of these tools require users to select modules for computation and to input many algorithm-related parameters in different modules. For example, the users have to input a fixed value for peak slope or peak width. Such requirement limits the users to detect only a certain type of peaks and causes other possibly useful peaks undetected. Moreover, users are hard to optimize those algorithm-related parameters. To avoid such limitations, we proposed an intelligent method which automatically detects peaks by calculating the full width at half maxima (FWHM) and determining the peak boundaries using the FWHM. The method is embedded in our newly developed quantitation tool, called iMet-Q (intelligent Metabolomic Quantitation), which performs a complete quantitation procedure starting from peak detection to peak alignment, and provides friendly user interfaces.
As mentioned previously, the DIA approach divides a survey scan into several continuous m/z segments and sequentially fragments all the precursor ions in each segment. Thus, unlike DDA data which provides a direct clue between a precursor ion and its fragment ions, DIA data is more complicated in terms of data analysis because the fragment ions generated from different precursor ions in a m/z segment are all mixed up in a fragment ion spectrum. In order to utilize DIA data for protein identification and quantitation, constructing the connection between a precursor ion and its fragment ions has become an important task. For this reason, in our second task, we have developed a user-friendly tool, called ProDIA, which first constructs the extracted ion chromatograms (XICs) of precursors and fragments from DIA mass spectra. Then, for each pair of precursor and fragment XICs, we check whether the retention time of the apex of fragment XIC is within the FWHM of the precursor XIC and whether the peak shape similarity between the given pair of XICs is above a user-defined threshold to build up the connection between the precursor ions and fragment ions for constructing the in silico MS/MS spectra. With these constructed in silico MS/MS spectra, researchers can utilize DIA data for protein identification.
In addition to the above work, we have been also working on C-HPP, organized by Human Proteome Organization, which aims at discovering and characterizing all human proteins encoded from genes for the purpose of filling the gap between genomics and proteomics. Currently, neXtProt (2014-09 release) reported 20055 human proteins, including 16491 proteins with experiment evidence at protein level and the remaining 3564 proteins without protein-level experiment evidence. Excluding 616 proteins at uncertain or dubious level, 2948 proteins were regarded as missing proteins. The current main task of C-HPP is to experimentally observe these missing proteins. To explore the challenges of this task, we examined issues affecting the validation of missing proteins using an “ideal” shotgun analysis of human proteome. By performing in silico digestions on the human proteins, we conducted systematic analyses to investigate the difficulties in identifying missing proteins without any unique peptide and those with sequence variants. Meanwhile, we also explored the difficulties in identifying olfactory receptors and highly similar proteins. Among all missing proteins with evidence at transcript level, G protein-coupled receptors and olfactory receptors, based on InterPro classification, were the largest families of proteins and exhibited more frequent variants. To identify missing proteins, our analyses suggested include sequence variants in protein FASTA for database searching, use different enzymes for protein digestion, and consider different experiment techniques, e.g., top-down proteomics.
Table of Content

Acknowledgement I
Chinese Abstract II
English Abstract IV
Table of Content VII
List of Figures IX
List of Tables XIV
Chapter 1 Introduction 1
1.1 General overview 1
1.2 Aims of the study 5
Chapter 2 iMet-Q: an intelligent tool for metabolomics quantitation from LC-MS data 6
2.1 Motivation 6
2.2 Materials 7
2.2.1 Experiment on a standard metabolite mixture (provided by Dr. Chiun-Gung Juo) 7
2.2.2 Public Arabidopsis metabolome data set 8
2.2.3 Data processing and software parameter settings 8
2.3 Robust Algorithms for Metabolomics Quantitation Using Dynamic Peak-Width Determination 10
2.3.1 Peak detection 11
2.3.2 Peak alignment 14
2.3.3 Generating output of quantitation results 14
2.4 Results and Discussions 15
2.4.1 Performance evaluation on a standard metabolite mixture 15
2.4.2 Performance evaluation on a public Arabidopsis metabolome data set 17
2.5 Friendly User Interfaces of iMet-Q 21
Chapter 3 ProDIA: an automated workflow for processing DIA data to construct in silico MS/MS spectra for protein identification 23
3.1 Motivation 23
3.2 Experimental Description 23
3.3 Efficient Algorithm for Constructing in silico MS/MS spectra from SWATH MS data 25
3.4 Results and Discussion 29
3.4.1 Evaluation of four commonly used methods for assigning fragment peaks to precursor peaks 29
3.4.2 Performance evaluation on a standard protein mixture 31
3.4.3 Performance evaluation on a large-scale E. coli sample 35
3.5 ProDIA: a User-friendly Tool to Automatically Construct in silico MS/MS Spectra from SWATH MS Data 41
Chapter 4 Rigorous in silico analyses to explore the challenges in identifying missing proteins 42
4.1 Motivation 42
4.2 Methods 44
4.3 Results and Discussion 44
4.3.1 Analysis of protein identification based on in silico fully-digested unique and shared peptides of human proteins 44
4.3.2 Sequence variation very likely affecting identification of missing proteins in specific families and domains 53
4.3.3 Difficulties in identifying the olfactory receptor family 55
4.3.4 Difficulties in identifying missing proteins with high similarity 59
Chapter 5 Conclusions 61
References 64
Appendices 73
Tables 73
Figures 107


List of Figures

Figure 1. Schematic depiction of iMet-Q workflow for peak detection and peak alignment. 11
Figure 2. A cartoon for the illustration of constructing extracted ion chromatograms. The blue straight lines represent the clustered signals, w and t are the FWHM and retention time of , respectively. Signal A and B are determined as the boundaries of the XIC, and the area in light blue color is the abundance. 13
Figure 3. The box plot of abundance correlation of 167 elucidated metabolites across replicates in the public Arabidopsis data detected by the four quantitation tools. 18
Figure 4. Hierarchical clustering by using the quantitation results of iMet-Q, XCMS, MetAlign, and MZmine 2. Each entry in the tree leaves of a dendrogram represents a replicate. For each tool, we first combined its quantitation results of positive- and negative-ion modes. Colors were assigned to each replicate in the combined quantitation results according to the plant classes which the replicates originated from as follows: orange for cotyledon, red for stem, green for leaf, blue for flower, light blue for shoot apex, yellow for root, pink for seed, and gray for silique. Next, the figure was produced using MATLAB dendrogram function with PMMCC as the abundance correlation measure between any two replicates in the combined quantitation results. 19
Figure 5. The box plot of abundance correlations between 19 verified metabolites and their in-source fragments detected in the public Arabidopsis data. 20
Figure 6. The main graphical user interface of iMet-Q. The Arabidopsis data from positive-ion mode is used as an example. After processing the data, iMet-Q lists the detected peaks in the summary table where the peaks are sorted according to their retention time. When users select peaks of interest in the summary table, the abundances of the selected peaks in different samples are plotted in the sample abundance plot and the detailed information of the selected peak in the technical replicates of a sample is listed in the panel below the summary table. The left panel is the quantitation parameter explorer that lists the parameters of a quantitation. Users can use the provided filter function to narrow down the number of peaks in the summary table. 21
Figure 7. Data acquired from one technical replicate of SWATH MS. During the entire elution process, each survey scan (MS scan) is followed by consecutive fragment ion scans (MS/MS scans) from 32 windows of 25 m/z in width with precursors from 400 to 1200 m/z, and these 33 scans constitute a cycle. For example, Precursor A in the swath window of 425-450 m/z is fragmented and its fragment ions are recorded in the fragment ion scan with the experiment number 3 in each cycle. Since a peptide elutes for a period of a time, its precursor ion and fragment ions can be detected in adjacent cycles in most cases. 26
Figure 8. Schematic depiction of ProDIA workflow for reconstructing MS/MS spectra from SWATH MS data. When a SWATH MS data is imported, ProDIA first removes electronic noise (Step 1) and performs deisotoping (Step 2) in each scan. The highest isotopic ion and its preceding isotopic ions in an isotope envelope are kept as possible monoisotopic ions and the remaining ions are removed. Then, these processed scans are grouped to different scan groups according to their experiment number. ProDIA constructs peaks and determines the charge states of the constructed peaks (i.e., extracted ion chromatograms of fragment ions or precursor ions) from each scan group (Step 3). That is, precursor peaks are constructed from the survey scan group, and fragment peaks constructed from fragment ion scan groups. Finally, ProDIA assigns fragment peaks to precursor peaks if the retention time of the apex of a fragment peak is within the chromatographic peak width of a precursor peak and the peak shape similarity measured by dot product is above a user-defined threshold (e.g., 0.9), and then reconstructs the MS/MS spectra in an MGF file for protein identification (Step 4). 28
Figure 9. Performance evaluation of different strategies to assign fragment peaks to precursor peaks using 32 different parameters. Four different strategies have been used in the literature to check whether (1) the fragment peak apex is within w scans (±w scans) of the precursor’s apex (denoted as Scan_w, w: integer); (2) the apex of fragment peak is within 1/k of FWHM of the precursor’s apex (denoted as FWHM/k, k: integer); (3) the peak shape similarity between the fragment peak and the precursor peak measured by Pearson Correlation is above a threshold s (denoted as psPC_s, s: fraction); (4) the peak shape similarity between the fragment peak and the precursor peak measured by dot product is above a threshold s (denoted as psDP_s, s: fraction). A, Average numbers of fragment peaks in a reconstructed MS/MS spectrum in the 32 different assignments. B, The percentage of 1828 peptides in the assay list that are reported or identified by Mascot from reconstructed spectra of the 32 different assignments. 30
Figure 10. Comparison of peptide fragment coverage and peptide fragment intensity coverage of the identified spectra from SWATH MS and DDA data acquired from a mixture of 4 standard proteins. A number of 37 identified reconstructed spectra from SWATH MS and 98 identified spectra from DDA in the protein mixture experiments were compared. The y-axis represents the accumulated percentage of spectra with coverage ≧ k % (x-axis). 33
Figure 11. An example of highly-correlated fragmentation patterns between identified ProDIA-reconstructed MS/MS spectra and DDA MS/MS spectra. The intensities of 26 commonly matched b- and y-ions in the ProDIA-reconstructed MS/MS spectrum and DDA-generated MS/MS spectrum of SISIVGSYVGNR revealed a high correlation of 0.96. 34
Figure 12. Identification results of ProDIA-reconstructed MS/MS spectra from the large-scale E. coli SWATH MS dataset. Identification results of the reconstructed spectra from SWATH MS data can be decomposed into the following 7 cases based on the 5763 peptides identified from DDA as the benchmark: (a) 148 peptides additionally identified from the SWATH MS data, (b) 1747 peptides commonly identified in both SWATH MS and DDA data, (c) 1311 peptides non-confidently identified, including 311 peptides that exhibit at least 4 consecutive b- or y- ions in ProDIA-reconstructed MS/MS spectra (c1) and 1000 peptides that do not (c2), (d) 770 peptides with ProDIA-reconstructed MS/MS spectra but being unidentified, (e) 1674 peptides missed, without ProDIA-reconstructed MS/MS spectra, (f) 261 peptides missed, with precursor m/z outside the precursor range of SWATH MS, and (g) 220 peptides additionally reported from the SWATH MS data by Mascot and exhibiting at least 4 consecutive b- or y- ions. 36
Figure 13. Peptide fragment coverage and peptide fragment intensity coverage analysis for the large-scale E. coli sample. A, Comparison of cumulative peptide fragment coverage and cumulative peptide fragment intensity coverage of the 1747 commonly identified peptides in ProDIA-reconstructed and DDA spectra, respectively. Note that ProDIA-reconstructed spectra exhibited higher peptide fragment and peptide fragment intensity coverages than DDA spectra. For example, 25% of the ProDIA-reconstructed MS/MS spectra revealed a peptide fragment coverage of at least 70%, while none of the DDA spectra exhibited such peptide fragment coverage. B, The distribution of peptide fragment coverage and peptide fragment intensity coverage of the 1895 identified peptides, 1311 non-confidently identified peptides, and 770 unidentified peptides in the ProDIA-reconstructed spectra. The coverage for unidentified peptides is calculated from the reconstructed spectra with the same precursor m/z of each peptide. 40
Figure 14. User-friendly main interface of ProDIA. (a) Project Explorer. (b) Project Information. (c) Sample Abundance plot: plots peptide precursor abundances in different samples. (d) Sample Table: lists the detailed information of precursors, including the m/z, retention time, charge states and isotopic ratio of the precursors and the abundance of precursors in different samples. (e) Replicate Table: lists precursor information in each replicate, including m/z, retention time, charge states, and so on. (f) Fragment Table: lists all fragments assigned to a precursor. (g) MS/MS spectrum. (h) XIC plot. (i) Spectrum Processing Parameter Explorer: documents all parameters used in each process so that users can easily trace the experiment records. ProDIA provides an additional “Filter” function in the “Spectrum Processing Parameter Explorer” for users to set up different filtering criteria on the precursors listed in the sample table to filter out unwanted precursors. 41
Figure 15. The peptide length distributions of unique and shared peptides based on in silico trypsin digestion. The data was derived from in silico trypsin digestion of 20189 human proteins in UniProt, where the numbers of unique and shared peptides were 613253 and 607642, respectively. Red and blue areas represent the distributions of unique and shared peptides, respectively, regarding to different peptide lengths shown in the X-axis. 46
Figure 16. (A) The number of non-redundant unique peptides using trypsin, Lys-C, and trypsin+Lys-C digestion. A number of 18329 unique peptides were additionally obtained using trypsin+Lys-C digestion, whereas 31571 and 246906 unique peptides were additionally obtained using trypsin and Lys-C, respectively. (B) The number of proteins with at least one non-redundant unique peptide using trypsin, Lys-C, and trypsin+Lys-C digestion. In silico digestion analysis was conducted on 20053 proteins common in UniProt and neXtProt. Among the 20053 proteins, 19908 proteins had at least one unique peptide from any of the three digestions, and 145 proteins did not have any unique peptide. 47
Figure 17. The Annexin A8 sequence and its identified peptide sequences annotated with MS information. The gray bars denote the peptide sequences used to identify the protein reported by Burkard et al. [110]. The green and orange bars denote positions in the amino acid sequence covered by shared peptides annotated in PeptideAtlas and SRMAtlas, respectively. 48
Figure 18. The distribution of proteins at different evidence levels in all protein groups classified by the number of their unique tryptic peptides. The data was derived from in silico trypsin digestion of 20,053 common proteins from human proteomes of UniProt and neXtProt. The yellow curve, corresponding to the Y-axis on the right-hand-side of the panel, showed the number of proteins in each protein group. The purple curve, corresponding to the Y-axis on the left-hand-side of the panel, showed the accumulative percentage of PE1 proteins. 50
Figure 19. Comparisons of the unique peptide number generated by trypsin- and chymotrypsin-digestion. The in silico analysis was conducted on missing proteins with at most 20 unique tryptic peptides (i.e., 2070 missing proteins). The 2070 proteins were divided into three groups according to the number of unique tryptic peptides: proteins with 1-10 unique tryptic peptides (Group 1), with 11-20 unique tryptic peptides (Group 2), and without any unique tryptic peptide (Group 3). The distributions of proteins in the three groups were shown in A, B and C, respectively. 51
Figure 20. The LSV Comparison between 17479 verified (PE1) proteins and 2211 missing proteins. Group A, B, and C denote verified proteins (PE1) with >=51, 26-50, and 0-25 in silico unique tryptic peptides, respectively. Group Am, Bm, and Cm denote missing proteins (PE2-4) with >=51, 26-50, and 0-25 in silico unique tryptic peptides, respectively. The numbers of proteins in group A, Am, B, Bm, C, and Cm were 2994, 137, 5455, 469, 9030, and 1605, respectively. 54
Figure 21. The Comparison of the LSV distributions between 16490 proteins experimentally validated at protein level (PE1) and 2646 proteins validated at transcript level (PE2). 55
Figure 22. The LSV distribution in ten families and domains classified by InterPro. The ten families or domains containing most number of proteins with PE2 evidence were included in this analysis; they are A: IPR017452 (GPCR, rhodopsin-like, 7TM), B: IPR000276 (G protein-coupled receptor, rhodopsin-like), C: IPR000725 (Olfactory receptor), D: IPR015880 (Zinc finger, C2H2-like), E: IPR013087 (Zinc finger C2H2-type/integrase DNA-binding domain), F: IPR001909 (Krueppel-associated box), G: IPR009057 (Homeodomain-like), H: IPR001356 (Homeobox domain (IPR001356)), I: IPR020683 (Ankyrin repeat-containing domain), and J: IPR013783 (Immunoglobulin-like fold). Family/domain X1 and X2 denotes proteins with PE1 and PE2 evidences, respectively. In addition, Group PE1 and PE2 represent the total 16490 and 2646 proteins with evidences at PE1 and PE2 levels, respectively. 58


List of Tables

Table 1. The quantitation error (%) of seven standard metabolites calculated by iMet-Q, XCMS, MetAlign, and MZmine 2. 16
Table 2. The reproducibility (Rep.) and normalized abundance (Abund.) of two internal standards detected by four quantitation tools. 17
Table 3. The 16 peptides of the 4 standard proteins identified by SWATH MS data 32
Table 4. The improvement of protein identification in the large-scale E. coli sample by integrating search results from SWATH MS with those of conventional DDA. 38
Table 5. The number of in silico fully digested peptides of the 20189 human proteins in UniProt using different enzymes. 45
Table 6. The number of missing proteins found in the identified proteins using multiple proteases. 52
Table 7. The information of the ten ORs proteins annotated with PE1 evidences. 56
1. Meissner F, Mann M. Quantitative Shotgun Proteomics: Considerations for a High-Quality Workflow in Immunology. Nat Immunol. 15, 112-7 (2014).
2. Mallick P, Kuster B. Proteomics: A Pragmatic Perspective. Nat Biotechnol. 28, 695-709 (2010).
3. VerBerkmoes NC, Denef VJ, Hettich RL, Banfield JF. Systems Biology: Functional Analysis of Natural Microbial Consortia Using Community Proteomics. Nat Rev Microbiol. 7, 196-205 (2009).
4. Altelaar AF, Munoz J, Heck AJ. Next-Generation Proteomics: Towards an Integrative View of Proteome Dynamics. Nat Rev Genet. 14, 35-48 (2013).
5. Dettmer K, Aronov PA, Hammock BD. Mass Spectrometry-Based Metabolomics. Mass Spectrom Rev. 26, 51-78 (2007).
6. Wishart DS. Emerging Applications of Metabolomics in Drug Discovery and Precision Medicine. Nat Rev Drug Discov. (2016).
7. Johnson CH, Ivanisevic J, Siuzdak G. Metabolomics: Beyond Biomarkers and Towards Mechanisms. Nat Rev Mol Cell Biol. (2016).
8. Sun CS, Markey MK. Recent Advances in Computational Analysis of Mass Spectrometry for Proteomic Profiling. J Mass Spectrom. 46, 443-56 (2011).
9. Aebersold R, Mann M. Mass Spectrometry-Based Proteomics. Nature. 422, 198-207 (2003).
10. Ibanez C, Simo C, Garcia-Canas V, Cifuentes A, Castro-Puyana M. Metabolomics, Peptidomics and Proteomics Applications of Capillary Electrophoresis-Mass Spectrometry in Foodomics: A Review. Anal Chim Acta. 802, 1-13 (2013).
11. Distler U, Kuharev J, Navarro P, Tenzer S. Label-Free Quantification in Ion Mobility-Enhanced Data-Independent Acquisition Proteomics. Nat Protoc. 11, 795-812 (2016).
12. Azvolinsky A, DeFrancesco L, Waltz E, Webb S. 20 Years of Nature Biotechnology Research Tools. Nat Biotechnol. 34, 256-61 (2016).
13. Larance M, Lamond AI. Multidimensional Proteomics for Cell Biology. Nat Rev Mol Cell Biol. 16, 269-80 (2015).
14. Maze I, Shen L, Zhang B, Garcia BA, Shao N, Mitchell A, et al. Analytical Tools and Current Challenges in the Modern Era of Neuroepigenomics. Nat Neurosci. 17, 1476-90 (2014).
15. Moradian A, Kalli A, Sweredoski MJ, Hess S. The Top-Down, Middle-Down, and Bottom-up Mass Spectrometry Approaches for Characterization of Histone Variants and Their Post-Translational Modifications. Proteomics. 14, 489-97 (2014).
16. Zhang Z, Wu S, Stenoien DL, Pasa-Tolic L. High-Throughput Proteomics. Annu Rev Anal Chem (Palo Alto Calif). 7, 427-54 (2014).
17. Choudhary C, Mann M. Decoding Signalling Networks by Mass Spectrometry-Based Proteomics. Nat Rev Mol Cell Biol. 11, 427-39 (2010).
18. Yates JR, 3rd, Gilchrist A, Howell KE, Bergeron JJ. Proteomics of Organelles and Large Cellular Structures. Nat Rev Mol Cell Biol. 6, 702-14 (2005).
19. Stahl DC, Swiderek KM, Davis MT, Lee TD. Data-Controlled Automation of Liquid Chromatography/Tandem Mass Spectrometry Analysis of Peptide Mixtures. J Am Soc Mass Spectrom. 7, 532-40 (1996).
20. Domon B, Aebersold R. Mass Spectrometry and Protein Analysis. Science. 312, 212-7 (2006).
21. Egertson JD, Kuehn A, Merrihew GE, Bateman NW, MacLean BX, Ting YS, et al. Multiplexed Ms/Ms for Improved Data-Independent Acquisition. Nat Methods. 10, 744-6 (2013).
22. Michalski A, Cox J, Mann M. More Than 100,000 Detectable Peptide Species Elute in Single Shotgun Proteomics Runs but the Majority Is Inaccessible to Data-Dependent Lc-Ms/Ms. J Proteome Res. 10, 1785-93 (2011).
23. Liu H, Sadygov RG, Yates JR, 3rd. A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics. Anal Chem. 76, 4193-201 (2004).
24. Bern M, Finney G, Hoopmann MR, Merrihew G, Toth MJ, MacCoss MJ. Deconvolution of Mixture Spectra from Ion-Trap Data-Independent-Acquisition Tandem Mass Spectrometry. Anal Chem. 82, 833-41 (2010).
25. Malmstrom J, Lee H, Aebersold R. Advances in Proteomic Workflows for Systems Biology. Curr Opin Biotechnol. 18, 378-84 (2007).
26. Wu L, Han DK. Overcoming the Dynamic Range Problem in Mass Spectrometry-Based Shotgun Proteomics. Expert Rev Proteomics. 3, 611-9 (2006).
27. Wenner BR, Lynn BC. Factors That Affect Ion Trap Data-Dependent Ms/Ms in Proteomics. J Am Soc Mass Spectrom. 15, 150-7 (2004).
28. Blackburn K, Mbeunkui F, Mitra SK, Mentzel T, Goshe MB. Improving Protein and Proteome Coverage through Data-Independent Multiplexed Peptide Fragmentation. J Proteome Res. 9, 3621-37 (2010).
29. Carr S, Aebersold R, Baldwin M, Burlingame A, Clauser K, Nesvizhskii A, et al. The Need for Guidelines in Publication of Peptide and Protein Identification Data: Working Group on Publication Guidelines for Peptide and Protein Identification Data. Mol Cell Proteomics. 3, 531-3 (2004).
30. Wilkins MR, Appel RD, Van Eyk JE, Chung MC, Gorg A, Hecker M, et al. Guidelines for the Next 10 Years of Proteomics. Proteomics. 6, 4-8 (2006).
31. Purvine S, Eppel JT, Yi EC, Goodlett DR. Shotgun Collision-Induced Dissociation of Peptides Using a Time of Flight Mass Analyzer. Proteomics. 3, 847-50 (2003).
32. Plumb RS, Johnson KA, Rainville P, Smith BW, Wilson ID, Castro-Perez JM, et al. Uplc/Ms(E); a New Approach for Generating Molecular Fragment Information for Biomarker Structure Elucidation. Rapid Commun Mass Spectrom. 20, 1989-94 (2006).
33. Geiger T, Cox J, Mann M. Proteomics on an Orbitrap Benchtop Mass Spectrometer Using All-Ion Fragmentation. Mol Cell Proteomics. 9, 2252-61 (2010).
34. Panchaud A, Scherl A, Shaffer SA, von Haller PD, Kulasekara HD, Miller SI, et al. Precursor Acquisition Independent from Ion Count: How to Dive Deeper into the Proteomics Ocean. Anal Chem. 81, 6481-8 (2009).
35. Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, et al. Targeted Data Extraction of the Ms/Ms Spectra Generated by Data-Independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis. Mol Cell Proteomics. 11, O111 016717 (2012).
36. Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, et al. A Cross-Platform Toolkit for Mass Spectrometry and Proteomics. Nat Biotechnol. 30, 918-20 (2012).
37. Paik YK, Jeong SK, Omenn GS, Uhlen M, Hanash S, Cho SY, et al. The Chromosome-Centric Human Proteome Project for Cataloging Proteins Encoded in the Genome. Nat Biotechnol. 30, 221-3 (2012).
38. Paik YK, Omenn GS, Uhlen M, Hanash S, Marko-Varga G, Aebersold R, et al. Standard Guidelines for the Chromosome-Centric Human Proteome Project. J Proteome Res. 11, 2005-13 (2012).
39. Marko-Varga G, Omenn GS, Paik YK, Hancock WS. A First Step toward Completion of a Genome-Wide Characterization of the Human Proteome. J Proteome Res. 12, 1-5 (2013).
40. Lane L, Bairoch A, Beavis RC, Deutsch EW, Gaudet P, Lundberg E, et al. Metrics for the Human Proteome Project 2013-2014 and Strategies for Finding Missing Proteins. Journal of Proteome Research. 13, 15-20 (2014).
41. Omenn GS. The Strategy, Organization, and Progress of the Hupo Human Proteome Project. J Proteomics. 100, 3-7 (2014).
42. Choong WK, Chang HY, Chen CT, Tsai CF, Hsu WL, Chen YJ, et al. Informatics View on the Challenges of Identifying Missing Proteins from Shotgun Proteomics. J Proteome Res. 14, 5396-407 (2015).
43. Cho JY, Lee HJ, Jeong SK, Kim KY, Kwon KH, Yoo JS, et al. Combination of Multiple Spectral Libraries Improves the Current Search Methods Used to Identify Missing Proteins in the Chromosome-Centric Human Proteome Project. J Proteome Res. 14, 4959-66 (2015).
44. Ellis DI, Dunn WB, Griffin JL, Allwood JW, Goodacre R. Metabolic Fingerprinting as a Diagnostic Tool. Pharmacogenomics. 8, 1243-66 (2007).
45. Patti GJ, Yanes O, Siuzdak G. Innovation: Metabolomics: The Apogee of the Omics Trilogy. Nat Rev Mol Cell Biol. 13, 263-9 (2012).
46. Weiss RH, Kim K. Metabolomics in the Study of Kidney Diseases. Nat Rev Nephrol. 8, 22-33 (2012).
47. Baker M. Metabolomics: From Small Molecules to Big Ideas. Nat Methods. 8, 117-21 (2011).
48. Madsen R, Lundstedt T, Trygg J. Chemometrics in Metabolomics--a Review in Human Disease Diagnosis. Anal Chim Acta. 659, 23-33 (2010).
49. Issaq HJ, Van QN, Waybright TJ, Muschik GM, Veenstra TD. Analytical and Statistical Approaches to Metabolomics Research. J Sep Sci. 32, 2183-99 (2009).
50. Mamas M, Dunn WB, Neyses L, Goodacre R. The Role of Metabolites and Metabolomics in Clinically Applicable Biomarkers of Disease. Arch Toxicol. 85, 5-17 (2011).
51. Blekherman G, Laubenbacher R, Cortes DF, Mendes P, Torti FM, Akman S, et al. Bioinformatics Tools for Cancer Metabolomics. Metabolomics. 7, 329-43 (2011).
52. Jaitly N, Mayampurath A, Littlefield K, Adkins JN, Anderson GA, Smith RD. Decon2ls: An Open-Source Software Package for Automated Processing and Visualization of High Resolution Mass Spectrometry Data. BMC Bioinformatics. 10, 87 (2009).
53. Sugimoto M, Hirayama A, Ishikawa T, Robert M, Baran R, Uehara K, et al. Differential Metabolomics Software for Capillary Electrophoresis-Mass Spectrometry Data Analysis. Metabolomics. 6, 27-41 (2010).
54. Scalbert A, Brennan L, Fiehn O, Hankemeier T, Kristal BS, van Ommen B, et al. Mass-Spectrometry-Based Metabolomics: Limitations and Recommendations for Future Progress with Particular Focus on Nutrition Research. Metabolomics. 5, 435-58 (2009).
55. Smith CA, Want EJ, O'Maille G, Abagyan R, Siuzdak G. Xcms: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification. Anal Chem. 78, 779-87 (2006).
56. Lommen A. Metalign: Interface-Driven, Versatile Metabolomics Tool for Hyphenated Full-Scan Mass Spectrometry Data Preprocessing. Anal Chem. 81, 3079-86 (2009).
57. Pluskal T, Castillo S, Villar-Briones A, Oresic M. Mzmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data. BMC Bioinformatics. 11, 395 (2010).
58. Kenar E, Franken H, Forcisi S, Wormann K, Haring HU, Lehmann R, et al. Automated Label-Free Quantification of Metabolites from Liquid Chromatography-Mass Spectrometry Data. Mol Cell Proteomics. 13, 348-59 (2014).
59. Aberg KM, Torgrip RJ, Kolmert J, Schuppe-Koistinen I, Lindberg J. Feature Detection and Alignment of Hyphenated Chromatographic-Mass Spectrometric Data. Extraction of Pure Ion Chromatograms Using Kalman Tracking. J Chromatogr A. 1192, 139-46 (2008).
60. Sugimoto M, Kawakami M, Robert M, Soga T, Tomita M. Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis. Curr Bioinform. 7, 96-108 (2012).
61. Tautenhahn R, Bottcher C, Neumann S. Highly Sensitive Feature Detection for High Resolution Lc/Ms. BMC Bioinformatics. 9, 504 (2008).
62. Kuhl C, Tautenhahn R, Bottcher C, Larson TR, Neumann S. Camera: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets. Anal Chem. 84, 283-9 (2012).
63. Cleveland WS. Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association. 74, 829-36 (1979).
64. Cleveland WS. Lowess - a Program for Smoothing Scatterplots by Robust Locally Weighted Regression. American Statistician. 35, 54- (1981).
65. Matsuda F, Yonekura-Sakakibara K, Niida R, Kuromori T, Shinozaki K, Saito K. Ms/Ms Spectral Tag-Based Annotation of Non-Targeted Profile of Plant Secondary Metabolites. Plant J. 57, 555-77 (2009).
66. Katajamaa M, Oresic M. Data Processing for Mass Spectrometry-Based Metabolomics. J Chromatogr A. 1158, 318-28 (2007).
67. Keller BO, Sui J, Young AB, Whittal RM. Interferences and Contaminants Encountered in Modern Mass Spectrometry. Anal Chim Acta. 627, 71-81 (2008).
68. Tolstikov VV, Lommen A, Nakanishi K, Tanaka N, Fiehn O. Monolithic Silica-Based Capillary Reversed-Phase Liquid Chromatography/Electrospray Mass Spectrometry for Plant Metabolomics. Anal Chem. 75, 6737-40 (2003).
69. Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, et al. Hmdb 3.0--the Human Metabolome Database in 2013. Nucleic Acids Res. 41, D801-7 (2013).
70. Matsuda F, Hirai MY, Sasaki E, Akiyama K, Yonekura-Sakakibara K, Provart NJ, et al. Atmetexpress Development: A Phytochemical Atlas of Arabidopsis Development. Plant Physiol. 152, 566-78 (2010).
71. Lynn KS, Cheng ML, Chen YR, Hsu C, Chen A, Lih TM, et al. Metabolite Identification for Mass Spectrometry-Based Metabolomics Using Multiple Types of Correlated Ion Information. Anal Chem. 87, 2143-51 (2015).
72. Geromanos SJ, Vissers JPC, Silva JC, Dorschel CA, Li GZ, Gorenstein MV, et al. The Detection, Correlation, and Comparison of Peptide Precursor and Product Ions from Data Independent Lc-Ms with Data Dependant Lc-Ms/Ms. Proteomics. 9, 1683-95 (2009).
73. Wong JWH, Schwahn AB, Downard KM. Etiseq - an Algorithm for Automated Elution Time Ion Sequencing of Concurrently Fragmented Peptides for Mass Spectrometry-Based Proteomics. Bmc Bioinformatics. 10, (2009).
74. Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, et al. Targeted Data Extraction of the Ms/Ms Spectra Generated by Data-Independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis. Molecular & Cellular Proteomics. 11, (2012).
75. Kessner D, Chambers M, Burke R, Agusand D, Mallick P. Proteowizard: Open Source Software for Rapid Proteomics Tools Development. Bioinformatics. 24, 2534-6 (2008).
76. Eidhammer I, Flikka, K., Martens, L., Mikalsen, S. Computational Methods for Mass Spectrometry Proteomics: John Wiley & Sons Inc.; 2007.
77. Reiter L, Rinner O, Picotti P, Huttenhain R, Beck M, Brusniak MY, et al. Mprophet: Automated Data Processing and Statistical Validation for Large-Scale Srm Experiments. Nat Methods. 8, 430-U85 (2011).
78. Schluesener D, Fischer F, Kruip J, Rogner M, Poetsch A. Mapping the Membrane Proteome of Corynebacterium Glutamicum. Proteomics. 5, 1317-30 (2005).
79. Ma ZQ, Chambers MC, Ham AJL, Cheek KL, Whitwell CW, Aerni HR, et al. Scanranker: Quality Assessment of Tandem Mass Spectra Via Sequence Tagging. Journal of Proteome Research. 10, 2896-904 (2011).
80. Seol Y, Hardin AH, Strub MP, Charvin G, Neuman KC. Comparison of DNA Decatenation by Escherichia Coli Topoisomerase Iv and Topoisomerase Iii: Implications for Non-Equilibrium Topology Simplification. Nucleic Acids Research. 41, 4640-9 (2013).
81. Li C, Zhang Y, Vankemmelbeke M, Hecht O, Aleanizy FS, Macdonald C, et al. Structural Evidence That Colicin a Protein Binds to a Novel Binding Site of Tola Protein in Escherichia Coli Periplasm. Journal of Biological Chemistry. 287, 19048-57 (2012).
82. Legrain P, Aebersold R, Archakov A, Bairoch A, Bala K, Beretta L, et al. The Human Proteome Project: Current State and Future Direction. Mol Cell Proteomics. 10, M111 009993 (2011).
83. Lane L, Argoud-Puy G, Britan A, Cusin I, Duek PD, Evalet O, et al. Nextprot: A Knowledge Platform for Human Proteins. Nucleic Acids Res. 40, D76-83 (2012).
84. Beck M, Claassen M, Aebersold R. Comprehensive Proteomics. Curr Opin Biotechnol. 22, 3-8 (2011).
85. Farrah T, Deutsch EW, Hoopmann MR, Hallows JL, Sun Z, Huang CY, et al. The State of the Human Proteome in 2012 as Viewed through Peptideatlas. Journal of Proteome Research. 12, 162-71 (2013).
86. Shiromizu T, Adachi J, Watanabe S, Murakami T, Kuga T, Muraoka S, et al. Identification of Missing Proteins in the Nextprot Database and Unregistered Phosphopeptides in the Phosphositeplus Database as Part of the Chromosome-Centric Human Proteome Project. J Proteome Res. 12, 2414-21 (2013).
87. Ezkurdia I, Juan D, Rodriguez JM, Frankish A, Diekhans M, Harrow J, et al. Multiple Evidence Strands Suggest That There May Be as Few as 19 000 Human Protein-Coding Genes. Human Molecular Genetics. 23, 5866-78 (2014).
88. Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, et al. A Draft Map of the Human Proteome. Nature. 509, 575-81 (2014).
89. Wilhelm M, Schlegl J, Hahne H, Moghaddas Gholami A, Lieberenz M, Savitski MM, et al. Mass-Spectrometry-Based Draft of the Human Proteome. Nature. 509, 582-7 (2014).
90. Nesvizhskii AI, Aebersold R. Interpretation of Shotgun Proteomic Data: The Protein Inference Problem. Mol Cell Proteomics. 4, 1419-40 (2005).
91. Nesvizhskii AI. A Survey of Computational Methods and Error Rate Estimation Procedures for Peptide and Protein Identification in Shotgun Proteomics. J Proteomics. 73, 2092-123 (2010).
92. Li J, Su ZL, Ma ZQ, Slebos RJC, Halvey P, Tabb DL, et al. A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics. Molecular & Cellular Proteomics. 10, (2011).
93. Nijveen H, Kester MGD, Hassan C, Viars A, de Ru AH, de Jager M, et al. Hspvdb-the Human Short Peptide Variation Database for Improved Mass Spectrometry-Based Detection of Polymorphic Hla-Ligands. Immunogenetics. 63, 143-53 (2011).
94. Roth MJ, Forbes AJ, Boyne MT, Kim YB, Robinson DE, Kelleher NL. Precise and Parallel Characterization of Coding Polymorphisms, Alternative Splicing, and Modifications in Human Proteins by Mass Spectrometry. Molecular & Cellular Proteomics. 4, 1002-8 (2005).
95. Su ZD, Sun L, Yu DX, Li RX, Li HX, Yu ZJ, et al. Quantitative Detection of Single Amino Acid Polymorphisms by Targeted Proteomics. Journal of Molecular Cell Biology. 3, 309-15 (2011).
96. Alves G, Ogurtsov AY, Yu YK. Raid_Dbs: Mass-Spectrometry Based Peptide Identification Web Server with Knowledge Integration. Bmc Genomics. 9, 505 (2008).
97. Xi H, Park JS, Ding GH, Lee YH, Li YX. Syspimp: The Web-Based Systematical Platform for Identifying Human Disease-Related Mutated Sequences from Mass Spectrometry. Nucleic Acids Research. 37, D913-D20 (2009).
98. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. Uniprot: The Universal Protein Knowledgebase. Nucleic Acids Res. 32, D115-9 (2004).
99. Lindskog C. The Potential Clinical Impact of the Tissue-Based Map of the Human Proteome. Expert Rev Proteomics. 12, 213-5 (2015).
100. Farrah T, Deutsch EW, Hoopmann MR, Hallows JL, Sun Z, Huang CY, et al. The State of the Human Proteome in 2012 as Viewed through Peptideatlas. J Proteome Res. 12, 162-71 (2013).
101. Picotti P, Lam H, Campbell D, Deutsch EW, Mirzaei H, Ranish J, et al. A Database of Mass Spectrometric Assays for the Yeast Proteome. Nature Methods. 5, 913-4 (2008).
102. Keil Bi. Specificity of Proteolysis. Berlin ; New York: Springer-Verlag; 1992. ix, 336 p. p.
103. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. Dbsnp: The Ncbi Database of Genetic Variation. Nucleic Acids Res. 29, 308-11 (2001).
104. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, et al. Cosmic: Mining Complete Cancer Genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39, D945-50 (2011).
105. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-Based Protein Identification by Searching Sequence Databases Using Mass Spectrometry Data. Electrophoresis. 20, 3551-67 (1999).
106. Eng JK, McCormack AL, Yates JR. An Approach to Correlate Tandem Mass Spectral Data of Peptides with Amino Acid Sequences in a Protein Database. J Am Soc Mass Spectrom. 5, 976-89 (1994).
107. Craig R, Beavis RC. Tandem: Matching Proteins with Tandem Mass Spectra. Bioinformatics. 20, 1466-7 (2004).
108. Meyer-Arendt K, Old WM, Houel S, Renganathan K, Eichelberger B, Resing KA, et al. Isoformresolver: A Peptide-Centric Algorithm for Protein Inference. Journal of Proteome Research. 10, 3060-75 (2011).
109. Branca RM, Orre LM, Johansson HJ, Granholm V, Huss M, Perez-Bercoff A, et al. Hirief Lc-Ms Enables Deep Proteome Coverage and Unbiased Proteogenomics. Nature Methods. 11, 59-62 (2014).
110. Burkard TR, Planyavsky M, Kaupe I, Breitwieser FP, Burckstummer T, Bennett KL, et al. Initial Characterization of the Human Central Proteome. BMC Syst Biol. 5, 17 (2011).
111. Rety S, Sopkova-de Oliveira Santos J, Dreyfuss L, Blondeau K, Hofbauerova K, Raguenes-Nicol C, et al. The Crystal Structure of Annexin A8 Is Similar to That of Annexin A3. Journal of Molecular Biology. 345, 1131-9 (2005).
112. Swaney DL, Wenger CD, Coon JJ. Value of Using Multiple Proteases for Large-Scale Mass Spectrometry-Based Proteomics. J Proteome Res. 9, 1323-9 (2010).
113. Wisniewski JR, Mann M. Consecutive Proteolytic Digestion in an Enzyme Reactor Increases Depth of Proteomic and Phosphoproteomic Analysis. Anal Chem. 84, 2631-7 (2012).
114. Guo X, Trudgian DC, Lemoff A, Yadavalli S, Mirzaei H. Confetti: A Multiprotease Map of the Hela Proteome for Comprehensive Proteomics. Mol Cell Proteomics. 13, 1573-84 (2014).
115. Chen Q, Yan G, Zhang X. Applying Multiple Proteases to Direct Digestion of Hundred-Scale Cell Samples for Proteome Analysis. Rapid Commun Mass Spectrom. 29, 1389-94 (2015).
116. Giansanti P, Aye TT, van den Toorn H, Peng M, van Breukelen B, Heck AJ. An Augmented Multiple-Protease-Based Human Phosphopeptide Atlas. Cell Rep. 11, 1834-43 (2015).
117. Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, et al. Interpro - an Integrated Documentation Resource for Protein Families, Domains and Functional Sites. Bioinformatics. 16, 1145-50 (2000).
118. Frumin I, Sobel N, Gilad Y. Does a Unique Olfactory Genome Imply a Unique Olfactory World? Nature Neuroscience. 17, 6-8 (2014).
119. Hasin-Brumshtein Y, Lancet D, Olender T. Human Olfaction: From Genomic Variation to Phenotypic Diversity. Trends in Genetics. 25, 178-84 (2009).
120. Mainland JD, Keller A, Li YR, Zhou T, Trimmer C, Snyder LL, et al. The Missense of Smell: Functional Variability in the Human Odorant Receptor Repertoire. Nature Neuroscience. 17, 114-20 (2014).
121. Olender T, Waszak SM, Viavant M, Khen M, Ben-Asher E, Reyes A, et al. Personal Receptor Repertoires: Olfaction as a Model. BMC Genomics. 13, (2012).
122. Woo S, Cha SW, Merrihew G, He Y, Castellana N, Guest C, et al. Proteogenomic Database Construction Driven from Large Scale Rna-Seq Data. J Proteome Res. 13, 21-8 (2014).
123. Wang X, Zhang B. Customprodb: An R Package to Generate Customized Protein Databases from Rna-Seq Data for Proteomics Search. Bioinformatics. 29, 3235-7 (2013).
124. Sheynkman GM, Shortreed MR, Frey BL, Scalf M, Smith LM. Large-Scale Mass Spectrometric Detection of Variant Peptides Resulting from Nonsynonymous Nucleotide Differences. J Proteome Res. 13, 228-40 (2014).
125. Wang X, Slebos RJ, Wang D, Halvey PJ, Tabb DL, Liebler DC, et al. Protein Identification Using Customized Protein Sequence Databases Derived from Rna-Seq Data. J Proteome Res. 11, 1009-17 (2012).
126. Nesvizhskii AI. Proteogenomics: Concepts, Applications and Computational Strategies. Nature Methods. 11, 1114-25 (2014).
127. Neuhaus EM, Zhang W, Gelis L, Deng Y, Noldus J, Hatt H. Activation of an Olfactory Receptor Inhibits Proliferation of Prostate Cancer Cells. Journal of Biological Chemistry. 284, 16218-25 (2009).
128. Spehr M, Gisselmann G, Poplawski A, Riffell JA, Wetzel CH, Zimmer RK, et al. Identification of a Testicular Odorant Receptor Mediating Human Sperm Chemotaxis. Science. 299, 2054-8 (2003).
129. Kang N, Kim H, Jae Y, Lee N, Ku CR, Margolis F, et al. Olfactory Marker Protein Expression Is an Indicator of Olfactory Receptor-Associated Events in Non-Olfactory Tissues. Plos One. 10, e0116097 (2015).
130. Sanz G, Leray I, Dewaele A, Sobilo J, Lerondel S, Bouet S, et al. Promotion of Cancer Cell Invasiveness and Metastasis Emergence Caused by Olfactory Receptor Stimulation. PLoS One. 9, e85110 (2014).
131. Weng J, Wang J, Hu X, Wang F, Ittmann M, Liu M. Psgr2, a Novel G-Protein Coupled Receptor, Is Overexpressed in Human Prostate Cancer. Int J Cancer. 118, 1471-80 (2006).
132. Xu LL, Stackhouse BG, Florence K, Zhang W, Shanmugam N, Sesterhenn IA, et al. Psgr, a Novel Prostate-Specific Gene with Homology to a G Protein-Coupled Receptor, Is Overexpressed in Prostate Cancer. Cancer Res. 60, 6568-72 (2000).
133. Flegel C, Manteniotis S, Osthold S, Hatt H, Gisselmann G. Expression Profile of Ectopic Olfactory Receptors Determined by Deep Sequencing. Plos One. 8, e55368 (2013).
134. Ezkurdia I, Vazquez J, Valencia A, Tress M. Analyzing the First Drafts of the Human Proteome. Journal of Proteome Research. (2014).
135. Kyte J, Doolittle RF. A Simple Method for Displaying the Hydropathic Character of a Protein. Journal of Molecular Biology. 157, 105-32 (1982).
136. Fu LM, Niu BF, Zhu ZW, Wu ST, Li WZ. Cd-Hit: Accelerated for Clustering the Next-Generation Sequencing Data. Bioinformatics. 28, 3150-2 (2012).
137. Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, et al. Openswath Enables Automated, Targeted Analysis of Data-Independent Acquisition Ms Data. Nature Biotechnology. 32, 219-23 (2014).
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊