Coverage and consistency

Bioinformatics aspects of the analysis of multirun iTRAQ experiments with wheat leaves

Dana Pascovici, Donald M. Gardiner, Xiaomin Song, Edmond Breen, Peter S. Solomon, Tim Keighley, Mark P. Molloy*

*Corresponding author for this work

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

The hexaploid genome of bread wheat (Triticum aestivum) is large (17 Gb) and repetitive, and this has delayed full sequencing and annotation of the genome, which is a prerequisite for effective quantitative proteomics analysis. Aware of these constraints we investigated the most effective approaches for shotgun proteomic analyses of bread wheat that would support large-scale quantitative comparisons using iTRAQ reagents. We used a data set that was generated by two-dimensional LC-MS of iTRAQ labeled peptides from wheat leaves. The main items considered in this study were the choice of sequence database for matching LC-MS data, the consistency of identification when multiple LC-MS runs were acquired, and the options for downstream functional analysis to generate useful insight. For peptide identification we examined the extensive NCBInr plant database, a smaller composite cereals database, the Brachypodium distachyon model plant genome, the EST-based SuperWheat database, as well as the genome sequence from the recently sequenced D-genome progenitor Aegilops tauschii. While the most spectra were assigned by using the SuperWheat database, this extremely large database could not be readily manipulated for the robust protein grouping that is required for large-scale, multirun quantitative experiments. We demonstrated a pragmatic alternative of using the composite cereals database for peptide spectra matching. The stochastic aspect of protein grouping across LC-MS runs was investigated using the smaller composite cereals database where we found that attaching the Brachypodium best BLAST hit reduced this problem. Further, assigning quantitation to the best Brachypodium locus yielded promising results enabling integration with existing downstream data mining and functional analysis tools. Our study demonstrated viable approaches for quantitative proteomics analysis of bread wheat samples and shows how these approaches could be similarly adopted for analysis of other organisms with unsequenced or incompletely sequenced genomes.

Original languageEnglish
Pages (from-to)4870-4881
Number of pages12
JournalJournal of Proteome Research
Volume12
Issue number11
DOIs
Publication statusPublished - 1 Nov 2013

Fingerprint Dive into the research topics of 'Coverage and consistency: Bioinformatics aspects of the analysis of multirun iTRAQ experiments with wheat leaves'. Together they form a unique fingerprint.

Cite this