×

Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies. (English) Zbl 1359.92004

Summary: Mass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.

MSC:

92B15 General biostatistics
92C55 Biomedical imaging and signal processing
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

R; MINITAB

References:

[1] Anderson, N. L. and N. G. Anderson (2002): “The human plasma protein,” Mol. Cell. Proteomics, 1, 845-867.; Anderson, N. L.; Anderson, N. G., The human plasma protein, Mol. Cell. Proteomics, 1, 845-867 (2002)
[2] Bolstad, B. M., R. A. Irizarry, M. Astrand and T. P. Speed (2003): “A comparison of normalization methods for high density oligonucleotide array data based on variance and bias,” Bioinformatics, 19, 185-193.; Bolstad, B. M.; Irizarry, R. A.; Astrand, M.; Speed, T. P., A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics, 19, 185-193 (2003)
[3] Burzykowski, T., J. Claesen and D. Valkenborg (2016): “The analysis of peptide-centric mass-spectrometry data utilizing information about the expected isotope distribution,” In: Datta, S and Mertens, B. eds., Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data using Mass Spectrometry. Berlin, Germany: Springer International Publishing.; Burzykowski, T.; Claesen, J.; Valkenborg, D.; Datta, S.; Mertens, B., Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data using Mass Spectrometry (2016)
[4] Diamandis, E. P. (2004): “Mass spectrometry as a diagnostic and a cancer biomarker discovery tool opportunities and potential limitations,” Mol. Cell. Proteomics, 3, 367-378.; Diamandis, E. P., Mass spectrometry as a diagnostic and a cancer biomarker discovery tool opportunities and potential limitations, Mol. Cell. Proteomics, 3, 367-378 (2004)
[5] Helsel, D. R. (2012): Statistics for censored enviromental data using MINITAB and R, New Jersey: Wiley.; Helsel, D. R., Statistics for censored enviromental data using MINITAB and R (2012) · Zbl 1280.62004
[6] Horn, D. M., R. A. Zubarev and F. W. McLafferty (2000): “Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules,” J. Am. Soc. Mass Spectr. 11, 320-332.; Horn, D. M.; Zubarev, R. A.; McLafferty, F. W., Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules, J. Am. Soc. Mass Spectr., 11, 320-332 (2000)
[7] Le Cessie, S. and J. C. van Houwelingen (1992): “Ridge estimators in logistic regression,” Appl. Stat., 41, 191-201.; Le Cessie, S.; van Houwelingen, J. C., Ridge estimators in logistic regression, Appl. Stat., 41, 191-201 (1992) · Zbl 0825.62593
[8] Mertens, B., M. E. De Noo, R. A. E. M. Tollenaar and A. M. Deelder (2006): “Mass spectrometry proteomic diagnosis: enacting the double cross-validatory paradigm,” J. Comput. Biol., 13, 1591-1605.; Mertens, B.; De Noo, M. E.; Tollenaar, R. A. E. M.; Deelder, A. M., Mass spectrometry proteomic diagnosis: enacting the double cross-validatory paradigm, J. Comput. Biol., 13, 1591-1605 (2006)
[9] Nicolardi, S., B. J. Velstra, B. Mertens, B. Bosing, W. E. Mesker, R. A. E. M. Tollenaar, A. M. Deelder and Y. E. M. van der Burgt (2014): “Ultrahigh resolution profiles lead to more detailed serum peptidome sugnatures of pancreatic cancer,” Transl. Proteomics, 2, 39-51.; Nicolardi, S.; Velstra, B. J.; Mertens, B.; Bosing, B.; Mesker, W. E.; Tollenaar, R. A. E. M.; Deelder, A. M.; van der Burgt, Y. E. M., Ultrahigh resolution profiles lead to more detailed serum peptidome sugnatures of pancreatic cancer, Transl. Proteomics, 2, 39-51 (2014)
[10] Palmblad, M., J. Buijs and P. Hakanson (2001): “Automatic analysis of hydrogen/deuterium exchange mass spectra of peptides and proteins using calculations of isotopic distributions,” J. Am. Soc. Mass Spectr., 12, 1153-1162.; Palmblad, M.; Buijs, J.; Hakanson, P., Automatic analysis of hydrogen/deuterium exchange mass spectra of peptides and proteins using calculations of isotopic distributions, J. Am. Soc. Mass Spectr., 12, 1153-1162 (2001)
[11] Park, K., J. Y. Yoon, S. Lee, E. Paek, H. Park, H. J. Jung and S. W. Lee (2008): “Isotopic peak intensity ratio based algorithm for determination of isotopic clusters and monoisotopic masses of polypeptides from high-resolution mass spectrometric data,” J. Anal. Chem., 80, 7294-7303.; Park, K.; Yoon, J. Y.; Lee, S.; Paek, E.; Park, H.; Jung, H. J.; Lee, S. W., Isotopic peak intensity ratio based algorithm for determination of isotopic clusters and monoisotopic masses of polypeptides from high-resolution mass spectrometric data, J. Anal. Chem., 80, 7294-7303 (2008)
[12] Rockwood, A. L. and P. Haimi (2006): “Efficient calculation of accurate masses of isotopic peaks,” J. Am. Soc. Mass Spectr., 17, 415-419.; Rockwood, A. L.; Haimi, P., Efficient calculation of accurate masses of isotopic peaks, J. Am. Soc. Mass Spectr., 17, 415-419 (2006)
[13] Sauve, A. C. and T. P. Speed (2004): “Normalization, baseline correction and alignment of high-throughput mass spectrometry data,” Procedings Gensips 2004, 4 pages.; Sauve, A. C.; Speed, T. P., Normalization, baseline correction and alignment of high-throughput mass spectrometry data, Procedings Gensips (2004)
[14] Scheltema, R. (2009): “Simple data-reduction method for high-resolution lc-ms data in metabolomics.” Bioanalysis, 1, 1551-7.; Scheltema, R., Simple data-reduction method for high-resolution lc-ms data in metabolomics, Bioanalysis, 1, 1551-7 (2009)
[15] Senko, M. W., S. C. Beu and F. W. McLafferty (1995): “Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distribution,” J. Am. Soc. Mass Spectr., 6, 229-233.; Senko, M. W.; Beu, S. C.; McLafferty, F. W., Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distribution, J. Am. Soc. Mass Spectr., 6, 229-233 (1995)
[16] Stone, M. (1974): “Cross-validatory choice and assessment of statistical predictions,” J. R. Stat. Soc., Series B, 36, 111-147.; Stone, M., Cross-validatory choice and assessment of statistical predictions, J. R. Stat. Soc. Series B, 36, 111-147 (1974) · Zbl 0308.62063
[17] Valkenborg, D., I. Mertens, F. Lemiere, E. Witters and T. Burzykowski (2012): “The isotopic distribution conundrum,” Mass Spectr. Rev., 31, 96-109.; Valkenborg, D.; Mertens, I.; Lemiere, F.; Witters, E.; Burzykowski, T., The isotopic distribution conundrum, Mass Spectr. Rev., 31, 96-109 (2012)
[18] van der Burgt, Y. E. M., I. M. Taban, M. Konijnenburg, M. Biskup, M. C. Duursma, R. M. A. Heeren, A. Römpp, R. V. van Nieuwpoort and H. E. Bal (2007): “Parallel processing of large datasets from nanolc-fticr-ms measurements,” J. Am. Soc. Mass Spectr., 18, 152-161.; van der Burgt, Y. E. M.; Taban, I. M.; Konijnenburg, M.; Biskup, M.; Duursma, M. C.; Heeren, R. M. A.; Römpp, A.; van Nieuwpoort, R. V.; Bal, H. E., Parallel processing of large datasets from nanolc-fticr-ms measurements, J. Am. Soc. Mass Spectr., 18, 152-161 (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.