Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Correa, Elon; Goodacre, Royston (2011)
Publisher: BioMed Central
Journal: BMC Bioinformatics
Languages: English
Types: Article
Subjects: R858-859.7, Methodology Article, Computer applications to medicine. Medical informatics, Biology (General), QH301-705.5



The rapid identification of Bacillus spores and bacterial identification are paramount because of their implications in food poisoning, pathogenesis and their use as potential biowarfare agents. Many automated analytical techniques such as Curie-point pyrolysis mass spectrometry (Py-MS) have been used to identify bacterial spores giving use to large amounts of analytical data. This high number of features makes interpretation of the data extremely difficult We analysed Py-MS data from 36 different strains of aerobic endospore-forming bacteria encompassing seven different species. These bacteria were grown axenically on nutrient agar and vegetative biomass and spores were analyzed by Curie-point Py-MS.


We develop a novel genetic algorithm-Bayesian network algorithm that accurately identifies sand selects a small subset of key relevant mass spectra (biomarkers) to be further analysed. Once identified, this subset of relevant biomarkers was then used to identify Bacillus spores successfully and to identify Bacillus species via a Bayesian network model specifically built for this reduced set of features.


This final compact Bayesian network classification model is parsimonious, computationally fast to run and its graphical visualization allows easy interpretation of the probabilistic relationships among selected biomarkers. In addition, we compare the features selected by the genetic algorithm-Bayesian network approach with the features selected by partial least squares-discriminant analysis (PLS-DA). The classification accuracy results show that the set of features selected by the GA-BN is far superior to PLS-DA.

  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 1. Atrih A, Foster SJ: The role of peptidoglycan structure and structural dynamics during endospore dormancy and germination. Antonie van Leeuwenhoek 1999, 75(4):299-307.
    • 2. Doyle MP, Beuchat LR, Montville TJ, (Eds): Food Microbiology: Fundamentals and Frontiers Washington DC: Amercian Society of Microbiology; 1997.
    • 3. Barnaby W: Plague Makers: The Secret World of Biolgoical Warfare Vision; 1999.
    • 4. Inglesby TV, Henderson DA, Bartlett JG, Ascher MS, Eitzen E, Friedlander AM, Hauer J, McDade J, Osterholm MT, O'Toole T, Parker G, Perl TM, Russell PK, Tonat K: Anthrax as a Biological Weapon - medical and Public Health Management. JAMA - Journal of the American Medical Association 1999, 281(18):1735-1745.
    • 5. Ghiamati E, Manoharan R, Nelson WH, Sperry JF: UV Resonance Raman Spectra of Bacillus Spores. Applied Spectroscopy 1992, 46(2):357-364.
    • 6. Tabor MW, MacGee J, Holland JW: Rapid determination of dipicolinic acid in the spores of Clostridium species by gas-liquid chromatography. Applied and Environmental Microbiology 1976, 31:25-28.
    • 7. Warth AD: Liquid Chromatographic Determination of Dipicolinic Acid from Bacterial Spores. Applied and Environmental Microbiology 1979, 38(6):1029-1033.
    • 8. Goodacre R, Shann B, Gilbert RJ, Timmins EM, McGovern AC, Alsberg BK, Kell DB, Logan NA: Detection of the Dipicolinic Acid Biomarker in Bacillus Spores Using Curie-Point Pyrolysis Mass Spectrometry and Fourier Transform Infrared Spectroscopy. Analytical Chemistry 2000, 72:119-127.
    • 9. DeLuca SJ, Sarver EW, Voorhees KJ: Direct analysis of bacterial glycerides by Curie-point pyrolysis-mass spectrometry. Journal of Analytical and Applied Pyrolysis 1992, 23:1-14.
    • 10. Snyder AP, Dworzanski JP, Tripathi A, Maswadeh WM, Wick CH: Correlation of mass spectrometry identified bacterial biomarkers from a fielded pyrolysis-gas chromatography-Ion mobility spectrometry biodetector with the microbiological gram stain classification scheme. Analytical Chemistry 2004, 76(21):6492-6499.
    • 11. Jensen FV: Bayesian networks and decision graphs Springer; 2001.
    • 12. Neapolitan RE: Learning Bayesian networks Prentice Hall; 2003.
    • 13. Heckerman D: A tutorial on learning with Bayesian networks. Tech rep, Microsoft Research 1995.
    • 14. The R Project for Statistical Computing: R programming language. [http://www.r-project.org/].
    • 15. Shute LA, Gutteridge CS, Norris JR, Berkeley RCW: Curie-point Pyrolysis Mass Spectrometry Applied to Characterization and Identification of Selected Bacillus Species. Journal of General Microbiology 1984, 130:343-355.
    • 16. Lopez-Diez EC, Goodacre R: Characterization of Microorganisms Using UV Resonance Raman Spectroscopy and Chemometrics. Analytical Chemistry 2004, 76(3):585-591.
    • 17. Witten IH, Frank E: Data mining: practical machine learning tools and techniques. second edition. The Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann; 2005.
    • 18. Holland JH: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence The MIT Press; 1992.
    • 19. Goldberg DE: Genetic algorithms in search, optimization and machine learning Addison-Wesley; 1989.
    • 20. Mitchell M: An introduction to genetic algorithms MIT Press; 1998.
    • 21. Goldberg DE: The design of innovation: lessons from and for competent genetic algorithms Kluwer Academic; 2002.
    • 22. Pearl J: Probabilistic reasoning in intelligent systems: networks of plausible inference. The Morgan Kaufmann series in representation and reasoning San Mateo, CA, USA: Morgan Kaufmann; 1988.
    • 23. Lauritzen SL, Spiegelhalter DJ: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistics Society 1988, 50:157-224.
    • 24. Bouckaert RR: Properties of Bayesian belief network learning algorithms. Conference on Uncertainty in Artificial Intelligence UAI 1994 Seattle, WA, USA: Morgan Kaufmann; 1994, 102-109.
    • 25. Chickering DM, Geiger D, Heckerman D: Learning Bayesian networks is NP-hard. Tech rep, Microsoft Research 1994.
    • 26. Barker M, Rayens W: Partial least squares for discriminantion. Journal of Chemometrics 2003, 17:166-173.
    • 27. Karp NA, Griffin JL, Lilley KS: Application of partial least squares discriminant analysis to two-dimensional difference gel studies in expression proteomics. Proteomics 2005, 5:81-90.
    • 28. Kuhn M: Classification and Regression Training (Caret). R programming language package [http://cran.r-project.org/web/packages/caret/index.html].
    • 29. Zeng X, Martinez TR: Distribution-balanced stratified cross-validation for accuracy estimation. Journal of Experimental and Theoretical Artificial Intelligence 2000, 12:1-12.
    • 30. Seasholtz M, Kowalski B: The parsimony principle applied to multivariate calibration. Analytica Chimica Acta 1993, 277(2):165-177.
    • 31. Hair JF, Black B, Babin B, Anderson RE, Tatham RL: Multivariate Data Analysis. 6 edition. Pearson Education; 2007.
    • 32. Beverly MB, Basile F, Voorhees KJ, Hadfield TL: A Rapid Approach for the Detection of Dipicolinic Acid in Bacterial Spores Using Pyrolysis/Mass Spectrometry. Rapid Communications in Mass Spectrometry 1998, 10(4):455-458.
    • 33. Havey CD, Basile F, Mowryb C, Voorhees KJ: Evaluation of a microfabricated pyrolyzer for the detection of Bacillus anthracis spores. Journal of Analytical and Applied Pyrolysis 2004, 72:55-61.
    • 34. Opitz J: Electron-impact ionization of benzoic acid, nicotinic acid and their n-butyl esters: An approach to regioselective proton affinities derived from ionization and appearance energy data. International Journal of Mass Spectrometry 2007, 265:1-14.
    • 35. Breiman L, Friedman J, Stone CJ, Olshen R: Classification and Regression Trees. 1 edition. Chapman & Hall; 1984.
    • 36. SPSS computer program used for statistical analysis. Website http://www.spss.com/.
    • 37. Shute LA, Gutteridge CS, Norris JR, Berkeley RCW: Reproducibility of pyrolysis mass spectrometry: effect of growth medium and instrument stability on the differentiation of selected Bacillus species. Journal of Applied Microbiology 1988, 64:79-88.
    • 38. Sproch N, Begin KJ, Moms RJ: The Modern Student Laboratory: Chromatography: An LC/Particle Beam/MS Experiment for Undergraduates. Journal of Chemical Education 1996, 73(2):A33iA39.
    • 39. Huang Ss, Chen D, Pelczar PL, Vepachedu VR, Setlow P, Li Yq: Levels of Ca2 + dipicolinic acid in individual bacillus spores determined using microfluidic Raman tweezers. The Journal of Bacteriology 2007, 189(13):4681-4687.
    • 40. Zhang P, Kong L, Setlow P, Li Yq: Characterization of wet heat inactivation of single spores of Bacillus species by dual-trap Raman spectroscopy and elastic light scattering. Applied and Environmental Microbiology 2010, 76(6):1796-1805.
    • 41. Pendukar SH, Kulkarni PR: Chemical composition of bacillus spores. Food/ Nahrung 1988, 32(10):1003-1004.
    • 42. Wolpert DH, Macready WG: No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1997, 1:67-82.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article