Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Doytchinova, Irini A; Flower, Darren R (2007)
Publisher: BioMed Central
Journal: BMC Bioinformatics
Languages: English
Types: Article
Subjects: R858-859.7, Methodology Article, Computer applications to medicine. Medical informatics, Biology (General), QH301-705.5



Vaccine development in the post-genomic era often begins with the in silico screening of genome information, with the most probable protective antigens being predicted rather than requiring causative microorganisms to be grown. Despite the obvious advantages of this approach – such as speed and cost efficiency – its success remains dependent on the accuracy of antigen prediction. Most approaches use sequence alignment to identify antigens. This is problematic for several reasons. Some proteins lack obvious sequence similarity, although they may share similar structures and biological properties. The antigenicity of a sequence may be encoded in a subtle and recondite manner not amendable to direct identification by sequence alignment. The discovery of truly novel antigens will be frustrated by their lack of similarity to antigens of known provenance. To overcome the limitations of alignment-dependent methods, we propose a new alignment-free approach for antigen prediction, which is based on auto cross covariance (ACC) transformation of protein sequences into uniform vectors of principal amino acid properties.


Bacterial, viral and tumour protein datasets were used to derive models for prediction of whole protein antigenicity. Every set consisted of 100 known antigens and 100 non-antigens. The derived models were tested by internal leave-one-out cross-validation and external validation using test sets. An additional five training sets for each class of antigens were used to test the stability of the discrimination between antigens and non-antigens. The models performed well in both validations showing prediction accuracy of 70% to 89%. The models were implemented in a server, which we call VaxiJen.


VaxiJen is the first server for alignment-independent prediction of protective antigens. It was developed to allow antigen classification solely based on the physicochemical properties of proteins without recourse to sequence alignment. The server can be used on its own or in combination with alignment-based prediction methods. It is freely-available online at the URL: http://www.jenner.ac.uk/VaxiJen.

  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 1. Levine MM, Lagos R: Vaccines and vaccination in historical perspective. In New Generation Vaccines 2nd edition. Edited by: Levine MM, Woodrow GC, Kaper JB, Cobon GS. New York: Marcel Dekker, Inc; 1997:1-11.
    • 2. Ada GL: The traditional vaccines: an overview. In New Generation Vaccines 2nd edition. Edited by: Levine MM, Woodrow GC, Kaper JB, Cobon GS. New York: Marcel Dekker, Inc; 1997:13-23.
    • 3. Woodrow GC: An overview of biotechnology as applied to vaccine development. In New Generation Vaccines 2nd edition. Edited by: Levine MM, Woodrow GC, Kaper JB, Cobon GS. New York: Marcel Dekker, Inc; 1997:25-34.
    • 4. Rappuoli R: Reverse vaccinology, a genome-based approach to vaccine development. Vaccine 2001, 19:2688-2691.
    • 5. Pizza M, Scarlato V, Masignani V, Giuliani MM, Arico B, Comanducci M, Jennings GT, Baldi L, Bartoloni E, Capecchi B, Galeotti CL, Luzzi E, Manetti R, Marchetti E, Mora M, Nuti S, Ratti G, Santini L, Savino S, Scarselli M, Storni E, Zuo P, Broeker M, Hundt E, Knapp B, Blair E, Mason T, Tettelin H, Hood DW, Jeffries AC, Saunders NJ, Granoff DM, Venter JC, Moxon ER, Grandi G, Rappuoli R: Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science 2000, 287:1816-1820.
    • 6. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410.
    • 7. Nakai K, Kanehisa M: Expert system for predicting protein localization sites in gram-negative bacteria. Proteins 1991, 11:95-110.
    • 8. Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004, 340:783-795.
    • 9. Petsko GA, Ringe D: Protein structure and function. Blackwell Publishing; 2004.
    • 10. Wold S, Jonsson J, Sjöström M, Sandberg M, Rännar S: DNA and peptide sequences and chemical processes multivariately modeled by principal component analysis and partial leastsquares projections to latent structures. Anal Chim Acta 1993, 277:239-253.
    • 11. Andersson PM, Sjöström M, Lundstedt T: Preprocessing peptide sequences for multivariate sequence-property analysis. Chemometr Intell Lab 1998, 42:41-50.
    • 12. Nyström Å, Andersson PM, Lundstedt T: Multivariate data analysis of topographically modified á-melanotropin analoques using auto and cross auto covariances (ACC). Quant Struct-Act Relat 2000, 19:264-269.
    • 13. Lapinsh M, Gutcaits A, Prusis P, Post C, Lundstedt T, Wikberg JES: Classification of G-protein coupled receptors by alignmentindependent extraction of principal chemical properties of primary amino acid sequences. Protein Sci 2002, 11:795-805.
    • 14. Hellberg S, Sjöström M, Skagerberg B, Wold S: Peptide quantitative structure-activity relationships, a multivariate approach. J Med Chem 1987, 30:1126-1135.
    • 15. SIMCA 8.0. Umetrics UK Ltd, Wokingham Road, RG42 1PL, Bracknell, UK. .
    • 16. Sjöström M, Rännar S, Wieslander Å: Polypeptide sequence property relationships in Escherichia coli based on auto cross covariances. Chemometr Intell Lab Syst 1995, 29:295-305.
    • 17. Lee MJ, de Jong S, Gäde G, Poulos C, Goldsworthy GJ: Mathematical modelling of insect neuropeptide potencies. Are quantitatively predictive models possible? Insect Biochem Molec 2000, 30:899-907.
    • 18. Siebert KJ: Quantitative structure-activity relationship modelling of peptide and protein behavior as a function of amino acid composition. J Agr Food Chem 2001, 49:851-858.
    • 19. Doytchinova IA, Walshe V, Borrow P, Flower DR: Towards the chemometric dissection of peptide-HLA-A*0201 binding affinity: comparison of local and global QSAR models. J Comput Aid Mol Des 2005, 19:203-212.
    • 20. Guan P, Doytchinova IA, Walshe VA, Borrow P, Flower DR: Analysis of peptide-protein binding using amino acid descriptors: prediction and experimental verification for HLA-A*0201. J Med Chem 2005, 48:7418-7425.
    • 21. Cancer Immunome Database [http://www2.licr.org/CancerIm munomeDB]
    • 22. Viral Bioinformatics Resource Center [http://www.biovi rus.org/sequence.asp]
    • 23. UniProt Knowledgebase of the ExPASy Proteomics Server [http://ca.expasy.org/sprot/]
    • 24. Bradley AP: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997, 30:1145-1159.
    • 25. Webber C, Barton GJ: Estimation of P-values for global alignments of protein sequences. Bioinformatics 2001, 17:1158-1167.
    • 26. Floyd RW: Algorithm 97 Shortest Path. Commun ACM 1969, 12:345-346.
    • 27. VaxiJen Server [http://www.jenner.ac.uk/VaxiJen]
  • No related research data.
  • Discovered through pilot similarity algorithms. Send us your feedback.

    Title Year Similarity

Share - Bookmark

Cite this article