Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Publisher: BioMed Central
Journal: BMC Bioinformatics
Languages: English
Types: Article
Subjects: R858-859.7, Biology (General), Q, QH301-705.5, Science, DOAJ:Biology and Life Sciences, DOAJ:Biology, Computer applications to medicine. Medical informatics, Bioinformatics (life sciences), Research Article



Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel.


The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP 1, MCHBN 2, and MHCBench 3. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database.


The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.

  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 1. Brusic V, Rudy G, Harrison LC: MHCPEP, a database of MHCbinding peptides: update 1997. Nucleic Acids Res 1998, 26(1):368-371.
    • 2. Bhasin M, Singh H, Raghava GP: MHCBN: a comprehensive database of MHC binding and non-binding peptides. Bioinformatics 2003, 19(5):665-666.
    • 3. Raghava GP: MHCBench: Evaluation of MHC Binding Peptide Prediction Algorithms. 2001.
    • 4. Rhodes DA, Trowsdale J: Genetics and molecular genetics of the MHC. Rev Immunogenet 1999, 1(1):21-31.
    • 5. Yewdell JW, Bennink JR: Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses. Annu Rev Immunol 1999, 17:51-88.
    • 6. Kropshofer H, Max H, Halder T, Kalbus M, Muller CA, Kalbacher H: Self-peptides from four HLA-DR alleles share hydrophobic anchor residues near the NH2-terminal including proline as a stop signal for trimming. J Immunol 1993, 151(9):4732-4742.
    • 7. Brusic V, Rudy G, Honeyman G, Hammer J, Harrison L: Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network. Bioinformatics 1998, 14(2):121-130.
    • 8. Tong JC, Zhang GL, Tan TW, August JT, Brusic V, Ranganathan S: Prediction of HLA-DQ3.2beta ligands: evidence of multiple registers in class II binding peptides. Bioinformatics 2006, 22(10):1232-1238.
    • 9. Zavala-Ruiz Z, Strug I, Anderson MW, Gorski J, Stern LJ: A polymorphic pocket at the P10 position contributes to peptide binding specificity in class II MHC proteins. Chem Biol 2004, 11(10):1395-1402.
    • 10. Xia J, Siegel M, Bergseng E, Sollid LM, Khosla C: Inhibition of HLADQ2-mediated antigen presentation by analogues of a high affinity 33-residue peptide from alpha2-gliadin. J Am Chem Soc 2006, 128(6):1859-1867.
    • 11. Carson RT, Vignali KM, Woodland DL, Vignali DA: T cell receptor recognition of MHC class II-bound peptide flanking residues enhances immunogenicity and results in altered TCR V region usage. Immunity 1997, 7(3):387-399.
    • 12. Bonomi G, Moschella F, Ombra MN, Del Pozzo G, Granier C, De Berardinis P, Guardiola J: Modulation of TCR recognition of MHC class II/peptide by processed remote N- and C-terminal epitope extensions. Hum Immunol 2000, 61(8):753-763.
    • 13. Arnold PY, La Gruta NL, Miller T, Vignali KM, Adams PS, Woodland DL, Vignali DA: The majority of immunogenic epitopes generate CD4+ T cells that are dependent on MHC class II-bound peptide-flanking residues. J Immunol 2002, 169(2):739-749.
    • 14. Godkin AJ, Smith KJ, Willis A, Tejada-Simon MV, Zhang J, Elliott T, Hill AV: Naturally processed HLA class II peptides reveal highly conserved immunogenic flanking region sequence preferences that reflect antigen processing rather than peptideMHC interactions. J Immunol 2001, 166(11):6720-6727.
    • 15. Donnes P, Elofsson A: Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics 2002, 3:25.
    • 16. Noguchi H, Hanai T, Honda H, Harrison LC, Kobayashi T: Fuzzy neural network-based prediction of the motif for MHC class II binding peptides. J Biosci Bioeng 2001, 92(3):227-231.
    • 17. Burden FR, Winkler DA: Predictive Bayesian neural network models of MHC class II peptide binding. J Mol Graph Model 2005, 23(6):481-489.
    • 18. Bhasin M, Raghava GP: SVM based method for predicting HLADRB1*0401 binding peptides in an antigen sequence. Bioinformatics 2004, 20(3):421-423.
    • 19. Yang ZR, Johnson FC: Prediction of T-cell epitopes using biosupport vector machines. J Chem Inf Model 2005, 45(5):1424-1428.
    • 20. Mallios RR: Predicting class II MHC/peptide multi-level binding with an iterative stepwise discriminant analysis metaalgorithm. Bioinformatics 2001, 17(10):942-948.
    • 21. Murugan N, Dai Y: Prediction of MHC class II binding peptides based on an iterative learning model. Immunome Res 2005, 1:6.
    • 22. Doytchinova IA, Flower DR: Towards the in silico identification of class II restricted T-cell epitopes: a partial least squares iterative self-consistent algorithm for affinity prediction. Bioinformatics 2003, 19(17):2263-2270.
    • 23. Kawashima S, Ogata H, Kanehisa M: AAindex: Amino Acid Index Database. Nucleic Acids Res 1999, 27(1):368-369.
    • 24. Guan P, Doytchinova IA, Walshe VA, Borrow P, Flower DR: Analysis of peptide-protein binding using amino acid descriptors: prediction and experimental verification for human histocompatibility complex HLA-A0201. J Med Chem 2005, 48(23):7418-7425.
    • 25. Noguchi H, Kato R, Hanai T, Matsubara Y, Honda H, Brusic V, Kobayashi T: Hidden Markov model-based prediction of antigenic peptides that interact with MHC class II molecules. J Biosci Bioeng 2002, 94(3):264-270.
    • 26. Karpenko O, Shi J, Dai Y: Prediction of MHC class II binders using the ant colony search strategy. Artif Intell Med 2005, 35(1- 2):147-156.
    • 27. Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O: Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics 2004, 20(9):1388-1397.
    • 28. Vapnik VN: The nature of statistical learning theory. New York , Springer; 1995:xv, 188 p..
    • 29. Schölkopf B, Smola A, Müller KR: Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation 1998, 10:1299-1319.
    • 30. Schölkopf B, Smola AJ: Learning with kernels : support vector machines, regularization, optimization, and beyond. In Adaptive computation and machine learning Cambridge, Mass. , MIT Press; 2002:xviii, 626 p..
    • 31. Saigo H, Vert JP, Ueda N, Akutsu T: Protein homology detection using string alignment kernels. Bioinformatics 2004, 20(11):1682-1689.
    • 32. Vert JP, Akutsu T, Saigo H: Local Alignment Kernels for Biological Sequences. In Kernel Methods in Computational Biology Edited by: Schölkopf , Tsuda , Vert . MIT Press; 2004.
    • 33. Kuang R, Ie E, Wang K, Wang K, Siddiqi M, Freund Y, Leslie C: Profile-based string kernels for remote homology detection and motif extraction. J Bioinform Comput Biol 2005, 3(3):527-550.
    • 34. Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 1992, 89(22):10915-10919.
    • 35. Efron B: The jackknife, the bootstrap, and other resampling plans. In CBMS-NSF regional conference series in applied mathematics ; 38 Philadelphia, Pa. , Society for Industrial and Applied Mathematics; 1982:vii, 92 p..
    • 36. Dosztanyi Z, Torda AE: Amino acid similarity matrices based on force fields. Bioinformatics 2001, 17(8):686-699.
    • 37. Hammer J, Sturniolo T, Sinigaglia F: HLA class II peptide binding specificity and autoimmunity. Adv Immunol 1997, 66:67-100.
    • 38. Hammer J, Bono E, Gallazzi F, Belunis C, Nagy Z, Sinigaglia F: Precise prediction of major histocompatibility complex class II-peptide interaction based on peptide side chain scanning. J Exp Med 1994, 180(6):2353-2358.
    • 39. Xiao YS M.: Prediction of Genomewide Conserved Epitope Profiles of HIV-1: Classifier Choice and Peptide Representation. Statistical Applications in Genetics and Molecular Biology 2005, 4(1):.
    • 40. Hattotuwagama CK, Toseland CP, Guan P, Taylor DJ, Hemsley SL, Doytchinova IA, Flower DR: Toward prediction of class II mouse major histocompatibility complex peptide binding affinity: in silico bioinformatic evaluation using partial least squares, a robust multivariate statistical technique. J Chem Inf Model 2006, 46(3):1491-1502.
    • 41. Wauben MH, van der Kraan M, Grosfeld-Stulemeyer MC, Joosten I: Definition of an extended MHC class II-peptide binding motif for the autoimmune disease-associated Lewis rat RT1.BL molecule. Int Immunol 1997, 9(2):281-290.
    • 42. Muller T, Vingron M: Modeling amino acid replacement. J Comput Biol 2000, 7(6):761-776.
    • 43. Muller T, Spang R, Vingron M: Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method. Mol Biol Evol 2002, 19(1):8-13.
    • 44. Liu W, Meng X, Xu Q, Flower DR, Li T: Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models. BMC Bioinformatics 2006, 7:182.
    • 45. Zavala-Ruiz Z, Strug I, Walker BD, Norris PJ, Stern LJ: A hairpin turn in a class II MHC-bound peptide orients residues outside the binding groove for T cell recognition. Proc Natl Acad Sci U S A 2004, 101(36):13279-13284.
    • 46. Smith TF, Waterman MS: Identification of common molecular subsequences. J Mol Biol 1981, 147(1):195-197.
    • 47. Altschul SF, Koonin EV: Iterated profile searches with PSIBLAST--a tool for discovery in protein databases. Trends Biochem Sci 1998, 23(11):444-447.
    • 48. Haussler D: Convolution Kernels on Discrete Structures. Technical Report UCS-CRL-99-10 1999.
    • 49. Mathworks: MATLAB. [http://www.mathworks.com].
    • 50. Weston J, Elisseeff A, Bakir G, Sinz F: SPIDER: object-orientated machine learning library, v. 1.6. [http://www.kyb.tuebin gen.mpg.de/bs/people/spider/].
    • 51. Bengio Y, Grandvalet Y: No Unbiased Estimator of the Variance of K-Fold Cross-Validation. Journal of Machine Learning Research 2003, 2003:.
    • 52. Blake JD, Cohen FE: Pairwise sequence alignment below the twilight zone. J Mol Biol 2001, 307(2):721-735.
    • 53. Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovic S: SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 1999, 50(3-4):213-219.
  • No related research data.
  • No similar publications.