Sequence based prediction of TCR and peptide interaction
Morten Nielsen1,2, Vanessa Isabell Jurtz1*
1Department of Bio and Health Informatics, Technical University of Denmark; 2Instituto de Investigationes Biotecnológicas, Universidad Nacional de San Martín, Buenos Aires, Argentina
A major challenge for T cell therapy and rational identification of T cell epitopes is the identification of the cognate target (the peptide-HLA complex) of a given TCR. While reliable prediction of HLA-peptide interaction is possible for most HLA class I alleles, prediction models for the interaction between TCR and the HLA-peptide complex have not yet to the best of our knowledge been described. Recent sequencing projects have generated a considerable amount of data relating TCR sequences with the HLA-peptide complex they recognize. We utilize such data to train sequence-based predictors of TCR and peptide interactions. Our models are based on long short-term memory (LSTM) neural networks, which are especially designed to meet the challenges posed by the variable sequences of length of the TCRs. We show that such sequence-based models allow for the identification of the cognate peptide-HLA target of a given TCR from its sequence alone. Moreover we expect predictive performance to increase when more data becomes available.