LMPred: predicting antimicrobial peptides using pre-trained language models and deep learning
- Published: 31 March, 2022
- Project URL: Bioinformatics Advances (OUP) - LMPred
- Tags: Deep learning, large language models, protein sequence prediction.
Summary
This project utilized pre-trained large language models (BERT, T5 and XLNET) to create contextualized embedding vectors representing peptide sequences. Sequences were classified as having antimicrobial properties or not using a CNN - showing an ability for language models to learn some of the "language of life", i.e. the underlying biology, encoded within each sequence.