LRRpredictor
General Info
LRRpredictor is an open-source tool for detecting LRR motifs within leucine rich repeats proteins. It resides on secondary structure, relative solvent accessibility and disorder predictions that are performed using RaptorX-Property [1-4] and sequence variability profiles generated using HH-suite [5,6] on Uniprot20 sequence database.
Run locally
LRRpredictor source can be found on GitHub at:
https://github.com/eliza-m/LRRpredictor_v1
Also provided are a Dockerfile from which a docker image can be built and a pre-installed Docker image can be pulled from our docker repository.
Detailed installation instruction are provided in README.md file.
Usage
At the moment, only one single sequence can be processed in a single job.
Output files
1. Short output
"ProteinName.predshort.txt"
Displays only the potential LRR motifs identified by LRRpredictor, that yielded a probability value over 0.5 (if any).
Probabilities are also shown for each classifier (clf 1-8).
If none LRR motif was detected this file will contain only the header.
Example:
#Protein | pos | clf1 | clf2 | clf3 | clf4 | clf5 | clf6 | clf7 | clf8 | LRRpred | -5 | -4 | -3 | -2 | -1 | L | x | x | L | x | L | +6 | +7 | +8 | +9 | +10 | |||
gpa216045 | 864 | 0.912 | 0.949 | 1.000 | 0.778 | 0.898 | 0.893 | 0.999 | 0.682 | 0.889 | A | D | I | T | T | L | A | L | I | D | I | F | R | C | Q | Q |
Header description:
* prot - protein name
* pos - residue number where a detected LRR motif starts (i.e first `L` from `LxxLxL` minimalistic motif)
* clf1-8 - Each classifiers predicted probability (min: 0, max 1)
* LRRpred - LRRpredictor probability based on all eight classifiers.
Starting from columns 12 until the end, the amino acid sequence of the detected LRR motif is shown: 5 positions upstream the motif (-5 to -1), the minimalistic motif 'LxxLxL' and 5 positions downstream (6 to 10).
2. Long output
"ProteinName.pred.txt"
Example:
#Protein | resid | aa | clf1 | clf2 | clf3 | clf4 | clf5 | clf6 | clf7 | clf8 | LRRpred |
gpa216045 | 864 | L | 0.912 | 0.949 | 1.000 | 0.778 | 0.898 | 0.893 | 0.999 | 0.682 | 0.889 |
Header description:
* prot - protein name
* resid - residue number
* aa - amino acid one letter code
* unused - unused field (this field is used only training and testing data and indicates the position where a true LRR motifs starts; these positions were identified from structural files).
* clf1-8 - Each classifiers predicted probability (min: 0, max 1)
* LRRpredictor - LRRpredictor probability based on all eight classifiers.
3. Data used as input
"ProteinName.input"
Data used as input for LRRpredictor - RaptorX-property SS, RSA, disorder predictions and variability profile.
Reference
If you use LRRpredictor please cite:
Eliza C. Martin, Octavina C. A. Sukarta, Laurentiu Spiridon, Laurentiu G. Grigore, Vlad Constantinescu, Robi Tacutu, Aska Goverse, Andrei-Jose Petrescu. LRRpredictor - a new LRR motif detection method for irregular motifs of plant NLR proteins using ensemble of classifiers. Genes 2020, 11, 286.
Click to view paper
Bibliography
[1] Wang, S.; Li, W.; Liu, S.; Xu, J. RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res. 2016, 44, W430-W435.
[2] Wang, S.; Peng, J.; Ma, J.; Xu, J. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields. Sci. Rep. 2016, 6, -11.
[3] Wang, S.; Ma, J.; Xu, J. AUCpreD: Proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields. In Proceedings of the Bioinformatics; Oxford University Press, 2016; Vol. 32, pp. i672-i679.
[4] Wang, S.; Sun, S.; Xu, J. AUC-maximized deep convolutional neural fields for protein sequence labeling. In Proceedings of the Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer Verlag, 2016; Vol. 9852 LNAI, pp. 1-16.
[5] Remmert, M.; Biegert, A.; Hauser, A.; Soding, J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 2012, 9, 173-175.
[6] Steinegger, M., Meier, M., Mirdita, M., Vohringer, H., Haunsberger, S. J., Soding, J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics 2019, 473. doi: 10.1186/s12859-019-3019-7
Contact
For any issue or suggestion please feel free to write us at :
e-mail: eliza.martin@biochim.ro