Prediction of protein backbone and sidechain torsion angles from NMR chemical shifts.
TALOS-N is an artificial neural network (ANN) based hybrid system for empirical prediction of protein backbone φ/ψ torsion angles, sidechain χ1 torsion angles and secondary structure using a combination of six kinds (HN, Hα, Cα, Cβ, CO, N) of chemical shift assignments for a given residue sequence.
The original TALOS approach, and its successor TALOS+, is an extension of the well-known observation that many kinds of secondary chemical shifts (i.e. differences between chemical shifts and their corresponding random coil values) are highly correlated with aspects of protein secondary structure. The goal of TALOS-N is again to use secondary chemical shift and sequence information to make quantitative predictions for the protein backbone angles φ/ψ, and to provide a measure of the uncertainties in these predictions. In the original TALOS approach, we search a high-resolution structural database (for which experimental chemical shifts are available) for the 10 best matches to the secondary chemical shifts of a given residue in a target protein along with its two flanking neighbors (a residue triplet). If there is a consensus of φ and ψ angles among the 10 best database matches, then we use these database triplet structures to form a prediction for the backbone angles of the target residue. The later TALOS+ approach added an ANN classification scheme to this database mining approach. This ANN analyzed the chemical shifts and sequence to estimate the likelihood of a given residue being in a α, β, or positive-φ conformation. This ANN classification information was then combined with the database mining results, thereby increasing the number of residues where useful backbone angle predictions can be made.
TALOS-N relies far more extensively on the use of trained ANNs than TALOS+. In TALOS-N method, the ANN used to correlate the chemical shift and the backbone conformation is implemented upon a concept of defining the Ramachandran map in terms of 324 voxels, rather than the three groupings used by TALOS+. TALOS-N also improves upon the original TALOS and TALOS+ database mining approaches by relying on (1) a large database of over 9500 high quality X-ray structures to which chemical shift assignments were added by SPARTA+, and (2) an optimized database search procedure for 25 best matched database hepta-peptides (rather than 10 best matched database tri-peptides). The far greater reliance on ANN algorithms, as well as an optimized database mining approach, allows TALOS-N to predicting backbone torsion angles for a larger fraction (~90%) of residues in a given protein at improved precision.
TALOS-N also includes an ANN component to derive sidechain χ1 angle information, as the χ1 value is known to impact the backbone chemical shifts.
In addition, TALOS-N offers several important features:
- TALOS-N can make predictions for the frequently encountered cases where residue assignments are lacking. Although the fraction of such residues for which unambiguous predictions can be made tends to be significantly lower, the reliability of such predictions remains relatively high.
- For convenience, and in order to prevent assignment of backbone torsion angles to regions that are dynamically disordered, TALOS-N also reports an estimated backbone order parameter S2 derived from the chemical shifts in a way described by Berjanskii and Wishart (J. Am. Chem. Soc. 127: 14970-14971).
- For those residues whose backbone torsion angles cannot be predicted uniquely by TALOS-N, but whose backbone is not dynamically disordered as judged by RCI-S2, the ANN predicted 324-state (φ,ψ) distribution frequently strongly limits the chemical shift compatible φ/ψ values to two small, discrete regions of the Ramachandran map, which may prove useful in structure determination efforts.
- TALOS-N provides ANN-predicted secondary structure information from the chemical shifts (and/or protein sequence), with high prediction accuracy.
|NMRbox Version||Software Version|