Skip to content

Dihedral Backbone Does Not Account for Gaps, Returns Incorrect Values #744

@cing

Description

@cing

I didn't see anything in the docs that says biotite.structure.dihedral_backbone() will return an incorrect value if there's a gap the protein backbone, i.e. it fails biotite.structure.check_res_id_continuity().

For example in the Ramachandran code sample, with PDB: 3VKH there are many gaps in that structure which would result in incorrect dihedrals at the boundaries when we pull atom coordinates between successive i and i+1 residues. In just chain A, there would be erroneous dihedrals produced at all the following positions:

Couldn't find next residue data for residue 1619
Couldn't find next residue data for residue 2060
Couldn't find next residue data for residue 2169
Couldn't find next residue data for residue 2452
Couldn't find next residue data for residue 2630
Couldn't find next residue data for residue 3211
Couldn't find next residue data for residue 3491
Couldn't find next residue data for residue 3736
Couldn't find next residue data for residue 4438
Couldn't find next residue data for residue 4465
Couldn't find next residue data for residue 4521
Couldn't find next residue data for residue 4581
Couldn't find next residue data for residue 4613
Couldn't find next residue data for residue 4624
Couldn't find next residue data for residue 4673
Couldn't find next residue data for residue 4729

Here's an example from the above PDB which isolates one of these gaps (with 2 residues on either side)
3vkh_gap.pdb.txt
In this case, phi/psi/omega, in degrees, are:

(array([        nan,   42.100914,  -94.83789 , -118.06954 ], dtype=float32), 
array([156.29105 , -47.083096, 123.67293 ,        nan], dtype=float32), 
array([ 179.54808 ,   31.235714, -179.63321 ,         nan], dtype=float32))

The phi value -94.837 and the psi/omega the values -47.084 and 31.235 would be incorrect due to the unphysical gap, both would be better as nan.


For comparison, MDAnalysis returns None when you try to get dihedrals across a gap, for example:

import MDAnalysis as mda
from MDAnalysis.analysis import dihedrals
import numpy as np
u = mda.Universe("3vkh_gap.pdb")
protein = u.select_atoms('protein')
print('There are {} residues in the protein'.format(len(protein.residues)))
omegas = [res.omega_selection() for res in protein.residues]
for omega in omegas:
    if omega:
        print(omega, omega.dihedral.value())

returns the correct non-nan angles only:

There are 4 residues in the protein
<AtomGroup [<Atom 2: CA of type C of resname LEU, resid 2059 and segid A and altLoc >, <Atom 3: C of type C of resname LEU, resid 2059 and segid A and altLoc >, <Atom 9: N of type N of resname LEU, resid 2060 and segid A and altLoc >, <Atom 10: CA of type C of resname LEU, resid 2060 and segid A and altLoc >]> 179.54809477851654
<AtomGroup [<Atom 18: CA of type C of resname ASN, resid 2064 and segid A and altLoc >, <Atom 19: C of type C of resname ASN, resid 2064 and segid A and altLoc >, <Atom 22: N of type N of resname ILE, resid 2065 and segid A and altLoc >, <Atom 23: CA of type C of resname ILE, resid 2065 and segid A and altLoc >]> -179.63322828758015

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions