You are here

Inconsistencies in full atom energy calculations

5 posts / 0 new
Last post
Inconsistencies in full atom energy calculations
#1

Hello,

I'd like to decompose the energy in one body energy terms (for each residue) and pairwise energy terms for residue-residue interactions.
I tried to check if I could decompose these terms and sum them up to retrieve the score returned by the scoring function.
The problem is that I try 3 different ways of computing the energy, and I get 3 different numbers.
Could someone please tell me what's wrong ?

Here is my code:

def test_ig_score(pose):
score_fxn = get_fa_scorefxn()
task_pack = TaskFactory.create_packer_task(pose)
parse_resfile(pose, task_pack, "data/3chy.resfile") # NATAA for all residues
mover = PackRotamersMover(score_fxn, task_pack)
mover.apply(pose)
ig = mover.ig()
print ig.get_energy_current_state_assignment()
print score_fxn(pose)
ener=0.0
for res1 in range(1,ig.get_num_nodes()+1):
ener+=ig.get_one_body_energy_for_node_state(res1,ig.get_node(res1).get_current_state())
for res2 in range(res1+1, ig.get_num_nodes()+1):
if (ig.get_edge_exists(res1, res2)):
ener+=ig.find_edge(res1,res2).get_two_body_energy(ig.get_node(res1).get_current_state(),ig.get_node(res2).get_current_state())
print ener

And here is the output :
-137.874694824 <- get_energy_current_state_assignement()
-137.865758236 <- score_fxn(pose)
-137.874761105 <- Sum of all terms

The difference is always small between 1 and 3, but sometimes it is quite big with 2 (i.e. the score returned by the scoring function).
Thank you for your help

Category: 
Post Situation: 
Tue, 2014-07-29 06:02
dsim

Hello again,

Maybe my question wasn't clear enough so I'll try to simplify it.

If I sum all one body and two bodies energy terms of an assignment in an interaction graph shouldn't
I get the total score of the corresponding pose ?

I try to sum all these terms through the following methods:
get_one_body_energy_for_node_state from the class DensePDInteractionGraph
get_two_body_energy from the class DensePDEdge

If I score the pose with the scoring object like this : score_fxn(pose) then I get a sligthly
different number than what I get if I sum all the one body and two bodies energy terms.

Thank you for your help

Mon, 2014-08-18 08:09
dsim

Could anybody at least tell me if what I try to do is correct ?

Does Sum of one body and two bodies energy terms = total score ?

Wed, 2014-09-10 05:15
dsim

Sorry for leaving you hanging on this - I tried looking into it, and even reached out to others who are more knowledgeable, and I haven't been able to get any sort of clear answer on things. (I will confirm that I see the same sort of issues you are seeing, though, even if I don't know the explanation for it.)

For your last question, while most energy terms are either one or two body energies, there are a number of terms which are "whole structure" energies. (See http://www.ncbi.nlm.nih.gov/pubmed/21187238 for details.) I don't believe that there are any whole structure terms in talaris2013, though. There are also "long range" two body interactions, which are treated differently from the short range two body interactions, and although the disulfide and constraint terms in talaris are long range two body terms, they don't seem to be playing a role in this issue.

The one complications you're running into is that the way the energies are calculated for the packer/interaction graph and the way they're calculated for simple scoring is different. In simple scoring each residue is treated as a whole and the pairwise and single body interactions are calculated based off of the whole resiude. In the packer/interaction graph, however, sidechains and backbones are treated separately. The reason for this is that during packing the backbone structure doesn't change, so all backbone-backbone interactions and one-body backbone terms can be rolled into the "constant" value, and not treated during packing. (Interactions with sidechains which aren't being repacked also get rolled into the constant value.)

From what I've played around with, though, it doesn't look like that is a major contributor to the discrepancy. There's also a difference in precision between the intermediate energies for packing and for regular scoring (single versus double precision), and that also doesn't look like it will account for the issue.

I also looked into seeing if there are structural differences for the structure that is represented by the interaction graph and the one that's being scored. Packing is a recover lowest Monte Carlo simulated annealing process, so the structure that comes out is not necessarily the last one that has been evaluated. From what I can tell, though, that shouldn' t affect things here. In the test runs I've done, the last structures from packing have all been the lowest energy, and I've still seen the discrepancy issue pop up. Unfortunately it isn't straightforward to set up an interaction graph in the absence of a packing protocol, so I can't completely rule out the issue.

A final wrinkle on the interaction graph issue is the fact that the interaction graph has a distance cutoff, whereas the standard scoring doesn't. That is, for each (short range) pairwise energy term in the normal scorefunctions there's a distance outside of which the score drops to zero. The packer takes advantage of this by ignoring residues (interactions) which are separated by more than that cutoff. To do this with a rotamer library, there's some heuristics involved, treating residues as spheres. If that's not set appropriately, marginal interactions which get included in regular scoring may not be counted in the interaction graph. I don't think that's the case here, though. I just mention it for completeness.

If you want to play around with things, though, I'd recommend taking a look at the EnergyGraph ( pose.energies().energies_graph() ). This stores the (short range) two body interactions as computed by the ScoreFunction during regular scoring. The one body energies are stored in pose.energies().onebody_energies(seqpos). These both store per-term breakdowns on a residue level, whereas the interaction graph only stores the sum. The long range energies are stored in the pose.energies().long_range_container(). You can pull out the numbers from these and see if there is a particular set of residues which are giving you the discrepancy, and see what might be the issue with that residue.

You can also take a look at the various scorefxn.eval_*() methods and the scorefxn.evaluate_rotamer_*() methods, which should calculate the various breakdowns between residues for all of the different scoreterms.

Fri, 2014-09-12 08:58
rmoretti

Thank you for your reply, it is very instructive.
The reason for that difference might be a sum of everything you describe.
I'll try to look into the scorefxn_eval functions.

Mon, 2014-09-15 07:47
dsim