I want to do binding energy calculations: The basic idea is the obvious to compare the energy of the complex with the energy of the ligand and the apo-protein.
1. Rosetta Dock does something like this: "the interaction energy between protein and ligand only (interface_delta); this is the score difference between the components together and the components pulled apart by 500A". Regarding this one question. Are the components (especially the protein) relaxed before scoring? Otherwise I guess it would b highly unrealistic, no? Since the sidechains in the binding pocket would most likely arrange different in the absence of the ligand.
2. I tried my own approach, being:
- docking of the ligand --> choosing representative structures
- relaxing the docked structures --> choosing lowest E structure --> rescoring with RosettaScore = COMPLEX
- getting rid of the ligand and relaxing this structure --> choosing lowest E structure --> rescoring with RosettaScore = APO
- relax ligand --> choosing lowest E structure --> rescoring with RosettaScore = LIGAND
While doing this I realized, that the energy of my relaxed docked structure is very different from the original pose directly after docking (both rescored):
after relax 1393.364
original docking -465.427
The major change can be attributed to fa_rep (1946.389 as opposed to 91.803) ... which might be due to the fact that I use soft_rep in docking? But I gathered from the manual that the final minimization is done with normal rep: "## Use soft-repulsive scoring during search (but not final minimization). ## This slightly improves search. Hard-rep is used for final scoring,
## however, because it gives better discrimination."
So, why this very different scores? Am I doing something wrong in docking? Is it sound to do a relaxation (FastRelax) after docking to get a "proper" structure for energy calculations? The ligand position changes slightly during relaxation.
Thanks & cheers
Quick clarification: when you say "ligand", are you refering specifically to a small molecule ligand, or are you talking about a protein ligand. (I ask because protein-small molecule and protein-protein docking are treated differently in Rosetta, and when Rosetta talks about "ligands" it's usually referring to small molecule ligands.)
I don't know about Rosetta dock specifically, but typically for protein-protein binding, the structures are repacked in the apo state prior to scoring for binding energy calculations. (So rotamer sampling, but no off-rotamer sidechain minimization or backbone movement.) The most flexible way to access this calculation is to use the ddG Filter or Mover in RosettaScripts (https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Ros... and specifically https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Mov... or https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Fil...)
If you've rescored both structures with the same energy function, that means that the relax run is messing up your structures. The soft_rep in docking is not an issue because the docked structure has a good, negative energy when rescored. It's specifically the FastRelax step which is causing the energy increase. It's very strange to see the energy increase after an optimization step - especially going from negative to positive. What was the commandline you used to do the relax? The energy after relax should go down, unless you're using a different scorefunction (did you try to relax with softrep?), or you're relaxing with constraints that somehow can't be satisfied.
ligand = small molecule = Tryptophan
Regarding the energies ... sorry, typo (the worst kind: reversing the meaning), it should have read:
after relax -465.427 !!!! makes more sense
original docking 1393.364
here is the part of the E-table with the biggest contributions for a number of scenarios:
SCORE: score fa_atr fa_rep fa_sol fa_intra_rep pro_close fa_pair
#### rescored complete molecule with ligand docked after relaxation
SCORE: -465.427 -861.044 91.803 415.442 1.621 0.604 -23.395
#### test if the rescoring properly covers the ligand: rescoring of the above molecule without the ligand
SCORE: -447.297 -829.830 88.153 400.427 1.511 0.604 -22.948
#### test if the relaxation changes something to the score: rescoring of the molecule with the ligand docked WITHOUT relaxation
SCORE: 1393.364 -872.253 1946.389 425.495 1.604 0.679 -22.193
#### original output score form the rosetta dock program of the molecule with docked ligand and WITHOUT relaxation
#### rescoring of just the ligand after relaxation
SCORE: 0.048 0.000 0.000 0.000 0.048 0.000 0.000
#### rescoring of just the start structure before docking after relaxation
SCORE: -441.539 -826.359 88.196 401.614 1.508 0.645 -21.864
I will look into the RScripts advices you send me. But regarding the big energy difference I still see no reason. The energy after relaxation seems quite reasonable.
Oh, that energy difference then makes perfect sense.
Direct from the PDB structures are generally not optimized against the Rosetta energy function. They therefore typically have small (in the rmsd sense) errors which are completely irrelevant for matching the experimental crystal data, but result in very high Rosetta energies. Because of the way fa_rep works (and fa_dun), even being slightly (fractions of an angstrom) too close can cause the Rosetta energy to blow up. Frequently these "errors" aren't even anywhere near the regions of interest.
The current advice is to use an all-atom constrained relax protocol to ease your starting structure into the Rosetta energy function (http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0059004). But depending on where and what the problems are, that might not be critical.
the initial structures I used for docking 10 backrubs from the original xray structure. Each of the backrubs is FastRelaxed subsequently with the lowest poseE structure taken. So they should be in the "Rosetta Energy World", no? I rescored a start structure ant it has a fairly normal energy of ~ -400. So the high energy is only in context with the docking, if I didnt do a completely stupid mistake.
Two more things:
1. the system is a symmetric molecule of tow chains A and B with two ligand positions in the interfaces. Only one position is adressed in the standard docking (I havent looked into the RosettaScripts Symmetry so far). The other positon has the ligand in an unchanged position
2. My Input structure for docking has only the unchanged part of the ligands (the indol ring that is common to Trp and Auxin). The rest is then substituted by the appropriate .params and rotamers file.
I havent thought of both things as a problem, so far?
cheers & thanks
If you've already FastRelaxed the structures, that should have fixed the major energy issues, unless you've introduced new ones. The small molecule ligand docking energy is different from the standard energy used in relax, but they're close enough that it probably shouldn't result in that big of difference.
It might be that the second ligand (the one not docked) is in a very bad conformation. Was it subjected to any local optimization after it was placed? If you dump a PDB of a structure with the high energy, a per-residue energy breakdown will be printed at the end. You can use that to identify where the high fa_rep is coming from.
2) That should work, as long as the params files for the two ligands have the same atom names for those common core atoms. (The default behavior of molfile_to_params.py is to rename atoms, so that might not necessarily be the case, unless you were paying attention to it.) Rosetta will rebuild the missing atoms when it reads it in, depending on what params file you pass to it for that particular ligand. Depending on how things are set up, though, the conformation of the ligand that it picks might not be very good. (e.g. if surrounding residues prohibit putting it well.)