I have the .pdb structure of a complex of two chains.
I want to try the effect of a mutation in one of the interfaces on the binding energy. I am new to Rosetta. Can someone explain step by step how to compute this delta-delta-G?
There's a number of ways to do it, but one way is to use the InterfaceAnalyzer application (https://www.rosettacommons.org/docs/latest/interface-analyzer.html) to compute the binding score of the interface. This only scores the structure as it exists, though, so you need to make the mutation with another program. The simplest option is probably the fixbb application (https://www.rosettacommons.org/docs/latest/fixbb.html). You give this a resfile (https://www.rosettacommons.org/docs/latest/resfiles.html) which specifies the mutation you want to make, and the other sidechains you want to change to accommodate the mutation. You can then run the protocol twice, once with the mutation you want, and once with the wild type sequence (to allow Rosetta to rearrange the wild type residues in the same way as the mutant one.) You can then run the interface analyzer on both, and look at the difference in scores. This does fixed backbone redesign, if you're looking for backbone flexibility, you may want to use the relax application to make the mutation instead (https://www.rosettacommons.org/docs/latest/relax.html). This does more extensive backbone rearrangement, and in recent weekly releases can also take a resfile to specify mutations, like fixbb can.
There's certainly other ways of doing this, some of which give you more flexibility about backbone movement and which residues to repack. RosettaScripts (https://www.rosettacommons.org/docs/latest/RosettaScripts-Documentation....) gives you a lot of flexibility, but it may be overwhelming to put together a protocol from it.
By the way, the values that Rosetta will give you will be in "Rosetta Energy Units", rather than kcal/mol or kJ/mol. In practical experience these values can be more-or-less linearly correlated with experimental energy values, but the conversion factor can vary based on protocol and approach. Thus, the numbers you get out of any Rosetta protocol will only have limited use for examining a single mutant, and tend to be of more use when comparing a number of point mutants (or potential point mutants) with each other.
Is there a way to tell relax or fixbb to only relax the residues in the vicinity of the mutation?
Fixbb takes a resfile (https://www.rosettacommons.org/docs/latest/resfiles.html) which specifies which sidechains can move and which can't. Just
Recent versions of relax should take both a movemap file and a resfile. The resfile speficies which sidechains can repack in the repacking step. The movemap file (https://www.rosettacommons.org/docs/latest/movemap-file.html) specifies which degrees of freedom (backbone and sidechain) are allowed to move during the minimization portion of the relax procedure.
If you want to do more complex things (like adaptively detect which residues are near the mutation) you'll have to use the RosettaScripts framework (https://www.rosettacommons.org/docs/latest/RosettaScripts.html) and the TaskOperations/ResidueSelectors provided there. (There are both relax movers and fixbb/PackRotamers movers available through RosettaScripts.)
Can you point out some papers where Rosetta is used this way to compute binding energies between proteins and changes of binding energies upon mutations?
I've been reading the RosettaScripts documentation and played with the demos, but I haven't figured out how to adaptively detect residues which are near the mutation and relax only those (backbone and sidechain). Can you help me with that? Thanks again.
You probably want the DesignAround task operation (https://www.rosettacommons.org/docs/latest/TaskOperations-RosettaScripts...). This can select only residues which are within a certain distance of a particular residue and turn off repacking and design of residues outside of that range. This requires manually specifying the residues which you want to center around.
I am unaware of a TaskOperation which would compare a pose to a reference sequence, and then select residues which are a certain distance away from the residues which have been mutated. It's certainly a reasonable operation, but not one that has been implemented yet.
I am doing enzyme redesign using Enzdes application. But based on the conversation above, it seems fixbb and relax can do the same thing. What is the difference then. My understanding is that fixbb is incorporated in enzdes application and the scoring done by rosetta is equivalent to that done by the interface analyser.
Can you clarify this ?
The different protocols of Rosetta use a lot of the same machinery internally. There's some basic optimization steps like packing, minimization and scoring which most protocols use, but how they use them - in which order and on which parts of the protein they apply them is the difference. (Many of the defaults mentioned below can be changed by commandline options and the like.)
Fixbb is basically just the packer - on a fixed backbone, apply a Monte Carlo Simulated Annealing optimization over a discrete set of rotamers.
The standard enzyme_design protocol uses packing, but it also adds in rounds of minimization, including backbone and ligand rigid body minimization. It also defaults to doing packing and minimization just in the region around the ligand. (Normally there's three cycles of packing and minimization: pack/min/pack/min/pack/min)
Relax also uses packing and minimization, but it uses a more involved cycle of packing and minimization and does so over the whole protein. In addition, during the cycles of packing and minimization there's a ramping of the repulsive scoreterm. It starts low so that the protein compacts, and then it's ramped up so the protein pops back into the normal size. It does this compact-expand process several times. We've found that this process is good for really minimizing the energy of the protein.
As far as evaluation, there can be difference based on if you use the score for the entire protein or just the score of the interaction between multiple chains. Even in the later case, there are difference between summing the interaction energies in the complex versus scoring the complex and the separated chains and doing the subtraction, versus doing the holo and apo scoring with a repacking/minimization step in the apo state.
The enzyme_design application tends to do the first - it looks at the energy of interactions strictly in the complex state. The InterfaceAnalyzer uses the first for dG_cross, but uses the last for dG_separated (or the second if you turn off apo repacking.) Generally speaking for small molecule ligands the difference between the two is minimal, but for protein-protein docking you tend to get better (but noisier) results if you do apo repacking.