You are here

Distinctions between InterfaceAnalyzer and RosettaScript

7 posts / 0 new
Last post
Distinctions between InterfaceAnalyzer and RosettaScript
#1

Hi,

I am currently working to develop a protocol to compute the destabilization of the interface between two proteins upon mutation. The characteristics that I am looking for are a methodology that will distinguish between the perturbation induced by a mutation at the surface and one that is buried, and a perturbation that leads to destabilization of the interface versus one that does not. Thus my general strategy is to compute the energy of binding for the wild type and for the mutant and then take the difference. Because the initial structures are taken from crystal structures, I allow for repacking of the sidechains for both wildtype and mutant before computing the binding energy. From perusing the forums and sundry papers recommended there I either modified or created three different protocols. As far as I understand, the primary distinction lies in either scoring or the packing used however I am not clear how so. From my perspective they should in principle produce the same trends though perhaps not the same numbers, they do not and I am left a bit perplexed. I found that protocol 3 produces the qualitative behaviors I expected however I am somewhat unsure at this point how it differs from protocols 1 and 2. I know that in one fashion or another this question has been hashed and rehashed however having spent alot of time trying to figure out why these different methods produce wildly different results, I figured it might be nice to clarify this in a single spot. Your thoughts and insight regarding how the protocols differ would be very welcome.

Protocol 1:
Commandline:
InterfaceAnalyzer.default.linuxgccrelease @options -s <FILENAME>

options file:
-overwrite
-interface A_B
-pack_input true
-pack_separated true
-out:file:score_only score.sc

The initial sidechains are built externally and a prebuilt PDB was fed in. This was done because as I understood it the InterfaceAnalyzer when called alone did not handle resfiles. As I understand, it uses the default score12 weights. While the mutation is done in an external program I thought that since the sidechain is repacked, this would not introduce a significant issue.

Protocol 2:
Commandline:
rosetta_scripts.linuxgccrelease @options -s <FILENAME>

options file:
-parser:protocol interface_analysis.xml
-score12prime true
-ignore_unrecognized_res
-no_his_his_pairE
-out:file:score_only score.sc
-no_optH false
-ex1
-ex2
-use_input_sc
-extrachi_cutoff 1
-linmem_ig 10
-ignore_unrecognized_res
-atomic_burial_cutoff 0.01
-sasa_calculator_probe_radius 1.2
-overwrite

interface_analysis.xml:
<ROSETTASCRIPTS>
<SCOREFXNS>
<s12_prime weights="score12prime"/>
</SCOREFXNS>
<TASKOPERATIONS>
</TASKOPERATIONS>
<MOVERS>
<InterfaceAnalyzerMover name=fullanalyze scorefxn=s12_prime packstat=1 pack_input=1 pack_separated=1 jump=1 tracer=0 use_jobname=1 resfile=0/>
</MOVERS>
<PROTOCOLS>
<Add mover_name=fullanalyze/>
</PROTOCOLS>
</ROSETTASCRIPTS>

As I understand this protocol primarily differs from the 1st in terms of possibly allowing for extra chi angles in the rotamer sampling.

Protocol 3:
Commandline:
rosetta_scripts.linuxgccrelease @resfile.flags

resfile.flags file:
-parser:protocol mutation_script.xml
-s 2O8B.pdb
-ignore_unrecognized_res
-out:file:score_only score.sc
-nstruct 1
-overwrite

mutation_script.xml:
<dock_design>
<SCOREFXNS>
</SCOREFXNS>
<FILTERS>
Here we add two delta G filters to calculate the delta G of binding before and after mutations are made.
We specify a jump number of 3 because want to calculate the delta G of binding the ligand, which is chain D
in the given PDB file.
<Ddg name=dg_wt threshold=1000 repeats=50 jump=1/>
<Ddg name=dg_mut threshold=1000 repeats=50 jump=1/>
</FILTERS>
<TASKOPERATIONS>
Here we add a task operation used to specify that we want only to repack residues without design
<RestrictToRepacking name=repack_only />
Here we specify the location of the resfile to use for design.
<ReadResfile name=resfile filename=mut.resfile/>
</TASKOPERATIONS>
<MOVERS>
Here is a mover to relax the crystal structure
FastRelax name=relax/>
Here we pack the rotamers without any design
<PackRotamersMover name=pack scorefxn=talaris2013 task_operations=repack_only/>
Here we pack the rotamers with design. We specify to read the resfile containing the mutation we want to design
<PackRotamersMover name=mut_and_pack scorefxn=talaris2013 task_operations=resfile/>
</MOVERS>
<PROTOCOLS>
Here we include the movers and filters in the order we want them to run
Add mover_name=relax/>
<Add mover_name=pack/>
<Add filter_name=dg_wt/>
<Add mover_name=mut_and_pack/>
<Add filter_name=dg_mut/>
</PROTOCOLS>
</dock_design>

mut.resfile:
NATAA
EX 1 EX 2
USE_INPUT_SC
AUTO
start
<RESIDUE NUMBER> A PIKAA <RESIDUE CHANGE>

The last protocol differ from the first two in the fact that mutations are handled using a resfile rather than reading in a pre-mutated pdb, it also allows for the specification of several more repeats of the initial sampling (50 repeats in this case), rather than requiring repeated runs from the external commandline (not shown) to obtain statistical sampling. However other than these distinctions I am not clear how exactly it differs from the preceding protocol. As I said before thoughts and suggestions would be most welcome. Thank you for your consideration in advance.

Category: 
Post Situation: 
Wed, 2015-06-17 10:14
achambe

Protocols 1 and 2 differ in their rotamer selection, and in that protocol 2 has scorefunction tweaks for better performance. Most of the flags in 2 would work in 1.

Protocols 1 and 2 are both strictly fixed backbone. Both should be very fast and require essentially no sampling (the packer is pretty good at converging for small problems).

Protocol 3 seems to have FastRelax present but commented out, which would have allowed for some backbone relaxation. I'm not sure if you wanted that.

Protocol 3 explictly uses Talaris13 instead of Score12. I think that the protocol 1 will also use Talaris13 (I don't recall writing InterfaceAnalyzer to default to score12). Protocol 2 explicitly uses score12prime. Protocol 3 would benefit from some of the command line flags 2 has (-extrachi_cutoff 1 primarily, and -no_his_his_pairE won't hurt, although I think it's supposed to be fixed now). The only significant difference I can see in your 3 protocols is that you've got different sets of flags for each. It is worth remembering that documentation that says "use score12" almost certainly simply pre-dates talaris13, rather than being a statement that 12 is better than 13.

For what you are doing, you should be thinking along protocol 3 lines. InterfaceAnalyzer as a standalone executable was written because I already had a big pile of existing structures I wanted to analyze. You may as well use a script to make the mutants and analyze them immediately instead of taking multiple steps.

Wed, 2015-06-17 10:33
smlewis

Thanks for the insight. I have to admit that I thought I had the doing of it down. Apparently I just overlooked the flags. I'll update accordingly and give it another shot. I was definitely seeing that protocol 3 produced clearer distinctions in energies between residues near the interface and those further away. I commented out fast relax just to get a better sense of how much time was involved and distinctions in behavior with it on and off. Thanks again.

Thu, 2015-06-18 11:44
achambe

Hi, I seem to continue to see little distinction between the effect of mutations at the protein-protein interface as opposed to variants within protein.

As the Rosetta script is implemented, it intended to calculate the DDG of binding of the wild-type and the mutation. Taking your suggestions in mind I have set it up with the following format, however I suspect I am still missing something. I am wondering whether the DDG of binding is calculated between all of the residues of protein or not.

I also cannot seem to allow the backbone to relax without running into massive time constraints and am wondering if there is a way to specify that only the backbone within 10 Å of the residue be allowed to relax.

For a list of residue mutations I use an external script to sequentally cycle through the list changing the mut.resfile each time and calling Rosetta with the command:

rosetta_scripts.linuxgccrelease <input flags file> -s <PDB>

The input flags file is:
-parser:protocol pp_script.xml
-ignore_unrecognized_res
-out:file:score_only score.sc
-nstruct 1
-overwrite
-ex1
-ex2
-ex3
-extrachi_cutoff 1
-no_his_his_pairE
-use_input_sc
-no_optH false
-linmem_ig 10

The XML script file (pp_script.xml) is:
<dock_design>
<SCOREFXNS>
</SCOREFXNS>
<FILTERS>
<Ddg name=dg_wt threshold=2000 repeats=10 jump=1/>
<Ddg name=dg_mut threshold=2000 repeats=10 jump=1/>
</FILTERS>
<TASKOPERATIONS>
<RestrictToRepacking name=repack_only />
<ReadResfile name=resfile filename=mut.resfile/>
</TASKOPERATIONS>
<MOVERS>
FastRelax name=relax/>
<PackRotamersMover name=pack scorefxn=talaris2013 task_operations=repack_only/>
<PackRotamersMover name=mut_and_pack scorefxn=talaris2013 task_operations=resfile/>
</MOVERS>
<PROTOCOLS>
Add mover_name=relax/>
<Add mover_name=pack/>
<Add filter_name=dg_wt/>
<Add mover_name=mut_and_pack/>
<Add filter_name=dg_mut/>
</PROTOCOLS>
</dock_design>

The residue selection file (mut.resfile):
NATAA
EX 1 EX 2 EX 3
USE_INPUT_SC
AUTO
start
<Residue Number> <Chain> PIKAA <1-letter amino acid>

Wed, 2015-08-19 13:31
achambe

The Ddg filter works by calculating the energy of the complex, separating it, (optionally) repacking it, computing the energy of the separated partners, and then subtracting the two energies to get a difference. (https://www.rosettacommons.org/docs/latest/scripting_documentation/Roset...) In that sense it works on all the residues of the complex. In practice, because the range of the Rosetta energy function is limited (~6 Ang heavyatom-heavyatom distance), and the repacking in the Ddg filter is limited to residues near the interface (~8 Ang, by default), residues far enough away from the interface will not change in score in the holo and apo states, and as such will not contribute anything to the ddg.

Regarding reducing the time needed for the Relax, you can specify both taskoperations to restrict where the sidechains can repack, as well as a movemap to restrict which parts can minimize. https://www.rosettacommons.org/docs/latest/scripting_documentation/Roset... Turning off one or both in regions distal to the interface/mutation can go a long way toward speeding up the process.

Also, with FastRelax you need to be a bit careful, as certain settings can turn on design, which will slow things down tremendously. By default you should be fine, but if you add any task operations to the FastRelax, then you need to make sure you add a RestrictToRepacking task operation as well, otherwise you might open yourself up to design.

Wed, 2015-09-02 12:43
rmoretti

You do know that none of these are Rosetta++, by the way, correct? I'd like to move the question to the Rosetta3 forum but don't want you to lose your thread.

Wed, 2015-06-17 11:02
smlewis

Thanks for the head up. Please do.

Thu, 2015-06-18 11:39
achambe