Should FastRelax be commented out in Calculate_protein_protein_ddg demo?

12 posts / 0 new

Top

Here's the source code of mutation_script.xml, part of the demo calculate_protein_protein_ddg included in Rosetta (see https://www.rosettacommons.org/demos/latest/public/calculate_protein_protein_ddg/README). I deleted the comments to make it shorter:

<dock_design>
    <SCOREFXNS>
    </SCOREFXNS>
    <FILTERS>
    <Ddg name=dg_wt threshold=1000 repeats=3 jump=3/>
    <Ddg name=dg_mut threshold=1000 repeats=3 jump=3/>
  </FILTERS>
    <TASKOPERATIONS>
            <RestrictToRepacking name=repack_only />
            <ReadResfile name=resfile filename=UbcH7-A98W.resfile/>
    </TASKOPERATIONS>
    <MOVERS>
        FastRelax name=relax/>
        <PackRotamersMover name=pack task_operations=repack_only/>
        <PackRotamersMover name=mut_and_pack task_operations=resfile/>
    </MOVERS>
    <PROTOCOLS>
        Add mover_name=relax/>
        <Add mover_name=pack/>
        <Add filter_name=dg_wt/>
        <Add mover_name=mut_and_pack/>
        <Add filter_name=dg_mut/>
    </PROTOCOLS>
</dock_design>

You can see that the FastRelax mover is commented out (it's missing the opening <). This is exactly like this on the original mutation_script.xml file included in Rosetta.

Is this a typo or is there a reason for this? Note that I don't care so much for computation time. I want to obtain results as accurate as possible. So what is the recommended protocol here? Use FastRelax or not?

Category:

Design

Post Situation:

Unsolved

Mon, 2016-10-31 12:01

cossio

Top

Noting that uncommenting the FastRelax makes the protocol MUCH more slower.

Mon, 2016-10-31 12:57

cossio

Top

Two more questions:

1) Why is the threshold=1000 on <Ddg .../> so high?

2) I am running the demo with -nstruct 1000, to generate 1000 pdb structures. For many of them, the ddG = 0 (exactly zero!). Is this likely? I'd expect the ddG to be small, if the mutation has a negligible effect. But a value that is exactly zero seems unlikely. Is this normal?

Thu, 2016-11-03 10:00

cossio

Top

I'm guessing that the fact that the relax slows down the run tremendously is the reason why it's commented out.

Note that the relax is the very first thing in the protocol - this means that you're pre-optimizing the structure prior to the ddG calculation - the relax is not being applied during the ddG calculation proper. If you already have a decently optimized structure (e.g. if you already ran relax on the strucutre, or if it's coming from some other Rosetta protocol which does a relax), then the initial relax is likely superflous. Instead, you can get away with just repacking the structure.

The other issue is that you typically don't do a single ddG calculation - normally it's done against a number of different mutants. If you include the relax as part of the ddG script, you re-relax the structure for each independent mutations. It's much more CPU efficient to do the relax once, as a preparation step, rather than doing it for every mutation.

Regarding the threshold, the reason is that you don't really want the ddG filters to act as "filters" - you want to run the ddG protocol, send the results to the tracer and continue. Setting the threshold that high accomplishes that. If the threshold was lower, you might actually trigger it and stop the run prior to outputing the mutated structure. (If you set `confidence=0`, the the filters wouldn't be run at all at the point in the protocol, only being run at the very end of the protocol, both on the mutated structure.)

How are you calculating the ddG values? If you are pulling the values from the score file, those values are both from the exact same (mutated) structure (the filters being run at the end of the protocol), so it's reasonable that the difference between the two are zero - if the two converge to exactly the same packed rotamers, then there may be no difference between the structures, leading to an exactly zero difference.

Mon, 2016-11-14 15:33

rmoretti

(Reply to #5)

Top

Thanks for the reply.

How are you calculating the ddG values?

I am doing what the README.md (of the demo) says:

The output file will be a repacked and mutated PDB file with the dg_wt and
dg_mut lines at the end of the file. Subtration of these two numbers yields
the change in binding energy.

So I am looking at the pdb output files. The final two lines contain ddG scores, for example, here are the last two lines of one .pdb output file from a run I did using the above script:

dg_mut -22.8434
dg_wt -22.8434

You see that the numbers are exactly the same.

Looking forward to your reply.

Tue, 2016-11-15 16:12

cossio

(Reply to #6)

Top

Yeah ... that's not going to work. -- I'm not sure why they're doing it that way.

As I mentioned, the values in the scorefile (and by extension the end of the PDB) are calculated based on the structure at the *end* of the protocol. So both ythe dg_mut and dg_wt values are being calculated on the same structure ... hence the reason they're the same.

If you have a recent weekly release, then you can use the FilterReportAsPoseExtraScoresMover (https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/FilterReportAsPoseExtraScoresMover) . You would make two new movers, one for each filter, and then in the PROTOCOLS section use those movers, rather than the filters. This should do what you expect to do.

Thanks for the bug report - I'll fix the demo accordingly.

Wed, 2016-11-16 15:18

rmoretti

(Reply to #7)

Top

Thanks. When you fix the demo can you let me know here, or send me the updated version?

Thu, 2016-11-17 11:22

cossio

(Reply to #8)

Top

I don't understand why the filters are run on the same mutated structure. In the <PROTOCOLS> ... </PROTOCOLS> section, I place one Ddg filter *before* the mutation, and another *after* the mutation. Why are they both run after the mutation?

Thu, 2016-11-17 04:54

cossio

(Reply to #9)

Top

It's due to a quirk in the way RosettaScripts filters work.

The *filtering* portion of the Filter works at the time you place it in the PROTOCOLS section. That is, if you pass/fail based on the result of the filters, you do so at the point of time the filters are placed in the PROTOCOLS block.

However, the *(re-)scoring* portion of filters for the scorefile (or end of the PDB) work based on the structure at the *end* of the protocol. The philosophy is that the scores in the scorefile (or PDB file) are for the final structure, so the filter-related scores should also be for the final structure. - I admit it's confusing. It's probably not the way I would have done it.

Mon, 2016-11-21 07:12

rmoretti

(Reply to #10)

#10

Top

Thanks. It's clearer now.

Mon, 2016-11-21 07:56

cossio

#11

Top

@rmoretti Based on your comments, I modified my protocol as follows. I would like to know your opinion. Thanks.

I relaxed the crystal pdb containing the complex with FastRelax. I generated 100 decoys and selected the one with lowest score. Then I feed this structure to the following script to compute the dg of a mutated structure:

<ROSETTASCRIPTS>
	<SCOREFXNS>
	</SCOREFXNS>
	<FILTERS>
		<Ddg name=dg threshold=1000 repeats=5 jump=1 repack=true repack_bound=true/>
	</FILTERS>
	<TASKOPERATIONS>
		Here we specify the location of the resfile to use for design.
		<ReadResfile name=resfile filename="%%mutresfile%%"/>
	</TASKOPERATIONS>
	<MOVERS>
		Here we pack the rotamers with design. We specify to read the resfile containing the mutation we want to design.
		<PackRotamersMover name=mut_and_pack task_operations=resfile/>
	</MOVERS>
	<PROTOCOLS>
		Here we include the movers and filters in the order we want them to run.
		<Add mover_name=mut_and_pack/>
		<Add filter_name=dg/>
	</PROTOCOLS>
</ROSETTASCRIPTS>

I also apply this script to the wild-type (by using a resfile that only allows the native residue).

This way I obtain two distinct PDBs, the wild-type and the mutant, each with its dG. Substracting the dG's gives the predicted ddG of mutation.

Thu, 2016-11-17 14:12

cossio

(Reply to #12)

#12

Top

Sounds good to me!

Mon, 2016-11-21 07:13

rmoretti

Search form

You are here

Should FastRelax be commented out in Calculate_protein_protein_ddg demo?