Dear Rosetta developers, I am wondering whether it is possible to cross-check results between Rosetta++ and Rosetta 3 by using the same scoring function for each.
As an example, I tried evaluating the score of the attached PDB file using the two programs and obtained -79.88 from Rosetta++ and 323.668 from Rosetta 3. I also tried breaking down the energies into components to resolve the difference, without much success.
I've attached a more detailed output below.
The reason I am trying to do this is our lab would like to make use of the protocol outlined in
"Structure-based protocol for identifying mutations that enhance protein-protein binding affinities", Sammond et al.
to design improved binders to glycoprotein hormone receptors. So far we have used the version currently running on the Rosetta Design server, which is based on Rosetta++. We were encouraged by the extensive validation studies performed in the paper, and indeed the results we obtained were encouraging, with the protocol already producing mutations matching those known from the literature.
However we would like to use this protocol to examine protein complexes containing post-translational modifications, support for which is only available in Rosetta 3. Thus I am learning more about the Rosetta internals, in an effort to port increase_affinity from Roestta++ to Rosetta 3 (using PyRosetta).
I am starting this effort by trying to reproduce the Rosetta++ scoring function on Rosetta 3, given that the Sammond results were thoroughly validated against the Rosetta++ function.
Thank you and sincerely,
p.s. if anyone knows any efforts to update increase_affinity to newer scoring functions, or to get it running on Rosetta 3.5, please let me know. We are an experimental group (genetics, Hay lab) and would definitely appreciate any assistance in this direction.
COMMAND: rosetta.gcc -paths ~/Software/Rosetta2/rosetta++/paths.txt -s a.pdb -score -intout test.txt -overwrite -remark_output -scorefile a.score
TOTAL SCORE -79.88
Rosetta 3 (using PyRosetta.ScientificLinux-r56324.64Bit)
from rosetta import *
pose = pose_from_pdb("a.pdb")
scorefxn = get_fa_scorefxn()
print "E = %f" % scorefxn(pose)
TOTAL SCORE 323.668
Weight Raw Weighted
fa_atr 0.8 -302.098 -241.678
fa_rep 0.44 419.826 184.724
fa_sol 0.75 193.97 145.477
fa_intra_rep 0.004 202.386 0.81
fa_elec 0.7 -19.301 -13.511
pro_close 1 28.542 28.542
hbond_sr_bb 1.17 -5.507 -6.443
hbond_lr_bb 1.17 -14.493 -16.957
hbond_bb_sc 1.17 -2.204 -2.579
hbond_sc 1.1 -1.476 -1.623
dslf_fa13 1 -5.318 -5.318
rama 0.2 -7.836 -1.567
omega 0.5 35.077 17.538
fa_dun 0.56 430.751 241.22
p_aa_pp 0.32 -16.289 -5.212
ref 1 0.245 0.245
The main difference you're seeing currently between the Rosetta++ scores and the Rosetta3 scores you list is the difference between centroid mode and fullatom mode. The Rosetta++ scorefunction you're displaying is a centroid mode score, but your PyRosetta scoring is fullatom.
Looking at the paper, it looks like you probably *don't* want the centroid energy function. Instead, you'll probably want the "soft_rep_design" energy function. Rosetta3 has a weights file which is called "soft_rep_design.wts", which likely is mostly analogous to the energy function used in the Rosetta++ runs in the paper.
Some *major* caveats, though: while many Rosetta++ functionalities got ported over to Rosetta3, not everything got ported, and what got ported wasn't necessarily exactly identical in behavior to Rosetta++. While mostly similar, there's no guarantee that Rosetta3 functionality is the same as Rosetta++ functionality. Moreover, even within Rosetta3 itself, things get tweaked and continually improved. So the behavior of something today might be different that what it was several years ago (or even last month).
If you're using a recent PyRosetta (an r56324 would count), the main thing you have to watch out for is the score12 -> talaris2013 transition. This transition introduced significant changes in how various scoring terms worked and other underlying behavior. If you want to recapitulate the behavior of earlier versions of Rosetta, you'd be advised to add the commandline flag -restore_pre_talaris_2013_behavior when you initialize PyRosetta (See http://www.pyrosetta.org/faq#TOC-1.-How-do-I-interact-with-the-Rosetta-O... )
I might also recommend running some of the example systems from the paper with your PyRosetta scheme, and checking that the performance is close to or better than what they saw in the paper, to make sure things are working well.
Thanks rmoretti for the detailed reply! I will look out for centroid vs full atom modes; soft_rep_design vs default weights; and also score12 vs talaris2013 differences. This is quite helpful in navigating the different options and scoring modes available in Rosetta.
I've been performing a comparison of many various scoring function and atom representations in Rosetta 2.3 vs 3.5, and the results all make sense so far. There is just one detail I've having difficulty getting.
Right now in Rosetta 2.3, the option -soft_rep_design doesn't seem to have any effect on the score.
The command I am using is
rosetta.gcc -paths ~/Software/Rosetta2/rosetta++/paths.txt -s a.pdb -score -intout score.txt -overwrite -remark_output -scorefile a.score -soft_rep_design
rosetta.gcc -paths ~/Software/Rosetta2/rosetta++/paths.txt -s a.pdb -score -intout score.txt -overwrite -remark_output -scorefile a.score
Do you know how I can set up a calculation to compare the score with vs without the softened potential?
Thanks much, Julius
I should add that I also ran it in full atom mode (-fa_input), and the -soft_rep_design flag didn't seem to have an effect then either.
If the softrep flag is being read appropriately, you should be getting output messages like "using soft potentials as requested by the -soft_rep_design flag" and "!!! weights for packer modified according to the -soft_rep_design flag !!!" in the Rosetta++ logging output.
I'm not tremendously familiar with the Rosetta++ code, but it could be that with the plain scoring mode you don't actually trigger the soft_rep_design scoring -- instead you may actually need to use the packer in order to trigger its use. (And it may be that even with the packer use you'll automatically rescore with a non-soft-rep scorefunction at the very end automatically.)
Thanks rmoretti. I do see the lines:
using soft potentials as requested by the -soft_rep_design flag
!!! weights for packer modified according to the -soft_rep_design flag !!!
but it looks like the score finally output is the one for the non-soft-rep scorefunction. I'll see if I can get access to the soft-rep energy somehow.
I figured it out! the solution is to pass in the additional flag