I'm writing a pilot app in Rosetta and I want to cluster results (all pdbs) of global dock. After calculating the scores, they are completly different with the original scores I got from docking_protocol. I'm using following codes:
core::scoring::ScoreFunctionOP scorefxn( core::scoring::get_score_function() );
( *scorefxn )( init_pose );
I found that the input pdbs do not have sidechains, however using following command just made the scores better but the difference is still remained:
pack::task::PackerTaskOP task( pack::task::TaskFactory::create_packer_task( init_pose ));
task->initialize_from_command_line().or_include_current( true );
protocols::simple_moves::PackRotamersMoverOP pack( new protocols::simple_moves::PackRotamersMover( scorefxn, task, 1 ) );
pack->apply( init_pose );
Please let me know if I should ask my question somewhere else.
Thanks a lot for your help in advance
This is the best place to ask questions about Rosetta. (Except perhaps if you're a member of a RosettaCommons lab, and have access to the internal mailing lists.)
I suspect that your issue is that the docking results are scored with one scorefunction, but you're rescoring with a different one.
Normally, if you don't specify a scoring funtion, you'll get the default one (talaris2014 for the most recent weekly releases and Rosetta3.6). It depends slightly on how you're rescoring, but for things like score_jd2, you can use the -score:weights option to specify a weights file. The trick is to figure out which weights file is being used originally. One way is to look at the code which scored your original structures, but probably an easier way is to look at the output PDBs: the weights used should be listed at the top of the per-residue energy table. You can compare those numbers with those in the weights files in Rosetta/main/database/scoring/weights/. At the very worst (e.g. if the program did custom weight manipulations), you can always try making your own weights file. (-score:weights can take a local filename, in addition to the database weights file.)
A further complication is that you say your "input pdbs do not have sidechains". (By this, I assume you mean that the results of the global docking - the inputs to rescoring - don't have sidechains.) This indicates to me that they're in centroid mode. (Are there CEN atoms in the PDBs?) This means that to get consistent rescoring, you're going to need to 1) use a centroid mode scorefunction 2) read in the PDBs in centoid mode. So if you want to match the scoring, *don't* re-add the sidechains. Normal rescoring (e.g. with score_jd2) will probably assume that the PDBs are in full atom mode, and automatically readd sidechains. Instead, tell the PDB loader that you want centoid mode structures. Typically this is done with -in:file:centroid or -in:file:centroid_input, but it can vary based on the exact protocol you're using.
Dear Rocco Moretti,
Thanks a lot for your quick response. The PDBs that I'm using in my pilot app are all the results of low-res global docking and they are all in centroid mode. I've used the default docking_protocol by some command line options and as I didn't relax the output pdbs, I think it's normal to have results in centroid mode (does it make sense?)
I guess, importing the pdbs in centroid mode and using a centroid mode scoring function would be the solution according to your points. I've created a score function as the same as DockLowres protocol:
scorefxn = core::scoring::ScoreFunctionFactory::create_score_function( "interchain_cen" );
I've also used these command line options:
I've tested the default score function with -score:weights cen_std command and the results are the same!
Now, the scores are very close to the original, but there are still some differences. Should I define the docking partners as I'm using interchain_cen score? The score headers in the pdb files are as follows:
SCORE: total_score score rms cen_rms interchain_contact interchain_env interchain_pair interchain_vdw st_rmsd time description
Thanks a lot for your valuable points...
Since you're passing the structures through PDB format, which rounds the coordinates to 0.001 Ang precision you should expect to see slight numeric variability in rescoring, due to the slight change in coordinates. It shouldn't be large, though, unless you have really big clashes or on some other such knife edge of scoring.
Yes, in order to get the interchain_cen scoring terms to work, you'll need to annotate the pose with the INTERFACE_INFO. There's not an easy way to do this from the commandline (e.g. with score_jd2), but if you're writing C++ code (or PyRosetta) you can follow the example in src/protocols/docking/DockSetupMover.cc for DockSetupMover::apply().