I recently had some issues with re-ranking RosettaDock-generated decoys using ZRANK. The basic idea is described in a paper by Pierce and Weng, 2008 (http://www.ncbi.nlm.nih.gov/pubmed/18214977), where the authors used RosettaDock to refine docking predictions from ZDOCK, and then do a rescoring using ZRANK. Since ZRANK requires polar hydrogens to be added to all decoys before it performs rescoring, they stated in their paper "ZRANK was used to rerank the ZDOCK models as described previously, with polar hydrogens added to the unbound proteins using RosettaDock prior to scoring. For the refined structures, hydrogens were already in the structures from RosettaDock. The nonpolar hydrogens (which were also added by RosettaDock) were ignored by ZRANK.".
I thought it should be OK since I turned on -no_optH false flag during docking, so Rosetta should be able to optimize polar H positions. However the hydrogens seemed not to be recognized by ZRANK with this error "Warning: no hydrogens found in PDB file, need to include polar hydrogens for accurate scoring".
I noticed the authors used RosettaDock 2.0, different from the version I used (3.5). Not sure if this causes the problem. I also noticed that there are some differences in naming the H atoms than other programs that add polar H. I tried using Discovery Studio to add H, and it worked well with ZRANK. So I am wondering how to make the Rosetta hydrogens recognizable by ZRANK. Anyone have suggestions on this? Thanks in advance.
Hydrogen naming conventions are a little bit of a muddle in Rosetta at the moment. We currently use an older "alternate" naming system for hydrogens, rather than the currently recommended naming conventions from the wwPDB. We don't have a flag which allows you to switch the naming conventions, unfortunately.
A slightly painful way of fixing this is to go into rosetta_database/chemical/residue_type_sets/fa_standard/residue_types/l-caa/ and edit the names of the hydrogens in the params files found there. This will cause Rosetta to use the names you specify for those atoms and residues.
The other alternative is to use a bit of (e.g. sed) scripting to change the names of the hydrogens.
Alternatively, PyMol worked fine for me to add H and then reranking by Zrank.
I think I know how to make it work. I compared the working pdb file with Rosetta output file and found the little secret.
Apart from the H naming issue, Rosetta tends to output docking decoys without "TER" entry between chains. I guess Rosetta reads pdb files by chain IDs and simply neglects the TER entry in the file. Manually adding the TER entry fixed the problem with ZRANK. Although I am not sure if Rosetta can explicitly show a TER entry between chains, this can be easily done through a little scripting.
I have tried to add "TER" entry between chains and added polar hydrogens to unbound proteins before rescoring with ZRANK. However, I'm still facing the "Warning: no hydrogens found in PDB file, need to include polar hydrogens for accurate scoring" problem. Is there any suggestion on this? Thank you.
Rosetta still uses the older naming convention for hydrogens. You'll need to use a script to change the hydrogen names from the Rosetta convention to the one that ZRANK wants. I am unaware of an existing script which will do this for you, so you'll need to write one yourself.