Dear Rosetta Community,
I am trying to test run RosettaDNA protocol. For some background of the test, please refer to my early post on the forum: http://www.rosettacommons.org/content/rosettadna-error
I have a question regarding to residues designed around DNA binding interface by multistate design. The protein-DNA binding example I used is zfn268. When studying the output models, I saw Rosetta mutated a few key arginine residues away. These Arg residues are very important to DNA binding affinity and specifically by forming favored interaction to DNA major groove. To get understood why these Arg are mutated, I studied the scoring results. One observation is that the reference energy term of Arg is very high (probably the highest among all 20 AA). I guess on the amino acid level, the high ref energy of Arg may be due to it's high degree of freedom. But in the DNA binding case, these Arg are very helpful for DNA binding. I would rather mutate some other residues to Arg instead of mutate Arg away.
The weight of reference energy is set at 1 by default in the RosettaScript. I wonder if there are easy way in RosettaScript to lower the weight of reference energy or just reparametrize ref energy of Arg to a lower value. Any other approaches are also welcome. Thank you for your help.
The reference weight is just another scoreterm. You can just use the Reweight subtag to reweight the term to whatever you want it to be. http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Rose... That would scale all the reference weights, though. If you wanted to tweak the Arg reference weight specifically, you would have to modify the weights file itself. (Remember that if you use a scorefunction other than the default score12, you'll need to set the movers/filters tag to use it.)
All that said, I don't think that the reference weight is your main problem. Rather, I'm guessing that Rosetta isn't correctly calling all the favorable interactions that the residue is making. You may need to bump up the sampling (for example, try -ex3 and -ex4 to increase sampling on the chi 3 and 4) to get rotamers in the correct orientation to make the good hydrogen bonds.
The other approach is that since you know those residues have to be Arg, you can force in that identity by using something like a resfile, or a TaskOperation which would limit design at that position.
Thank you for your suggestion. I tried to include -ex3 and -ex4. Unfortunately, the Arg are still mutated. There are 5 key Arg which bind to DNA major groove. They are all mutated to some other AA after calculation. So I think the influence of Arg is significant.
Using a resfile to restrict the Arg is a practical method for this testing. But for real world DNA binding protein design, where there are not many Arg present in wt protein and new Arg are expected to be introduced for DNA binding specificity, the resfile doesn't know where to introduce new Arg. I understand manually force the weight contribution to scoring functions or reparametrize the ref energy of Arg is not preferred if calculation result can follow the scoring rule which is like a "force field" defined inside the application. But this is the last choice I thought to tweak the "force field" more suitable for DNA binding interface.
I guess the most direct answer is that we don't HAVE a scorefunction premade for your purpose, other than the one that Justin's DNA code is already using. Devising scorefunctions for various different simulations is an ultrahard problem: for one, we know that real physics has exactly one scorefunction, so using specialized ones is cheating, and for two, determining what the scorefunction should be for any particular simulation is something we can't do with great reliability.
If you wanted to try to create the perfect scorefunction for DNA-protein interfaces, you'd need to generate DNA-protein interfaces with Rosetta, express the proteins, crystallize them, use the differences between the crystal and the model to fix the scorefunction, and cycle through this indefinitely until you had a perfect solution. This is obviously untenable. For some problems we attempt to short-circuit this by running models of known crystal structures with different scorefunctions until we find a scorefunction that best recapitulates crystals, but this isn't particularly successful.
Manually hacking around with the scorefunction (in your case, by respecifying just the ARG weight in the reference energies) is a perfectly valid way to improve your simulation. It's laborious and error-prone, exactly like all the other methods for improving scorefunctions.
The problem you are describing is a persistent one, Rosetta is better at handling hydrophobic interactions than highly charged interactions like those found at a protein DNA interface. As others have pointed out, this probably won’t be corrected by tweaking reference energies. That said, there are a couple of easy things you can do to try and remedy the situation. If you are not already, I would recommend pre-relaxing your input structure using Rosetta Relax application. Sometimes Rosetta dislikes good interactions because of minor unfavorable interactions with nearby residues that are not being redesigned or repacked.
A more sophisticated workaround should appear in the 3.5 release using something known as MotifDnaPacker. This approach supplies Rosetta with a database of canonical protein-DNA interactions that are found in the PDB, and provides a score term bonus for incorporating similar interactions in interface designs. (See http://www.ncbi.nlm.nih.gov/pubmed/22426128 details).
Thanks everyone for the constructive suggestions. I would like to play with the scoring function in somewhat level. I did notice that this is a risky approach and may not yield favorable results. Meanwhile, I look forward to the new 3.5 release and trying MotifDnaPacker.