I am having a problem when setting the packer to design residues that are non canonical in the pdb file I provide. I have an example for 2ovq where the short peptide called chain C has two phosphorylated residues (TPO at 380 and SEP at 384). When I make a resfile like this:
384 C ALLAA
The packer will only sample this position with phophoserine, phosphothreonine or phosphotyrosine. My interpretation is that the packer is secretly aware of the class of modified residues and only considers the phosporylated residues as "canonical" replacements for this class. I am experiencing the problem running the ddg_monomer application but it seems to be a feature of the packer and not the app.
Question is then; how can I turn this off? I want the packer to sample all the residues that we normally know as canonical (ACFGILM ... ).
Rosetta treats phosphorylation as a "variant type". The packer refuses to alter "variant types" during packing. Consider it this way: most users would WANT rosetta to keep their phosphoresidues during packing instead of removing them, or even worse just putting phosphates in when it feels like it!
Broadly, the easy solution is to modify your inputs to not include the phosphoresidues. This will screw up the ddG's, of course. I'm not sure that ddG was benchmarked on phosphoresidues.
I guess you can get the energies for the pT and the T separately, and then the dGs to the mutations, and then do some of the addition outside of Rosetta?
That makes sense but it surprises me that you say that it cannot be done at all.
You are right that ddG methods in Rosetta are not benchmarked with phosphoresidues and also that there is a possible (and very ugly) way around it by doing substraction outside Rosetta after running on both the native protein and thereafter the protein with its phosphoresidues removed.
Thanks for your response,
" cannot be done at all"
Well, I'm assuming you don't want to rewrite the C++ code. You can go in and special-case the variant matching code in PackerTask to ignore phosphorylation variants when determining allowed ResidueTypes. Users generally frown on that. (I mean, I wrote a chunk of PackerTask and I don't want to try doing it...)
There is a large upcoming change (it's still in development) to the packer called the "PackerPalette" that is likely to be able to address this issue. It's designed in an general sense to address Rosetta's issues in switching between the exact 20 canonical amino acids and the wider world of chemical modifications and noncanonicals. No idea on the timeframe.
Haha, yeah the "cannot be done at all" should be put in the context of doing it with the Rosetta code base as it currently is.
Certainly I would not mind making the change in the C++ code and recompile, at least if it is only require editing a conditional statement or similar, but finding where it is and testing it afterwards (running unit tests etc.) is more troublesome. So I guess I will have to wait for the new PackerPalette to be committed to main.
Thanks Steven that was really helpful to get clarified.
Oh, and another thing related to the Packer that you can probably easily answer:
How come the Packer includes two histidines in packing when specifying "ALLAA"? I presume that one of them is D-histidine but the ddg_monomer output unfortunately does not specify which is D and which is L. Is there a default order? I also noticed that D-histidine is included under the "l-caa" .params files:
The behaviour of the Packer to include D-histidine stands in contrast to the resfile documentation that says 20 amino acids are included:
Placing HIS_D.params in the "l-caa" folder.... Well that must be for historical reasons because namespace wise it does not make much sense.
HIS, which is implicitly HIS_E, and HIS_D are the different chemical tautomers of L-histidine. E and D are which ring nitrogen has a hydrogen - one is NE and one is ND. The D-histidine that is the stereoisomer at the alpha carbon lives with the DCAAs where you'd expect, as DHI probably (didn't check) - there are probably both hydrogen placements separately there, too.
There are two params files because there are two ResidueTypes, because they have different atoms (same elements, but that hydrogen changes names and chemical connectivity). A ResidueType is extremely rigid in terms of chemistry - any change in chemical connectivity is a new type. This is somewhat historical - Rosetta3 knew it needed HIS handled properly from day 1, but a lot of the other residue fiddly bits got figured out much later. If we were doing it today it would be via "patches".
I see. Then it makes most sense to choose the the lowest energy type in the ddg calculations in my case.
Once more, thank you so much, this helped me a lot.