You are here

Access sigma values for rotamers

6 posts / 0 new
Last post
Access sigma values for rotamers
#1

Is it possible to get access, through the class structure, to *other* fields of the Dunbrack rotamer library? After building the rotamers for a particular pose I would like to be able to get the sigma values of each chi as well as the Dunbrack probability values, for each rotamer.

I've been drilling down through the class structure and think I see how to get the probability values, but I'm not seeing the sigmas anywhere.

Jason

Post Situation: 
Thu, 2014-05-15 07:10
pachecoj

Possibly the easiest way to get the value is to use the the get_rotamer() or get_all_rotamer_samples () methods of core::pack::dunbrack::RotamericSingleResidueDunbrackLibrary< T > (Which you can get, after dynamic casting, with a call to the get_library_by_aa () method of the core::pack::dunbrack::RotamerLibrary singleton). This returns a (vector1 of) DunbrackRotamerSampleData object. This has a chi_sd() method, which returns a fixed length (4), one indexed array of the sd values for that particular rotamer.

The values are stored in the core::pack::dunbrack::DunbrackRotamerMeanSD< S, P > class or one of it's subclasses.

Thu, 2014-05-15 08:24
rmoretti

Based on your suggestion of using a dynamic_cast to RotamericSingleResidueDunbrackLibrary<T> I am able to get at most of the chi_sd() values. There are, however, two issues:

1) The nchi() method of RotamerLibrary seems to return the wrong number of chi's for some aa's. Three failure cases I've observed are SER=2, THR=2, TYR=3. Based on my understanding these should be SER=1, THR=1, TYR=2.

2) I'm not quite sure I'm handling the non-rotameric cases correctly. For TYR the only dynamic_cast that works (i.e. returns a non-null pointer) is RotamericSingleResidueDunbrackLibrary<2>, but I only get 6 rotamers which I think I should have 9 according to the Dunbrack docs (3 for chi1 and 6 for chi2). Am I only getting chi2? I tried casting to SemiRotamericSingleResidueDunbrackLibrary<T> for various values of T, but it always returns a null pointer. I don't know if this issue is only for TYR, or whether I have similar issues for other aa's with non-rotameric chi's.

Any suggestions on either of these points would be helpful.

Jason

Sat, 2014-05-17 09:54
pachecoj

Chi angles numbering in Rosetta includes proton chis. That's what the extra chi angles in Ser, Thr, and Tyr are - the chi angle for the hydroxyl proton. These chi angles are special cased internally. The values don't come from the Dunbrack library, but are added on top of the Dunbrack rotamers.

For multiple chi rotamers, you can't just simply add the number of chi1 and chi2 variants -- there's an inter-chi dependance (such that a chi1 of a given value can prohibit chi2s that otherwise would be valid). Even if they were independent, it would be multiplicative rather than additive.

The first thing to check is if you're referring to the correct Dunbrack library version for your counting reference. Rosetta 3.5 and before use the 2002 version of the Dunbrack library by default, but the weekly releases use the 2010 version of the Dunbrack library. (The defaults can sometimes be changed with the "-dun10" or "-dun10 false" flags.) The other thing to realize is that the Dunbrack library is by default backbone dependent, so the rotamers you see can vary based on what backbone conformation you use.

Another issue is that Rosetta uses a probability cutoff when building rotamers. Often it's not worth building rotamers for very low probability states, so only the most common rotamers are built, rather than all of them. The way this is specified is with a bulk probability cutoff. For example, if set to 0.95, Rosetta would work its way down the list from most probable to least probable, building rotamers until the accumulated probability for the rotamers reaches 95%. So the rotamers which represent the least probable 5% or so would be ignored. This cutoff can be adjusted with the -packing:dunbrack_prob_buried, -packing:dunbrack_prob_nonburied and -packing::dunbrack_prob_nonburied_semirotameric options. (Defaults are 0.98, 0.95 and 0.95, respectively. Set to 1.0 to include all Dunbrack rotamers.)

Regarding "semirotameric", this is a feature of the 2010 Dunbrack library (used by weekly releases) - the 2002 version (used by Rosetta 3.5 and earlier) doesn't have these. Even for 2010, not every amino acid is semirotameric (only asn, asp, gln, glu, his, phe, trp, and tyr ). If you're not using the 2010 library, or if you're using the 2010 library, but not one of the semirotameric amino acids, trying to dynamic cast to SemiRotamericSingleResidueDunbrackLibrary will give you a null pointer, as those are simply RotamericSingleResidueDunbrackLibrary, not the SemiRotameric ones.

Sat, 2014-05-17 14:10
rmoretti

Thanks.

I'm a little confused as to how the rotamer library is built Looking at RotamerSet::build_rotamers_for_concrete() there seem to be a number of options. I am working with a protein, not DNA/RNA, so I think it *should* build the Dunbrack library. As a test, though, I completely removed the rosetta_database/rotamer directory, and it still builds the rotamers without error.

Any clarification on this process would be very helpful.

Jason

Thu, 2014-05-15 12:28
pachecoj

Yes, you should be hitting the "All other residues" clause. What happens is that the RotamerSet asks the RotamerLibrary for the SingleResidueRotamerLibrary that corresponds to whatever particular ResidueType you want to build the rotamers for. The SingleResidueRotamerLibrary is then responsible, in the fill_rotamer_vector() method, for creating Residue objects representing the rotamers. The SingleResidueDunbrackLibrary class (or rather the RotamericSingleResidueDunbrackLibrary< T > and SemiRotamericSingleResidueDunbrackLibrary< T > subclasses) is the subtype of SingleResidueRotamerLibrary responsible for handling the standard amino acids (those in the Dunbrack library). Each amino acid has their own SingleResidueDunbrackLibrary instance, which is loaded and cached by the RotamerLibrary early on in the run. The loading code is somewhat convoluted, as it has to handle both dun02 and dun10 formats, and for each there's a cached binary format to speed loading, which is generated the first time things are run, and then stored in the database/rotamer directory. (There should be log statements to this effect, and for normal runs you'll see mention of loading the Dunbrack rotamers from binary.)

I'm not sure why things still work when you delete the rosetta_database/rotamer directory. That needs to exist to load the rotamer libraries. The only thing I can think of is that your program is either reading the database from a different location than the one you deleted it from, or you deleted the directory in the middle of a run, after the libraries were loaded into memory.

Fri, 2014-05-16 13:02
rmoretti