Sorry for the newbie question, but I'm looking for some pdb's to practice packing methodologies on. This will be packing only to start with, not design, and I'm looking for a locked backbone, ( I've seen a function somewhere in the docs for this), and just perturbing the side chains exactly like in this video http://www.youtube.com/watch?feature=player_detailpage&v=_SDHZ-jxP4o....
Are there suitable (as yet non-optimised) pdbs like this to test in the rosetta tarball? Maybe a non optimised pdb from an old foldit challenge would do the job as well?
Thanks in advance
It's easy enough to dowload PDBs from the RCSB (www.pdb.org) - they're all experimentally determined, so should hopefully be free of Rosetta optimization. (Keep in mind that some NMR models may have been determined with RosettaCM, and even some of the Xray structures might have been refined with Rosetta, though the higher the resolution the less bias from whatever fitting method was used.) That's especially true of older structures (particularly pre-2000), which were determined before Rosetta was up and running.
As far as structures included with the Rosetta distribution, take a look at the files in main/tests/scientific/biweekly/sequence_recovery/inputs/ which are used for benchmarking Rosetta's ability to do design. (They should also work with benchmarking rotamer recovery and packing, as well.)
Thx for the reply.
OK that confused me slightly. Surely the pdbs in the RCSB are as they appear in the wild (presumably derived from Xray crystalography) and hence are fully packed by natural folding? None of them will be sub optimal guesses as a part run of a packing simulation? What you seem to be saying is that a lot of the entries are inaccurate, looking at the site it has a figure for resolution, but these wont be "unpacked" proteins will they, more like best attempts of a folded structure using incomplete data?
Will have a look through that directory for suitable candidates. Thx for the pointer
Apologies for the newbie-ness again, I'm on a steep learning curve here!
Sorry, I misinterpreted your request. Usually when I think about "non-optimized" structures, I'm thinking in terms of structures which haven't be subjected to Rosetta's energy function, and hence don't necessarily have any of Rosetta's biases in them. You're correct that naturally folded proteins should already have "ideal" packing. That's not to say that Rosetta won't try to change the packing - it will. But that would be considered to indicate an error in Rosetta's energy function, rather than the experimental structure being inaccurate. The directory I gave you is all experimental structures - it's likely no different than if you had downloaded them directly from the RCSB website.
I'm not aware of any "dis-packed"/"mis-packed" proteins in the standard distribution, but it should be easy enough to make them. All you'd have to do is feed Rosetta an inaccurate energy function. One possibility is to use just the rotamer potential. That would put the most probable rotamer at each position, regardless of if there's a steric clash, or bad hydrogen bonds, etc. To do this, make a text file with just the line "fa_dun 1.0" . Then you can run repacking giving that weights file. (e.g. use fixbb https://www.rosettacommons.org/docs/latest/fixbb.html with an all-NATAA resfile and the option "-score:weights fa_dun.wts", where "fa_dun.wts" is the text file with just the "fa_dun 1.0" line.) You can play around with other inaccurate energy functions if you want: just fa_atr (LJ attractive), just hydrogen bonding terms, etc. The output of those packing runs would be mis-packed, and you could then use the regular Rosetta energy functions to repack/reoptimize them.
Ah right, all understood now. Good tip on the repacking, thanks for that, will give it go and a useful link there too.
Out of interest, do you know if there has been much work on alternative packing methodologies other than simulated annealing? I believe there is a genetic algorithm used in the DNAinterface module for example.
Thx again for the help
SCWRL4 from the Dunbrack Lab uses a tree-decomposition algorithm during packing, using an energy function that is completely optimized for rotamer repacking. It is exceptionally fast as well. http://dunbrack.fccc.edu/scwrl4/ http://dunbrack.fccc.edu/scwrl4/SCWRL4Paper.pdf
Both SCRWRL4 and the default talaris2013 energy function in Rosetta use the newer 2010 Rotamer library that is talked about in that paper. Recovery is a bit better in SCWRL4, but comparable to Rosetta.
Great stuff, thanks for that, will have a read
Came across this video https://www.youtube.com/watch?v=fvtnEv4x6sQ Its part of a very good lecture series by Jeffrey Gray and collegues, and gives a solid grounding into rotamer packing. It does mention a dead end elimination technique to eliminate sub optimal rotamers from the search space to narrow down possibilities (for a given backbone confirmation). Is this technique used in Rosetta? If so, could someone point me in the right direction?
Thx again for all your help
Rosetta doesn't have dead end elimination, although it's a common technique for some other programs (EGAD, Dezymer, ORBIT, OSPREY). There's been some work at looking in adding a DEE module to Rosetta, but it really didn't go anywhere. The standard Rosetta packing algorithm tends to be "good enough" for most use cases (and tends to work better when you're rapidly changing backbone conformations).
Another technique besides packing that *is* used in Rosetta occasionally is called "rotamer trials". This is basically the hill-climbing version of packing. Instead of doing simulated annealing to do simultaneous optimization at all positions, you randomly pick a single position and then exhaustively score all the rotameric possibilities for that position in the (fixed) current context of all other residue positions. You substitute the lowest energy rotamer, and then move on to the next position, optimizing in the updated context. It doesn't really consider coordinated changes like regular packing can, but it's more exhaustive in it's sampling than standard packing.