I want to use a two-step design procedure in which a fixed backbone design is combined with a backbone relaxation.
The two steps are supposed to be repeated alternately as long as they bring improvement in the energy.
(something like this http://www.ncbi.nlm.nih.gov/pubmed/22632833)
Which fixbb design parameters are the most important? I use -packing:ex1, -packing:ex2. But perhaps there is something else I should consider?
Next, I want to relax the resulting designed structure. However, I want to avoid big changes in the backbone. Which parameters can I use to make the relaxation less 'aggressive'? Will be -relax:fast enough? Does the relax protocol supoprt symmetry?
The relaxation step is supposed to allow for small backbone movements that resolve potential steric clashes (sophisticated alternative for -score:weights soft_rep_design).
Such a relaxed structure will be an input for another round of fixbb design. The question is whether to use the relaxed backbone or the full structure (bb+sequence) for design? In the latter case, how to tell fixbb design to optimize the input sequence instead of starting from a random one? Do I need to perform an idealization step at some point?
Finally, when the procedure should be stopped? I consider to take RMSD to the initial backbone structure (should not be too big) together with the change in the energy.
I would appreciate your comments and hints, especially on the technical aspects.
Packing: -ex1 -ex2 are the general suggestion for packing. "-extrachi_cutoff 0" is useful if you want extra rotamers for nonburied residues for whatever reason (the integer argument determines how many neighbors a residue must have to get extra rotamers). -use_input_sc is useful if you are doing sidechain minimization at any point, or want to allow retention of input (crystal) rotamers. If you want packing to converge more tightly, "-ndruns 10" will help (the integer is how many times to try repacking each time packing is called), or "-multi_cool_annealer 10" to get a similar super-convergence effect.
Relax: Relaxation is generally not too aggressive in my experience unless there are bad clashes in the input, in which case it can make a structure explode. -fast is superior to the classic mode. I'm reasonably certain it supports symmetry but I haven't tried it personally.
"The question is whether to use the relaxed backbone or the full structure (bb+sequence) for design? In the latter case, how to tell fixbb design to optimize the input sequence instead of starting from a random one? Do I need to perform an idealization step at some point?"
If you want do what Grant did, then you use the output of one step as the input for the next (assuming I remember it correctly from group meetings). You don't need to tell fixbb anything special. If you are going to idealize at all, you only need to do it once, at the very beginning. Generally idealization is relevant only if you want to use fragments.
The procedure should be stopped whenever it stops improving structure energy. It will bottom out after 4 or 5 cycles.
I brought this to Grant's attention for further details...he can hopefully post exact command lines for you.
One way to make relax less aggressive is to use constraints. You can turn on backbone constraints with the -relax:constrain_relax_to_start_coords command line flag. By default this will put on constraints early in the relax, but gradually taper them off as the relax progresses. This should keep you relatively close to the starting structure. You can tailor how aggressive the constraints are by varying the -relax:coord_cst_stdev command line parameter. A smaller number means tighter constraints and less backbone movement (default is 0.5).
RE :> I want to use a two-step design procedure in which a fixed backbone design is combined with a backbone relaxation.
This can be done in two ways - one is to simply do fixbb followed by relax and check the average or best energy decrease at each step. A second method is to write a small rosetta protocol that couples fixbb->relax as was done in the publication you mentioned. The code for the publication you mentioned has not been distributed with the current rosetta release - my apologies.
RE :> Which fixbb design parameters are the most important?
This really depends on how aggressively you can sample - in general -packing:ex1 2, -packing:ex2 2 are sufficient, especially considering that the next step is relax.
RE :> Next, I want to relax the resulting designed structure. However, I want to avoid big changes in the backbone. Which parameters can I use to make the relaxation less 'aggressive'? Will be -relax:fast enough? Does the relax protocol supoprt symmetry?
Fast relax should be enough - cycles of fast relax have been shown to reduce the energy more rapidly and deeply than multiple cycles of classic relax. You can compare the rmsd of the input to the output to evaluate if the structure is changing to much or to little. You can use -relax:constrain_relax_to_start_coords -native native.pdb to remain near the native structure - you can also turn up the weight on this constraint or use other distance/dihedral/coordinate constraints if you want a particular part of the structure to remain the same.
RE :>Does the relax protocol supoprt symmetry? Yes - it should work fine.
RE :> The relaxation step is supposed to allow for small backbone movements that resolve potential steric clashes (sophisticated alternative for -score:weights soft_rep_design).
This is a bad idea - the soft_rep_design weights and other soft_rep weights should not be used with relax - you will get strange artifacts such as helical un/overwrapping, atoms getting to close, and less desired rama values.
RE :> Finally, when the procedure should be stopped? I consider to take RMSD to the initial backbone structure (should not be too big) together with the change in the energy.
The average energy for a natural protein in Rosetta is -2.5 Rosetta Energy Units and the average energy for a Rosetta designed protein is -2.8. Once the energy change between rounds of design and relax are ~1.0 energy unit the sequence and structure will not improve with out significant backbone remodeling - such as loop remodel or refolding in Rosetta Ab Initio
Hopefully this helps. Reply back if you have further questions.
Quick clarification. -relax:constrain_relax_to_start_coords will constrain the backbone heavy atoms to the coordinates in the input structure. It's -relax:constrain_relax_to_native_coords -native native.pdb to constrain the backbone heavy atoms to those in the native.pdb file.
Thank you for your comments and explanations. They are all very useful. I would like, however, to clarify few things:
1) What does "10" in "-multi_cool_annealer 10" mean? Is there any general rule whether to use ndruns or multi_cool_annealer in order to achieve the "super-convergence effect"?
2) Do I understand correctly that fixbb with default parameters will ignore side-chains present in the input file? However, adding "-use_input_sc" will force fixbb to start the simulation from the sequence of the input file?
3) Does relax always act on full-atom representation or is it possible to relax a structure that comprise only the backbone?
4) Does energy (Rosetta Energy Units) correspond to the total energy ("total_score") divided by the number of residues?
1) Here's the documentation for multi_cool_annealer (from -help):
Alternate annealer for packing. Runs multiple quench cycles in a first cooling stage, and tracks the N best network states it observes. It then runs low-temperature rotamer substitutions with repeated quenching starting from each of these N best network states. 10 is recommended.
Basically "10" is a measure of how many different states it will juggle; 10 is a empirical optimum between time and power. Both ndruns and multi_cool_annealer should be used when you value convergence over time spent. For most cases, they are not necessary.
2) -use_input_sc has nothing to do with _sequence_, only with _rotamers_. If Rosetta's packing (fixbb) modifies a residue at all, it totally replaces the sidechain with a perfect rotamer from the Dunbrack library. It certainly doesn't ignore the identity of the sidechain, but the crystalline bond lengths and angles will get replaced, and if it's a rare rotamer then Rosetta may not be able replace it with an ideal similar rotamer. -use_input_sc causes Rosetta to leave the crystalline rotamer as an allowed rotamer during packing, instead of only allowing freshly generated perfect rotamers.
The flags -packing:repack_only and the resfile commands NATRO/NATAA are used to prevent a sequence from changing; without one of those commands then Rosetta's packing will by default change the sequence.
3) It is not possible to relax a backbone-only-structure: no such structures exist from nature, so we don't have a scorefunction parameterized to do that. Centroid relax is possible (which uses a reduced sidechain representation).
4) Grant often talks about scores normalized by protein length (which makes lots of sense), but REU does not imply that the value has already been normalized. Rosetta's total_score is reported in REU and never normalized # residues internally.
Thank you for the explanations.
In the Grant's paper the simulation is stopped when the energy difference between design/relax rounds is lower than 1.0 REU.
Is this difference calculated based on not normalized total_score score of relaxed structures from the consecutive rounds?
I am almost certain it's not-normalized total_score, because 1 REU / number of residues is actually quite a large energy gap. If Grant doesn't contradict me consider that the answer.
Steven is correct - the 1 REU is not normalized - it is the difference in total score