You are here

Cartesian-space refinement

6 posts / 0 new
Last post
Cartesian-space refinement
#1

Dear all,

I did the relax with a model from Robetta Server using below command: 
mpiexec -np 88 relax.mpi.linuxgccrelease \
-in:file:s robetta_model_1.pdb \
-in:ignore_unrecognized_res \
-packing:use_input_sc \
-relax:constrain_relax_to_start_coords \
-relax:coord_constrain_sidechains \
-relax:ramp_constraints false \
-relax:cartesian \
-relax:min_type lbfgs_armijo_nonmonotone \
-out:nstruct 86 \
-out:file:silent silent.out \
-out:file:silent_struct_type binary \
-beta_nov15_cart

But, each output model was relaxed into the same scores or geometries.
I am not sure if I used the correct command or not.

Thank you so much for your help and advice,

SJ

Post Situation: 
Wed, 2017-05-31 19:34
sjyang7

Usually if all the jobs come back with the same score, it's because there was a problem with the random number seeds, and all the processors accidentally did the exact same trajectory.  This is often the case when a non-mpi executable is run in MPI mode, but you appear to have used the right executable.

I assume you either don't have the log files or they are mpi-garbled by multiplexing.  I'd like to check that you got different random number seeds for each processor like you're supposed to.  The seed is reported on the first few lines of output.  (Unfortunately it might be before the mpi tracer splitting activates, so this might be for naught).  You can try re-doing a small run (like 5 structures) with the flag "-mpi_tracer_to_file proc", and it will dump log files separately to proc_0, proc_1, etc, which will let us compare how identical the logs really are.

Are the structures all NUMBERED separately in the silent file?  Are there the correct number of them?

Thu, 2017-06-01 07:28
smlewis

The reason you're getting all the same structures is because you're doing relax with relatively heavy constraints on all atoms: "-relax:constrain_relax_to_start_coords -relax:coord_constrain_sidechain -relax:ramp_constraints false". Throw on the "-relax:cartesian", and you've set up the energy function and sampling strategy such that there's a very clear global minimum, and the relax protocol can very efficiently find it. Generally speaking, if you're using the always-on all-heavy-atom constraint protocols ("-relax:constrain_relax_to_start_coords -relax:coord_constrain_sidechain -relax:ramp_constraints false"), you only really need a single output structure - there may be some variation in the result, but it's normally pretty consistent.

That's probably not what you want to do, anyway. The always-on all-heavyatom constrained relax protocol is really good for taking a structure that hasn't been optimized in Rosetta's energy function (e.g. direct from the PDB, or from something like a molecular dynamics run) and removing the small errors that would blow up the value of the Rosetta energy function without too much movement of the atoms.  You're already starting with a model that has been optimized in Rosetta (the output of Robetta), so using  this protocol probably won't gain you much.

You don't mention the reason for doing the relax, but I might suggest just doing an unconstrained relax (that is, just remove the "-relax:constrain_relax_to_start_coords -relax:coord_constrain_sidechain -relax:ramp_constraints false" flags).  If you get issues with the protein blowing up/falling apart, you might want to consider adding "-relax:constrain_relax_to_start_coords", but it's probably not worth adding the "-relax:coord_constrain_sidechain" or the "-relax:ramp_constraints false" flags if you're after model diversity.

On the other hand, if you just want to pre-relax the structure from whatever Robetta is using (which is probably not beta_nov15_cart) into the beta_nov15_cart energy function, I'd suggest forgoing the multiple output structures, and just using an -nstruct of 1 with the always-on all-heavyatom constrainted relax flags. It's hella convergent, so you don't actually need the oversampling like you do in most Rosetta runs.

Thu, 2017-06-01 08:39
rmoretti

Thanks for all your valuable comments!

Let me mention the reason for doing this relax.  
I'd like to calculate ∆∆G of mutation to predict candidate regions for improving the thermal stability of my protein.
To do this, I'm trying to run "cartesian_ddg application" using a method described in the recently published paper
(2017. Alford et al. J Chem Theory Comput. http://pubs.acs.org/doi/abs/10.1021/acs.jctc.7b00125?journalCode=jctcce).

As first step, my robetta model was subjected to the Cartesian-space refinement according to below their methods.   
relax.linuxgccrelease \
-s xxx.pdb \
-use_input_sc \
-constrain_relax_to_start_coords \
-ignore_unrecognized_res \
-nstruct 1000 \
-relax:coord_constrain_sidechains \ 
-relax:ramp_constraints false \
-relax:cartesian \ 
-relax:min_type lbfgs_armijo_nonmonotone

To apply the energy function described in the paper (Rosetta Energy Function 2015, REF15), 
I used "-beta_nov15_cart" or "-beta_nov16_cart" flag according to Gray's comments 
(http://biorxiv.org/content/early/2017/02/07/106054).  

# smlewi, I checked if my rosetta's mpi runs have any problems. 
No problems. I got different random number seeds for each processor used.
 
# rmoretti, I tried to run the relax without "-relax:cartesian" flag.
But, It generated output models with the exact same scores or trajectories.
There was no some variation in output models.

SJ

Thu, 2017-06-01 21:08
sjyang7

Yeah, an -nstruct of 1000 on those runs is overkill. I contacted the author, and there is a notation that when they did the runs they actually only saw a handful of distinct structures in those 1000 output structures. (That is, what they saw was consistent with what you're seeing. Most of the structures which come out of the protocol are all the same.) They certainly could have reduced that number, though they had enough CPU time that they didn't need to.

For what you're doing, you should be okay. It's a preparation step, and you don't actually need any diversity in the step. (You're going to just pick the lowest energy one, anyway.)  If you're interested in getting the absolute lowest energy structure you can keep the -nstruct 1000, in the odd case that one time out of 100 you happen to randomly find a lower energy path, but from my personal perspective that's probably not worth it, and if you want to reduce the computational load you can just knock that down to an -nstruct 1.

P.S. I'm not surprised that removing the  "-relax:cartesian" doesn't do much - the key thing that's limiting the output structures is having constraints on all atoms and not ramping them. The sampling method (cartesian versus torsional) doesn't contribute all that much, though doing cartesian minimization gives Rosetta the abiltity to sample more degrees of freedom, so it's less likely to be caught in local minima. But that's a minor issue - the main thing limiting the diversity of the output structures are the *heavy restraints* you're putting on the atom coordinates.

P.P.S -beta_nov15 (or -beta_nov15_cart) is giving you REF2015. -beta_nov16 (or -beta_nov16_cart)  is giving you a different energy function, which, while likely to be the successor to REF2015, is currently still experimental and is unnamed outside the "beta_nov16" designation. 

Conclusion Having all the outputs be the same is expected in this case, and not a problem. Pick one of the the lowest energy model and continue on with the protocol -- everything should be fine.

Fri, 2017-06-02 12:39
rmoretti

rmoretti,

Many thanks for all your helpful comments!

Sat, 2017-06-03 06:09
sjyang7