You are here

Modeling more than one protein loop at the same time

3 posts / 0 new
Last post
Modeling more than one protein loop at the same time
#1

Hello everyone! I'm new to the Rosetta software and I apologize in advance if someone has already asked this question before (I looked through the forum but couldn't find such topic).

I'm trying to refine a membrane protein structure. I obtained the crude structure from homology modeling with Modeller and then ran the mp_relax protocol of Rosetta Membrane (I used the available mp_relax files from the protocol capture folder of Rosetta 3.8, a distribution that was installed some time last August). Now I wan to refine further the loop portions of my protein. For that I tried to used the files from the loop_modelling tutorial that came with this distribution, but I encountered a potential problem:

If I refine a single loop, the program seems to be running fine and produces an otput where the geometry of the loop has changed with respect to the initial structure. However, if I ask for more than one loop to be refined at the same time, the code seems to start working on one of the loops (see below the fiirst loop that gets optimized)

      kinematic initial perturb with start_res: 299  middle res: 313  end_res: 328

and I start getting lists of warnings of the kind:

      core.kinematics.AtomTree: No proper DoF can be found for these four atoms: 44-1, 44-2, 44-3, 45-1!

where the four atoms belong to loops that are NOT undergoing the optimization at the moment. Once the code finishes with the first loop, it moves to the second one and then I get new warning messages for missing DoFs. However, this time, I don't have DoF messages for atoms from the very first loop AND from the second loop that is being currently optimized. This goes on and on, until I no longer get DoF messages during optimization of the last loop of the set.  Then I get another iteration of loop refinement, as ROSETTA is building a second model (I've asked for nstruct 5) starting from the very first loop, and the DoF messages appear again.

So, my question is: Can I trust the structures obtained through this multiple loop refinement modeling or are the DoF warnings an indicator that something went very wrong during the simulation? I couldn't find much information about this on the internet - I guess it implies a problem with the Atom Tree, but I have no idea how it can be fixed.

When I looked at the 5 generated structures I discovered that all of the loops have underwent change with respect to the initial structure. However, initially I had some highly structured loops (from the .pdb files on which I based my un-refined homology model), whose structure has been destroyed during this refining protocol and they have become highly disordered. I don't think this is realistic (these structured loops appear in the crystal structures in a number of bacterial homologues of this protein, so they are obviously stable enough to get resolved by X-Ray diffraction). I would love to hear some opinions on this from users who are more experienced  with ROSETTA's code and performance. Are there some options that I can explore in this case that would preserve the structure of the highly structured loops? Any keywords that can be of help?

Please let me know if there is any other information that you may need! Below are the contents of my rosetta flag file and the .loops file which has the list of loops that need to be optimized. I've used the  loopmodel.linuxgccrelease executable on my office machine and I've also ran a serial ROSETTA calculation (with nstruct 2000) with the loopmodel.mpi.linuxiccrelease (of ROSETTA 3.8) on one of the clusters my group is using. The serial calculation behaved in the same way.


My flag file looks like this:

-in:file:s c.1.0_z.pdb
-in:file:fullatom

-nstruct 5

-loops:loop_file NIS_refine_loop.loops
-loops:fast
-loops:max_kic_build_attempts 250
-loops:remodel perturb_kic
-loops:refine refine_kic

-ex1
-ex2
-use_input_sc

-out:overwrite
-out:file:fullatom
-out:path:all output_files
-out:file:scorefile NIS_vSGLT_42_loop_relax_scores_new.sc

 

And my NIS_refine_loop.loops file looks like this:

#subtract 10 since first 10 residues are missing
LOOP 24 45 0 0 0
LOOP 102 116 0 0 0
LOOP 150 153 0 0 0
LOOP 168 177 0 0 0
LOOP 299 328 0 0 0
LOOP 362 369 0 0 0
LOOP 392 405 0 0 0
LOOP 427 431 0 0 0

Category: 
Post Situation: 
Mon, 2018-01-15 14:03
hzhekova

Rosetta loop remodeling using the loop_modelling application isn't able to do simultaneious loop optimization. What it does instead, as you've seen, is to randomly pick a loop, model that and then move on to the next (randomly selected) loop. All loops get modeled, only one after each other rather than all at once.

The reason you get the DOF messages is likely due to the fact that you have (backbone) atoms which all share the same (all-zero?) coordinates in the unmodeled loops. Rosetta can't properly calculate phi/psi/etc. information for those residues, so it raises an error. This shouldn't affect the modeling of the loops which are being rebuilt, so you should be able to ignore those messages.

Regarding the strucutre in the loops, when you ask Rosetta to remodel a region of the protein, it will remodel it, more-or-less ignoring whatever structure already exists there. "Loop modeling" in Rosetta is something of a misnomer. The remodeling isn't restricted to true loop regions - anything you specify as a "loop" in the loop file is considered to be a loop to Rosetta, and will get remodeled as appropriate.

 

If you do have pre-existing structure in the loops you wish to preserve, my suggestion is to not use the loop_modeling application, but instead to look at the RosettaCM application instead. While originally intended for comparative modeling tasks, the RosettaCM protocol works very well for loop modeling applications as well -- you just simply provide the starting structure as your "template" homolog. The RosettaCM protocol should work well in your situation, as it's able to grab structural information from your pre-existing loops, as well as rebuilding things where they don't match. -- And it shouldn't make a difference that your template "homologs" are from another homology modeling run -- in fact, there's been some success in interating RosettaCM, feeding the output of one homology modeling run back in as "templates" for a second round of modeling.

There should be a demo for RosettaCM in the Rosetta/demos/tutotirals/rosetta_cm/ directory.

Mon, 2018-01-15 15:50
rmoretti

Thank you so much for your reply and suggestions! I'll give the RosettaCM protocol a try.

Tue, 2018-01-16 10:31
hzhekova