You are here

modeling loops between domains

7 posts / 0 new
Last post
modeling loops between domains
#1

Hi,
I'd like to try to reproduce the interdomain linker prediction (Prot Sci 2007 16:165-175) on a two-domain protein we're working on. We don't have the full structure (yet) to back-check but do have NMR data to cross-validate any results. Regarding the published method, I wasn't sure how to model the linker while keeping the individual domains internally rigid but allowed to translate/rotate.
Using rosetta3, if I model the linker as a loop, the two domains stay fixed in space, greatly restricting the linker folding. If I use the ab initio routine, how do I enter the protein domains?
Thanks,
Dave

Thu, 2009-03-05 14:48
dhorita

Do you have a case where both domains are self-contained with a flexible linker between them, or a case where it's (Nterm.domainA-linker-domainB-Cterm.domainA)? I don't know of any ways to do this from command line but I know places you can hack up the code to get it to work, if you're comfortable with the C++.

Fri, 2009-03-06 09:14
smlewis

This is a N-terminal domain A - linker C-terminal domain B case (so the former - two self-contained domains connected by a linker).
(not much of a c++ programmer, but I'm ok with c and might be able to figure stuff out as long as it's not too buried on the oo parts).

Dave

Fri, 2009-03-06 14:41
dhorita

The shortest path to getting the modeling you want should be in relax mode. What you want to do is relax the protein, without changing anything inside your domains. So, we'll tweak relax mode to ignore your domains and only relax your linker.

The executeable we'll be working with is src/apps/public/relaxation/relax.cc. Inside it offers two modes, Classic and FastRelax. I'll describe how to hack up Classic but the process would hold for FastRelax (just different lines to change). Your linker is presumably short so you won't need speed (classic scales poorly with number of residues).

ClassicRelax is a Mover, it lives in src/protocols/relax_protocols.* We need to modify the mover's private function set_default_move_map, which is implemented at src/protocols/relax_protocols.cc:139

The MoveMap is the data structure that controls what parts of the protein can move. So, we'll just tell it not to move your domains, but leave the middle intact.

set_default_move_map create the default map (which has all freedoms turned off by default), then activates freedoms according to some options in the option system. We want to control freedoms based on linker position instead, so remove the three lines in the function that set things in the movemap (these three):

(I can't get this to post code correctly so this will be ugly, sorry).
~pp~
movemap_->set_jump( core::options::option[ core::options::OptionKeys::relax::jump_move ]() );
movemap_->set_bb( core::options::option[ core::options::OptionKeys::relax::bb_move ]() );
movemap_->set_chi( core::options::option[ core::options::OptionKeys::relax::chi_move ]() );

Replace it with these lines:

for (int i = start_of_linker; i set_bb(i, true);){

movemap_->set_chi(i, true);

}
~/pp~
Where start_of_linker and end_of_linker are the resids of your protein. If you have one case, just hard-code the numbers and don't worry about it; if you want this to work on multiple starting structures without recompiling you can get them passed in as options. Note you need rosetta resids (indexed from 1 from the start of the protein, no gaps) not the PDB residue number.

Disclaimers: A) This is off the top of my head, I haven't tested it. B) This is probably not the protocol from the paper you mention, but it should provide linker modeling. C) I'm not sure there isn't a commandline way to do it - I just don't know of one.

Mon, 2009-03-09 07:54
smlewis

Thanks! Absolutely beautiful.

I assume the lines to be added were along:

for (int i = start_of_linker; i less_than end_of_linker; i++) {
movemap_->set_bb(i, true);
movemap_->set_chi(i, tru);

}

(this thing certainly doesn't like less_thans)

I do get this:

end: Not in while/foreach.

when it stops, although a molecule has been calculated.

If it's not too difficult, how do I go about passing start/end residues numbers as a flag (or for that matter whether to do limited relax or not so I don't need two versions...)?

Thanks,
Dave

Tue, 2009-03-10 11:55
dhorita

"end: not in while/foreach"

This sounds like tcsh or a shell script, it's certainly not rosetta. Are you running from inside a script?

Hacking in the options system:

We should probably add this to the "how to write a protocol" spiel in the manual. This will be a little vague, but I can be more specific if you have trouble. Anyway, the option system lives in src/core/options/options_rosetta.py. Read that file's header material and options.py in the same folder.

You can add globally accessible options by adding entries within options_rosetta.py. I'd suggest either sticking a new one into an obvious option group like "run", or making a new group named dhorita or whatever (this is slightly harder). Make several options, one for a boolean control of this option, one for start of flexible region, one for end. After adding new things to options_rosetta.py, run options.py in the same folder. options.py is a script that generates the C++ code for the option system from the giant-huge data structure in options_rosetta.py.

Now that your option exists, you need to shoehorn it into relax in the movemap function. Go find a line of code that looks like ~pp~ core::options::option[ core::options::OptionKeys::somenamespaces:anoption].value(). ~/pp~ This is how the global options object is accessed. I'd suggest just searching relax_protocols.cc for the word option, one is bound to be there somewhere. Use lines like this within the movemap-creation function to query the command line for what options were passed in. You know the decision logic already...

If you don't want to add new options, there are guaranteed to be "unused" options ripe for the picking already in the options system that you could just access (skipping the first part). This leads to a maintenance nightmare trying to figure out what the repurposed options do, but it's faster if you're uncomfortable messing with options_rosetta.py. For example, the entire AnchoredDesign option group is unused in the rosetta3.0 release (the code that uses it is unreleased), so you could just "abuse" its pre-existing options by checking their values from relax mode. At that point the hassle is remembering that "AnchoredDesign::refine_only" secretly means "relax executeable limited movemap"...

Wed, 2009-03-11 14:31
smlewis

hi, all,
this topic sounds interesting.
I am also in the situation to model a linker between two domains, the linker is about 20 aa length of poly-Gly which is usually for the linker between light-chain domain and heavy-chain domain of a singe-chain antibody. I want to know whether smlewis's idea is also working for my case.

Sat, 2010-03-20 23:20
jarod