You are here

Ligand conformations file in docking

5 posts / 0 new
Last post
Ligand conformations file in docking
#1

All,

I have a multi-ligand file, NBU_confs.pdb, of ligand conformers (291 total) and the line "PDB_ROTAMERS NBU_confs.pdb" at the end of the params file, and the pdb (coordinates) of one of the ligand conformers at the end of the input receptor file for docking. My questions:

1. Does Rosetta use the other conformers in NBU_confs.pdb. If so why does it need to have the pdb of just one conformer at the
end of the input receptor file?
2. If I run on several CPUs, each with -nstruct 1000, using the same input receptor file, params, options etc..., would the
run on the second, third, forth etc... processor just be repeating the run on the first one or it's a random process and
thus the decoys from all the runs will be different? Would it be better to split NBU_confs.pdb into different smaller files
and use each individually for each run?
3. Is it advisable to dock to a rigid receptor with Rosetta? Could you point to a options or xml files for this?

Thanks.

Post Situation: 
Fri, 2013-07-12 09:21
rosa

1. When repacking the structure, Rosetta will consider the other conformations of the ligand in the PDB_ROTAMERS file. It might not put any of them in, but it will consider them. These ligand conformers are used for their *internal* coordinates only. Their global translation and rotation are not considered, so Rosetta needs a starting position of the ligand to tell it first that the ligand should be included in the structure, and where in the protein the ligand is located. (The ligand conformers are oriented on the "neighbor" atom of the ligand and surrounding atoms. Usually this is a ligand at the center of the structure, but you can control this by passing parameters to molfile_to_params.)

2. Each separate process should produce independent output structures, even when called with the same flags - assuming you haven't used the -constant_seed flag. By default the random number generator is initialized to a different start position (from the OS entropy pool, not the start time), so multiple runs should be independent. Some have reported seed collisions, though, so if you're concerned, you can explicitly set the seed with "-constant_seed -jran {seed}" where {seed} is a different integer for each process.

*Don't* separate out the PDB_ROTAMERS file. These conformations aren't handled serially - they're all considered as a group during packing. 29100 runs with a 291 conformer PDB_ROTAMERS file will be of substantially different (and better) character than 100 runs each of 291 different 1 conformer PDB_ROTAMERS file.

3. I'm not quite understanding what you mean by rigid receptor. The ligand docking application (https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d4/...) has options to control how much freedom you give to the protein, most notably -docking:ligand:minimize_backbone which controls whether you allow the backbone to minimize in context of the ligand. RosettaScripts docking allows for much finer control, but how you would change the script depends on what you're hoping to achieve.

Fri, 2013-07-12 11:27
rmoretti

Hi rmoretti,

Thanks for the explanation.

1. hasn't sunk in yet 2. very clear 3. By rigid I mean no movement on the protein backbone or side chains: no packing or repacking or minimizing. Only the ligand is allowed to be flexible. Would it work to simply set the option to 0 Angstrom?

-docking:ligand:minimize_backbone 0.0

Thanks.

Fri, 2013-07-12 15:48
rosa

-docking:ligand:minimize_backbone is a Boolean option. You'd want to use "-docking:ligand:minimize_backbone false". (Though false is the default, so just omitting the flag would work.) That would only prevent backbone movement. Sidechain movement would still be permitted. The way to stop that is to use a resfile (https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d1/...) to prevent repacking (NATRO) on the protein chain.

That's only if you're using the stand alone ligand docking application. If you're doing ligand docking through RosettaScripts the control is different. For that you'd have to edit the XML file. For example, if you're using the XML file rosetta_demos/public/dock_ligand_and_proteins/Part2/ligand_dock.xml as an example, I believe it would be sufficient to reduce the size of the cutoffs in the LIGAND_AREAS section to something small which wouldn't pick up any additional residues (e.g. something like 0.5 Ang). There's probably better ways to do it if you want to rewrite things totally, but that's a quick way to prohibit all protein movement.

Regarding 1., it might help to have an understanding of the packer. The way Rosetta does sidechain rotamer search is with Monte Carlo simulated annealing. Over simplifying, what happens is that the packer randomly picks a position, and then randomly picks a rotamer that's possible at that position. It then looks at the energy, and based on how much better/worse the rotamer makes the energy (the "Metropolis criterion") it accepts or rejects the rotamer. Over time the criteria for accepting rotamers gets more stringent, so you quench/anneal the protein into a low energy state - hopefully the global minimum. (There's more complications, but that's the 30,000 ft. overview.)

The ligand is treated exactly the same as protein residues in this respect. As the packer randomly goes through the positions, it has a chance to pick the ligand position. If it does pick the ligand, it then would randomly pick one of the PDB_ROTAMERS. If there's only one rotamer, it isn't much of a choice, and you'll always get that ligand rotamer. If there's multiple rotamers available, the Metropolis criterion means that over the course of the packing run, the packer will converge on one of the best ones.

So if you have a ligand rotamer set where 80+% of the ligand rotamers wouldn't work in context of the protein, if you split the rotamers out into separate runs you'll have those 80+% of runs give you bad results. In contrast, having a single run with all the rotamers means that the packer will choose the rotamers which will work in context, and you'll effectively "ignore" the 80+% of the rotamers which wouldn't work. (Because it will try them, see that they're bad, reject them and go on to the next position/rotamer.)

Mon, 2013-07-15 11:54
rmoretti

Hi rmoretti,

That makes a lot of sense. Thanks for the detailed explanation.

Thank you.

Tue, 2013-07-16 16:27
rosa