You are here

Output different favored solutions with PackRotamersMover design

6 posts / 0 new
Last post
Output different favored solutions with PackRotamersMover design
#1

Hi,

I'm trying to use PackRotamersMover to mutate a serials of mutatable residues into the predefined residues. Here is part of the core codes:

task = TaskFactory.create_packer_task(mutant_pose)
for i in range(1, mutant_pose.total_residue() + 1):
# only perform design for the mutatable residues
if i in mutation_positions:
# only mutate to specific amino acids types, like ["A", "K", "F", "S", "T", "K", "I"]
aa_allow = [False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False]
for aa in ["A", "K", "F", "S", "T", "K", "I"]:
aa = aa_from_oneletter_code(name)
aa_allow[aa] = True
aa_bool = rosetta.utility.vector1_bool()
for j in range(1, 21):
aa_bool.append(aa_allow[j])
# set packing options
task.nonconst_residue_task(i).or_ex1(True)
task.nonconst_residue_task(i).or_ex1_sample_level(ExtraRotSample.EX_FOUR_HALF_STEP_STDDEVS)
task.nonconst_residue_task(i).or_ex2(True)
task.nonconst_residue_task(i).or_ex2_sample_level(ExtraRotSample.EX_TWO_HALF_STEP_STDDEVS)
task.nonconst_residue_task(i).or_ex3(True)
task.nonconst_residue_task(i).or_ex3_sample_level(ExtraRotSample.EX_TWO_FULL_STEP_STDDEVS)
task.nonconst_residue_task(i).or_ex4(True)
task.nonconst_residue_task(i).or_ex4_sample_level(ExtraRotSample.EX_TWO_FULL_STEP_STDDEVS)
task.nonconst_residue_task(i).or_include_current(True)
task.nonconst_residue_task(i).restrict_absent_canonical_aas(aa_bool)
# freeze the other residues
else:
task.nonconst_residue_task(i).prevent_repacking()
# start to design
pack_mover = PackRotamersMover(scorefxn, task, 10)
pack_mover.apply(mutant_pose)

My problems are:

1. I have run the design for several times with the same input mutatable amino acids types like ["A", "K", "F", "S", "T", "K", "I"], presumably there should be multiple structures with different mutated sequences coming out. But I have run it for ten times, every time the output is the same sequence composition. I don't think it is reasonable because all of the mutatable reisudes are selected from the experimental single mutation results, there must be more than one combination which are favored in energy. Since PackRotamersMover is a stochastic method on the basis of SA, why it always gives the same result with the given input?

2. Comparing with other rotamer assembling method like dead end elimination which claimed to be able to find the global energy minimum, how is the accuracy of rosetta? Can it find the global energy minimum as well?

3. What should I do if I want to output more design solutions which are energy-favored with PackRotamersMover?

Maybe they are quite basic questions, but I appreciate any suggestion.

xfliu

Post Situation: 
Fri, 2012-07-13 06:23
xfliu

Sorry the indentation of the Python codes were eaten by the forum...

Fri, 2012-07-13 06:25
xfliu

You can attach it as a text file to preserve whitespace.

Fri, 2012-07-13 06:32
smlewis

Thank you Lewis. I know you are quite familiar with the Rosetta sidechain rotamer packing mechanism. Any suggestions? Is there any similar problem with the C++ codes? I have not used fixbb, but the manual says you can specify more than one solutions to be output. I peeked through the fixbb design codes, but not found anything novel, except a outer loop outside the PackRotamersMover... I did the similar thing, but only the same solution came out...

Fri, 2012-07-13 06:42
xfliu

"Is there any similar problem with the C++ codes?"

Yes - we usually fix it by redefining the problem in some way, enumerating (as in my post below), or changing the temperature scheduler (again as in the post below)

"I have not used fixbb, but the manual says you can specify more than one solutions to be output."

Can you point me to the specific section of manual you are referring to? This almost certainly refers to -nstruct, which causes multiple outputs, but has no influence on whether those outputs are distinct.

Fri, 2012-07-13 06:45
smlewis

Your PackerTask appears to be set up correctly. It should not repack most positions, and will design the mutable positions to any of 7 residues. That gives it 7^(# residues) diversity. Look for a PackerTask output function (there's << in C++, I don't know the python equivalent; you may be able to do "print task") and look at the results for debugging help (or paste for me here).

It is not surprising to me that it converges. 7^(#res) may not be very many sequences, so it is exhaustively searchable. Monte Carlo is not guaranteed to exhaustively search, and it is not guaranteed to find a true global minimum like DEE, but in general it does a very good job. For this scale of problem I think Rosetta is finding the true minimum.

This is especially true if your mutable positions are not neighbors. You _prevent repacking_ at all non-design positions, freezing their sidechains. This means that isolated designable positions will have fixed best residues, regardless of the other mutations that might get made; this hugely reduces the complexity of the problem and leads to faster convergence. So, the first thing to do is replace "prevent repacking" with "restrict to repacking" - which allows sidechain movement but not design.

To get more variety out of the packer, there are a few choices. If your number of mutable positions is small, I would just enumerate the sequences and pack all of them to get all their energies. It will be seconds per sequence, so you can afford thousands of sequences easily. Your second option is to increase the Monte Carlo temperature used in packing. See if you can find a ModifyAnnealer TaskOperation, which lets you modify the simulated annealer temperature scheduler.

Fri, 2012-07-13 06:41
smlewis