You are here

Loop modeling script - unwanted change of amino acid identities

6 posts / 0 new
Last post
Loop modeling script - unwanted change of amino acid identities

When running the Loop Modeling script on my protein of interest, the identity of the amino acids in the loop changes when it generates the full atom model from the centroid model. In other words, in the centroid model the amino acids are equivalent to the protein sequence from the original PDB, but they are "mutated" in the FA model (often to lots of G's). Is there a way to prevent this from happening?

Post Situation: 
Fri, 2014-11-21 08:42

To prevent it from happening, you need to ensure that your repacking steps do not also include design. Rosetta defaults to design when it repacks (this is for code consistency reasons, although it seems strange from the outside). Find out where you specify what repacking is to be done and ensure that your PackerTask or TaskFactory has been commanded to do only repacking.

The default script ( already does this:

task_pack = TaskFactory.create_packer_task(starting_p)
task_pack.restrict_to_repacking() # prevents design, packing only

Fri, 2014-11-21 09:06

Thank you. I am using the default script with those steps included, yet it still happens on the centroid to full atom switch. Are there additional lines that I can add to the default script to prevent this?

Tue, 2014-12-02 15:29

I can't find anything wrong with the script that would cause this. It may be something else entirely. Can you post your input PDB as a txt and the command-line you used to run the script?

Wed, 2014-12-03 08:56

Basically, the crystal structure of my protein is missing a loop. I used the build function of PyMol to string together the appropriate amino acids. Then, I used PyRosetta to close the loop. I managed to get this to work, but noticed the high frequency of mutation that prompted this thread.

The starting PDB and two output PDBs are too big to upload. Is there another way to get them to you? The command line was: --pdb_filename 4cwu.allresi.pdb --loop_begin 130 --loop_end 162 --frag_filename 4CWUtrunc.frag9 --frag_length 9 --jobs 20 --job_output 9fragfromextended --outer_cycles_low 10 --inner_cycles_low 20 --init_temp_low 2. --final_temp_low .8 --outer_cycles_high 10 --inner_cycles_high 20 --init_temp_high 2. --final_temp_high .8 --loop_cutpoint 160

I have also done this with the fragment length 3 file with similar results.

Thu, 2014-12-04 14:32

You mentioned that this happened on the switch from centroid to fullatom - is there something in particular which lead you to mention that?

One way to help debug it is to simplify the protocol, and figure out which step is giving you the mutational issue. Looking at the script, I'd first comment out the 15 lines or so including or after "to_fullatom.apply(p)" (to pymov.send_energy(p)). Run with the simplified protocol, and see if you still get the issue. If not, you can slowly add the following lines back on. The critical ones are likely going to be to_fullatom.apply(p), recover_sidechains.apply(p), pack.apply(p), and loop_refine.apply(p). (Unless something is really wrong, the pymov.apply(p)'s and the other lines are highly unlikely to change the pose.) Figuring out which line is introducing the mutations will go a long way toward debugging the issue.

Final note: when you say that the residues are mutated to glycine, that means they're actually annotated as "GLY" in the output PDB, right? Or is it rather the case that the coordinates for the sidechain atoms are missing for some reason, but the three letter code is still what it's supposed to be?

Fri, 2014-12-05 00:47