I want to model some loops and the n-terminus of a protein.
The n-terminus is 45 residues long and I know the positions of the residues 14-20.
So I have created a homology model of my protein with these residues in correct position and now want to use Rosetta to improve the loops and the n-terminus before and after these 6 residues that I know.
But when I run the simulation, exactly this 6 residues are missing and I can't explain why. I have now tried many things, especially I tried all posibilities with the loop file but nothing changed. And yes, the residues are really there in the pdb file.
What have I done wrong / how could I solve this problem?
I also have an additional question:
What is the meaning of the third number in the loop file? In the manual I find that it is the "Cut point residue number". Could someone explain me what is meant by this? I couldn't find any further information.
Thank you very much for your answers!!!
My input file and my loop file are like this:
-loops:input_pdb mc4.pdb \
-loops:loop_file mc4.loop_file \
-loops:frag_sizes 9 3 1 \
-loops:frag_files mc4_09 mc4_03 none \
-loops:remodel quick_ccd \
-loops:refine refine_kic \
-loops:relax fastrelax \
-nstruct 10 \
-out:prefix mc4 \
LOOP 1 13 13 0 0
LOOP 20 44 0 0 0
LOOP 104 118 0 0 0
LOOP 187 190 0 0 0
LOOP 262 274 0 0 0
"And yes, the residues are really there in the pdb file. "
Did you check that the occupancy column is nonzero, and that all residues have all four backbone heavy atoms (N, CA, C, O) defined? What happens if you run it through score_jd2 with -out:pdb active (does the result PDB have all the residues or not)? Post the PDB if none of those help fix it and I'll take a look.
"Cut point residue number"
For KIC-style loop modeling, this is irrelevant and does nothing. For CCD-style, it's where the loop will break during the pre-re-closure diversification stage. Have you read the papers on how CCD works? (CCD in general, http://www.ncbi.nlm.nih.gov/pubmed/12717019, CCD in Rosetta, http://www.ncbi.nlm.nih.gov/pubmed/17825317)
Also, I'm not really sure loop modeling does termini - I think quick_ccd will for your perturbation stage, but I know KIC cannot for the refinement.
I have to check about "What happens if you run it through score_jd2 with -out:pdb active (does the result PDB have all the residues or not)? "
But yes the occupancy column is nonzero and this is what I found in the manual:
"quick_ccd can also remodel termini"
I have uploaded the pdb as a txt file because I saw that pdb files can't be uploaded. strange?
"I have uploaded the pdb as a txt file because I saw that pdb files can't be uploaded. strange?"
This is off-the-shelf forum software, it blocks all extensions it doesn't recognize. Renaming to .txt is correct.
Residues 14-19 have occupancies of zero. Change the 0.00 column (third from the left) to some other value.
Oh, yeah, there's a flag for it too: "-ignore_zero_occupancy false" should have the same effect as fixing the input PDB.
Thank you very much! I thought it is 0.00xx so not zero. Apparently I was mistaken.
I'm running now the calculation and will see what happens!
What has happened is that you have triple-digit B-factors in the next column. PDBs are column-indexed, not whitespace-delimited. This means that that absence of a space between columns does not mean that there isn't a column break. So, for atom #101, the string "0.00146.25" means "occupancy of 0.00, B-factor of 146.25". Most PDBs have B-factors of two digits or fewer at most positions, so there's usually an obvious column break there.