You are here

Rosetta_cm Partial Threading Removing Parts of Sequence

2 posts / 0 new
Last post
Rosetta_cm Partial Threading Removing Parts of Sequence
#1

Hello,

I am currently trying to homology model the PAR1 receptor with the P2Y12 receptor as my template structure, but for some reason I believe that parts of my sequence are being altered/removed, namely the loop regions.

For example, the original PAR1 sequence is as follows:

>sp|P25116|PAR1_HUMAN Proteinase-activated receptor 1 OS=Homo sapiens OX=9606 GN=F2R PE=1 SV=2
MGPRRLLLVAACFSLCGPLLSARTRARRPESKATNATLDPRSFLLRNPNDKYEPFWEDEE
KNESGLTEYRLVSINKSSPLQKQLPAFISEDASGYLTSSWLTLFVPSVYTGVFVVSLPLN
IMAIVVFILKMKVKKPAVVYMLHLATADVLFVSVLPFKISYYFSGSDWQFGSELCRFVTA
AFYCNMYASILLMTVISIDRFLAVVYPMQSLSWRTLGRASFTCLAIWALAIAGVVPLLLK
EQTIQVPGLNITTCHDVLNETLLEGYYAYYFSAFSAVFFFVPLIISTVCYVSIIRCLSSS
AVANRSKKSRALFLSAAVFCIFIICFGPTNVLLIAHYSFLSHTSTTEAAYFAYLLCVCVS
SISCCIDPLIYYYASSECQRYVYSILCCKESSDPSSYNSSGQLMASKMDTCSSNLNNSIY
KKLLT

Post-threading, my sequence is this:

>PAR1_on_P2Y12.A
ASGYLTSSWLTLFVPSVYTGVFVVSLPLNIMAIVVFILKMKKKPAVVYMLHLATADVLFV
SVLPFKISYYFSDWQFGSELCRFVTAAFYCNMYASILLMTVISIDRFLAVVYPRTLGRAS
FTCLAIWALAIAGVVPLLLKEQTIQVGLNITTVLNETLLEGYYAYYFSAFSAVFFFVPL
IISTVCYVSIIRCLSSSAVANRSKKSRALFLSAAVFCIFIICFGPTNVLLIAHYSFLSHT
STTEAAYFAYLLCVCVSSISCCIDPLIYYYASSEC

I've bolded the part of the sequence I first noticed this in, the ECL2 region. I was trying to incorporate the disulfide bond (as I have with three previous successful modeling runs of other receptors) when I noticed that the loop cysteine wasn't modeled into the post-threaded model for some reason. I haven't run into this issue before and haven't changed any settings from the previous successful runs except for pdb names when need be. 

AttachmentSize
P2Y12_cleaned.pdb182.02 KB
post-threaded model380.31 KB
grishin alignment952 bytes
Post Situation: 
Thu, 2019-06-20 12:30
gszwabowski

The threading application assumes that all thre relevant residues are present in your grishin alignment. That is, the template sequence should correspond to exactly those residues which are present in the template PDB structure. (This may be different than what the wwPDB entry says is the sequence for the PDB).

Likewise, the target sequence in your alignment shoud be the full target sequence you want to model, including those residues which are present in your desired output model, but which don't have any correspondence to the target structure. (Just simply de-align those residue/align them against '-' gaps). If you're interested in having the sequence TCHDV be present in your output structure, you need to have those residues be present in the alignment (even if they're only aligned against gaps).

However, just because they're present in the target sequence, doesn't mean they'll be present in the threaded structure. RosettaCM threading will only output the aligned positions. The unaligned positions will only be there  implicitly, as a gap in numbering. The RosettaCM Hybridize mover then uses these fragmentary templates to construct a full-length model, filling in missing residues with loop remodeling.

Given your alignment, you do have the CHD residues present, but they're aligned against gaps. They won't be present as coordinates in the threaded file, but they should be implicitly present as the three residue gap between T253 and V257. But note that because you aligned those residues to adjacent residues in the template file, the threaded structure will show T253 and V257 as being right next to each other. It's only after the Hybridize mover that the three missing residues will be inserted at that location.

(Note, though, that because the Hybridize mover will be inserting three residues there, you want to set up your alignment file such that the inserted residues are in a location , e.g. a loop, which will accomodate the insertion. Having an insertion in something like a secondary structure you want to preserve isn't great, and it's often better to shift your alignment such that the insertion (or deletion) happens elsewhere - even if this theoretically messes up a purely sequence-based alignment score.)

Mon, 2019-06-24 13:43
rmoretti