Rebuilding the structure from unknown residues

5 posts / 0 new

Top

Hi,

I am trying to rebuild the structure from unknown residues (UNK). I have the corresponding EM map and sequence but the PDB file is missing some residues and rest is given as "UNK". I am trying to build an all-atom model using Rosetta. Please suggest as Rosetta does not accept any undefined residues. Any input is welcome.

Thanks

Sushree

Category:

Structure prediction

Post Situation:

Unsolved

Mon, 2016-06-27 09:32

sushreet

Top

If you know the sequence, put the right sequence in over the UNKs in the PDB file. I guess if you don't know the sequence, make it all alanine or something. I don't fully understand the question.

Mon, 2016-06-27 09:35

smlewis

Top

Rosetta needs to model *something*. There isn't really an "unknown" residue type, because any sort of modeling you do will make assumptions about what the residue properties are.

As Steven says, if you know what the sequence should be (e.g. from other sequencing results), you should put that in. Rosetta electron density fitting takes something of a "best fit" approach, and will try an match up the experiemental density with the actual sequence. Unlike some other density fitting programs, Rosetta isn't really befuddled by having extra atoms in the structure. You're not going to mess up your fit by having atoms present which aren't well represented by the density. Rosetta is built to accomodate the presence of atoms, residues and even loops which are poorly represented in the density data. It will fit the density the best it can, and if there are atoms for which there isn't density, it will model them with the standard Rosetta structure prediction energy function.

If you don't know what the sequence is for some reason, you can just pick a semi-arbitrary amino acid to use at that position. Alanine is a fine choice, as it's somewhat neutral. Valine is also a good one for positions you know are hydrophobic, as it's a good middle-sized hydrophobic. (Poly-valine is typically used in Rosetta de novo structure design projects for this reason.) If you're a hydrophillic/surface residue, something like serine might be a good choice. -- Basically, you're trying to do a best -uess matchup between the amino acid used and the likely properties of the real amino acid at that position.

You also might want to do something like an iterative approach. Do an inital run with a simple sidechain (poly-ALA/poly-VAL), then look at the density/surface exposure at each UNK position and adjust the identity of the amino acid to match, refining the fit with the new amino acid identities. Do this a couple of times based on your best guess at the identities at each position.

If you're looking at an unknown sequence length in addition to unknown identity, things get harder. But you can do multiple runs each with different loop lengths and pick the best. You can also take an iterative approach, starting with a mid-length loop length, and then doing density-guided loop remodeling to extend/shorten the loop, as appropriate.

Mon, 2016-06-27 10:25

rmoretti

Top

Thank you for your suggestions. To clarify my question, following is the example of residue 14 in pdb.

"ATOM 52801 N UNK Y 14 104.651 2.058 -96.784 1.00 30.00 N
ATOM 52802 CA UNK Y 14 105.993 2.712 -96.646 1.00 30.00 C
ATOM 52803 C UNK Y 14 105.883 4.238 -96.676 1.00 30.00 C
ATOM 52804 O UNK Y 14 105.099 4.790 -97.453 1.00 30.00 O"

and as per the sequence, it should be Valine.

So what I understand here is simply replacing residues from sequence file into the correspoding residues in PDB should work.

Thanks again.

Mon, 2016-06-27 10:51

sushreet

(Reply to #5)

Top

Right. Take your favorite text editor and change the "UNK" to "VAL". As long as the backbone heavy atoms are present, Rosetta should be able to take it from there.

Tue, 2016-07-12 09:35

rmoretti

Search form

You are here

Rebuilding the structure from unknown residues