You are here

first try at relaxing a structure

5 posts / 0 new
Last post
first try at relaxing a structure
#1

I've just put up Rosetta, and am trying to run it for the first time. I get a lot of errors that look like this:
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 1 in file 8cp0
apo.pdb. Best match rsd_type: LEU_p:NtermProteinFull
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 2 in file 8cp0
apo.pdb. Best match rsd_type: TYR
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 3 in file 8cp0
apo.pdb. Best match rsd_type: GLY
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 4 in file 8cp0
apo.pdb. Best match rsd_type: ILE
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 5 in file 8cp0
apo.pdb. Best match rsd_type: LEU
core.io.pdb.file_data: [ WARNING ] discarding 2 atoms at position 7 in file 8cp0
apo.pdb. Best match rsd_type: GLN
core.io.pdb.file_data: [ WARNING ] discarding 1 atoms at position 8 in file 8cp0
apo.pdb. Best match rsd_type: GLY

When I view the structure with the viewer, I can't find anything wrong these residues. I've only done a spot check, and they seem to be recognized correctly, but it does concern me. In a search, I found someone with similar warnings in a loop refinement, who was told his output looked OK. What does it mean?

How long might a relaxation take? I've got a homodimeric protein, and I have to refine the units together because there's a very flexible loop at the dimeric interface which, in the absence of the other unit, can assume conformations that prevent dimerization. Each chain is 258 residues long, & I have a linux workstation with an Intel(R) Xeon(R) CPU E5410 @ 2.33GHz. I stopped the job after 12-15 hours yesterday. It had iterated through Stage 1 multiple times & the best energy seemed to be oscillating considerably. Might this take days? Weeks?

Thanks!

Post Situation: 
Wed, 2011-01-05 11:09
einew

Unrecognized atom warnings are common and usually unimportant. It means there are atoms in the PDB file that do not match the expected atoms defined in the residue parameters files. You can compare the atom names in the PDB file to the atom names in the minirosetta_database/chemical/residue_type_sets/fa_standard/residue_types/l_caa/*.params to see where there is an atom name mismatch (spaces count). I would guess it's either the backbone alpha hydrogen or a virtual atom. You will also get these errors when reading fullatom PDBs into a centroid mode pose and vice versa.

I don't know the exact numbers (I never relax things) but I think relax mode is not suggested for proteins larger than 100 residues. I know its runtime scales poorly with protein length, so 200+ is going to be very slow. This is the sort of thing someone really ought to put into the relax documentation...

Wed, 2011-01-05 11:27
smlewis

You can also try -relax:fast. I think they're making it default in 3.2...

Wed, 2011-01-05 12:40
smlewis

I forgot to mention that I'm using -relax:fast. It's good to know Rosetta isn't a particularly viable option for this protein. I've been using Schrodinger Prime, but selecting a region around each residue & tweaking the endpoints until the stresses relax out isn't fun. It takes about a week of careful tending to do it. And although there are scripts to identify stresses, they don't correlate all that well with Molprobity criteria.

Thanks!

Wed, 2011-01-05 12:55
einew

After the previous reply, I asked on the mailer for Modeller9v8 about refining Modeller-built structures. Two members replied they'd had good luck with Rosetta relax on proteins even larger than mine. So I tried it with relax:fast. I started the job on a 2.33GHz linux box just after lunch on a Friday and it finished sometime Saturday night while I was asleep. The output has a very good MolProbity score - 1.93. I'm happy.

Sun, 2011-01-09 10:10
einew