You are here

Problem with HETATM entries in PDB file

4 posts / 0 new
Last post
Problem with HETATM entries in PDB file
#1

Hello

I am using FARNA module for RNA denovo structure prediction. Some of the PDB files contain HETATM entries like this :

HETATM 191 P CH A 10 6.803 -7.485 -7.145 1.00 0.00 P
HETATM 192 O1P CH A 10 7.379 -8.656 -7.843 1.00 0.00 O
HETATM 193 O2P CH A 10 6.942 -6.135 -7.736 1.00 0.00 O
HETATM 194 O5* CH A 10 5.232 -7.767 -6.901 1.00 0.00 O
HETATM 195 C5* CH A 10 4.807 -8.876 -6.102 1.00 0.00 C
HETATM 196 C4* CH A 10 3.369 -8.743 -5.629 1.00 0.00 C
HETATM 197 O4* CH A 10 3.063 -9.889 -4.848 1.00 0.00 O
HETATM 198 C3* CH A 10 2.991 -7.513 -4.743 1.00 0.00 C
HETATM 199 O3* CH A 10 2.701 -6.346 -5.598 1.00 0.00 O
HETATM 200 C2* CH A 10 1.810 -8.141 -4.023 1.00 0.00 C
HETATM 201 O2* CH A 10 0.787 -8.181 -4.959 1.00 0.00 O
HETATM 202 C1* CH A 10 2.369 -9.432 -3.706 1.00 0.00 C
HETATM 203 N1 CH A 10 3.191 -9.513 -2.509 1.00 0.00 N
HETATM 204 C2 CH A 10 2.537 -9.246 -1.312 1.00 0.00 C
HETATM 205 O2 CH A 10 1.442 -8.667 -1.287 1.00 0.00

Running the executables with such a file gives erro message like
ERROR: unrecognized aa CH
ERROR:: Exit from: src/core/io/pdb/file_data.cc line: 476

Do i have to remove HETATM entries from the files? In case the above format is not clear, i have attached the corresponding PDB file.

AttachmentSize
1kpy.txt1.66 KB
Post Situation: 
Fri, 2011-05-20 10:17
asmi

Rosetta's actual complaint is that it does not recognize the residue code field (CH), not HETATM/ATOM. As a general rule, you should convert HETATM to ATOM anyway.

What is a CH supposed to be? I assume it's a modification of one of the four standard bases? If you remove the CH lines, it won't fail on them, but you'll have gaps. If you convert it to the closest residue type, then it may have an atom or two wrong but will be mostly correct. If you really want CH, whatever it is, you can create a parameters file for it.

Is 1KPY from the documentation/demo, or the one from the PDB you want to use? Nucleic acid formatting in the PDB is poorly standardized; Rosetta only takes the input in one format, so you may have to use scripts to tweak things (especially the residue code fields) to get Rosetta to recognize it.

Fri, 2011-05-20 10:49
smlewis

Yes, CH represents a protonated Cytosine base and the PDB file given is the one which i intend to use.It seems to me that for RNA, one has to do mass modifications in the PDB file to make them compatible according to Rosetta.

Fri, 2011-05-20 11:16
asmi

"It seems to me that for RNA, one has to do mass modifications in the PDB file to make them compatible according to Rosetta."

Pretty much, yes. I can help you troubleshoot, but as a general rule you'll need to make your PDB names match those in the same files (in the demo, or whatever) plus the database parameter files (rosetta_database/chemical/residue_type_sets/rna/residue_types).

Fri, 2011-05-20 11:23
smlewis