You are here

ERROR: Unable to fill in missing atoms.

8 posts / 0 new
Last post
ERROR: Unable to fill in missing atoms.
#1

Could you please help with the following issue (I am using rosetta3.12/rosetta_bin_linux_2020.08.61146_bundle):

core.conformation.Residue: [ ERROR ] Cannot build coordinates for residue TGT at position 111: missing too many atoms. core.conformation.Residue: [ ERROR ] Missing atoms are: C11 N1 C20 C9 C10 C17 N3 N2 C13 C2 N4 C19 S1 C14 C3 C15 C4 C18 C16 C7 C5 C12 C1 C6 C8 H1 H9 H16 H17 H18 H19 H20 H6 H7 H8 H10 H11 H12 H13 H14 H15 H4 H2 H3 H5

File: src/core/conformation/Residue.cc:1365 [ ERROR ] UtilityExitException ERROR: Unable to fill in missing atoms.

Steps to reproduce:

 

1) I am using a PDB structure containing two protein chains and two warheads. Small molecule residues are named TGT and LIG.

 

2) I wanted to follow the docking protocol as in Bai et al. paper: 10.1101/2020.05.27.119347 (my PDB contains two proteins oriented towards each other - a Medium quality pose when comparing with the bound complex).

 

3) I started with saving small molecules as mol2 using Maestro and then I called (as on https://www.rosettacommons.org/demos/latest/tutorials/prepare_ligand/prepare_ligand_tutorial): /rosetta3.12/rosetta_bin_linux_2020.08.61146_bundle/main/source/scripts/python/public/molfile_to_params.py -n TGT -p TGT --conformers-in-one-file TGT.mol2

/rosetta3.12/rosetta_bin_linux_2020.08.61146_bundle/main/source/scripts/python/public/molfile_to_params.py -n LIG -p LIG --conformers-in-one-file LIG.mol2

This step has already generated some warnings like:

WARNING: fragment 1 has 65 total atoms including H; protein residues have 7 - 24 (DNA: 33) WARNING: fragment 1 has 33 non-H atoms; protein residues have 4 - 14 (DNA: 22) WARNING: fragment 1 has 11 rotatable bonds; protein residues have 0 - 4 Average 65.0 atoms (33.0 non-H atoms) per fragment (Proteins average 15.5 atoms (7.8 non-H atoms) per residue) WARNING: no root atom specified, using NBR atom instead.

 

4) Then I run the commands as in Bai et al (I changed -partners parameter to A_B): /rosetta3.12/rosetta_bin_linux_2020.08.61146_bundle/main/source/bin/docking_prepack_protocol.static.linuxgccrelease -s rosetta_input.pdb -use_input_sc -extra_res_fa TGT.params LIG.params

/rosetta3.12/rosetta_bin_linux_2020.08.61146_bundle/main/source/bin/docking_protocol.static.linuxgccrelease -s rosetta_input_0001.pdb -nstruct 100 -use_input_sc -spin -dock_pert 5 20 -partners A_B -ex1 -ex2aro -extra_res_fa TGT.params LIG.params -out:file:scorefile score.sc -score:docking_interface_score 1

prepack already shows many warnings like: core.io.pose_from_sfr.PoseFromSFRBuilder: [ WARNING ] discarding 1 atoms at position 1 in file rosetta_input.pdb. Best match rsd_type: LYS:NtermProteinFull

and docking_protocol as mentioned above shows for every pose it tries (apart from many Rigid Body Perturbation Rejected): core.conformation.Residue: [ ERROR ] Cannot build coordinates for residue TGT at position 111: missing too many atoms. core.conformation.Residue: [ ERROR ] Missing atoms are: C11 N1 C20 C9 C10 C17 N3 N2 C13 C2 N4 C19 S1 C14 C3 C15 C4 C18 C16 C7 C5 C12 C1 C6 C8 H1 H9 H16 H17 H18 H19 H20 H6 H7 H8 H10 H11 H12 H13 H14 H15 H4 H2 H3 H5

ERROR: Unable to fill in missing atoms. ERROR:: Exit from: src/core/conformation/Residue.cc line: 1365 protocols.jd2.JobDistributor: [ ERROR ]

[ERROR] Exception caught by JobDistributor for job rosetta_input_0001_0002

[ ERROR ]: Caught exception:

File: src/core/conformation/Residue.cc:1365 [ ERROR ] UtilityExitException ERROR: Unable to fill in missing atoms.

Category: 
Post Situation: 
Tue, 2020-10-20 09:43
mieczyslaw

So the warnings from  molfile_to_params.py are typical for ligands. For the PoseFromSFRBuilder error you quote, having one or two lines of that per residue isn't a big deal (Rosetta will just rebuild the missing atoms), but if you're getting a large number of such warnings for your TGT, then that's the likely problem.

The issue is that the atom names in the params file and the input PDB have to match. If they don't, then Rosetta can't match things up properly, and thinks atoms are missing. molfile_to_params.py has a tendency to rename atoms. This is needed for SDF files (which don't contain atom name information), but mol2 files should still have it. Check the generated params files and make sure that the atom names in the params file match up with the mol2 file you passed to molfile_to_params.py. If not, there's a --keep-names option you can try to encourage it to refrain from renaming atoms.

P.S. If you're not already aware of it, https://pubmed.ncbi.nlm.nih.gov/32976709/ might also be of interest if you're doing PROTAC work.

Tue, 2020-10-20 12:37
rmoretti

Thanks for your quick reply.

Regarding PoseFromSFRBuilder error, I am getting one warning line per residue - always discards one atom (not sure which one). Is that expected?

Regarding my issue, you're right - this is about atom names - somehow molfile_to_params.py doesn't deal with chlorine atom (CL1) - see attached mol2 and params file (added *.txt extension for uploading). Instead of that an extra carbon appeared.

When using --keep-names, chlorine is kept, but it parametrizes it as CH1, which doesn't seem to be right? All atom names match between mol2 and params. I suppose molfile_to_params may need some debugging?

Unfortunately, docking protocol still shows the same error (improvement that it sees CL1 now):

 

core.conformation.Residue: [ ERROR ] Cannot build coordinates for residue TGT at position 111: missing too many atoms.
core.conformation.Residue: [ ERROR ] Missing atoms are:  C10   N1    C19   C8    C9    C16   N3    N2    C12   C1    N4    C18   S1    C13   C2    C14   C3    C17   C15   C6    C4    C11   CL1   C5    C7    HN1   H19   H81   H82   H91   H92   H93   H11   H12   H13   H21   H22   H23   H31   H32   H33   H6    H4    H5    H7

ERROR: Unable to fill in missing atoms.
ERROR:: Exit from: src/core/conformation/Residue.cc line: 1365
protocols.jd2.JobDistributor: [ ERROR ]

[ERROR] Exception caught by JobDistributor for job rosetta_input_0001_0002

[ ERROR ]: Caught exception:


File: src/core/conformation/Residue.cc:1365
[ ERROR ] UtilityExitException
ERROR: Unable to fill in missing atoms.
 

Wed, 2020-10-21 06:41
mieczyslaw

The atom discarding is not unexpected. Rosetta uses the old PDB atom naming conventions (or rather, it uses the PDB atom naming conventions in place at the time when it was originally programmed). This is pretty much the same as the new PDB naming conventions, except for hydrogens. You can also get differences in atom naming conventions from one program to the other (Some programs use different conventions on beta-branched amino acids.) -- Normally this doesn't make much of a difference in practice. You only typically need to be concerned if you have a large number of (non-hydrogen) atoms discarded.

Your CL1 atom is typed as a C.3 atom by the Mol2 file -- as far as the mol2 file is concerned, it's a carbon atom with a funky name.

I'd double check your input PDB file -- does the TGT residue have the same naming conventions as your Mol2 file/your params file? If there's a mismatch between the PDB naming and the params/mol2 naming, then you'll run into issues with not being able to recognize the atoms.

Wed, 2020-10-21 07:20
rmoretti

You're right, Maestro made a mistake when converting to mol2. I corrected that in Mol2 and molfile_to_params.py generated params file with Cl atom type. However, I am still getting the error. Yes, I have checked PDB file, atom after atom, and the names are the same.

Would it be OK if I attach my input PDB here?

Wed, 2020-10-21 08:30
mieczyslaw

Dear Rocco, would you be able to have a look at my input PDB file if I attach it here or send to you via email? Still no luck with making Rosetta working on this simple case. Thanks.

Thu, 2020-10-29 07:52
mieczyslaw

Hi,

I am a new user of Rosetta and currently trying to run protein-protein docking on the presence of ligands using rosetta_bin_linux_2021.16.61629_bundle. The goal is to design a PROTAC. Now I face exactly the same problem of "Unable to fill in missing atoms". In dock.log, the error is "core.conformation.Residue: [ ERROR ] Cannot build coordinates for residue LG1 at position 304: missing too many atoms.
core.conformation.Residue: [ ERROR ] Missing atoms are:  C9    N3    C10   O2    C11   C12   N4    C13   O1    C14   C15   C16   F1    F2    F3    C8    N2    C4    C2    N1    C1    C5    C6    C7    C3    C27   C26   C18   C17   C19   C20   C21   C22   C23   C24   C25   F4    H2    H5    H3    H6    H23   H24   H4    H1    H14   H15   H16   H9    H25   H26   H27   H28   H29   H20   H21   H22   H13   H12   H7    H8    H17   H18   H19   H10   H11"

But all these atoms are present in the pdb file and the param files.

I am using sdf files to create the ligand parameters. I checked the atom names in ligand parameter files and the pdb file being used in docking. The atom names are totally OK.  The ligand names are OK.

Is there any other possible source for this same problem?

Thu, 2021-08-26 14:25
soumyo88

Based on my own experiences, this problem happened because the rosetta wasn't using the ligand parameter file to look for the ligand in your input structure. It tried to find the ligand's three letter code from PDB database (https://www.rcsb.org/ligand/TGT), which may not be your desired structure. To fix this, add this option in your flag file: - load_PDB_components false. This will turn off the search from PDB and use the parameter file you prepared. Does this make sense?

Thu, 2021-11-18 10:13
rayyoung