I have a protein that contains Zn and Ca ions bonded to it.
The last few rows of the PDB file are as follows:
ATOM 3336 C ASP A 434 36.621 7.866 23.815 1.00 0.00 C
ATOM 3337 O ASP A 434 36.955 6.680 23.907 1.00 0.00 O
HETATM 3338 ZN ZN A 435 34.098 17.250 32.934 1.00 22.09 ZN
HETATM 3339 CA CA A 436 14.268 26.405 54.375 1.00 35.23 CA
LINK NE2 HIS A 166 ZN ZN A 435 1555 1555 2.13
LINK NE2 HIS A 170 ZN ZN A 435 1555 1555 2.06
LINK NE2 HIS A 176 ZN ZN A 435 1555 1555 2.14
LINK O ILE A 242 CA CA A 436 1555 1555 2.47
LINK OD1 ASN A 245 CA CA A 436 1555 1555 2.38
LINK O MET A 247 CA CA A 436 1555 1555 3.05
LINK OE1 GLU A 252 CA CA A 436 1555 1555 3.00
LINK OE2 GLU A 252 CA CA A 436 1555 1555 2.34
LINK OD2 ASP A 255 CA CA A 436 1555 1555 2.42
When I try to run an mp_domain_assembly protocol with this PDB file, it throws this error:
[ ERROR ] UtilityExitException
ERROR: The sequence in one of the PDBs doesn't match the sequence in the fasta file! There could either be a mutation or missing density. Please close loops within PDB files before running this protocol (i.e. connecting the pieces with linkers)!
It doesn't matter whether I add -in:auto_setup_metals flag into the SLURM file or not. It gives the exact same error. When I delete HETATM and LINK rows it works fine. But I need to retain them inside my PDB file to calculate their exact effect. Is it possible? If yes, how can I handle/retain them?
The issue you're having isn't that Rosetta isn't being able to read the metals. The issue is that you're not including the metals in your FASTA file. The mp_domain_assembly application takes in a fasta file with the intended sequence. If you're including metal ions in your structure, those metal ions also need to be present in the sequence you use.
How to include metal ions varies a bit with the application, but for the mp_domain_assembly application, you simply need to add the appropriate one letter code for the metal ion to the end of the sequence (or rather, at the place in the sequence corresponding to where the metal ion is in the input PDB, which is usually at the end). Rosetta generally uses either Z or X as the one letter code for non-protein/nucleic acid residues. Both Zn and Ca use Z. Just add them to the end of the fasta sequence without any sort of other separator, and I think that should work.
Thank you very much rmoretti! I added ZZ to the corresponding place in the fasta file and it seems to almost working. Now it throws this error:
[ ERROR ] UtilityExitException
ERROR: The residue CA could not be generated. Has a suitable params file been loaded? (Note that custom params files not in the Rosetta database can be loaded with the -extra_res or -extra_res_fa command-line flags.)
I checked database/chemical/residue_type_sets/fa_standard/residue_types.txt file. Everything looks fine (CA ions are included and uncommented). Then I tried to use this path (and after that database/chemical/residue_type_sets/fa_standard/residue_types/metal_ions/CA.params) with -in:file:extra_res_path or -in:file:extra_res_fa flags. I got errors on each try.
Can't figure out what I did incorrectly. Could you please direct me to where can I find params for CA, or whether I should generate it myself (if so how?)?
The parameters for calcium ions should be in the database and should be enabled by default (as you've noticed and mentioned).
I'm not sure why you're getting that message. Do you have the full error traceback (from the ROSETTA_CRASH.log file)? That might help debug what is going on. (Alternatively, if you can post all the input files, I can potentially try debugging to see what might be going on.)
Ok. I am adding the crash files. I renamed them 1 and 2.
In 1 the normal crash file (Meanwhile when I remove CA the same error appears to ZN ion)
In 2 I added -in:file:extra_res_fa=./CA.params and this time it says that "Attempting to add a new residue type, but residue type 'CA' already exists in the cache"
Please let me know if there are any updates?
Looking into it further, it looks like I mislead you originally. While adding the Z letters to fasta does allow the mp_domain_assembly application to read the fasta file, there's a bigger lingering issue in the mp_domain_assembly application, in that it attempts to polymerically connect all the residues in the input structures.
Because metal ions can't be polymerically connected, even if you fix up other issues, you'll run into problems attempting to merge everything together.
My recommendation would be to use mp_domain_assembly on the metal-free proteins, then attempt to add back the metal ions to the output structure of mp_domain_assembly.