I've been trying to readapt the ligand_docking tutorial (provided by the Meiler lab's Rosetta Virtual Workshop 2020 youtube series) to dock a Heme porphyrin to protein 4K8F. In short, I am having issues with the dock.xml file, specifically with the <InterfaceScoreCalculator> mover and its -native flag for generating alignment results to a given protein-ligand complex. When including <InterfaceScoreCalculator native="crystal_complex.pdb">, my ligand_docking run produces a ROSETTA_CRASH.log error, "Residue out of res_map range"". I believe the issue is my "crystal_complex.pdb" file for 4K8F bound to heme.
I'm curious if anyone knows whether the "crystal_complex.pdb" file needs to be prepped in a particular manner in order to be used with <InterfaceScoreCalculator> and its -native flag???
The attached ligand_docking tutorial states that the provided crystal_complex.pdb file "was taken directly from the RCSB PDB [for D3R and its bound ligand eticlopride]", but says little else... So, to produe an analagous file for 4K8F and HEME, I performed the following steps:
(1) From the RCSB PDB I downloaded biological-assembly-1 for 4K8F. This file contained a homodimer made from two 4K8F-heme complexes.
(2) I deleted chain-b and the second heme-ligand information from the pdb file, so that only one monomer-ligand complex remained within the pdb file.
(3) I changed all HETATM records to indicate chain X instead of A for HEME ligand.
(4) I renamed this file "crystal_complex_4K8F.pdb"
Finally, after making crystal_complex_4K8F.pdb, I ran the ligand_docking application using ROSETTA binaries via comamnd line:
~/rosetta/main/source/bin/rosetta_scripts @options.txt -nstruct 500 > ligand_docking.log
This is where I am currenly stuck ... Any tips or suggestions would be greatly appreciated. Please let me know if I need to reclarify.
P.S. I can confirm that when calling my dock.xml and options.txt files, without the -native flag in mover <InterfaceScoreCalculator>, the ligand_docking application works correctly and produced good results - i.e. all of my docking outputs were RMSD 0.00 when aligned to the native crystal structure in PyMOL (as expected). Unfortunately, the heme ligand was not included in these PyMOL structural alignments, hence why I am now trying to run ligand_docking again with <InterfaceScoreCalculator native="crystal_complex.pdb">.
I forgot to add my .log file, and the tutorial I used (ligand_docking_tutorial.pdf) can be found via this link.
You shouldn't need to do extensive prerelaxation of the native structure. However, you will need to prepare it such that Rosetta can read it in. A warning message you're getting:
indicates that there might be something off about your crystal_complex_4K8F.pdb file. In particular, it looks like the atom names for your HEM residue aren't matching the ones which are specified in the params file for your HEM residue. You may need to rename all the atoms such that Rosetta can correctly match up the atoms between the structure being docked and the reference structure.
I'm not entirely sure that it would fix the crash you're seeing, but it's a possibility, and something you'll want to do anyway. (Otherwise, Rosetta is rebuilding the "missing" atoms on your reference ligand, which may affect your rmsd calculations.
Thanks @rmoretti for the help!
I have a new problem but not sure where to start, im wondering if you could advise me on how best to proceed?
As of now, my ligand_docking models (protein-heme) have only one out of two coordinating residues (beacause they are monomers). For the native dimer complex in 4K8F, you'll note that the distal coordinating ligand on the HEME comes from the opposing monomer's N-termini; this is my issue...
I am interested in mesuring the change in energy (ideally delta_interface) between the 5 coordinate and 6 coordiante heme states (i.e. for the dimer bounded complex).
Should I first dimerize my unbound monomer, and then do ligand docking for two binding sites (one for each monomer). OR should I dock my heme ligand to the monomer (see attached png), proceed to dimerize the "bounded" monomer, followed by some sort of repacking/relax of each N-termini?
Currently Im thinking about dimerizing the unbound-heme models using SymDock, and then performing ligand_docking on the two binding sites (but so far I've only successfully dimerized the bounded monomer via this method). My hopes are that the N-termini for the unbound dimer are in close enough proximity to the heme binidng site on each monomer to produce 6 coordinate heme with the use of a constraint file.
Another idea is to keep the cleaned, native ligand-dimer pdb, rigid while trying to use the floppy tail application to fold one of the N termini (that ive forcefully linearized with PyRosetta); but i dont think the app was written for this purpose and im having issues with the jump tree. Although the N terminus of each monomer is long enough to reach the other subunit of the dimer, it has some secondary structure, so not quite the same as the disordered region that the publication used FloppyTail for...
Any tips would be appreciated. Ive attached a pdb of the native dimer complex "4k8f_AB.cleaned.pdb" (which im striving to reproduce), and a png of the two best ouputs from my ligand_dock run (the cartoon in green is the native wildtype complex, whereas blue and orange are the theoretical docks)
Yes, residue ID "misnaming/misnumbering" seemed to be my problem w/ respect to my original post. I was able to fix the error by doing the following: