We are trying to use "InterfaceAnalyzer" to analyze a PDB, then we get this:
core.conformation.Conformation: [ WARNING ] missing heavyatom: C7 on residue pdb_HSD 152
core.conformation.Conformation: [ WARNING ] missing heavyatom: O4 on residue pdb_HSD 152
Error: [ ERROR ] ERROR: Exception caught by JobDistributor while trying to get pose from job 'complex_0_0001'
Error: [ ERROR ]
[ ERROR ] UtilityExitException
ERROR: too many tries in fill_missing_atoms!
The attachment is our PDB file, because it is over 512K in size, so we compressed it with winrar.
It seems that there is a problem with our PDB file format. What should I do with it?
"Too many tries in fill_missing_atoms" is not a failure of your whole PDB, it's usually a failure of some particular residue. Generally you have something chemically unusual (or with an unusual chemical connection to protein) that Rosetta is failing to understand. You can try running it through clean_pdb.py in the tools distribution to clean up stuff Rosetta might not like?
This is not an InterfaceAnalyzer failure, it's a PDB loading failure, it will probably occur no matter which Rosetta application you use to try to load this PDB.
Thank you for your reply. I used "python /home/mxp/apps/rosetta/tools/protein_tools/scripts/clean_pdb.py complex_0.pdb ignorechain" to clean my PDB file, and used "InterfaceAnalyzer.default.linuxgccrelease -s complex_0_ignorechain.pdb -packstat 1 -fixedchains E F @pack_input_options.txt -overwrite" to analyze it again, another error occurred:
core.pack.dunbrack.RotamerLibrary: Dunbrack 2010 library took 0.29 seconds to load from binary
core.pack.pack_rotamers: built 39 rotamers at 16 positions.
core.pack.interaction_graph.interaction_graph_factory: Instantiating DensePDInteractionGraph
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
apps.public.analysis.InterfaceAnalyzer: [ ERROR ] pose has only one chain, skipping
protocols.jd2.JobDistributor: [ WARNING ] complex_0_ignorechain_0001 reported that its input was bad and will not retry
There are four chains in our PDB file, H L E F , we considered chains E and F as one monomer and chains H and L as another one. The attachment is the error log file.
Did you look at the PDB after cleaning? are there 4 chains in it still? Are they the chains you expect?
"core.pack.pack_rotamers: built 39 rotamers at 16 positions." makes me suspicious that much of your pose got deleted during cleanup, although it might just be the missing sidechain packing line which will only have a few positions.
I opened this file with Discovery Studio and it looks fine, all chains is there. Attachment is the screenshot.
The size of the PDB file also exceeds 512K，I compressed it with winrar,and add ".pdb" at the end of file name. You can uncompress and check it.
I can take a look at .gz or .zip in a pinch but my machine does not natively have a handler for .rar and I'm not going to install one.
You can also just split the file into a couple of chunks or just post the first 5 and last 5 residues of each chain, that would be enough to diagnose any obvious flaws.
You should also try round-tripping through a different Rosetta application, that will discriminate "pdb loading problems" from InterfaceAnalyzer specific ones. Just load it through score_jd2 and then examine the output to see what residues rosetta "kept".
I think we've set the behavior of Rosetta w/r/t occupancy to read in zero occupancy residues, but you might want to force all occupancies to 1. I once saw someone encode partial charges in the occupancy column, which leads to NEGATIVE occupancy, which might also be bad times.
I split the file into 2 parts and put them in the attachment. I will try to open this PDB file with another Rosetta application.
I get funny symbols when I load this in a text editor. You have DOS line endings. Rosetta requires UNIX style line endings.
If you are actually on a *nix system, this snippet will fix it.
If you are actually in windows, Google will tell you how to fix it, but basically you have to get rid of the carriage return and keep the linefeed.
I should have suspected the problem as soon as you posted a screen grab from windows but...I'm slow.
Oh, I think this is because I used wordpad to edit and save files.
But I think this is not the reason for my problem, because the problematic file is complex_0_ignorechain.pdb, it has not been edited by wordpad.
I used Discovery Studio to turn amino acid HSD in PDB into HIS，and saved, the problem has not appeared again.
DS seems to be more robust in identifying PDB file formats, I hope that ROSETTA can also handle this problem.