I have successfully been able to run match, but enzdes remains elusive. Currently, I'm trying to figure out, if I used output_matchres_only in matcher (my PDB files only contain the ligand and catalytic residues), how do I load both the pdb file containing the scaffold and pdb files containing the matched site into enzyme design?
Have you looked at the enzyme design demo yet? (rosetta_demos/enzyme_design)? I think step 3 answers your question?
I just did. I didn't realize the examples were here, since the manual says they are in tests/integration.
Stil, step 3 doesn't help. The tutorial assumes that you output the scaffold into the PDB for each match:
2.2 -- Outputs generated
1) A .pdb file for each "match" three-dimensional coordinates of both the scaffold protein and the theozyme, including the ligand, and will be used as an input in step 3. The number of matches found in the scaffold depends on the complexity of the theozyme, and be anywhere between 0 and hundreds.
3.3 -- Performing design
/bin/enzyme_design. -database @rosetta_inputs/general_design.flags -s rosetta_inputs/UM_1_D41H116K189_1tml_11_mocktim_1.pdb -out:file:o scorefile.txt
As you can see, the tutorial only loads one PDB file which contains both the scaffold and the matched sites. I am wondering how to load the scaffold if my match pdb files don't contain it. i.e. if I used output_matchres_only option in match.
Does the Baker lab really output the scaffold in each match? That seems like an unnecessary waste of disk space.
Can you splice the PDB and match together (in emacs or whatever)? I'm not sure how they're formatted but that may be straightforward. I've asked a Baker lab person to ask around.
Thanks for asking around.
For now that's essentially what I'm doing, but its seems like there should be an easier way. Let me know if you find out anything more.
In general we do output the full protein matches. Limited disk space is not really an issue in the Baker Lab, using a CloudPDB output cuts down on the amount of output anyway, and if you're going to run enzyme design on them anyway, storing the matches is only a small portion of the disk space. One of the other reasons for doing so is that it allows for rapid examination of the matches you get with Pymol. Also, if you have multiple scaffolds you can put them all together in an enzyme design run without the difficulty of trying to pair up each match with the respective scaffold.
That said, back in the Rosetta++(2.0) days, outputting matchres only was the default, and the Rosetta++ enzyme design protocol could take them directly as input. Asking around, it appears that that functionality was never ported to Rosetta3, as by that time people were typically doing the full PDB output.
There are several scripts which automate putting the matched residues into a scaffold. I've attached one such perl script. Be advised it hasn't been used much recently, and it might overwrite your current files. (So to be on the safe side you might want to operate on a copy of the matcher output.)
Thanks! I'll keep this in mind for future runs.