Below are some ideas for improving symmetric_docking jobs:
(1) If you choose dihedral instead of cyclical symmetry, it seems like you also need an even number of subunits. Could you have the input page block dihedral jobs with odd numbers of subunits?
(2) Most dihedral hexamer jobs I have run have worked fine, but jobs 16910 & 16922 gave Isc values like -120,737,256 or -120,776,632 instead of more typical Isc values like -46.692 or -40.261 (from jobs 16908 & 16909). Using the same input pdb file from 16910 & 16922 for cyclical hexamer jobs (16029 & 16573) gave more normal Isc values (-53.826 & -44.244) as well.
(3) I like how the output pdb files from docking2 jobs include lines like below:
# All scores below are weighted scores, not raw scores.
label fa_atr fa_rep fa_sol fa_elec fa_pair hbond_sr_bb hbond_lr_bb hbond_bb_sc hbond_sc dslf_ss_dst dslf_cs_ang dslf_ss_dih dslf_ca_dih fa_dun total
weights 0.338 0.044 0.242 0.026 0.164 0.245 0.245 0.245 0.245 0.5 2 5 5 0.036 NA
pose -248.547 6.51102 106.119 -2.62375 -3.06939 -1.86796 -13.0332 -4.76678 -2.81325 0 0 0 0 4.8183 -159.273
These help identify the pdb file even after you have copied it out of its tar archive. These lines also list the total_score (-159.273) and I_sc (-2.46827) values for the pdb file, which helps keep records straight. Could such lines be included in symmetric_docking job output files as well?
(4) Often on the server web page for a particular symmetric_docking job, the first column in the spreadsheet begins too narrow for all of the decoy names. Could the first column be made wider?
(5) If you click on a point in the graph made by a symmetric_docking job, a window appears listing the pdb file name and various values for that point. Often though, the cursor blocks part of the file name in the top line of this window. Is it possible to add a blank line to the top of each window so the cursor won't block anything important? docking2 job graphs have the same problem.
(6) Include an I_sc / RMSD plot on each symmetric_docking job web page like the ones already appearing on docking2 job web pages. Please make sure that the windows that appear when you click on a point give the correct filenames and data for each point. See https://www.rosettacommons.org/node/9404 and https://www.rosettacommons.org/node/3911 for details.
(7) In each symmetric_docking score.sf file and spreadsheet, please list I_sc & rms closer to the left end, like for docking2 jobs. It might also help to list score & I_sc for each monomer as well as total score & total I_sc for all monomers. I think for dihedral symmetry, one could list 2 different I_sc values per monomer. It seems like dihedral symmetry has 2 binding faces (call them x and y) for each monomer. I think the binding goes like yax-xby-ycx-xdy-yex-xfy-yax for subunits a-f in a dihedral hexamer. Thus, x faces bind x faces, and y faces bind y faces. Thus, I think you could have I_sc_x and I_sc_y values for each hexamer. A final request would be to have the score.sf file and spreadsheet list whatever data it can for the input proteins.pdb file for a symmetric_docking job. This data could include the score due to an isolated monomer. Such data would help when using a pdb file output from one ROSIE job as the input pdb file for another ROSIE job. One could check that the desired pdb file was used and also see how running the symmetric_docking job alters the score for an isolated monomer.
(8) It would be great if symmetric_docking could take a whole oligomer as input and then refine it. Right now, it seems like symmetric_docking only wants monomers as input, and if the monomer comes from the best-scoring oligomer in a previous symmetric_docking job, it ignores how the monomers are arranged with respect to each other and basically starts the docking procedure from scratch.
Perhaps you could let the user input how many monomers are in the input file and how many monomers should be in the output file. The user could put TER lines between each monomer in the input file to delimit them. Then the symmetric_docking job could check that all monomers have the same sequence, align each monomer with the first one input, and then average all the aligned monomers to get a generic monomer. Next, the symmetric_docking job could find the closest cyclical or dihedral inter-monomer transformation rule to copy each monomer from the generic monomer. It could list cyclical or dihedral symmetry somewhere in its output. Once the idealized input oligomer pdb file is made, it could be scored like any of the output pdb files would be, and its scores could be included in the graph(s), spreadsheet, and score.sf file made for the symmetric_docking run. rms or RMSD values could be calculated from the idealized input oligomer instead of from one of the job's output files.
Continuing (8) above, suppose you had two pdb output files (proteins_0500.pdb and proteins_0799.pdb) from one docking2 job and used these two files as the input pdb files for two symmetric_docking jobs. Say the initial docking2 job began with two identical (but rotated and translated) monomers as its docking partners. I think it would help if each symmetric_docking job would determine and report the symmetry closest to its pdb input file, saying, for example, that proteins_0500.pdb is closest to being 2 subunits of a cyclical pentamer while proteins_0799.pdb is closest to being 2 subunits of a dihedral hexamer. This information could be used to determine the oligomers formed from various types of monomers.
These would be interesting new functionality. However, more complex modeling tasks involving symmetric docking has to be currently carried out with command line Rosetta.
Thanks for your suggestions, they are very welcome. I agree that your ideas would improve the usability of the symmetric docking server and something we can add next time the symmetric docking server is updated.