I've done a structure refinement of a protein by Rosetta3.8.
However, I'm not sure that what I've done is correct or not.
The protocol that I followed are list below:
First, I put my protein sequence(.fasta file), torsion angle(predicted by talos+), RDC value(experimental data) into the website( http://robetta.bakerlab.org/fragmentsubmit.jsp) to do fragment generation.
Second, I used the command "minirosetta.default.linuxgccrelease @HGDC_broker_cst.options" to run rosetta.
The file"HGDC_broker_cst.options" are shown below:
#make sure all variable names have been replaced with absolute path and that no line begins with a $ or ~s
-native HGDC.pdb # native PDB file (optional)
-fasta HGDC.fasta # protein sequence in fasta format
-frag3 aat000_03_06.200_v1_3 # protein 3-residue fragments file
-frag9 aat000_09_06.200_v1_3 # protein 9-residue fragments file
-increase_cycles 10 # Increase the number of cycles at each stage in AbinitioRelax by this factor
-rg_reweight 0.5 # Reweight contribution of radius of gyration to total score by this scale factor
-rsd_wt_helix 0.5 # Reweight env, pair, and cb scores for helix residues by this factor
-rsd_wt_loop 0.5 # Reweight env, pair, and cb scores for loop residues by this factor
-reinitialize_mover_for_each_job # jd generate fresh copy of its mover before each apply (once per job)
-find_neighbors_3dgrid # Use a 3D lookup table for doing neighbor calculations. For spherical, well-distributed conformations
-nstruct 1000 # how many structures do you want to generate? Usually want to fold at least 1,000.
-silent HGDC_broker_cst.out # full path to silent file output
-silent_struct_type binary # we want binary silent files
-overwrite # overwrite any existing output with the same name you may have generated
-force_minimize # minimize the structure after making a move, even if no restraints given
Finally, I've got a score.fsc file like this:
SCORE: score fa_atr fa_rep fa_sol fa_intra_rep fa_elec pro_close hbond_sr_bb hbond_lr_bb hbond_bb_sc hbond_sc dslf_fa13 rama omega fa_dun p_aa_pp yhh_planarity ref Filter_Stage2_aBefore Filter_Stage2_bQuarter Filter_Stage2_cHalf Filter_Stage2_dEnd co rms maxsub clashes_total clashes_bb time user_tag description
SCORE: -227.165 -772.819 83.957 444.286 2.071 -84.559 0.478 -18.327 -42.232 -18.666 -13.370 0.000 -8.885 17.856 234.566 -30.453 0.395 -21.463 0.000 0.000 0.000 0.000 27.286 19.088 54.000 0.000 0.000 719.000 002 S_002_00000001
SCORE: -247.325 -770.395 78.062 439.372 2.099 -90.080 0.374 -20.420 -38.220 -12.950 -20.084 0.000 -13.605 15.676 235.575 -31.396 0.130 -21.463 0.000 0.000 0.000 0.000 20.812 15.127 68.000 0.000 0.000 687.000 002 S_002_00000002
My questions are :
1. I don't know what the flags -run:reinitialize_mover_for_each_job & -score:find_neighbors_3dgrid & -out:file:silent_struct_type respectively represents for
should I use them for strcture refinement?
or where can I find the meaning of them in detail
(I've found this page (https://www.rosettacommons.org/docs/latest/full-options-list) that describes the meaning of them but it seems not that clear)
2. In my score.fsc file I can't understand what the fa_atr fa_rep ....... stand for?
Besides, most of the tutorials suggest us compare the score term. But how about other terms? Are they as important as each other?
3. If I want to obtain rmsd of each structures compared to the structure which has the lowest energy, what flags should i use.
(just like what csrosetta online server do)
4. Whether I can refine an inexact structure by inputting a pdb file and other constraint but not reconstruct a strcuture by fragment picking?
I'll really appreciate your advice and suggestions.
1) The page you found is probably the best documentation of options. Depending on the option, there may be more explanation elsewhere in the documentation.
-run:reinitialize_mover_for_each_job is just a safey thing. The "mover" is what is actually acting on the structure. Some movers have internal state which can carry through from output structure to output structure (and in fact, some protocols rely on this). What the -run:reinitialize_mover_for_each_job option does is say that each output structure (each job) should get a fresh copy of the mover at the start of the protocol. This way you don't inadvertantly carry over state between the output structures. I'd keep this on. It won't appreciably slow your protocol, and it will keep you from carrying over state between structures.
-out:file:silent_struct_type sets the type of silent file output. Silent files are an efficient but Rosetta-only file format. There's two main types of silent file formats: protein and binary. Protein silent files are only for proteins, and require the protein to have ideal bond length and angles. Binary silent files can handle more residue types, and can handle arbitrary atom positions. There is some autodetermination logic in Rosetta to choose the best format, but there's nothing wrong with telling Rosetta to always use binary silent files - it's a more general format. (And contrary to the name, it's still an ASCII-only text file.)
-score:find_neighbors_3dgrid -- I haven't encounted this before, but it's apparently an alteration in how the scoring function works. From what I can determine, it should speed up runs for very large protein/protein complexes. It would have minimal/negative impact for smaller protein systems, which is why it isn't turned on by default.
2) For a good rundown of all of Rosetta's energy terms, see Alford et al. (https://doi.org/10.1021/acs.jctc.7b00125). See also https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/score-types and the links in the "See Also" section of that page.
3) Most Rosetta protocols are once-through, so to compute an rmsd metric to one of the output structures, you'll need to do a separate Rosetta run. If all you want is the rmsd, this is relatively straightforward. Simply do something like: `score.linuxgccrelease -in:file:silent HGDC_broker_cst.out -in:file:native lowest_energy_structure.pdb -out:file:scorefile rmsd.sc -score:weights empty` -- you unfortunately will need to extract the lowest energy structure as a PDB prior to running it though. (The score application can't handle silent file-based "natives".)
4) I'm not entirely sure what you mean by this, but if you're talking about starting from an existing structure, rather than a fasta sequence file, the current protocol you use cannot do so (it always starts from an extended sequence). However, if you look into "comparative modeling" protocols, you should be able to find something related that will likely work. Comparative modeling also works if you use a (partial) structure of the current protein as your "homolog". (This approach has had some success with refinement into electron density information.)