I read more about rosetta and came up with this command line for denovo structure prediction of 382 residue long protein:
I ran it in silent mode.
mpiexec -np 16 ../../main/source/bin/AbinitioRelax.mpi.linuxgccrelease -database ../../rosetta/rosetta_src_2019.31.60840_bundle/main/database/ @options -mpi_tracer_to_file log1 & options: -in -file -fasta sequence.fasta # protein sequence in fasta format -frag3 t001_.200.3mers # protein 3-residue fragments file -frag9 t001_.200.9mers # protein 9-residue fragments file -abinitio -relax -increase_cycles 10 # Increase the number of cycles at each stage in AbinitioRelax by this factor -rg_reweight 0.5 # Reweight contribution of radius of gyration to total score by this scale factor -rsd_wt_helix 0.5 # Reweight env, pair, and cb scores for helix residues by this factor -rsd_wt_loop 0.5 # Reweight env, pair, and cb scores for loop residues by this factor -relax -fast # At the end of the de novo protein_folding, do a relax step of type "FastRelax". This has been shown to be the best deal for speed and robustness. -out -nstruct 50000 # how many structures do you want to generate? Usually want to fold at least 1,000. -file -silent abrelax.out # full path to silent file output -silent_struct_type binary # we want binary silent files -scorefile score.sc -overwrite # overwrite any existing output with the same name you may have generated -nstruct 50000
the program is running on 16 cores and has generated log files like this:
log1_0, log1_1, ......log1_15
. Can you please let me know if this is correct? And, once the program finishes then how to analyse the silent files since there are multiple files like this, abrelax_1, abrelax_15 etc to find best 3D structure? I am following this tutorial for abinitio structure prediction:
"Can you please let me know if this is correct?"
It depends too tightly on your actual purpose. It isn't wildly incorrect in an obvious way.
"And, once the program finishes then how to analyse the silent files since there are multiple files like this, abrelax_1, abrelax_15 etc to find best 3D structure? "
It depends on your actual purpose. "best" is vague. I would look at the score files first, identifying the lowest-energy structures of interest, then I would extract them from the silent files to look at more closely. There is a tool called combine_silent if you want to multiplex your 16 silent files into one; I suspect it will renumber for you. You can also just hack up your score files to annotate which silent file any given line is paired with and then do the extractions manually.
Clustering is an option but it will not perform well on 10K structures and isn't smart enough anyway if your templates were good. If you had good templates you will need something probably bespoke and more sensitive than standalone clustering tools.