I am using several applications with mpi (docking, pepspec, simple_cycpep_predict) and I get repeated entries of the same result with different scores.
Is this a bug? How to process this output?
If you're using the same output silent file for multiple simultaneous runs, then it's not unexpected to get duplicated entries. There's some "have we already created this output structure" detection code, but if multiple simultaneous runs are writing to the same file, there's a fair probability that they'll both write a structure with the same name. (The duplicate detection is done before the launch of the structure, rather than before writing.)
This isn't great, but it often isn't a big deal. For most protocols in Rosetta, each output structure is independent of each other, and the numbering is arbitrary. The output structure is an output structure, and it doesn't matter if it's labeled _0001 versus _0013. The silent file reading code has special casing to detect duplicates and do some renumbering on them, so you might not even notice this if you use the silent file in a next step and don't care about numbering.
In your case, though, it sounds like you have multiple different types of runs writing to the same silent file. If you were going to just mix all the results together for your next step you're probably fine, but if you were going to do something different with the pepspec output versus the simple_cycpep_predict output, then you're out of luck -- there's probably not a good way to see which output came from which run. Your best bet is probably to re-run things, this time giving a different silent file output name to each run.
Thank you for the reply.
In my case, I always run a single run with mpi with several processes (usually 32). In most cases, I do sequence screenings (pepspec) or docking to get the top-scored solutions. How to run the apps in mpi to avoid these repetitions? If the mpi processes will write different results for the same solution, what's the point of the parallelization then?
This is an example of simple_cycpep_predict where I get duplicated entries in the silent file:
mpirun -np 4 simple_cycpep_predict.linuxgccrelease -cyclic_peptide:sequence_file $sequence \
-cyclic_peptide:cyclization_type terminal_disulfide \
-cyclic_peptide:require_disulfides true \
-cyclic_peptide:disulf_cutoff_prerelax 100 \
-cyclic_peptide:disulf_cutoff_postrelax 1 \
-cyclic_peptide:genkic_closure_attempts 100000 \
-cyclic_peptide:genkic_min_solution_count 1000000 \
-cyclic_peptide:min_genkic_hbonds 0 \
-mute all -unmute protocols.cyclic_peptide_predict.SimpleCycpepPredictApplication \
-out:nstruct 100 \
I solved the problem with the silent_tools package, https://github.com/bcov77/silent_tools, renaming the entries.