You are here

Repeated entries in silent file

5 posts / 0 new
Last post
Repeated entries in silent file
#1

Hello,

I am using several applications with mpi (docking, pepspec, simple_cycpep_predict) and I get repeated entries of the same result with different scores.

score ... description
2.211 ... result_0001
1.441 ... result_0001
6.063 ... result_0001
2.267 ... result_0001
14.112 ... result_0002
1.446 ... result_0003
5.847 ... result_0004
1.723 ... result_0005
4.195 ... result_0002

 

Is this a bug? How to process this output?

Best

Category: 
Post Situation: 
Fri, 2022-08-05 02:52
almeida85

If you're using the same output silent file for multiple simultaneous runs, then it's not unexpected to get duplicated entries. There's some "have we already created this output structure" detection code, but if multiple simultaneous runs are writing to the same file, there's a fair probability that they'll both write a structure with the same name. (The duplicate detection is done before the launch of the structure, rather than before writing.)

This isn't great, but it often isn't a big deal. For most protocols in Rosetta, each output structure is independent of each other, and the numbering is arbitrary. The output structure is an output structure, and it doesn't matter if it's labeled _0001 versus _0013. The silent file reading code has special casing to detect duplicates and do some renumbering on them, so you might not even notice this if you use the silent file in a next step and don't care about numbering.

In your case, though, it sounds like you have multiple different types of runs writing to the same silent file. If you were going to just mix all the results together for your next step you're probably fine, but if you were going to do something different with the pepspec output versus the simple_cycpep_predict output, then you're out of luck -- there's probably not a good way to see which output came from which run. Your best bet is probably to re-run things, this time giving a different silent file output name to each run.

Fri, 2022-08-05 07:01
rmoretti

Thank you for the reply.

In my case, I always run a single run with mpi with several processes (usually 32). In most cases, I do sequence screenings (pepspec) or docking to get the top-scored solutions. How to run the apps in mpi to avoid these repetitions? If the mpi processes will write different results for the same solution, what's the point of the parallelization then?

Fri, 2022-08-05 09:04
almeida85

This is an example of simple_cycpep_predict where I get duplicated entries in the silent file:

mpirun -np 4 simple_cycpep_predict.linuxgccrelease -cyclic_peptide:sequence_file $sequence \
                      -cyclic_peptide:cyclization_type terminal_disulfide \
                      -cyclic_peptide:require_disulfides true \
                      -cyclic_peptide:disulf_cutoff_prerelax 100 \
                      -cyclic_peptide:disulf_cutoff_postrelax 1 \
                      -cyclic_peptide:genkic_closure_attempts 100000 \
                      -cyclic_peptide:genkic_min_solution_count 1000000 \
                      -cyclic_peptide:min_genkic_hbonds 0 \
                      -cyclic_peptide:MPI_auto_2level_distribution \
                      -mute all -unmute protocols.cyclic_peptide_predict.SimpleCycpepPredictApplication \
                      -out:nstruct 100 \
                      -out:file:silent output_mpi_$filename.silent

 

Mon, 2022-08-08 00:32
almeida85

I solved the problem with the silent_tools package, https://github.com/bcov77/silent_tools, renaming the entries.

Thu, 2022-08-11 00:29
almeida85