You are here

How do I implement more CPU cores in the execution of FlexPepDock?

2 posts / 0 new
Last post
How do I implement more CPU cores in the execution of FlexPepDock?

Hello all. I am a university student currently doing my thesis, and I really don't know if this topic has been adressed before, but  I couldn't find anything like my question.

For my thesis I need to use the FlexPepDock program of Rosetta in a local Linux machine. As the subject says, I want to implement more cores in the processing of my samples, because one job is generally taking a very long time while it uses only one core of the CPU. The lab provided me with a machine that has 8 cores dedicated for my analisis (without crashing the system), and I'd like to know if I can speed up things a bit with more processing power in one docking.

I am aware that I'm capable to run multiple FlexPepDock jobs locally using these cores, but as I said, I want to dedicate more cores to a single job.

I am no Linux pro or anything like that, so if anyone has an idea and can make me a step by step list of how to make this possible (if it is), I would be very grateful :)

Thanks in advance for all your answers.

Post Situation: 
Thu, 2024-05-02 08:55

Most Rosetta protocols (including FlexPepDock) are "trivially parallelizable" -- In particular, we run multiple simulations with multiple output structures. Typically, each output structure is independent of any other output structure. So it really doesn't matter if you launch one command with -nstruct 10000 or launch it one hundered times with -nstruct 100. You'll get the same scientific results.

The one wrinkle is that Rosetta has restart behavior, which means that it won't write new output if the old output is present. So actually running 100 jobs in one directory doesn't quite work. (They'll also fight each other by trying to output to the same file names.) -- But you can either run them in separate directories, or there's an `-out:suffix` option which can help.  If each command gets a different `-out:suffix` (e.g. `-out:suffix _1` `-out:suffix _2`, `-out:suffix _3` ...) the output names will be different and the multiple runs will coexist in the same directory.

Alternatively, there's ways to run with MPI such that the multiple processes coordinate with each other, but that's typically more effort than it's normally worth, unless you're already running on a cluster where MPI is the expectation.


P.S. We're updating the site, and as part of that we're moving the forums to Github Discussions.

If you have additional questions or are still having issues, please feel free to open up a thread over there.

Thu, 2024-06-20 13:34