You are here

Multistate design on Rosetta - no generations produced

9 posts / 0 new
Last post
Multistate design on Rosetta - no generations produced
#1

Hi,

I am currently running a multistate design protocol (mpi_msd.linuxgccrelease) for bispecific antibody design. After I have run the protocol with my options, the program runs and is "stuck" at this point for a long time:

core.pack.pack_rotamers: built 39485 rotamers at 58 positions.
core.pack.interaction_graph.interaction_graph_factory: Instantiating DoubleLazyInteractionGraph
core.pack.interaction_graph.interaction_graph_factory: IG: 22325380 bytes

I understand from prior trials that the program will start producing generations after this point, but it is still stuck here. At this point, there is 20gb of ram usage, but generations are still not produced yet.

I am running Ubuntu 18.04, with Rosetta 3.8.

The options I have run here are as follows:

#begin options
-entity_resfile entity.resfile
-fitness_file fitness.daf
-ms::pop_size 100
-ms::generations 240
-ms::numresults 10
-use_input_sc
-ms::fraction_by_recombination 0.025
-database /home/labadmin/Downloads/Rosetta/main/database
-options:user
-run:version
-mpi_tracer_to_file proc
-chemical:exclude_patches LowerDNA UpperDNA Cterm_amidation SpecialRotamer protein_cutpoint_upper protein_cutpoint_lower VirtualBB ShoveBB VirtualDNAPhosphate VirtualNTerm CTermConnect sc_orbitals pro_hydroxylated_case1 pro_hydroxylated_case2 ser_phosphorylated thr_phosphorylated tyr_phosphorylated tyr_sulfated lys_dimethylated lys_monomethylated lys_trimethylated lys_acetylated glu_carboxylated cys_acetylated tyr_diiodinated N_acetylated C_methylamidated MethylatedProteinCterm
-corrections::score::score12prime
-no_his_his_pairE
#end options

Would be grateful if someone could help me understand why the problem persists.

 

 

Category: 
Post Situation: 
Wed, 2018-06-13 00:36
tong

How much RAM total do you have? If it's around the 20 GB that Rosetta is using, it could be that your computer is "thrashing", that is trying to work with an over-allocation of memory by (temporarily) moving some of it to disk. This slows things down greatly, as writing data to disk is much, much slower than accessing it from memory. The issue is compounded if the run later on needs that data which it temporarily moved to disk. It then has to move other data out of memory to make room, then move that data back into RAM. If this happens enough, your computer can grind practically to a halt, and make it look like it isn't making any progress.

 

An additional question: how long have you waited for the program to make progress? (5 minutes? 5 hours?)

Wed, 2018-06-13 14:17
rmoretti

I have 64gb of ram in my computer. Anyway the program did start producing its first generation after a few hours of running on single core run. MPI run would take up too much ram (around 50+ gb). Is this normal? Also, any suggestions on how I could reduce the ram usage for MPI run?

Wed, 2018-06-13 18:23
tong

Ah, you're running it single core.  I guess I should have seen that from the executable name.  It has "MPI" literally in its name - it's not designed for use on one core because it's too slow.  I don't believe I ever tried running it single-core for comparison.

There's no standalone way to improve the MPI memory situation: each process will load the database in separately and do its calculations separately.  In another year or so we might have a proper multithreading model that would alleviate at least the multiple-database problem.  This is a hungry protocol.

Wed, 2018-06-13 19:06
smlewis

Yep, I wanted to run it on multicore, but the ram usage was way too high. How much ram do you typically consume when running a multistate protocol?

Thank you for your responses, greatly appreciate it!

Wed, 2018-06-13 19:23
tong

I haven't used it since 2012 or so, I don't remember.  Except small test jobs I only ever ran it on a big university cluster where RAM was never an issue.

Thu, 2018-06-14 10:26
smlewis

How big is your sequence space?  I would say run what you have but with a much smaller sequence space (only a few mutations allowed) to see if it works or not.  I don't remember if that rotamer count is overlarge but it's definitely not small.

 

I think the doublelazyIG will deal with a lot of the memory issues - it's supposed to trade less memory for being slower, since the memory demands of multiple backbones' rotamer sets would be overwhelming.  Then again, if that is where it is sitting in the log, it's definitely an indicator of using-too-much-memory.

 

 

Wed, 2018-06-13 14:36
smlewis

2 chains of 200+ amino acids each. Mutations allowed are 10 (5 on each chain). Is that too much?

Wed, 2018-06-13 18:24
tong

That should be a reasonable space!

Wed, 2018-06-13 19:03
smlewis