You are here

Fixbb Multiple Resfiles

19 posts / 0 new
Last post
Fixbb Multiple Resfiles
#1

I was wondering if it is possible to specify multiple resfiles in a single fixbb job submission. For instance, if I have 10 resfiles and 10 PDBs I would like to have 100 output files, 10 for each resfile. If not is there a simple way to modify the code to do this? I tried writing a wrapper program for the main function in fixbb.cc that cycles resfiles in and out of the flags file and calls the main function passing that in as argv, but the second call to the main function (second resfile) crashes with an error message saying:

ERROR: fresh_instance has been called on a Mover which has not overridden the base class implementation. Probably you tried to pass a Mover to the job distributor which does not have fresh_instance implemented. Implement the function and try again. (Mover.cc line 161).

I'm not really sure how to fix this error. Are there any suggestions on how to best tackle this problem? We're testing a design method that requires a lot of resfiles and we have access to a Bluegene cluster that could be used to easily farm this job out, but starting and stopping takes too long on a normal cluster and you have to wait in a queue for each Bluegene submission. Thanks in advance for your assistance.

Post Situation: 
Mon, 2012-10-15 10:34
protos_heis

There is no way to do this in stock Rosetta. The Resource Manager has been recently written in developer trunk for purposes such as these, but it hasn't been released yet. I don't think it's feasible to forward-port ResourceManager to 3.4.

How comfortable with C++ are you? I can point out a handful of different ways to make Q&D changes to 3.4 that will let you supply some sort of "equivalence" where certain jobs get certain resfiles.

I think the way you were trying to do it won't (ever) work, because you only get to pass ONE setup into the job distributor, and control flow doesn't return until after the jobs are done. We need to insert your resfile-iterating bit somewhere after the job distributor has started running so that it runs on a per-job basis.

You may also be able to write a trivially-MPI 'wrapper' script that just runs N independent runs of Rosetta, each with independent arguments, on your N threads (be careful mangle output PDB names uniquely...)

Mon, 2012-10-15 12:07
smlewis

Yeah, I was coming to the conclusion that it wasn't possible this way. It was just a preliminary naive attempt that I thought might work with minimal code modification. I'm pretty familiar with C++, but haven't been writing code in C++ in a while. However, I'm very familiar with OOP and can pick up the loose ends quickly. Is there a specific JobDistributor module I should look at if I end up needing to write internal code to distribute the resfiles? The bottleneck for me is digging through the code to find specific files to modify.

Tue, 2012-10-16 05:53
protos_heis

The thing you'll need to modify to fix this is the ReadResfile TaskOperation in src/core/pack/task/operation/TaskOperations.cc, line 536. This is the line of code that actually picks which resfile gets read in the case of the fixbb application. I guess you could supply Rosetta with some resfile-to-job matching file, and implement the matching here, so that the resfile used would depend on which job was being run.

The job distributor hookup is not tricky, but it will require moving code; the ReadResfile operation would have to be moved into the protocols library to get it to compile after linking it against the job distributor code. If you want to write some sort of job-to-resfile hookup, I can provide you with a patch file that moves the ReadResfile operation into protocols and has a hookup that provides a string name of which job is being run (to match against).

If you can use the parallelization wrapper script that Rocco described below (I don't know how to use it), it will be easier in the long run, I think.

Tue, 2012-10-16 07:13
smlewis

Okay, I figured out how to use the parallelization wrapper but I don't know that this will work on the BlueGene. I'll try it though and it definitely works on our in-house cluster. If it's not too much trouble, could you pass along the patch file? And thank you for pointing out the ReadResfile operation. You saved me hours if not days of time.

Tue, 2012-10-16 09:44
protos_heis

It's nontrivial, let's see if your parallelization wrapper works first.

Tue, 2012-10-16 10:08
smlewis

Okay, I'll have to get in touch with the collaborator who's usually very busy, so it might take some time. I appreciate your assistance in this matter and I'll post back when I have an outcome.

Wed, 2012-10-17 09:10
protos_heis

(Did you see my other post about ReadResfileFromDB at the bottom?)

Wed, 2012-10-17 11:38
smlewis

You might be able to use the parse_resfile command (pack::task::parse_resfile) that is talked about in the PyRosetta tutorial here: https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxwe...

But, my guess is that there is probably already a better way to do this. Have you looked into rosetta scripts?

Mon, 2012-10-15 12:10
jadolfbr

The problem is that resfile takes only one option (actually, it IS a FileVector and takes multiple, but the ReadResfileTaskOperation only looks at the first member of the FileVectorOption). Rosetta scripts won't let you change the resfile path on a per-job basis. You could write self-modifying code where the code rewrote its own XML between runs...but it'd be easier to hack ReadResfileTaskOperation to read the job ID and check against an externally-supplied job-to-resfile table.

Mon, 2012-10-15 13:00
smlewis

Do you need to have a single Rosetta process do all the separate resfiles? The way I would typically approach something like this is to use a bash script to launch multiple instances of Rosetta, each with their own resfile. You have a single program (bash) which dispatches out the work to separate sub-processes (Rosetta runs), starting the next as soon as the previous one finishes. If you use something like GNU parallel, or the parallel.py script in rosetta_source/src/python/apps/public/ you can even dynamically batch things out to multiple concurrent CPUs. -- If you're doing things like that and using RosettaScripts to compose your runs, you may want to look in the the "script vars" feature of RosettaScripts ( http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/Rose... ), although if you're just substituting resfiles, it may not really be necessary.

If it is necessary to have a single Rosetta instance that can read into multiple resfiles, your best bet would probably be adjusting the resfile reader like Steven suggests. Although I might suggest enforcing a simple external naming convention for resfiles (e.g. the resfile for "path/to/my_structure.pdb" is "path/to/my_structure.pdb.resfile") to avoid the hassle of loading in an external table.)

Mon, 2012-10-15 17:11
rmoretti

This will 100% work. The tricky thing here is that with 10 resfiles per input, he'll have 10 first_structure_0001.pdb results, so they'll need to be tagged with the resfile name or dumped into different directories. There may be an output flag for this (I'll look it up if this is the way the user wants to go).

Mon, 2012-10-15 17:58
smlewis

Thanks for the help. Is parallel.py and ssh-based script? I've tried writing a script that uses ssh and it will work on our cluster but I don't think it will work on the BlueGene because we cannot ssh to specific backend nodes. It might be possible though if we get certain privileges and one of our collaborators could probably look into this method. Can you specify the same hostfile with the "--host" option as you would with an mpirun?

Tue, 2012-10-16 06:49
protos_heis

I've only ever used GNU parallel and parallel.py to start multiple jobs on the same machine. From the doucmentation, you can also use them to run on multiple machines, but only if you can run programs with ssh. If you want a single program which will launch jobs on multiple separate machines and doesn't use ssh, I don't think they can do that.

From taking a quick look at what's written about BlueGene online, it looks like the only way to submit jobs is the single application which batches out work via MPI, so shell scripting tricks and the like will probably not work. (Although I should note that's only from a quick look - you might want to check your full documentation to see if there's an easy way to run such trivially (inter-node communication independent) jobs). It looks like you're going to need to hack the code.

Tue, 2012-10-16 11:14
rmoretti

Yeah, that seems to be the case, but I'll have our collaborator who's more well versed with the BlueGene look into it to make sure. Thanks for your help.

Thu, 2012-10-18 10:02
protos_heis

In making a patch to move ReadResfile for you, I found this gem in 3.4: ReadResfileFromDB TaskOperation, in protocols/toolbox/task_operations/ReadResfileFromDB.cc

If you know anything about databases, it will work out-of-the-box, if you either replace the ReadResfile operation in fixbb.cc with this one (don't forget the #include) or write a RosettaScript that does the same thing.

If you don't know anything about databases (I know less than I'd like), this code is a good choice to modify into what you need - just gut the apply() function (leaving the first line in place - that's your job tag) and replace it with your own code that matches job tags against desired resfile paths. The function to apply the resfile is the one in core/pack/task/operations/TaskOperations' ReadResfile that we already discussed. Does that make sense?

FYI, here's the documentation on the ReadResfileFromDB if you want to try to use the DB hookup as-is:

ReadResfileFromDB

Lookup the resfile in the supplied relational database. This is useful for processing different structures with different resfiles in the same protocol. The database db should have a table table_name with the following schema:

CREATE TABLE (
tag TEXT,
resfile TEXT,
PRIMARY KEY(tag));

When this task operation is applied, it tries to look up the resfile string associated with the tag defined by

JobDistributor::get_instance()->current_job()->input_tag()

This task operation takes the following parameters:

database_connection_options: Options to connect to the relational database
table=("resfiles" &string)

There is a bunch more database documentation, but it's in our unreleased developers' wiki; if you are going to try it unmodified I will provide static HTML of those pages.

Tue, 2012-10-16 12:30
smlewis

Oh wow, that's a handy function. I do know some database programming so I might be able to figure this one out using the ReadResfileFromDB function. If not I'll try modifying the apply function to get this working as you suggested. This problem seems exponentially simpler now. Thanks a lot for your help! Just one more question: if I supply multiple resfiles in the flags file does fixbb read all of those into the database and then only perform design on the first one and ignore the others, or do I have to modify some code to get it to read in more than one file?

Thu, 2012-10-18 09:59
protos_heis

"if I supply multiple resfiles in the flags file does fixbb read all of those into the database and then only perform design on the first one and ignore the others, or do I have to modify some code to get it to read in more than one file?"

I am concerned by your use of database in this question - this answer pertains only to Rosetta in general, but not the ReadResfileFromDB operation. The resfile option is a vector, which means that Rosetta will remember all file names passed as resfiles. However, all instances of the use of the resfile option (save one that I am aware of [and responsible for]) check only the first value of this vector. So, the code could use the other resfiles, but in almost every case it does not.

Since you know something about relational databases, I am attaching two more of our documentation pages. They are from our internal wiki. Unfortunately I can't give you access to the wiki, and I can't post HTML here, so it's as wiki markup. I can also hook you up with the guy that wrote that code (but he hasn't been in recently so I don't know if he's available).

Thu, 2012-10-18 10:44
smlewis

Okay, I see where I was confused: the database isn't normally used unless you tell Rosetta to use one. These files are actually very helpful. I'll try going down this road and see if I can get this working and I'll post back if I get stuck at an unresolvable issue. Thanks again for your help.

Thu, 2012-10-18 12:32
protos_heis