Rosetta
2015.31
|
#include <MPIWorkPartitionJobDistributor.hh>
Public Member Functions | |
virtual | ~MPIWorkPartitionJobDistributor () |
dtor WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt More... | |
virtual void | go (protocols::moves::MoverOP mover) |
This may be overridden by derived classes. Default implementation invokes go_main. More... | |
virtual core::Size | get_new_job_id () |
virtual void | mark_current_job_id_for_repetition () |
this function is called whenever a job "soft-fails" and needs to be retried. Generally it should ensure that the subsequent call to obtain_new_job returns this job over again. More... | |
virtual void | remove_bad_inputs_from_job_list () |
Public Member Functions inherited from protocols::jd2::JobDistributor | |
virtual | ~JobDistributor () |
void | go (protocols::moves::MoverOP mover, JobOutputterOP jo) |
invokes go, after setting JobOutputter More... | |
virtual JobOP | current_job () const |
Movers may ask their controlling job distributor for information about the current job. They may also write information to this job for later output, though this use is now discouraged as the addition of the MultiplePoseMover now means that a single job may include several separate trajectories. More... | |
virtual std::string | current_output_name () const |
Movers may ask their controlling job distributor for the output name as defined by the Job and JobOutputter. More... | |
JobOutputterOP | job_outputter () const |
Movers (or derived classes) may ask for the JobOutputter. More... | |
void | set_job_outputter (const JobOutputterOP &new_job_outputter) |
Movers (or derived classes) may ask for the JobOutputter. More... | |
JobInputterOP | job_inputter () const |
JobInputter access. More... | |
virtual void | mpi_finalize (bool finalize) |
should the go() function call MPI_finalize()? It probably should, this is true by default. More... | |
JobInputterInputSource::Enum | job_inputter_input_source () const |
The input source for the current JobInputter. More... | |
virtual void | restart () |
core::Size | total_nr_jobs () const |
core::Size | current_job_id () const |
integer access - which job are we on? More... | |
std::string | get_current_batch () const |
what is the current batch ? — name refers to the flag-file used for this batch More... | |
virtual void | add_batch (std::string const &, core::Size id=0) |
add a new batch ( name will be interpreted as flag_file ) More... | |
core::Size | current_batch_id () const |
what is the current batch number ? — refers to position in batches_ More... | |
Protected Member Functions | |
MPIWorkPartitionJobDistributor () | |
ctor is protected; singleton pattern More... | |
virtual void | handle_interrupt () |
This function got called when job is not yet finished and got termitated abnormaly (ctrl-c, kill etc). when implimenting it in subclasses make sure to delete all in-progress-data that your job spawns. More... | |
Protected Member Functions inherited from protocols::jd2::JobDistributor | |
JobDistributor () | |
Singleton instantiation pattern; Derived classes will call default ctor, but their ctors, too must be protected (and the JDFactory must be their friend.) More... | |
JobDistributor (bool empty) | |
MPIArchiveJobDistributor starts with an empty job-list... More... | |
void | go_main (protocols::moves::MoverOP mover) |
Non-virtual get-job, run it, & output loop. This function is pretty generic and your subclass may be able to use it. It is NOT virtual - this implementation can be shared by (at least) the simple FileSystemJobDistributor, the MPIWorkPoolJobDistributor, and the MPIWorkPartitionJobDistributor. Do not feel that you need to use it as-is in your class - but DO plan on implementing all its functionality! More... | |
JobsContainer const & | get_jobs () const |
Read access to private data for derived classes. More... | |
JobsContainer & | get_jobs_nonconst () |
Jobs is the container of Job objects. More... | |
void | mark_job_as_completed (core::Size job_id, core::Real run_time) |
Jobs is the container of Job objects need non-const to mark Jobs as completed on Master in MPI-JobDistributor. More... | |
void | mark_job_as_bad (core::Size job_id) |
ParserOP | parser () const |
Parser access. More... | |
void | begin_critical_section () |
void | end_critical_section () |
void | set_current_job_by_index (core::Size curr_job_index) |
For derived classes that wish to invoke JobDistributor functions which use the current_job_ and current_job_id_ member variables. Note that until those functions complete, it would be a bad idea for another thread to change current_job_. More... | |
bool | obtain_new_job (bool re_consider_current_job=false) |
this function updates the current_job_id_ and current_job_ fields. The boolean return states whether or not a new job was obtained (if false, quit distributing!) More... | |
virtual void | job_succeeded (core::pose::Pose &pose, core::Real run_time, std::string const &tag) |
This function is called upon a successful job completion; it has been virtualized so BOINC and MPI can delay/protect output base implementation is just a call to the job outputter. More... | |
virtual void | job_succeeded_additional_output (core::pose::Pose &pose, std::string const &tag) |
This function is called upon a successful job completion if there are additional poses generated by the mover base implementation is just a call to the job outputter. More... | |
virtual void | job_failed (core::pose::Pose &, bool will_retry) |
This function is called when we give up on the job; it has been virtualized so BOINC and MPI can delay/protect output. More... | |
virtual void | current_job_finished () |
Derived classes are allowed to clean up any temporary files or data relating to the current job after the current job has completed. Called inside go_main loop. Default implementation is a no-op. More... | |
virtual void | note_all_jobs_finished () |
Derived classes are allowed to perform some kind of action when the job distributor runs out of jobs to execute. Called inside go_main. Default implementation is a no-op. More... | |
void | clear_current_job_output () |
void | check_for_parser_in_go_main () |
Send a message to the screen indicating that the parser is in use and that the mover that's been input to go_main will not be used, but instead will be replaced by the Mover created by the parser. More... | |
bool | using_parser () const |
Is the parser in use? More... | |
bool | run_one_job (protocols::moves::MoverOP &mover, time_t allstarttime, std::string &last_inner_job_tag, std::string &last_output_tag, core::Size &last_batch_id, core::Size &retries_this_job, bool first_job) |
void | setup_pymol_observer (core::pose::Pose &pose) |
After the construction of the pose for this job, check the command line to determine if the pymol observer should be attached to it. More... | |
void | write_output_from_job (core::pose::Pose &pose, protocols::moves::MoverOP mover_copy, protocols::moves::MoverStatus status, core::Size jobtime, core::Size &retries_this_job) |
After a job has finished running, figure out from the MoverStatus whether the pose should be written to disk (or wherever) along with any other poses that the mover might have generated along the way. More... | |
void | increment_failed_jobs () |
Increment the number of failed jobs. More... | |
core::Size | get_job_time_estimate () const |
Get an estimate of the time to run an additional job. If it can't be estimated, return a time of zero. More... | |
void | set_batch_id (core::Size setting) |
set current_batch_id — eg for slave nodes in MPI framework More... | |
virtual bool | next_batch () |
switch current_batch_id_ to next batch More... | |
virtual void | batch_underflow () |
if end of batches_ reached via next_batch or set_batch_id ... More... | |
virtual void | load_new_batch () |
called by next_batch() or set_batch_id() to switch-over and restart JobDistributor on new batch More... | |
core::Size | nr_batches () const |
how many batches are in our list ... this can change dynamically More... | |
std::string const & | batch (core::Size batch_id) |
give name of batch with given id More... | |
Private Member Functions | |
void | determine_job_ids_to_run () |
ctor helper function splits up job list More... | |
Private Attributes | |
core::Size | npes_ |
total number of processing elements More... | |
core::Size | rank_ |
rank of the "local" instance More... | |
core::Size | job_id_start_ |
core::Size | job_id_end_ |
core::Size | next_job_to_try_assigning_ |
Friends | |
class | JobDistributorFactory |
Additional Inherited Members | |
Static Public Member Functions inherited from protocols::jd2::JobDistributor | |
static JobDistributor * | get_instance () |
static function to get the instance of ( pointer to) this singleton class More... | |
Static Protected Member Functions inherited from protocols::jd2::JobDistributor | |
static void | setup_system_signal_handler (void(*prev_fn)(int)=jd2_signal_handler) |
Setting up callback function that will be call when our process is about to terminate. More... | |
static void | remove_system_signal_handler () |
Set signal handler back to default state. More... | |
static void | jd2_signal_handler (int Signal) |
Default callback function for signal handling. More... | |
This job distributor is meant for running jobs where the number of jobs is equal to the number of processors (or, similarly, the jobs % processors calculation is very close to the number of processors and NOT a small number). It will blindly divide up jobs across processors and then start running them; it will NOT attempt load-balancing by giving more jobs to the processors that finished their original jobs. This is intended for use on smaller numbers of processors, and/or where the jobs are known to be equal in runtime. (The WorkPool implementation is meant for when runtimes are uncertain, or you have many many processors). It does not "waste" a processor as a master node, instead all processors run jobs.
|
protected |
ctor is protected; singleton pattern
constructor. Notice it calls the parent class! It also builds some internal variables for determining which processor it is in MPI land (later used in job determination). Note that all processors will have the same internal Jobs object (set by the parent class); this class merely iterates over it differently.
References determine_job_ids_to_run(), protocols::jd2::JobDistributor::get_jobs(), job_id_end_, job_id_start_, next_job_to_try_assigning_, npes_, rank_, protocols::jd2::JobsContainer::size(), and protocols::jd2::TR().
|
virtual |
dtor WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt
WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt
|
private |
ctor helper function splits up job list
All processors will get the same Jobs object; this function determines which slice belongs to a particular processor determined solely by its mpi rank and the number of processors, no communication needed EXAMPLE CASE: 18 jobs, 4 processors processor rank number of jobs assigned range in Jobs vector 0 5 1-5 1 5 6-10 2 4 11-14 3 4 15-18
References protocols::jd2::JobDistributor::get_jobs(), job_id_end_, job_id_start_, npes_, rank_, and protocols::jd2::JobsContainer::size().
Referenced by MPIWorkPartitionJobDistributor().
|
virtual |
determine which job to assign next: increment until we run out of available jobs
Implements protocols::jd2::JobDistributor.
References protocols::jd2::JobDistributor::get_jobs(), job_id_end_, protocols::jd2::JobDistributor::job_outputter(), and next_job_to_try_assigning_.
|
virtual |
This may be overridden by derived classes. Default implementation invokes go_main.
Reimplemented from protocols::jd2::JobDistributor.
References protocols::jd2::JobDistributor::go_main().
|
inlineprotectedvirtual |
This function got called when job is not yet finished and got termitated abnormaly (ctrl-c, kill etc). when implimenting it in subclasses make sure to delete all in-progress-data that your job spawns.
Implements protocols::jd2::JobDistributor.
|
virtual |
this function is called whenever a job "soft-fails" and needs to be retried. Generally it should ensure that the subsequent call to obtain_new_job returns this job over again.
Implements protocols::jd2::JobDistributor.
References protocols::jd2::JobDistributor::clear_current_job_output(), protocols::jd2::JobDistributor::current_job_id(), and next_job_to_try_assigning_.
|
virtual |
this function handles the FAIL_BAD_INPUT mover status by removing other jobs with the same input from consideration. This function DOES NOT percolate across processors - so if multiple processors have jobs starting with the same bad input, you will get multiple hits through this function. This is less efficient than it theoretically could be (but it's good enough).
Reimplemented from protocols::jd2::JobDistributor.
References protocols::jd2::JobDistributor::current_job(), protocols::jd2::JobDistributor::get_jobs(), job_id_end_, protocols::jd2::JobDistributor::job_outputter(), next_job_to_try_assigning_, and protocols::jd2::TR().
|
friend |
|
private |
|
private |
Referenced by determine_job_ids_to_run(), and MPIWorkPartitionJobDistributor().
|
private |
|
private |
total number of processing elements
Referenced by determine_job_ids_to_run(), and MPIWorkPartitionJobDistributor().
|
private |
rank of the "local" instance
Referenced by determine_job_ids_to_run(), and MPIWorkPartitionJobDistributor().