You are here

computer shuts down during autobulding with rosetta

5 posts / 0 new
Last post
computer shuts down during autobulding with rosetta
#1

Hi
i tried to run rosetta on a unix machine (ubuntu 9.04) for my data. Upto MR step everything goes fine , but during the rebuilding step after 20 cycles the computer shuts down.
i have attached the log file for the autobuilding run of rosetta. i am using phenix 1.7.1-743 version.

i will be grateful if some one helps me out

the error message is as folows

Parameters taken from: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/PARAMS_1.eff

# mr_rosetta
#
# Run automr/autobuild/rosetta together

# Type phenix.doc for help
Values of all params:
mr_rosetta {
input_files {
seq_file = "/home/khkim/kjcho/ros_st4/ha_st4.fasta"
hhr_files = None
alignment_files = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/edited_align.ali"
model_info_file = None
data = "/home/khkim/kjcho/ros_st4/757p212121.mtz"
labin = None
search_models = None
copies_in_search_models = None
mr_rosetta_solutions = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/rosetta_rebuild_result.pkl"
ids_to_load = None
map_coeffs = ""
labin_map_coeffs = None
map = ""
display_solutions = False
fragment_files = "/home/khkim/kjcho/ros_st4/stem4_frag09.txt"
fragment_files = "/home/khkim/kjcho/ros_st4/stem4_frag03.txt"
use_dummy_fragment_files = False
sort_fragment_files = True
}
output_files {
log = "mr_rosetta.log"
params_out = "mr_rosetta_params.eff"
}
directories {
temp_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
workdir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
output_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
top_output_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2"
rosetta_path = "/usr/local/rosetta"
rosetta_binary_dir = "rosetta_source/bin"
rosetta_binary_name = "mr_protocols.default.linuxgccrelease"
rosetta_script_dir = "rosetta_source/src/apps/public/electron_density"
rosetta_pilot_script_dir = "rosetta_source/src/apps/pilot/frank/"
rosetta_database_dir = "rosetta_database"
}
read_hhpred {
number_of_models = 5
number_of_models_to_skip = 0
copies_to_extract = None
only_extract_proper_symmetry = False
}
place_model {
run_place_model = False
prerefine {
run_prerefine = False
number_of_prerefine_models = 1000
number_of_models_in_ensemble = 1
}
model_already_placed = False
model_already_aligned = False
number_of_output_models = 5
align_with_sculptor = True
identity = None
identity_for_scoring_only = 25
use_all_plausible_sg = False
overlap_allowed = 300
selection_criteria_rot_value = 75
fast_search_mode = True
search_down_percent = 25
mr_resolution = 3
refine_after_mr = True
denmod_after_refine = True
find_ncs_after_mr = True
min_length_ncs = 10
fixed_ensembles {
fixed_ensembleID_list = None
fixed_euler_list = 0 0 0
fixed_frac_list = 0 0 0
fixed_frac_list_is_fractional = True
}
copies_of_search_model_to_place = None
}
rescore_mr {
run_rescore_mr = False
nstruct = 5
relax = False
include_unrelaxed_in_scoring = False
align = True
edit_model = False
stage_to_rescore = "mr_solution"
}
rosetta_rebuild {
run_rosetta_rebuild = True
stage_to_rebuild = "rescored_mr_solution"
max_solutions_to_rebuild = 5
min_solutions_to_rebuild = 1
llg_percent_of_max_to_keep = 50
rosetta_models = 20
chunk_size = 1
edit_model = True
superpose_model = False
}
rosetta_rescore {
run_rosetta_rescore = False
percentage_to_rescore = 20
min_solutions_to_rescore = 2
}
similarity {
run_similarity = False
required_cc = 0.2
number_of_required_cc = 5
}
refine_top_models {
run_refine_top_models = False
stage_to_refine = None
sort_score_type = None
percent_to_refine = 20
denmod_after_refine = True
}
average_density_top_models {
run_average_density_top_models = False
percent_to_average = 100
}
relax_top_models {
run_relax_top_models = False
stage_to_relax = None
number_to_relax = 2
nstruct = 5
}
autobuild_top_models {
run_autobuild_top_models = False
number_to_autobuild = 2
quick = False
phase_and_build = False
macro_cycles = None
morph = False
edit_model = True
use_map_coeffs = True
}
setup_repeat_mr_rosetta {
run_setup_repeat_mr_rosetta = False
repeats = 1
template_repeats = 0
morph_repeats = 0
number_to_repeat = 1
acceptable_r = 0.25
minimum_delta_r = None
}
repeat_mr_rosetta {
run_repeat_mr_rosetta = False
copies_in_new_search_group = 1
update_map_coeffs_with_autobuild = True
}
rosetta_modeling {
map_resolution = 3
map_grid_spacing = 1.5
map_weight = 1
map_window = 5
include_solvation_energy = True
weights_file = ""
}
crystal_info {
resolution = 0
space_group = "p212121"
chain_type = *PROTEIN DNA RNA
ncs_copies = 2
}
control {
verbose = False
debug = False
raise_sorry = False
dry_run = False
nproc = 100
group_run_command = "sh"
condor = None
single_run_command = "sh "
background = True
ignore_errors_in_subprocess = True
check_run_command = False
max_wait_time = 100
wait_between_submit_time = 1
n_dir_max = 100000
number_to_print = 5
write_run_directory_to_file = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/INFO_FILE_1"
resolve_command_list = None
start_point = place_model rescore_mr rosetta_rebuild rosetta_rescore \
similarity refine_top_models average_density_top_models \
relax_top_models autobuild_top_models \
setup_repeat_mr_rosetta repeat_mr_rosetta
stop_point = place_model rescore_mr rosetta_rebuild rosetta_rescore \
similarity refine_top_models average_density_top_models \
relax_top_models autobuild_top_models \
setup_repeat_mr_rosetta repeat_mr_rosetta
}
non_user_params {
file_base = None
print_citations = False
highest_id = 15
is_sub_process = True
dummy_autobuild = False
dummy_refinement = False
dummy_rosetta = False
skip_clash_guard = True
correct_special_position_tolerance = None
comparison_mtz = ""
labin_comparison_mtz = None
write_local_files = False
rosetta_fixed_seed = None
}
}

Starting mr_rosetta
Date: Tue Jul 19 13:40:23 2011
Directory: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1

Changing to work directory: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1

Log file will be /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/mr_rosetta.log
Splitting output to /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/mr_rosetta.log
Checking rosetta paths:
rosetta binary: /usr/local/rosetta/rosetta_source/bin/mr_protocols.default.linuxgccrelease
database_dir: /usr/local/rosetta/rosetta_database
script_dir: /usr/local/rosetta/rosetta_source/src/apps/public/electron_density
pilot_script_dir: /usr/local/rosetta/rosetta_source/src/apps/pilot/frank/

================================================================================
Setting up reflection file and labels
================================================================================
FP=F_757p212121 SIGFP=SIGF_757p212121 FreeR_flag=FreeR_flag:

================================================================================
LOADING EXISTING SOLUTIONS
================================================================================
Loading solutions from /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/rosetta_rebuild_result.pkl
RESULTS:
ID: 7 Model: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoMR_run_1_/ha1.p_mr.1.pdb Single chain: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoMR_run_1_/ha1.p_mr.1_one.pdb
Stage: rescored_mr_solution MR_LLG: 741.77 Target NCS copies: 2
NCS copies found: 2
Group: None
CRYST1 73.527 90.035 237.980 90.00 90.00 90.00 P 21 21 21
map_coeffs: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoBuild_run_1_/cycle_best_2.mtz labin:FP=FP PHIB=PHIM FOM=FOMM
map: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoBuild_run_1_/cycle_best_2_nf.map
Placed model: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoMR_run_1_/ha1.p_mr.1.pdb
component_solutions: None

Loaded 1 previous solutions:
(list is in /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/solutions_loaded.log)
SET CRYSTAL SYMMETRY FROM INPUT SOLUTION: CRYST1 73.527 90.035 237.980 90.00 90.00 90.00 P 21 21 21

Methods to be run:

place_model : False
rescore_mr : False
rosetta_rebuild : True
rosetta_rescore : False
similarity : False
refine_top_models : False
average_density_top_models : False
relax_top_models : False
autobuild_top_models : False
setup_repeat_mr_rosetta : False
repeat_mr_rosetta : False

================================================================================
REBUILDING BEST MR SOLUTIONS WITH ROSETTA

================================================================================

Choosing 1 MR solutions for Rosetta rebuilding

Rebuilding MR model (ID:7) by generating 20 rosetta models and choosing
the best using rosetta scoring

# mr_rosetta_rebuild
#
# Run automr mr_rosetta_rebuild with rosetta

# Type phenix.doc for help
Values of all params:
mr_rosetta_rebuild {
input_files {
model = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoMR_run_1_/ha1.p_mr.1_one.pdb"
map = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoBuild_run_1_/cycle_best_2_nf.map"
seq_file = "/home/khkim/kjcho/ros_st4/ha_st4.fasta"
hhr_files = None
alignment_files = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/edited_align.ali"
model_info_file = ""
fragment_files = "/home/khkim/kjcho/ros_st4/stem4_frag09.txt"
fragment_files = "/home/khkim/kjcho/ros_st4/stem4_frag03.txt"
use_dummy_fragment_files = False
sort_fragment_files = True
}
output_files {
log = "mr_rosetta_rebuild.log"
params_out = "mr_rosetta_rebuild_params.eff"
}
directories {
temp_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
workdir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
output_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1"
top_output_dir = "/home/khkim/kjcho/ros_st4/MR_ROSETTA_2"
rosetta_path = "/usr/local/rosetta"
rosetta_binary_dir = "rosetta_source/bin"
rosetta_binary_name = "mr_protocols.default.linuxgccrelease"
rosetta_script_dir = "rosetta_source/src/apps/public/electron_density"
rosetta_pilot_script_dir = "rosetta_source/src/apps/pilot/frank/"
rosetta_database_dir = "rosetta_database"
}
rosetta_rebuild {
run_rosetta_rebuild = True
stage_to_rebuild = "rescored_mr_solution"
max_solutions_to_rebuild = 5
min_solutions_to_rebuild = 1
llg_percent_of_max_to_keep = 50
rosetta_models = 20
chunk_size = 1
edit_model = True
superpose_model = False
}
rosetta_modeling {
map_resolution = 3
map_grid_spacing = 1.5
map_weight = 1
map_window = 5
include_solvation_energy = True
weights_file = ""
}
crystal_info {
resolution = 0
space_group = "P 21 21 21 "
chain_type = *PROTEIN DNA RNA
ncs_copies = 2
}
control {
verbose = False
debug = False
raise_sorry = False
dry_run = False
nproc = 100
group_run_command = "sh"
condor = None
single_run_command = "sh "
background = True
ignore_errors_in_subprocess = True
check_run_command = False
max_wait_time = 100
wait_between_submit_time = 1
n_dir_max = 100000
number_to_print = 5
write_run_directory_to_file = None
resolve_command_list = None
start_point = place_model rescore_mr rosetta_rebuild rosetta_rescore \
similarity refine_top_models average_density_top_models \
relax_top_models autobuild_top_models \
setup_repeat_mr_rosetta repeat_mr_rosetta
stop_point = place_model rescore_mr rosetta_rebuild rosetta_rescore \
similarity refine_top_models average_density_top_models \
relax_top_models autobuild_top_models \
setup_repeat_mr_rosetta repeat_mr_rosetta
}
non_user_params {
file_base = "ha1.p_mr.1"
print_citations = False
highest_id = 15
is_sub_process = True
dummy_autobuild = False
dummy_refinement = False
dummy_rosetta = False
skip_clash_guard = True
correct_special_position_tolerance = None
comparison_mtz = ""
labin_comparison_mtz = None
write_local_files = False
rosetta_fixed_seed = None
}
}

Starting mr_rosetta_rebuild
Date: Tue Jul 19 13:40:24 2011
Directory: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1

USER = root
USERNAME = khkim
PID = 16462
Checking rosetta paths:
rosetta binary: /usr/local/rosetta/rosetta_source/bin/mr_protocols.default.linuxgccrelease
database_dir: /usr/local/rosetta/rosetta_database
script_dir: /usr/local/rosetta/rosetta_source/src/apps/public/electron_density
pilot_script_dir: /usr/local/rosetta/rosetta_source/src/apps/pilot/frank/

Sequence rewritten to /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/EDITED_ha_st4.fasta :
>ha1
CKLRGVAPLHLGKCNIAGWILGNPECESLSTASSWSYIVETSSSDNGTCY
PGDFIDYEELREQLSSVSSFERFEIFPKTSSWPNHDSNKGVTAACPHAGA
KSFYKNLIWLVKKGNSYPKLSKSYINDKGKEVLVLWGIHHPSTSADQQSL
YQNADAYVFVGSSRYSKKFKPEIAIRPKVRDQEGRMNYYWTLVEPGDKIT
FEATGNLVVPRYAFAMERNAGKVDDGFLDIWTYNAELLVLLENERTLDYHDSNV

Changing to work directory: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1

Rebuilding the model /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_PLACE_MODEL_1/RUN_1/AutoMR_run_1_/ha1.p_mr.1_one.pdb
with rosetta
================================================================================

Splitting generation of rebuilt models into 20 jobs
with 1 structures generated per job

================================================================================
==============================================================================
Starting sub-processes Rebuild in sets...
==============================================================================

Splitting work into 20 jobs and running with 100 processors using sh
background=True in /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1

Starting job 1...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_1.log
Starting job 2...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_2.log
Starting job 3...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_3.log
Starting job 4...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_4.log
Starting job 5...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_5.log
Starting job 6...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_6.log
Starting job 7...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_7.log
Starting job 8...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_8.log
Starting job 9...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_9.log
Starting job 10...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_10.log
Starting job 11...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_11.log
Starting job 12...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_12.log
Starting job 13...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_13.log
Starting job 14...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_14.log
Starting job 15...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_15.log
Starting job 16...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_16.log
Starting job 17...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_17.log
Starting job 18...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_18.log
Starting job 19...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_19.log
Starting job 20...Log will be: /home/khkim/kjcho/ros_st4/MR_ROSETTA_2/GROUP_OF_ROSETTA_REBUILD_1/RUN_1/REBUILD_IN_SETS_1/RUN_FILE_20.log

Post Situation: 
Tue, 2011-07-19 01:07
intekhab

Do you mean the computer literally turns itself off and/or reboots? Rosetta shouldn't have a way to do that; I've certainly never heard of it. The only way I've ever seen Rosetta affect the stability of the computer overall is when Rosetta is trying to use more RAM than is available; even then it should just get sluggish until it throws a bad allocation error.

Is it possible it is time-related (it's shutting down overnight after a certain amount of idle time?) or heat-related (faulty heatsink, Rosetta heats up the processor till it fries and shuts down?)?

Tue, 2011-07-19 07:30
smlewis

yes the computer hangs and process is stoppped and no response overnight. We were using a memeory of 2GB which we increased to 3GB.
The specifications of our system are as follows:
Ubuntu 9.04
GCC 4.2
RAM 3GB
Rosetta 3.2
Phenix 1.7.1
the command line test using "phenix_regression.wizards.test_command_line_rosetta_quick_tests
test_rosetta_rebuild" works well .

Wed, 2011-07-20 18:18
intekhab

3 GB sounds like an awful lot for Rosetta to be using. Is there any tool you can use to track how much memory it has? You could set up a cron job to save "ps aux" every 5 minutes or something to see if it's using a ton of memory before it crashes. There are probably good tools out there for this that I'm not aware of.

I don't know if the MR protocol does much repacking. The normal way for Rosetta to gobble up TONS of memory is during protein design; generally repacking only can't get that many rotamers. How many residues are in the problem you're trying to solve? (I mean how many are in the Rosetta pose, not necessarily how many are in the biological protein). If that number is more than a thousand then maybe we should consider ways to pare down the rotamer memory usage.

Thu, 2011-07-21 08:13
smlewis

Splitting work into 20 jobs and running with 100 processors using sh

I'm not familiar with the MR protocol, but this line concerns me. Is this on a single box or a cluster? If it's a single 8-core, shared memory machine, trying to launch 100 copies of Rosetta or Phenix is very likely to hang/kill your box as they all compete for the same 8 cores and the same memory.

Try changing the "nproc = 100" line to something more reasonable for your system and see if that helps.

Thu, 2011-07-21 09:21
rmoretti