You are here

multigraft design

29 posts / 0 new
Last post
multigraft design
#1

can someone suggest if there are demos or tests for running multigraft design? it would be great help..

Post Situation: 
Wed, 2012-01-11 07:57
spraha

From one of the authors (Bruno Correia):

Yeah, unfortunately this is still in many and the code porting is not complete yet. If anything the best place to get command lines and file examples is the supplemental material of the paper (Science. 2011 Oct 21;334(6054):373-6). Other examples are supplied in a read me file together with the executable that was made available.

[When he says "still in many" he is referring to the ongoing process of porting multigraft in Rosetta++ to Rosetta3].

Thu, 2012-01-12 12:20
smlewis

thank you very much for the help.
while using multigraft match i am encountering the following error:
Increase MAX_SEGMENTS in param.cc
ERROR:: Exit from: input_pdb.cc line: 337
I ran this before and that time it was generating the results but now it is giving the error.
What do i do next?i have no clue, i also tried modifying the param.cc as per the error but no use. it is very important that i have the reproducibility of the results!!!...i would appreciate your sound suggestions..

Wed, 2012-01-18 21:20
spraha

What modification did you make? How much did you increase MAX_SEGMENTS by? Did you recompile after doing it? (I don't have a copy of ++ anymore so I can't go look at the code...)

Thu, 2012-01-19 06:48
smlewis

thanks smlewis the problem is sorted now.
But i am encountering a new problem, please help me with this. while running multigraft after match my run terminates, the following error is shown on the monitor:
terminate called after throwing an instance of 'std::bad_alloc'
what(): St9bad_alloc
Aborted
what can be done to resolve it?i hope this has nothing to do with the input files and if yes what can be done?
also, while running the following warning is encountered:
trouble finding /root/rosetta_database/energy_quantile__atre__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__atre__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__atre__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__repe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__repe__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__repe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__tlje__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__tlje__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__tlje__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__sole__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__sole__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__sole__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__hbe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__hbe__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__hbe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__intrae__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__intrae__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__intrae__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__paire__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__paire__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__paire__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__probe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__probe__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__probe__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__spk__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__spk__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__spk__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__dune__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__dune__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__dune__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__rese__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/energy_quantile__rese__aa_ss_sf_nb.data
WARNING: can't find data file energy_quantile__rese__aa_ss_sf_nb.data
trouble finding /root/rosetta_database/unsatisfied_buried_polar__pdb__aa_at_ss.data
trouble finding /root/rosetta_database/unsatisfied_buried_polar__pdb__aa_at_ss.data
WARNING: can't find data file unsatisfied_buried_polar__pdb__aa_at_ss.data
where can i get these data files from?

The command i am using is:
rosetta.release -epi_graft -multigraft -nres_Ab 432 -native_complex Native.pdb -loop_ranges LOOP_RANGES.TXT -input_file MATCH111.txt -use_non_monotone_line_search -ex1 -ex1aro -ex2 -extrachi_cutoff 0 -atom_vdw_set highres -close_as_ALA -grow_as_ALA -graft_with_Ab -repack_Ab -build_loops -refine_loops -design_after_closure -refine_with_Ab_after_design -closure_attempts 2 -store_n_best_closures 1 -design_attempts 2 -store_n_best_designs 1 -max_chainbreak_score 0.008 -max_local_rama 5.0 -design_after_closure -Ab_epitope_optimize -repack_epitope -paths paths.txt -vall filtered.vall.dat.2006-05-05

i look forward to your kind support..

Mon, 2012-01-23 02:40
spraha

bad_alloc generally means the program ran out of memory. How much memory do you have, and how much is it using? I guess you can try watching it with top or something if it doesn't take to long to crash.

I don't know where the file is, maybe Bruno will...

Mon, 2012-01-23 06:37
smlewis

I am running it on a 8GB(8MB Cache)Intel Xeon L5520 2.27 GHz processor. it is utilising 100% of cpu and 93.5% memory. it is now 24 hrs that the program has been running. it has not generated any output file by now. the monitor shows only the following message:
load_epitope_ranges:read 2 loops as follows:
i ran this before on the same machine and after one complete day of run it was generating the same bad_alloc error.
how much more memory will it require? because it been 2 weeks that i am stuck with this problem.

Mon, 2012-01-23 20:35
spraha

considering your suggestion, i thought of reducing my input. so i tried running multigraft design for a single scaffold, i am getting the following output in my output file:
##### 2Bww.pdb match 1
##### 2Bww.pdb 2 S 135 140 365 370 - - - - - - - - -0.61014628 0.79228836 -0.00078884 -0.72458076 -0.55840629 -0.40393704 -0.32047522 -0.24588908 0.91478646 -35.73805618 7.00858116 94.01690674 inf 4.504 3.337 inf 1343.670 67355.977 65.574 30.971 -1 0
##### * 1 - 78 79 277 282 - - - - - - - - - - - - - - - - - - - - inf 3.168 4.965 inf 6629.669 - 111.348 92.493 -1 0
#### 2Bww.pdb 78 79 graft_bb graft_sc
#### 2Bww.pdb 135 140 graft_bb graft_sc
# Grouped Statistics:
# Build --
# successes/attempts = 0/3
#
# Individual Statistics:
# Build --
# [75, 78] cut = 77 successes/attempts = 0/3 min/max_break = 36.81085/40.57056 break associated total min/max_local_rama = 0.00000/0.00000 total min/max_local_rama = 0.00000/0.00000
# [83, 86] cut = 83 successes/attempts = 0/3 min/max_break = 44.25626/44.28807 break associated total min/max_local_rama = 0.00000/0.00000 total min/max_local_rama = 0.00000/0.00000
# [136, 139] cut = 138 successes/attempts = 0/3 min/max_break = 27.07369/27.07369 break associated total min/max_local_rama = 0.00000/0.00000 total min/max_local_rama = 0.00000/0.00000
# [144, 147] cut = 144 successes/attempts = 0/3 min/max_break = 30.46113/30.46113 break associated total min/max_local_rama = 0.00000/0.00000 total min/max_local_rama = 0.00000/0.00000
why is it that the success is zero?what could be the way out?
this is the command i am using:
rosetta.release -epi_graft -multigraft -nres_Ab 432 -native_complex /root/rosetta_source/src/rosetta++/src/epigraft/native.pdb -loop_ranges LOOP_RANGES.TXT -input_file 2bww_match.out -use_non_monotone_line_search -ex1 -ex1aro -ex2 -extrachi_cutoff 0 -atom_vdw_set highres -close_as_ALA -grow_as_ALA -graft_with_Ab -repack_Ab -build_loops -refine_loops -refine_with_constraints -design_after_closure -refine_with_Ab_after_design -closure_attempts 3 -store_n_best_closures 1 -design_attempts 3 -store_n_best_designs 1 -max_chainbreak_score 0.008 -max_local_rama 0.008 -design_after_closure -Ab_epitope_optimize -repack_epitope -paths paths.txt -output_file 2b3w_multi.txt -vall filtered.vall.dat.2006-05-05

i look foward to your kind response.

Fri, 2012-02-10 02:16
spraha

I have tried to send this along to someone who might know but they haven't replied yet...

Tue, 2012-02-14 13:11
smlewis

ok then can u tell me the fragment files to be used for my problem has to be prepared for the scaffold or the native complex (antibody-epitope complex)?
i look forward to your response.

Thu, 2012-02-23 01:56
spraha

I can't get anyone who knows this code to look at the question. I guess you'll have to take the option of last resort and email Bill Schief (the corresponding author) directly...

Mon, 2012-02-27 06:42
smlewis

Here we go, this is from Sergey Menis:

1. For post #4 (http://www.rosettacommons.org/content/multigraft-design#comment-3861). I think the issue is simply database mismatch. As you can see he get a number of warnings for standard files which are in the rosetta++ database. His/her database is either corrupt or he is trying to use mini database for Rosetta++. Alternatively, Ron Jacak has identified a memory leak if Epigraft is compiled with the latest gcc. Ron, could you please elaborate on this item.

2. For post #7 (http://www.rosettacommons.org/content/multigraft-design#comment-3947). To answer his/her question directly to get more successes he needs to increase the max -max_chainbreak_score. Multigraft is not to find any reasonable solutions which is apparent from the fact that min chainbreak never goes below 35 angstroms (in the Individual Statistics section). I would examine if the match is actually what he wants and possible adjust the build instructions.

In general, it is useful to post the output from the run itself and any input files.

Cheers,

--
Sergey Menis

Laboratory of Bill Schief

Wed, 2012-02-29 06:07
smlewis

The problem Sergey is referring to is when epigraft/multigraft is compiled with gcc version 4.6. Depending on what's in the input file, an infinite loop can result. The process will be using 100% cpu but won't be generating any output. It doesn't sound like this is the problem the user is experiencing, though. I think Sergey's suggestion of a database mismatch could explain the warnings. Are the warnings still appearing? Because from later posts, it looks as though the program is giving the proper output.

Wed, 2012-02-29 21:31
ronj

Multigraft has an internal fragment picker guided by secondary structure assignment based on a mix of prediction algorithms. You do not need to generate your own fragments. The protocol will pick fragments it needs to close the breaks automatically.

Wed, 2012-02-29 13:15
Sergey Menis

As Sergey pointed out, it appears that Rosetta is unable to close the loop using the build instructions you have specified in the 2bww_match.out file. If you add the flag "-dump_all_closure_attempt_structures", you can look at the structures Rosetta is trying to close the loops for. Given the chain break scores above, you probably need to allow more flexibility in the scaffold and/or build more linker residues between the match and the scaffold to close these loops successfully.

Wed, 2012-02-29 21:41
ronj

thank you every one for the kind support.
I modified the input file by adding instructions for building linkers and on using the "-dump_all_closure_attempt_structures" i could get the closures which are very near to what i want and one of them is a success but the refine closure success is still zero.do i need to change my chainbreak thresholds more?
and can you also elaborate me on the vall database?for my problem which vall should i use? do i have to make my own vall database?
I am using the old_graft_info_format, is it fine if i use this format?

Sun, 2012-03-04 22:59
spraha

If one of the closures is a success you should examine your clash cutoffs possibly you need to increase those. Still, try to understand why it is so difficult to close the break? It is 1) too few residues are allowed to move 2) the chainbreak score is very strict.

Do the same exercise for refine failures. Is the clashing heavily with some part of the protein? Look at the closure attempt structures which were dumped to get more ideas.

I am not an expert on the vall database and I will not comment on that. Still, you should just use the database which you download with the Rosetta distribution. For Multigraft, you do no need to generate your own fragment files.

If the old_graft_info_format works for you it is just fine. If you have additional documentation which amply describes the new format you can try to use it. However, their functionalities overlap.

Good luck!

Mon, 2012-03-05 03:24
Sergey Menis

yeah exactly the one successful closure shows heavy clashing at a particular region.
and what is the difference between closures,attempted closures and partial closures?
I would appreciate your suggestions.
thanks in advance!

Tue, 2012-03-06 03:18
spraha

Closures are designs where the loop closure algorithm is able to find a solution which satisfies your chain break cutoff. Attempted closures are solutions which did not satisfy your chainbreak and/or clash cutoffs. Partial closures are designs which satisfy your cutoffs for some but not all of the loop breaks.

Tue, 2012-03-06 03:23
Sergey Menis

what is the difference between overall_rms,n_terminal_rms,c_terminal_rms and rms_over_length in the match result file?which of these should be used for filtering/selecting the match results?
which clash checks(inter or intra)should be used for filtering the primary and secondary loops.should i filter by both the clash checks for both the loops or one for each would be fine?

Thu, 2012-03-15 00:34
spraha

All of the rms values describe the matches based on the alignment system used for that match:
overall_rms -> superposition and endpoint
n_terminal_rms -> C2N
c_terminal_rms -> N2C
rms_over_length -> makes most sense for superposition matches, the overall_rms is divided by the number of residues matched. This is a measure of how good each residues was matched.

Pick the alignment system you are interested in and then sort on the particular rms column.

As far as clash, you should not have either. Inter clash refers to the 'antibody to scaffold' clash and intra clash is 'epitope to scaffold' clash. In my experience, I try to sort on both. However, you may see by eye that certain clash is easily resolvable.

Thu, 2012-03-15 11:24
Sergey Menis

are the matches made always on the surface? and how do i make sure that they are at exactly the same relative conformation after adding linkers?

Fri, 2012-03-16 00:36
spraha

Matches can be anywhere on the protein. However, if the match is buried in the core the inter clash score will be very large and will not pass the cutoffs.

Epigraft/Multigraft holds the relative orientation between the loops fixed this is a major part of the protocol. You do not need to do anything.

Fri, 2012-03-16 07:21
Sergey Menis

how do i decide on my inter and intra clash cutoffs?At present i go by the default values but is there any other criteria which can be considered before running the match and design stages.
And how should i set my max chainbreaks? should the max value be equal to max_rough_match_rms value?

Thu, 2012-03-22 01:09
spraha

In my experience < 50-100 clash units is essentially clash free. Anything above that has clash. During the matching stage it is best to set your clash cutoffs quite high to see all of the potential matches. However, a biologically relevant design will have no clash or very little. Very often you will find that you can easily resolve the clash manually (deleting residues from N- or C-termini, trimming loops).

You can try to include the following flags during the design stage:
'-fluidize_takeoff' : move two dihedrals (phi/psi) at +/-1 residue to takeoff for N2C and C2N matches
'-fluidize_landing' : move four dihedrals (phi/psi/phi/psi) at +/-1 and +/-2 residue for broken
endpoints (landing) of all matches
'-rb_move' : attempt rigid body movement during match

-use_non_monotone_line_search : uses alternative refinement procedure
-ex1 -ex1aro -ex2 : enables additional rotamer sets
-scan_randomize_cutpoints : randomizes cutpoint selection during closure

A chainbreak of 0.09 or less is a properly closed chainbreak. That is what you should aim for. However, if you plan on taking the design into some other procedure like Loop Model you could accept a poorly closed decoy.

Please don't forget that Rosetta++ version of Multigraft has not been developed for some time. We are porting to Rosetta3 right now and it will be able to take advantage of all Rosetta improvements. In short, perform your matching and rough closure in Match and Multigraft and refine your matches with a current version of RosettaRelax, LoopModel or FlexBackbone Design.

Good luck!

Thu, 2012-03-22 10:04
Sergey Menis

thank you for the suggestion...!
but i have some confusion regarding the number of tethers and linkers.how can i decide on my number of tethers and linkers besides the restrictions mentioned? and why is it that i observe some of the scaffold gaps to be exactly the length of epitope residues while some others are either very large or very small? does this scaffold gap influence the selection of the number of tethers and linkers?

Mon, 2012-04-02 21:33
spraha

The scaffold gaps matching the length of the epitope exactly are called superposition matches. They were envisioned to identify exact backbones where one could just transfer the amino acids without modifying the backbone. The varied length gaps are from the other alignment systems.

In regards to the first part of the question here is a cut and paste from the documentation:
#######
The possible operations are then via tether and linker residues. Tether residues describe usage of
existing scaffold and epitope residues, while linker residues describe usage of completely new, grown
residues or deletion of residues. So, the ssbm section can conceptually be split into the following
parts where "s" refers to scaffold, "e" refers to epitope, "l" refers to linker and "t" refers to
tether:

s_left_t s_left_l | e_left_l e_left_t e_fixed e_right_t e_right_l | s_right_l s_right_t

Epitope linker residues can only describe growth, not deletion, therefore the 'x' character
described below many not be used when dealing with the epitope. The breakpoints govern the type of
loop closures for the epitope.
#######

Tue, 2012-04-03 09:28
Sergey Menis

thanks again for the help!
I am getting some structures but i observe that when i try to align it with my actual complex one of the loops exactly aligns with the actual but the other does not though it is at almost the same position. what could be the reason? the structures are generated by E alignment.
can you suggest some measures that could help me improve them? or is it that the module generates the structures this way only?

Mon, 2012-04-09 04:08
spraha

It is difficult to tell without looking at the structures but the relative orientation between the two loops must stay exactly in the same orientation as the input loops. There may be some changes on the edges of the loops due to closure and design but the core should stay fixed relative to each other.

Tue, 2012-04-10 10:46
Sergey Menis