You are here

Rosetta Antibody Prepack - Problem HL_A vs. LH_A

6 posts / 0 new
Last post
Rosetta Antibody Prepack - Problem HL_A vs. LH_A
#1

Dear Rosetta Users,

I am having a problem with the antibody_prepack_protocol, I am again following the Nature Protocols article “Modeling and docking of antibody structures with Rosetta” (doi:10.1038/nprot.2016.180).

Short Story:

I get the following error when I run the prepacking protocol on my antibody, but only when I switch the heavy and light chains from HL to LH:

ERROR: (end_res_ - start_res_ + 1) == conf_size_

Long Story:

I have successfully built my homology models and refined their H3 loops, and now I am at the antibody-antigen docking phase.

As per the tutorial, I have generated a list of my top antibody models:

sort -nk2 H3_modeling_scores.fasc | head -n10 | awk '{print $NF}' > antibody_ensemble.list

I have prepared, cleaned and relaxed my antigen models and generated the ensemble list:

wget http://www.rcsb.org/pdb/files/1TFH.pdb.gz

gunzip 1TFH.pdb.gz

$ROSETTA/tools/protein_tools/scripts/clean_pdb.py 1TFH.pdb A

mkdir relax_antigen

relax.linuxgccrelease -s 1TFH_A.pdb -ex1 -ex2 -use_input_sc -flip_HNQ -no_optH false -nstruct 10 -out:path:pdb relax_antigen -multiple_processes_writing_to_one_directory > relax.log 2>&1 &

ls relax_antigen/*.pdb > antigen_ensemble.list

 

I then generate the antibody_antigen_start.pdb in PyMOL by loading an antibody model, an antigen model, aligning them with 1JPS and outputting the aligned pdb.

Prior to prepacking, the tutorial states …we must ensure the chain order in the PDB is light, heavy, then antigen. In this example, the heavy chain comes before the light chain, so we edit the file in Vim ($ vim antibody_antigen_start.pdb). Using visual mode, we select the heavy chain, cut, and paste it after the light chain but before the antigen (v 1754j 27l d 1668j o Esc p; space indicates separate commands). This process will have to be repeated for every member of the ensemble, such that chain order is matching.

My first question is, why do the chains need to be in the order LH_A and not the default HL_A? This is mentioned in both the SI tutorial and in the main text, but never explained. To me this seems unnecessary since the antibody builder always puts the heavy chain before the light; thus you would need to perform this switching for all of the models you generate each time you want to perform docking, no?

Anyway, I done this for the antibody_antigen_start.pdb and all of the antibody models as the tutorial dictates. Instead of doing this manually in vim, I used a script (see extract-swap-HL.sh in the linked folder) which extracts the heavy chain and the light chain coordinates from each pdb and reassembles them in the LH order using cat.

But when I run the antibody prepacking protocol, I get the following error message:

docking_prepack_protocol.linuxgccrelease -in:file:s antibody_antigen_start.pdb -ex1 -ex2 -partners LH_A -ensemble1 antibody_ensemble.list -ensemble2 antigen_ensemble.list -docking:dock_rtmin -out:level 500

ERROR: (end_res_ - start_res_ + 1) == conf_size_

The strange thing is, this does not happen when I use the unmodified files with the order heavy-light. Will something terrible happen if I use HL_A? My output seems fine.

Can anyone suggest what the problem is and how I might fix it? I’m fairly new to Rosetta, but to me this seems like an inconsistency between the antibody_antigen_start.pdb and the modified LH antibody models, but I cannot see any discrepancies in my files.

Again, clarify and explain the apparent need for the ordering of the light and heavy chain?

 

Please find the files that I’m using (including the verbose -out:level:500 log file), under the following link:

https://www.dropbox.com/sh/c340z3dvg82v4ag/AADHUipl0DAXnHQcqp1NHhtCa?dl=0

 

Thank you,

Dan

 

p.s. I am using the latest Rosetta release, 2017.08

Category: 
Post Situation: 
Fri, 2017-03-03 02:41
cannond

Update: I now see that the file needs to be in the order LH_A for the snugdock protocol.

[ERROR] Exception caught by JobDistributor for job antibody_antigen_start.prepack_0001Chains are not named correctly or are not in the expected order

With this new information, I would like to retract my question about why I must to use LH_A ordering and ask some others instead:

Is there any way to override the default chain ordering for snugdock? (-partners HL_A does not work)

If not, is there a way to make the earlier homology modeling protocols print out in the 'correct' order?

Does anyone have a script or less manual (than the vim editing) methodology for switching chain, that will work for an example such as this? Since mine apparently does not.

 

Best Regards,

Dan

 

Fri, 2017-03-03 03:38
cannond

Snugdock uses a lot of hardcoded fold tree tricks to "know" what do to with antibodies; that's why it requires the chain ordering it does.  

Scripts exist to extract chains; you can extract L, H, and A separately and then reconcatenate them into the correctly ordered file.  See if the badly-placed ./rna_tools/bin/extract_chain.py will work for you.

Fri, 2017-03-03 10:16
smlewis

Yes, the order is there due to the docking foldtree that is used.  A foldtree signifies how movements propagate through the structure.  It is extremely annoying, but it is that way for a reason.  THere is some work being done to allow any order of the chains, but currently, this is what will need to be done when working with antibodies.   Here is a script that I use to do it (it uses biopython, and the Jade repo): https://github.com/SchiefLab/Jade/blob/master/apps/order_ab_chains.py

That said, I'm not sure why docking_prepack fails with a different chain ordering.  Steven emailed this to the whole antibody group, so we should get a response to that soon. 

Fri, 2017-03-03 10:25
jadolfbr

The error is not informative, but it stems from the comparison of the number of residues in the first member of your ensemble, model-6.relaxed_0002.pdb, and the number of residues in the first member of your input pdb (antibody_antigen_start.pdb). Looking at the PDB, I can see that the N and CA residues are missing from the L chain of the relaxed model (and two other models: model-6.relaxed_0001.pdb and model-9.relaxed_0002.pdb -- probably model-9.relaxed_0001.pdb will also be problematic). So, the comparison to antibody_antigen_start.pdb fails. The fix is to include these atoms, because Rosetta cannot construct a residue when these mainchain atoms are missing.

Unfortunately, this problem arises due to the deeper issue that the LH_A arrangement is currently mandatory. I'm really sorry about that inconvenience.

To answer another question, the reason you cannot prepack without alterting the chains is that prepacking serves as a check for docking. Prepacking output is used for docking and it is generated based on your input, so if your input does not have the correct chain order, the output will not have the correct chain order, and docking will fail (and you will have wasted time prepacking and you will have to do it all over again).

Fri, 2017-03-03 13:59
jeliazkov

Thank you for your replies and insight on the fold tree algorithm.

I used the extract_chain.py script and recombined with cat and this worked! I will use this from now on.

Thanks again,

Dan

Wed, 2017-03-08 03:00
cannond