You are here

LHOC script problem in Rosetta Antibody

4 posts / 0 new
Last post
LHOC script problem in Rosetta Antibody
#1

Hi all,

I'm getting a problem running the LHOC angle script in the antibody modeling protocol. To reproduce:

python -m pdb ../rosetta/main/source/scripts/python/public/plot_VL_VH_orientational_coordinates/plot_LHOC.py -h3_fasc H3_modeling_scores.fasc -graft_dir grafting -output_dir vhvl

I have uploaded the H3_modeling_scores.fasc file and the contents of the grafting directory. Here is the error message:

Traceback (most recent call last):
  File "../rosetta/main/source/scripts/python/public/plot_VL_VH_orientational_coordinates/plot_LHOC.py", line 120, in <module>
    score_file = ScoreFile(i, infiles, outpath, names).plot_for_all_coordinates(tempfiles, angles_file)
  File "/cluster/ziheng/rosetta/main/source/scripts/python/public/plot_VL_VH_orientational_coordinates/ScoreFile.py", line 68, in plot_for_all_coordinates
    self.plot_hist_and_top_x(coord, tempfiles, angles_file)
  File "/cluster/ziheng/rosetta/main/source/scripts/python/public/plot_VL_VH_orientational_coordinates/ScoreFile.py", line 129, in plot_hist_and_top_x
    color = color_dict[decoy.template_no]
KeyError: 48

The relevant files are on here:

https://drive.google.com/open?id=1-HxJrvePkJU19p70RPem72LvhVndPE57

When I ran it through thedebugger it seems that all the decoys have template_no 48, whereas the color_dict's keys is only 0-12 or something. 

Thanks for the help,

Ziheng Wang

Post Situation: 
Thu, 2019-04-25 14:19
ziheng@mit.edu

Solved this problem. Just need to convert the numpy byte string to python string first

Thu, 2019-04-25 14:31
ziheng@mit.edu

I have the same problem. How to convert the byte to string? Could you provide more details?  Thank you!

Mon, 2022-08-15 02:04
xcrui

@Ziheng, the templates shouldn't all be the same unless all your tempates were really bad except one of them.   The reason you are seeing 48 is probably the error/bug (Rosetta team please comment here) in the output of H3_modeling_scores.fasc which I was able to see with the first snippet below.  You would have seen this if you took a few minutes to inspect the output file.  As you can see in the description there are numbers prefixing some of the description fields but not all of  them.  In the next snippet you see that the color_dict in constants.py only goes up to '9' for the 10th color because there are only 10 models.  So no you don't convert to a numpy byte array.  You fiile a bug report to Rosetta for prefixing the "model" under description column with numbers (not sure if this is an mpi issue maybe).  Why?  Because to get the model, inside ScoreFile.py (third snippet) the template number is defined by the 7 character decoy_array[name][6] , i.e. the single digit character after "model-" which therefore can only be 0-9.  So if say row 6 in the first snippet had one of the 10 lowest models, LHOC would bomb because the 7th character is an "l" the character before the hypen because it is prefixed with a '48'.  I found this problem because my lowest scores actually had single digits prefixing "model" (fourth snippet) and so it bombs because the 7th charater is the hypen which is not an index of the color_dict.  So let's file a bug report for the description column having number prefixes before "model" in the H3_modeling_scores.fasc file.

$ tail -n+2 H3_modeling_scores.fasc | awk '{print $2,$46}' OFS='\t' | head | column -t
total_score  description
-555.649     model-0.relaxed_0001
-593.614     model-0.relaxed_0002
-573.011     model-0.relaxed_0003
-575.792     model-0.relaxed_0004
-619.581     model-0.relaxed_0005
-518.167     48model-0.relaxed_0001
-589.982     5model-3.relaxed_0001
-495.965     49model-0.relaxed_0001
-602.857     4model-1.relaxed_0001

 

# color codes for models (What template do they come from?)
color_dict = {}
color_dict['0'] = (141, 211, 199)
color_dict['1'] = (255, 255, 179)
color_dict['2'] = (190, 186, 218)
color_dict['3'] = (251, 128, 114)
color_dict['4'] = (128, 177, 211)
color_dict['5'] = (253, 180, 98)
color_dict['6'] = (179, 222, 105)
color_dict['7'] = (252, 205, 229)
color_dict['8'] = (187, 187, 187)
color_dict['9'] = (187, 128, 189)

 

class LHOCDecoys(object):

    def __init__(self, decoy_array):
        self.decoy_array = decoy_array
        self.template_no = decoy_array[name][6]

 

$ tail -n+3 H3_modeling_scores.fasc | awk '{print $2,$46}' OFS='\t' | sort -k1,1n | head
-642.431	9model-9.relaxed_0020
-642.389	8model-8.relaxed_0006
-641.523	1model-8.relaxed_0005
-640.971	3model-8.relaxed_0009
-640.271	2model-9.relaxed_0005
-640.189	2model-8.relaxed_0005
-640.180	7model-8.relaxed_0006
-639.822	1model-8.relaxed_0013
-638.650	2model-8.relaxed_0013
-638.557	8model-2.relaxed_0001

 

## should only see these model-* numbers 0-9 (NOT 0-12 as indicated in the post):

$ tail -n+3 H3_modeling_scores.fasc | awk '{print $46}' | cut -d- -f2 | cut -d. -f1 | sort | uniq -c
    963 0
    195 1
    199 2
    196 3
    193 4
    200 5
    195 6
    198 7
    200 8
    200 9

 

Sat, 2023-02-04 23:26
Brian Wiley