# Errors in scorefiles after running Design

5 posts / 0 new
Errors in scorefiles after running Design
#1

Hi,

I've successfully run Rosetta Design using the April 14th, 2014 weekly release (this is the latest weekly release that I can faithfully use, given the age of our cluster) on a large number of scaffolds. I've attached the options file (femoco2.txt) as a reference.

However, after generating scorefiles for my designs, I've run into an issue where some of the rows have negative values for the SR_3_hbond_pm column, which misaligns everything and makes it so I can't run DesignSelect.py to rank or sort my results because the number of columns is not the same. I've attached a .txt file containing the offending rows (in this case, rows 86-88).

Is this an issue with my Design run, or just those specific input pbds? Is there a way I can remove these automatically (there are quite a few in my full run, so doing it manually would be arduous).

Thanks!

AttachmentSize
79.77 KB
510 bytes
Category:
Post Situation:
Tue, 2016-03-29 10:25
Jhreed

So three lines have three extra columns.  Weird.  This makes me think the scorefunction wasn't the same on those runs.  The column labels aren't reliable if the scorefunction wasn't exactly the same on every output.  Unfortunately that means we don't KNOW what the extra columns are on those outputs.  Can you try reproducing just those three to see what labels it gives you?  I'm not familiar with this code to know if that's easy or not.

Tue, 2016-03-29 10:52
smlewis

I reran Design using the same input file and iterated it 30 times, but was unable to reproduce the error. I've attached that score file.

File attachments:
Tue, 2016-03-29 17:04
Jhreed

I guess there's two ways to go here:

1) We can figure out an awk command to sanitize your data (there will be some easy way to tell awk to trim out the lines with too many columns; it's not really Rosetta but I can figure it out if you have trouble)

2) You can do a bunch of debugging runs to force this error to occur in isolation so that we can get proper column headers in the scorefile to see what the extra columns are ( to help debug ).  Actually, this reminds me - do you have full PDBs for each line of these scorefiles, and if so, do they have their full score tables printed at the end of the files?  Do the score table headers match for the two types of results?  That may tell us where the extra columns are coming from.

Wed, 2016-03-30 08:01
smlewis
awk "NF < 47" scoreerror.txt > scorefixed.txt

is the awk command, BTW.  There are 46 columns in the "right" lines.

Wed, 2016-03-30 09:52
smlewis