You are here

Problem with clustering 1000 centroid pdbs

7 posts / 0 new
Last post
Problem with clustering 1000 centroid pdbs

I am trying to use rosetta3.1 cluster function to cluster 1000 pdbs of a homodimer that were created by Symmdock of rosetta3.2 and are still in centroid mode. Even if I get the pdb outputs for the clusters , I still get an output txt file that crashes at line 2,707,992 and doesn't give me the clustering information. The problem is that there are so many warning messages of the following types for each atom:
core.conformation.Conformation: [ WARNING ] missing heavyatom... [ WARNING ] can't find atom for res 84 atom CEN... [ WARNING ] discarding 1 atoms at position 517 in file...
that the output file crashes before it gets to the clustering information.
I know that my command line works since when I try it on say 20 structures it works just fine in outputing both the warnings and the clustering information.
I thought maybe it will help if I silence all the warnings so it won't print them to the output file so it will have room to print my clustering information. Is it even possible? if so,how?
If not, can anybody tell me how to fix this problem so it will print out my clustering information to the txt output?

Thank you very much,

Post Situation: 
Sat, 2012-04-28 10:34

It's unlikely that Rosetta is crashing just because the output file is too long, unless you are running out of disk space. If you want to silence the warnings, you can try "-mute", but I don't think that's addressing the real problem

One question is, why is Rosetta spitting out so many warnings? I suspect the clusterer is expecting fullatom pdbs and get is getting centroid PDBs. Are you passing any fullatom or centroid flags specifically? Try -in:file:centroid, maybe?

Neither file size nor a centroid/fullatom mismatch ought to cause a hard crash, but I don't have enough data to suggest what is causing the crash yet, so try those things and we'll see...

Sat, 2012-04-28 12:11

Thank you very much for your reply.
I tried adding -in:file:centroid but I got the following error message :
ERROR: Option matching -in:file:centroid not found in command line top-level context
and the command wasn't executed.
Maybe I am adding it wrong to the command line, I used the following command line:
cluster.linuxgccrelease -database /share/apps/rosetta/rosetta_database/ -in:file:centroid:s *.pdb -cluster:radius 3
Is this the correct syntax for this flag?
Thank you again,

Sat, 2012-04-28 13:48

A) It's not combined with s like "-in:file:centroid:s *.pdb", you'd use

-s *pdb

B) Give -in:file:centroid_input a shot if -in:file:centroid doesn't work.

Sat, 2012-04-28 19:05

Thank you again for your help.
since -in:file:centroid always
gives me
ERROR: Option matching -in:file:centroid not found in command line top-level context
I tried using -in:file:centroid_input in the following command line:
cluster.linuxgccrelease -database /share/apps/rosetta/rosetta_database/ -in:file:s *.pdb -in:file:centroid_input -cluster:radius 3
and I got the following errors:
ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/ line: 72
Do you have any more suggestions?
Thank you,

Sun, 2012-04-29 00:59

Try adding "-score:weights cen_std" to your command line to specify a centroid-specific score function to use. (Cluster is apparently defaulting to a fullatom scorefuction.)

Sun, 2012-04-29 10:52

Thank you very very much!!!
Runs like a charm!!!

Sun, 2012-04-29 11:40