You are here

Rosetta score vs Experiment

4 posts / 0 new
Last post
Rosetta score vs Experiment
#1

Hi all,

I understand that the best way to interpret Rosetta scores is "lower=better". However, there are many papers that use Rosetta to design/enhance a protein then experimentally verify their structure/stability in the lab. Since the "lower=better" metric is so general, how can one be confident that the Rosetta-predicted structure/stability will match experimental values?

====
Ok, after some thought and reading, I think I'm finally starting to see the idea behind "lower=better". Here's my interpretation: Generally, we want to minimize the Rosetta score of a protein. If all goes well, the resulting structure is the most likely conformation that we will find in the lab, even though the experimental kcal/mol may not match the predicted REU. Is this the right idea? If not please enlighten me.

Thank you very much for all your help.
(Sorry if my questions are a bit fundamental. My research group is just starting to use PyRosetta and I'm the only one spearheading this effort so RosettaCommons forums has been the only way for me to ask for help.)

-thorx020

Post Situation: 
Tue, 2012-09-04 12:38
thorx020

"even though the experimental kcal/mol may not match the predicted REU."

Unless you are doing certain very specific experiments, Rosetta is not attempting to predict kcal/mol. It's basically producing a list of what conformations it thinks are most likely (based on their score), and attaching the scores it is using to make those judgments. The mode that will attempt to produce direct measurements in kcal/mol is Liz Kellogg's DDG of mutation work. This is fairly limited in scope. There are folks in the community concerned with the 'thermodynamic correctness' of Rosetta with respect to the ensemble of structures it generates - I think that Colin Smith's Backrub code uses this and I know Oliver Lange's current project does - but this only ensures that Rosetta's results are correctly distributed relative to the ensemble, it won't link to experimental energy measurements.

Here's an example of using the lower=better metric and how it can't be (and doesn't need to be) interpreted as kcal/mol. I've worked on a project where part of the project is ensuring that a certain protein-protein interface will not form, while other interface will form. As part of the calculations, I have Rosetta attempt to dock both the desired and undesired interfaces and calculate binding energies. The worst possible scores I've found for the undesired complexes were in the range of -12 REU of binding energy, compared to maybe -30~40 for the original complex and designed desired partners. I know that the best designs are those that will maximize the gap between the good complexes and the bad complexes, so those were the ones we picked, even though Rosetta was assigning nonzero binding energies to the undesired interfaces. Experimentally, it turns out that the -12 REU complex does not form (at the limit of measurement), whereas the others do bind. So, in reality, that undesired complex has a binding energy of effectively zero in kcal/mol. Rosetta stubbornly reports some negative energy because the scorefunction is parameterized on folded/collapsed proteins, and will find something to like even about interfaces that won't bind, because it can't conceive of a docking problem with no solutions. I can't equate -12 REU to 0 kcal/mol, because Rosetta is never even considering the 0 kcal/mol noninteracting state as a valid solution to the docking problem.

What are you trying to do that requires kcal/mol? I'm not convinced Rosetta is the right tool for the job.

Tue, 2012-09-04 13:01
smlewis

Thanks, that example was very helpful. So it does seem that for most apps, you'd want to minimize the REU and not necessarily look for some correlation between REU and physical units. We currently want to use Rosetta to predict the stabilities of small proteins after making mutations. The eventual goal is to find mutations that stabilize these proteins the most. Then in the near future, we'd like to predict mutations that can greatly enhance these proteins' binding affinity to various targets. Once we have more experience in Rosetta, we also want to be able to design novel protein structures based purely on residue sequences.

Tue, 2012-09-04 13:31
thorx020

With the narrow exception of predicting stabilities in kcal/mol, these are all things we do in Rosetta on a regular basis. You just need to score your starting structure (or at least your starting sequence subjected to the same modeling) using a similar experiment as your design protocol to get a comparison energy for the original state. Then you'll know if Rosetta thinks a mutation is relatively stabilizing or destabilizing, and you'll be able to rank by magnitude.

What you are doing is within the range of what Liz's ddg stuff does (http://www.ncbi.nlm.nih.gov/pubmed/21287615) if you want kcal/mol comparisons.

Tue, 2012-09-04 13:57
smlewis