According to the Rosetta density tutorial:
"Generally, we select the best 20% of models by geometry, and selecting the best overall by free FSC. The top 5 models should be inspected for model convergence as well as visually inspected for density map agreement"
I would appreciate if someone explains why we choose the lowest FSC score, and what does it mean to have a negatrive FSC score? Also, according to the paper quoted below, shouldn't we sample for the highest free FSC and bnot the lowest?
"this integrated FSC on an independent map (or free FSC) correlates with model accuracy quite well, particularly at high resolution. Furthermore, the real space
correlation between models and the independent testing map over segments of the chain correlates with the local accuracies of models. In high-resolution maps, as the local correlation decreases, the fraction of incorrectly modeled residues increases".
You're correct that with FSC (being a correlation) higher scores are better. -- But note the passage you quote doesn't ever say to take the lowest FSC, it says "the best overall by free FSC" the "best" here being the ones with a higher FSC.
A caveat, though, is that the higher=better applies only to a true FSC. Since Rosetta tends to work in the more-negative=better regime of energies, often these sorts of values are maniupulated to become more-negative=better. This might be what you're seeing with negative FSC: its not a true FSC but instead it's an FSSC-based energy that Rosetta is calculating.
Often it's good not to go with Rosetta's version of these sorts of things (because it sometimes makes approximations in calculations to speed evaluation for long simulations), but instead to calculate it with the standard program one typically uses even in the absecense of in silico predictions. This has a benefit of acting as a check for potential bugs/pathologies in Rosetta's implementation.
I'm really confused. Can you confirm that I should sample for the lowest FSC score, i.e. the more negative one, and this is calculated to actually reflect the high correlation with the test map? i.e. a high FSC score.
Originally I thought we sample for the lowest FSC to rule out overfitting, but based on what you told me, we sample for the lowest FSC that reflects high correlation to favor model accuracy, correct?