You are here

Structure refinement guided by Cryo-EM density maps

8 posts / 0 new
Last post
Structure refinement guided by Cryo-EM density maps

I have a few questions regarding this tutorial here:


1) What does ReportFSC do exactly? Does it generate a density map from the atomic model and correlates it to the test map?

2) I've been extracting two different repeating segments of the map to use as a refinement / test map , however upon inspecting the header of the PDB, the REMARK 1 is actually showing only one number as shown below. When I use the exact same map for the refinement and as a test map, I do get the FSC score for both maps, which is identical (makes sense, not shown below). 

REMARK   1 FSC[mask=5.80704](10:4.3) = 0.525372   (my structure)

REMARK 1 FSC[mask=6.75237](10:3) = 0.250239 / 0.239448 (from the tutorial)

3) In the tutorial it says I should generate about a 100 structures, however, I'm not sure what the purpose of that is, and what the difference between the strcutures are. Is there a way to average these strcutrues? Should I just pick the one with the highest score? 

4) The tutorial suggests using parallel to utilize all my CPUs, to generate those 100 structures. First I want to verify that I understand the syntax correctly because I'm not familiar with parallel. If the following syntax: parallel -j8 ./ {} ::: {1..100}  does indeed use 8 cores to generate a 100 structures by running the same script, a hundred times (divided on 8 cores), then my question is, what changes between each run, and how would the same script output a different strcuture without changing any parameters?


Post Situation: 
Thu, 2018-03-08 08:22

1) Yes, ReportFSC is reporting the correlation between a simulated map and the testing map.

2) I'm not sure why you are getting different results. You could try just running the REPORT FSC mover on the two maps of interest to get both numbers.

3) A number of applications in this refinement protocol are stochastic including the repacking algorithm in relax which uses a monte carlo sampling method as well as the cartesian sampler which uses a random subset of fragments in it's sampling. The end result is that all the models produced will be slightly different. Lower is better for Rosetta energy so you would want to look at the lowest energy models, you may also want to take a look at the models with the lowest density energy as well. 

4) Your assumption about parallel syntax is correct. Differences in the models are for the reasons described in 3.

Thu, 2018-03-08 09:43

Thanks. So to confirm, the number that gets appended at the end of each structure is random? Meaning, structure 001 is not the lowest energy structure already? If I write a script to read the energies from the score files, sort them and output the lowest one, would that be sufficient to get a good structure?  Is there an alternative way to sample the structures? 



Thu, 2018-03-08 10:25

Normally the number appended is just the results of the jobs as they are completed in order, I can't remember if this protocol does anything different but you can check the score files. We typically do something such as take the best 20 models by Rosetta energy and then take the best of that set by density score (again lower is better). Be sure to look at a handful of the best models to check for convergence and to ensure Rosetta is producing reasonable results.

Thu, 2018-03-08 11:06

Thanks. I will try that. I do have one more question. We use two maps for the refinement, and in the tutorial, the "names" of the maps used were half1 and half2, suggesting that they are half maps. My question is, do we have to use half maps or can we use a combined (post processing) map for the refinement and a half map for the test map. 

I also wonder how I can change the names of the outputted files, particularly the score files. 

Wed, 2018-04-11 08:14

You want to use one half for refinement and the other half for validation. I'm not sure what you mean by change the name of the files, do you mean tell Rosetta to output a different file name for the score files? I'm not sure if that is possible. You can use the flag -out::suffix to add a suffix to your output pdb files. I'm not sure if that affects score files.

Thu, 2018-04-12 11:51

Thanks. May I ask why we use the two half maps, instead of a combined map and a half map?

-out::suffix doesn't affect the score files names, I tried it. If I'm running a batch process, the different structures will have different names according to -out::suffix  but the score information is going to be appended to the score files already written from the previous run. Is there a way to change the output directory of a particular run? 

Speaking of scores, you mentioned we sample the first 20 structures based on the rosetta energy, however, there are many energy terms inside the score file, so which energy do I sample based upon? Also to confirm, the density score is the FSC score, correct?


Fri, 2018-04-13 09:15

Also, if I were to use the half map for the refinement, I should supply the half map's resolution to the script correct?  

Tue, 2018-05-01 09:04