Kind of a newbie to Rosetta. My protein has two structured domains connected by a short non-floppy loop (6 residues). One of the domains can have the structure accurately solved by comparative modelling. The other doesn't have related templates available, but I do have co-evolutionary information to probably obtain a nice structure via ab initio modelling. I am looking for the best way to do a hybrid approach to solve the whole structure of this protein. I am considering these options:
1. Model the two domains separately, then dock them and build the short loop.
2. Model the first domain with comparative method. Perform "ab initio" modeling of the whole sequence, but use the structure of the first domain as constraint (is that possible?). This way seems more interesting because the presence of one domain would affect the folding of the other...
I believe this is a common situation to find... can anyone help me to figure it out how this is usually done?
I really appreciate any comment on that!
I'm not sure if there's a "standard" way of doing this.
I'm guessing that the correct approach to take depends highly on how tightly you think the two domains bind to each other, how independently they fold, and how important the structure of the linker is for your downstream applications.
If you think the two domains are relatively tightly bound, fold indepedently and the linker structure isn't too important, I'd probably go with the docking approach. The independent folding of the domains mean that you're likely to get an accurate structure of the unknown domain. The strong inter-domain interaction should provide a clear signal for the relative orientation of the two domains, and you don't necessarily have to be two exhaustive about the linker modeling.
On the other hand, if you think the domain without a template is highly dependent on the known domain for strucutral organization (for example the domain is unstructured on its own, only folding in the presence of the partner) using the partial ab inito protocol, where you fold the protein in the presence of the known domain, is the way to go as you might not get a decent monomer structure to go into the docking step with. Then again, with strong enough coevolutionary information, that might be enough to define the structure of the template-free domain, even in the absence of its binding partner.
I'd also take a moment to consider the role of the linker between the two domains. If this is a rather large, unstructured linker, it might be difficult to do a partial ab initio model with it, as it will add a lot of degrees of freedom to the system, making it hard for the protocol to accurately locate the relative orientations of the structures - you may be better off with the separate modeling with docking approach. In contrast, a short linker or one with a relatively confined orientation means that the two domains will be well-placed with respect for each other, making it easier for the the partial ab initio protocol to make a combined structure. (Though you can also use such information to better constraint the docking protocol in the separate modeling techniques).
To be honest, I might suggest trying both approaches. You can do a run of "standard" ab inito of the non-templated domain (as it's more straightforward), and see if that gives you a well defined structure. If not, or if you have trouble docking it to the homology modeled domain, you can try a partial ab inito protocol, and see if that helps.