In Rosetta there are two ways to do antibody-antigen docking: (1) run snugdock.linuxgcc command locally ; (2) use the ROSIE server. Both of these way requires an native antibody-antigen complex structure, and we need to align the antibody and antigen structure to the native complex structure to get a start docking structure. However, if we don't have a native antibody-antigen complex structure, but know possible interacting site of the antibody on the antegen, then what is the general protocol to predict the antibody-antigen complex structure?
Thank you and best regards.
"Both of these way requires an native antibody-antigen complex structure"
Both of these ACCEPT a native structure for benchmarking purposes, so that you can see if the protocol is capable of getting the right answer (and under what conditions). Neither REQUIRE a native structure.
I think you encode the "possible interacting site" via constraints. There's probably a fancy way to do it somewhere in the documentation but I've done it with an AmbiguousConstraint wrapping the possible CDR-epitope pairs.
Thank you. But could me tell me where I can find a tutorial or example to learn how to use the AmbiguousConstraint wrapping?
I can only find the examples for format of the AtomPair, Angle and SiteConstraint in the Rosseta document, but I cann't find any example for AmbiguousConstraint.
I know the antibody interacts with a specific domain (about 100 residues) of the antigen. What is the correct syntax to define the constaint?
See my comment below. I have added code to do all the constraints automatically that can help. Otherwise, there are docs on the ambiguous site constraints in the docs.
See https://www.rosettacommons.org/docs/latest/rosetta_basics/file_types/constraint-file#constraint-types_nested-constraints for an example of an AmbiguousConstraint. (Note that where it says "Constraint_Type1 Constraint_Def1" etc., that's referring to any of the other constraints which are documented on that page.
I can still not get it.
If I want to constrain residue 50-100 in chain A of the antigen as the potential dock site, I should express this requirement as an ambiguous constraint in the cst file, am I correct? But what is the format to write this ambiguous constraint in the cst file?
By the way, in the document of snugdock, the the example flag file (https://www.rosettacommons.org/docs/latest/application_documentation/antibody/snugdock), there are a "low resolution constraint" file (cst_file kink.cst) and a "high resolution constraint" file (cst_fa_file kink.cst). What is the difference of these two files?
Here's a brief sample AmbiguousConstraint: https://www.rosettacommons.org/content/using-degenerate-protons-rosetta3x#comment-2985
"Low resolution" in Rosetta usually (but not always) means centroid mode, whereas "high resolution" typically means full atom. In centroid mode sidechains are represented as a single "superatom", rather than as individual atoms.
This is important for constraints, as if you have constraints which are based around the sidechain atoms, they won't work in centroid mode, as the atoms aren't present. Often this means you need two constraint files, one with the fullatom constraints you use during fullatom, and another with centroid constraints, where you approximate the fullatom constraints by looser constraints to the atoms which are present. (Normally CA-CA constraints for sidechain-sidechain constraints.)
The other reason you may want different constraints during the two stages is due to the different energy functions used, and the different scale of sampling. For example, you might want to have more constraints on during the low resolution mode, as the centroid scorefunction is less exact, and the sampling is greater. You then might decide to turn down/off the constraints in fullatom mode, and rely more on the higher resolution scorefunction to get the interactions correct.
Adding to smlewis's response above, I want to make it clear that SnugDock is a local docking method, so the input structure should have the antigen epitope region positioned approximately at the antibody CDRs. There are several ways to generate the input structure. If you have no prior knowledge about the epitope, then you can use a global docking approach (I prefer cluspro with antibody mode turned on in the advanced options) to give you an initial structure that can be refined with SnugDock. If you have some knowledge of the epitope, you can either manually combine the antibody and antigen into a single PDB or align them to a homologous structure.
Do you have the structure of the Antibody?
Personally, I would first get the antibody structure using Rosie.
Second, make a RosettaScript with Docking and using my ParatopeEpitopeSiteConstraints added to them (used for Antibody Design). These SiteConstraints will keep the low/high res docking at the interaction sites. Much easier than making SiteConstraints by hand, but they essentially do the same thing. For the SiteConstraints to work, you will need to renumber your antibody. IF you use RosettaAntibody on ROSIE first, it will be in the chothia scheme. Otherwise, you can use PyIgClassify to renumber it to AHo. (use the options -input_ab_scheme and the option -output_ab_scheme to specify what numbering scheme you are using)
Finally, I would use SnugDock to further resolve the local interactions. SnugDock is not currently available in RosettaScripts, so you would need to do this as a two-stage protocol. Snugdock obeys the input and ouptut ab scheme options.
Hello smlewis, rmoretti and jadolfbr,
Thank you for your help. After reading your posts, I am still not sure whether I have understood the usage constraint for my situation. I now have the antibody strcutures and I know the antigen epitopes might lie in residues 40-120 in chain A of the antigen, so I want to constrain these 40-120 residues in snugdock. Then how should I write the cst file? I guess I should apply SiteConstraint on each CA atom of residues 40-120 in chain A of the antigen . Since I have no idea whether chain H or L would dock onto these residues on the antigens, so I write the SiteConstraint for each of these residues to both the H and L antibody chains as following in the .cst file:
SiteConstraint CA 40A L SIGMOID 5.0 2.0
SiteConstraint CA 41A L SIGMOID 5.0 2.0
SiteConstraint CA 42A L SIGMOID 5.0 2.0
SiteConstraint CA 43A L SIGMOID 5.0 2.0
SiteConstraint CA 44A L SIGMOID 5.0 2.0
SiteConstraint CA 45A L SIGMOID 5.0 2.0
SiteConstraint CA 46A L SIGMOID 5.0 2.0
SiteConstraint CA 47A L SIGMOID 5.0 2.0
SiteConstraint CA 48A L SIGMOID 5.0 2.0
SiteConstraint CA 49A L SIGMOID 5.0 2.0
SiteConstraint CA 50A L SIGMOID 5.0 2.0
SiteConstraint CA 51A L SIGMOID 5.0 2.0
SiteConstraint CA 52A L SIGMOID 5.0 2.0
SiteConstraint CA 53A L SIGMOID 5.0 2.0
SiteConstraint CA 54A L SIGMOID 5.0 2.0
SiteConstraint CA 55A L SIGMOID 5.0 2.0
SiteConstraint CA 56A L SIGMOID 5.0 2.0
......... ........... ........... ..............
SiteConstraint CA 117A L SIGMOID 5.0 2.0
SiteConstraint CA 118A L SIGMOID 5.0 2.0
SiteConstraint CA 119A L SIGMOID 5.0 2.0
SiteConstraint CA 120A L SIGMOID 5.0 2.0
SiteConstraint CA 40A H SIGMOID 5.0 2.0
SiteConstraint CA 41A H SIGMOID 5.0 2.0
SiteConstraint CA 42A H SIGMOID 5.0 2.0
SiteConstraint CA 43A H SIGMOID 5.0 2.0
SiteConstraint CA 44A H SIGMOID 5.0 2.0
SiteConstraint CA 45A H SIGMOID 5.0 2.0
SiteConstraint CA 46A H SIGMOID 5.0 2.0
SiteConstraint CA 47A H SIGMOID 5.0 2.0
SiteConstraint CA 48A H SIGMOID 5.0 2.0
SiteConstraint CA 49A H SIGMOID 5.0 2.0
......... .................. ............
SiteConstraint CA 117A H SIGMOID 5.0 2.0
SiteConstraint CA 118A H SIGMOID 5.0 2.0
SiteConstraint CA 119A H SIGMOID 5.0 2.0
SiteConstraint CA 120A H SIGMOID 5.0 2.0
I set all the constraint distence to 5 Å, but obviuosly it is impossible for all these antigen residues make contact with both the antibody H and L chains. Could you tell me whether this setting is correct for my purpose? Or what is the correct setting?
I beleive your constraint file is written properly for your statement of your data.
Your data are pretty vague - "I think a residue in the range 40-120 makes contact" - so this constraint will not be very strong. It will evaluate all 160 constraints and only apply the best satisfied. You might get better performance here out of KofN constraints (see the same documentation page) - it evaluates the lowest K of N constraints; Ambiguous is effectively 1 of N in the same terms.