I have some questions about using enzyme design application for my system.
1. I have not used RosettaMatch, but I have a rough complex structure of the enzyme with the ligand. Is it fine to use that or I need to use RosettaMatch?
2. To generate the cst file for enzyme design applicaton, I have to use secondary match algorithms since I do not have the angle and dihedral information. I read Match documentation, but it is not clear to me is whrn to use UPSTREAM and when to DOWNSTREAM and also about the number is written in front of UPSTREAM.I know UPSTREAM is related to protein residues and DOWNSTREAM to ligand. Please clarify it for me with one example or refer me to a documentation about it that has an example?
3. In my system, other than the enzyme (chain A) and ligand (chain X), there is another protein (chain D) bound to the ligand. In other words, my substrate is composed of chain D and chain X, but only chain X is located in the binding pocket. I am thinking to define the covalent bond between chain D and chain X as one of the constraints in the cst file even it is not a catalytic residue. Is it correct to do that?
4. In this system, how can I make sure the binding energy is exactly between my enzyme (chain A) and my ligand (chain X)? I do not want chain D to be considered in binding energy. In enzyme design application documents, it is mentioned "If there are multiple protein chains, these figures may not accurately represent the total figures for binding."
RosettaMatch is only used to find the templates/positions of the catalytic residues on backbones which may support the desired geometry. If you already have a template and residue positions, there is no reason to use RosettaMatch. You may need to do some manual edits to the input PDB to add the REMARK lines which Match would have added for you, but that should hopefully be straightforward.
As you mentioned, DOWNSTREAM is the ligand (or toward the ligand). UPSTREAM is away from the ligand. For the Secondary matcher, you can write constraints against previously built protein residues in addition to the ligand residue. (Primary matches are always against the ligand residue.) If your secondary constraint is building a new protein residue against the ligand, it would be `SECONDARY_MATCH: DOWNSTREAM`. If your secondary constraint is between two protein residues, you have to specify which previously built protein residue is one partner in the constraint. Thus you'd use `SECONDARY_MATCH: UPSTREAM_CST X`, where X is the number of the constraint block which builds the previously built protein residue. (e.g. if you build a SER off of the ligand in the first constraint block, and then build a HIS off of that SER in another constraint, then you'd use `SECONDARY_MATCH: UPSTREAM_CST 1` to specify that the already existing partner (the SER) was the one previously built in the first constraint block. (What documentation we have would be at https://www.rosettacommons.org/docs/latest/rosetta_basics/file_types/match-cstfile-format)
There shouldn't be an issue with multiple chains. Also, "catalytic" in this context is mostly convention. There really isn't any biological role that the residues listed in the constraints need to fufil. The main thing for listing the residues in the constraint file is that there's some geometric relationship you want to preserve between the ligand and those residue (or to other protein residues already listed in the constraint file.
Regarding binding energy, the binding energy is really only used as a final reporting metric. The actual optimization is always against the full complex energy. (Which is probably what you want - if you just optimize binding energy, you tend to end up with conformations with odd internal clashes.) The default binding energy is going to be the ligand against all of the protein residues. To look at just the ligand-A binding energy, you may want to use something like the InterfaceAnalyzer (https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/movers_pages/analysis/InterfaceAnalyzerMover) to look at the A versus X-D interface. If I'm guessing correctly, that's going to be reasonably representative of the actual binding event of interest. Even if it's not, you could also do separate runs after you remove the X ligand and then compare the A to X-D energy to the A to D (no X in system) energy to get a better sense of how much X contributes to the interface.