You are here

covalent ligand docking in Rosetta

4 posts / 0 new
Last post
covalent ligand docking in Rosetta


I am modeling an enzyme and a substrate to study the enzyme specificity. From literature, I learned that the specificity is determined by the second step of the mechanism in which the ligand is covalently bound to D281 of an enzyme (see the attached picture showing the mechanism). Is Rosetta able to do covalent ligand docking? If so, how can I parameterize the ligand (which is covalently bound to the protein)? If not,  can I do covalent docking with other software(??) and then score it with Rosetta?

Any suggestions/papers in this regard is really appreciated.


mechanism.png87.92 KB
Post Situation: 
Mon, 2021-06-07 06:39

Yes. I did it en mass and it is doable. However, I do not recall how one makes a covalent with the official topology generation script (, so cannot give advice there.
I got fustrated by, so I wrote my own — rdkit_to_params module for Python3 (pip released, Rdkit-to-params repo). I also made a web interface, which is rather rudimentary (one weekend I'll fix it). With this, if one gives it a SMILES with a star/dummy atom/R-atom, e.g. `*C(=O)CCC` for butanoyl it will make a covalent params file. It also allows the atom names to be assigned or copied from a partial match, so one can be consistent or do a ligand switcheroo. I have another module, Fragmenstein, which can be used for a ligand switcheroo (example), but if it is a one off, it's easier to remove an atom or two in PyMol.

There are three types of connections in a topology file in Rosetta. UPPER, LOWER and CONN. The first two are for polymers and the last for ligands/sidechains. So you strictly have to use CONN1 here.

For a PDB file with a covalent ligand the important thing is that there is a LINK record. PyMol strips these, but are easy to write manually.

Rosetta after subversion 10 IIRC has been good with crosslinks to aspartate —for earlier versions you need to have a special ASX residue.

One thing to mind is the fact that the x-link bond topology may be off, FastRelax with cartesian settings deals somewhat with it, but having assigned constraints is better, especially if your bond is planar as a CONNECT entry is a single bond with no resonance —not sure yours is.
The bridging oxygen, I believe needs to be part of the ligand (`*OC(=O)CCCCCC`). Note that the OG1 oxygen will become OG.

Do note the figure is not 100% correct as it depects technically a radical reaction as opposed to a bog-standard hydrolysis:

  • the water needs to be deprotonated to be a nucleophile
  • the arrow is a radical arrow
  • the water attacks the molecule via the carbon, not oxygen —it may attack either carboxy, but the terminal aspartate looks more stabilised.

If it were a radical reaction you would need HOOH or O=O and not standard water.
So you actually have two steps, one is the stable crosslinked intermediate and the other is the hydrolysis transition step after the addition of the water nucleophile and before the elimination of one of the carboxys —this is chiral, but the hemiacetal can be modelled.

Tue, 2021-06-08 04:35


Thank you for your response.

I am not completely sure, but I believe the second step is hydrolysis and the water molecule should be oriented correctly (by hydrogen bonding with His) to attack the anhydride intermediate from nucleophilic attack from D281 in the first step based on the mechanism of similar enzymes.

I have some questions:

1. you mentioned that "So you actually have two steps, one is the stable crosslinked intermediate and the other is the hydrolysis transition step after the addition of the water nucleophile and before the elimination of one of the carboxys —this is chiral, but the hemiacetal can be modelled."

I hypothesize it is just sufficient to model the crosslinked intermediate for studying specificity, and modeling the hydrolysis transition might not be necessary even though I am not sure. Do you have any suggestion about this?

Also, I think the crosslinked intermediate (anhydride) is chiral, but the hydrolysis transition step after the addition of water nucleophicl is achiral. Is it right?

2. I ran rdkit_to_params  for `*OC(=O)CCCCCC` and obtained the parameter file attached to this forum. How I should modify this parameter file to ensure the O1 of the ligand (OG1 of D281 of enzyme) is connected to CG of D281 of the enzyme. I see there is CONN1 in the parameter file. Do I need to change that? Moreover, I am confused how the OG1 of the enzyme can be used as an atom for the ligand.

3. My plan is to get the binding score (interface score) between the ligand and different mutants of the enzyme. So I need to run ligand docking or rosetta enzyme design application. For that, is it necessary to define the link record in PDB, or defining the covalent bond as a constraint with AddorRemoveMatchCsts would be enough? any suggestion for this work is really appreciated.


File attachments: 
Wed, 2021-06-09 10:17

Phew! The figure is misleading then. There are cases where molecular oxygen breaks up a molecule by a radical SAM enzyme, such as a hydrocarbon chain by BioI in B. subtilis and it is very nasty/ugly stuff. Your step is a classic addition/elimination affair.

1. If you are altering the specificity without changing the mechanism, e.g. different length substrate, that is totally fine —and the data will make more sense. If you are changing the reaction from hydrolysis to transacetylation or improving k_cat as opposed to k_on/k_off, then you may need to consider it.

Chirality. An anhydride is near planar and has no hydrogens.

2. The caproyl is a Ligand and not a polymer, so CONN1 is correct. I honestly have not delt with a crosslinked aspartate where the linking atom is an oxygen, it generally is a nitrogen and there there is no OG2. The crosslinking patch of ASP in the database might support OG1, but it would strike me as odd.
So removing the line in the PDB manually is fine.
The LINK record I mentioned is a PDB format feature you add to the top of your PDB file, which automatically gets parsed by the PDB reading mover.

LINK         CG ASP A 281                 O1 LIG B   1     1555   1555  1.56

This adds a connection type bond with is different that a constraint as it does not result in a horrendous LJ term.
The constraint file would be something like

AtomPair O1 281A CG 1B HARMONIC 1.35 0.2
Angle C1 281A O1 281A CG 1B HARMONIC 2.04 0.35
Angle O1 281A CG 1B  OG1 1B HARMONIC 2.27 0.35
Dihedral C1 281A O1 281A CG 1B OG1 1B CIRCULARHARMONIC 0.00 0.35
Thu, 2021-06-10 10:51