# Using D-amino acids in Rosetta docking

8 posts / 0 new
Using D-amino acids in Rosetta docking
#1

Hi,

I have tried to use the docking protocol of Rosetta3.2.1 for docking of well structured peptide to protein receptor. The peptide contains the D-amino acids(D-Pro, D-Arg). The corresponding "params" files are located in "residue_types/d-caa"

First I changed the related lines in "residue_types.txt", to get the params files active. The residue types seem to be recognized by Rosetta but it fails with the following error message

ERROR: unknown atom_name: DPR H
ERROR:: Exit from: src/core/chemical/ResidueType.cc line: 1454

The input PDBfile contains no hydrogen atoms. What file could be wrong or has to be changed in the rosetta database?

To check wether D-Arg will be accepted, I removed the D-Pro residue from the peptide sequence.

The program ends with segmentation fault. In the log file, the last statement was

This file is lacking in the rosetta_database. The link to get the ncaa_rotamer_libraries in the README does not work anymore.

Can anyone help me to fix these problems?

Post Situation:
Wed, 2011-06-01 00:49
moehle

The unknown atom name error is not occuring when Rosetta is reading in your PDB file, it is occuring when other code in the program is trying to access the H atom on that residue. Proline doesn't have a amide hydrogen, so trying to look it up is a pretty legitimate failure.

I think in both cases these failures are likely caused somewhere in the scorefunction. The standard scorefunctions make many assumptions about having normal protein; the scorefunctions used for noncanonicals (and Ds) do not. I've emailed the noncanonicals guy for advice on which scorefunction to use.

Wed, 2011-06-01 09:13
smlewis

The NCAA and D-CAA code was added by me. Hopefully I will be able to answer all your questions.

• D-Proline Problem:
The problem with the D-proline occurs when the base residue type parameter files are patched to create the different variants such as the N-terminus and C-terminus. For example, when the N-term patch (rosetta_database/chemical/residue_type_sets/fa_standard/patches/NtermProteinFull.txt) is applied to the set of residue types it removes the hydrogen (H) attached to the backbone nitrogen (N) and then adds three hydrogens (1H, 2H, 3H) and sets up the bonds, internal coordinates, and properties for the new atoms and new residue type (each applied patch creates a new residue type). If you look at the above file you will see that there is a special case for glycine and proline. Since the base proline doesn't have the H to remove it starts by adding two hydrogens (1H, 2H) and then setting up their bonds, internal coordinates, and properties. The segfault is occurring because D-PRO also lacks that H but since there is not a special case for D-PRO, the general case is being used instead. When Rosetta tries to delete the non-existant H from D-PRO it crashes because it can't find it. There is a similar problem with the N_acetylated.txt, ShoveBB.txt, VirtualBB.txt, VirtualNterm.txt, protein_cutpoint_upper.txt.

Modified patch files that include a special case for DPRO can be found at the links bellow...
http://carl.bio.nyu.edu/~renfrew/for_moehle/NtermProteinFull.txt
http://carl.bio.nyu.edu/~renfrew/for_moehle/N_acetylated.txt
http://carl.bio.nyu.edu/~renfrew/for_moehle/ShoveBB.txt
http://carl.bio.nyu.edu/~renfrew/for_moehle/VirtualBB.txt
http://carl.bio.nyu.edu/~renfrew/for_moehle/VirtualNterm.txt
http://carl.bio.nyu.edu/~renfrew/for_moehle/protein_cutpoint_upper.txt

• D-Arginine Problem (really two problems but you had not encountered the second yet):
• The noncanonical amino acid (NCAA) rotamer libraries are not in the release because they are rarely used and quite large (~450MB). The DARG and DPRO rotlibs can be found at the links bellow...
http://carl.bio.nyu.edu/~renfrew/for_moehle/dpro.rotlib
http://carl.bio.nyu.edu/~renfrew/for_moehle/darg.rotlib
• Steven correctly anticipated that there would be a scoring issue using the the standard Rosetta scoring functions with the D-amino acids. If you would like to use the D-amino acids you will need to use the MM_STD scoring function by adding "-score::weights mm_std". This energy function has had the knowledge-based terms removed and replaced with equivalent physically-based molecular-mechanics potentials. There may still be problems if certain rosetta protocols ignore that command line option and hard code which energy methods they use. There is also no centroid scoring function for the NCAAs so all of your docking will have to be done in fullatom mode.
Fri, 2011-06-03 11:51
renfrew

I copied the additional rotlib files (darg.rotlib, dpro.rotlib) into the ncaa_rotlibs dir, but the following seqfault is still occuring.

The path and file names seem to be correct. Any further suggestions? Thanks a lot.

Mon, 2011-06-06 06:07
moehle

It may be useful if we have more information - can you do a debug build (mode=debug when you compile, instead of mode=release) and run the same command with the debug executeable. It will be slower (plus a long recompile) but it may give us more information. (It won't overwrite the existing release compile you already have).

Wed, 2011-06-08 08:52
smlewis

Hi smlewis,

attached I'll send you the log file from the debug docking run.
Thank you again.

Thu, 2011-06-09 06:12
moehle

It tells us we've got a vector overrun, but not too much more. I'm wondering if it's crashing just after dpro, or crashing during darg, or something else. Try moving darg away and seeing if it crashes in the same way, then do the same for dpro?

Thu, 2011-06-09 07:21
smlewis

The calculation crashed as well as after dpro and after darg. Without Dpro or renamed into Pro the docking protocol seems to go a step forward, with the following additional lines in the Logfile, but finally it's crashing too.