You are here

Design with non-canonical amino acids (NCAA)

3 posts / 0 new
Last post
Design with non-canonical amino acids (NCAA)
#1

Hello everyone, I would like to ask, how to add the following lines when creating a parameter in the rosetta for a non-standard amino acid containing a methyl group on the backbone? M ROOT 13 M POLY_N_BB 13 M POLY_CA_BB 14 M POLY_C_BB 15 M POLY_O_BB 16 M POLY_IGNORE 2 3 4 5 6 8 9 10 11 12 M POLY_UPPER 7 M POLY_LOWER 1 M POLY_CHG 1 M POLY_PROPERTIES PROTEIN POLAR CHARGED M END

When I add parameters according to the tutorial, when run script,I get the following error: python /home/tianflame/rosetta_bin_linux_2019.07.60616_bundle/demos/protocol_capture/using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/scripts/molfile_to_params_polymer.py --clobber --polymer --no-pdb --name X11 -k X11.kin X11.mol Traceback (most recent call last): File "/home/tianflame/rosetta_bin_linux_2019.07.60616_bundle/demos/protocol_capture/using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/scripts/molfile_to_params_polymer.py", line 1995, in sys.exit(main(sys.argv[1:])) File "/home/tianflame/rosetta_bin_linux_2019.07.60616_bundle/demos/protocol_capture/using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/scripts/molfile_to_params_polymer.py", line 1953, in main polymer_assign_pdb_like_atom_names_to_sidechain( m.atoms, m.bonds, options.peptoid ) File "/home/tianflame/rosetta_bin_linux_2019.07.60616_bundle/demos/protocol_capture/using_ncaas_protein_peptide_interface_design/HowToMakeResidueTypeParamFiles/scripts/molfile_to_params_polymer.py", line 1697, in polymer_assign_pdb_like_atom_names_to_sidechain a.pdb_greek_dist = greek_alphabet[all_all_dist[ca_index][i]] TypeError: list indices must be integers, not float

AttachmentSize
MEASP structure file101.61 KB
MEASP mol file2.79 KB
Post Situation: 
Mon, 2019-08-05 00:34
yinasun

Are you using PyRosetta on Python 2, a 2to3'ed version of it with PyRosetta on Py3 or the self standing mol  to params script from http://www.pyrosetta.org/scripts?
IIRC, I had some odd issues with the former (in a system with PyRosetta for Py3).

But your error seems to be that either all_all_dist, ca_index or i are floats, so simply changing the offending line to greek_alphabet[int(all_all_dist[int(ca_index)][int(i)])] might fix it. However, I have always altered the UPPER_CONNECT and LOWER_CONNECT manually or semimanually with Python+rdkit. As there are a few things that you have to change that aresn't listed like AA ASP will fail, but AA UNK won't etc.

Wed, 2019-08-21 12:11
matteoferla

I had to parametrise something so I checked your molecule while I was at it. And I did not get your error.
I just upload my 2to3 port of the self standing version in http://www.pyrosetta.org/scripts (old) and the version shipped with Rosetta here: https://github.com/matteoferla/mol_to_params.py just in the offchange it was a version thing...

This `polymer` flag you are using does not exist on my version(s).
There is a flag amino acid which does the same thing, but from the code I cannot see it parsing M lines. Instead it requires a Sybyl mol2 and not a sdf (MDL mol) file. 

>>> babel -i mol LIG.mol -o mol2 LIG.mol2

The output will have


@ATOM
      1 C          -0.3738   -1.9100    0.2369 C.2     1  UNL1        0.2129
      2 O          -0.9668   -1.8046    1.3145 O.2     1  UNL1       -0.2759

The names are the `C` not the `C.2` ones... so rename manually C, CA, CB, HA, O, N.
However, do a find-replace for all `C `, `N `, `N `, `H ` names to `CX` etc. The reason being if there are more than one atom called C, the first will be called C1 during the namefixing —`CX1` etc. doesn't harm anyone, but C-prime as C1 does...

You have a methylated amine. Call the carbon there H.

Note that MDL mol files do not have partial charge, so Babel predicted it (GASTEIGER) —a bit wasteful given that you used Gaussian—, but I think this py script in turn ignores them an makes up some in turn from a dictionary.

This is what I got from your file:

NAME LIG
IO_STRING LIG X
AA UNK
TYPE POLYMER
LOWER_CONNECT N
UPPER_CONNECT C
PROPERTIES PROTEIN
FIRST_SIDECHAIN_ATOM CB
ACT_COORD_ATOMS CB END
ATOM  N   Nbb  NH1  -0.29
ATOM  CA  CAbb CT1  0.11
ATOM  C   CObb C    0.23
ATOM  O   OCbb O    -0.27
ATOM  NX  Ntrp  X   -0.32
ATOM  C4  CH3   X   0.00
ATOM  H10 Hapo  X   0.04
ATOM  H11 Hapo  X   0.04
ATOM  H12 Hapo  X   0.04
ATOM  H13 Hpol  X   0.15
ATOM  CB  CH2   X   0.07
ATOM  C3  COO   X   0.31
ATOM  O3  OOC   X   -0.25
ATOM  O2  OH    X   -0.48
ATOM  H4  Hpol  X   0.30
ATOM  H6  Hapo  X   0.04
ATOM  H5  Hapo  X   0.04
ATOM  HA  Hapo HB   0.06
ATOM  C1  CNH2  X   0.21
ATOM  O1  ONH2  X   -0.28
ATOM  C2  CH3   X   0.01
ATOM  H2  Hapo  X   0.03
ATOM  H1  Hapo  X   0.03
ATOM  H3  Hapo  X   0.03
ATOM  H   HNbb H    0.00
ATOM  H7  Hapo  X   0.04
ATOM  H8  Hapo  X   0.04
ATOM  H9  Hapo  X   0.04
BOND_TYPE  C1   O1  2   
BOND_TYPE  C2   C1  1   
BOND_TYPE  C2   H2  1   
BOND_TYPE  H1   C2  1   
BOND_TYPE  H3   C2  1   
BOND_TYPE  N    C1  4   
BOND_TYPE  N    CA  1   
BOND_TYPE  CA   HA  1   
BOND_TYPE  C    CA  1   
BOND_TYPE  C    NX  4   
BOND_TYPE  O    C   2   
BOND_TYPE  CB   CA  1   
BOND_TYPE  CB   C3  1   
BOND_TYPE  CB   H6  1   
BOND_TYPE  C3   O3  2   
BOND_TYPE  O2   H4  1   
BOND_TYPE  O2   C3  1   
BOND_TYPE  H5   CB  1   
BOND_TYPE  H    N   1   
BOND_TYPE  H7   H   1   
BOND_TYPE  H8   H   1   
BOND_TYPE  H9   H   1   
BOND_TYPE  NX   H13 1   
BOND_TYPE  H10  C4  1   
BOND_TYPE  C4   NX  1   
BOND_TYPE  C4   H11 1   
BOND_TYPE  H12  C4  1   
CHI 1  CB   C3   O2   H4 
PROTON_CHI 1 SAMPLES 2 0 180 EXTRA 1 20
CHI 2  C1   N    CA   C  
CHI 3  N    CA   C    O  
CHI 4  N    CA   CB   C3 
CHI 5  CA   CB   C3   O3 
NBR_ATOM  CB 
NBR_RADIUS 6.354337
ICOOR_INTERNAL    N      0.000000    0.000000    0.000000   N     CA    C  
ICOOR_INTERNAL    CA     0.000000  180.000000    1.477375   N     CA    C  
ICOOR_INTERNAL    C      0.000001   70.336982    1.544964   CA    N     C  
ICOOR_INTERNAL    O    -99.553091   58.049369    1.229364   C     CA    N  
ICOOR_INTERNAL    NX   178.859637   66.807105    1.358183   C     CA    O  
ICOOR_INTERNAL    C4  -174.611607   57.490680    1.451782   NX    C     CA 
ICOOR_INTERNAL    H10  106.039241   68.351999    1.097768   C4    NX    C  
ICOOR_INTERNAL    H11  120.966193   69.536981    1.095154   C4    NX    H10
ICOOR_INTERNAL    H12  119.491437   72.127202    1.090960   C4    NX    H11
ICOOR_INTERNAL    H13  167.594191   63.545121    1.017289   NX    C     C4 
ICOOR_INTERNAL    CB  -123.561118   67.526237    1.531952   CA    N     C  
ICOOR_INTERNAL    C3   -62.881890   68.540652    1.514280   CB    CA    N  
ICOOR_INTERNAL    O3   -47.605281   54.556833    1.210454   C3    CB    CA 
ICOOR_INTERNAL    O2  -179.610207   67.911044    1.358074   C3    CB    O3 
ICOOR_INTERNAL    H4   178.474384   74.036671    0.976702   O2    C3    CB 
ICOOR_INTERNAL    H6  -117.506019   72.499339    1.096353   CB    CA    C3 
ICOOR_INTERNAL    H5  -116.769586   67.692882    1.089965   CB    CA    H6 
ICOOR_INTERNAL    HA  -118.410645   74.450962    1.090922   CA    N     CB 
ICOOR_INTERNAL    C1   -89.927468   62.855377    1.372862   N     CA    C  
ICOOR_INTERNAL    O1     3.708502   58.433969    1.234496   C1    N     CA 
ICOOR_INTERNAL    C2  -179.454890   61.886703    1.520880   C1    N     O1 
ICOOR_INTERNAL    H2   164.738725   72.785187    1.090741   C2    C1    N  
ICOOR_INTERNAL    H1   117.899082   68.962311    1.096833   C2    C1    H2 
ICOOR_INTERNAL    H3   121.464318   67.239591    1.094155   C2    C1    H1 
ICOOR_INTERNAL    H    168.282202   62.279099    1.463264   N     CA    C1 
ICOOR_INTERNAL    H7    77.658951   68.248292    1.096047   H     N     CA 
ICOOR_INTERNAL    H8   120.876588   69.664890    1.089070   H     N     H7 
ICOOR_INTERNAL    H9   118.760654   70.561942    1.095440   H     N     H8 
# These lines stolen from ALA.params
# I *believe* they will override any duplicate definitions above
ICOOR_INTERNAL  UPPER  149.999985   63.800007    1.328685   C     CA    N  
ICOOR_INTERNAL    O   -180.000000   59.200005    1.231015   C     CA  UPPER
ICOOR_INTERNAL  LOWER -150.000000   58.300003    1.328685   N     CA    C  
ICOOR_INTERNAL    H   -180.000000   60.849998    1.010000   N     CA  LOWER

 

Note the comedy lines:

BOND_TYPE H7 H 1

BOND_TYPE H8 H 1

BOND_TYPE H9 H 1

No idea if this file will work, but it should. But you might want to fix a few tweaks I made, such as LIG as name.
Also, is your aspartate protonated? My guess is no.

 

However, 

 

Thu, 2019-08-22 05:39
matteoferla