# Problem with undetected lower and upper terminus variants for new polymer

6 posts / 0 new
Problem with undetected lower and upper terminus variants for new polymer
#1

Hi, everyone,

I am struggling with generating params files to represent a new linear polymer in rosetta. I am using the general idea employed for the already implemented polymers (such as CAAs, NCAAs or nucleic acids) in which you define a params file containing the lower and upper atom connect entries. I also have set up the internal coordinate entries for these upper and lower connections, which are needed to generate meaningful connections between the polymer's residues. Additionally, I've created upper and lower terminus variants as separated params files, changing their atoms and charges and setting their lower or upper connections accordingly as none.

My problem is when I try to read the PDB file containing a ten residues version of the polymer. I am reading all the (three) params files with -extra_res_fa option; however, it complains about:

ERROR: Unable to find desired residue 'PET' with variant 'UPPER_TERMINUS_VARIANT'. Attempted to add target variant(s) to ResidueType using both ResidueType base name 'PET' and base ResidueType. Was attempting to add new variant type 'UPPER_TERMINUS_VARIANT'

I tried to find how Rosetta interprets the upper and lower terminus variants, but I could not find much information about this. I looked at the database and found the param files for protein caps (i.e., ACE and NME residues) and, copying that idea, I set up my upper and lower variants as different cap residues; however, the error continued to be the same.

Is there any way I can make rosetta aware of the polymer residue's upper and lower terminus variants? or is my problem laying somewhere else entirely?

I am attaching all my params files.

(I apologise for putting this in the Non-Canonical Peptides category; I could not find anything more adequate).

AttachmentSize
2.94 KB
3.04 KB
2.92 KB
Post Situation:
Sat, 2022-06-11 06:51
Martin Floor

The problem is that you are reusing the same residue name thrice, even if different, yet what the system expects is a single residue that has patch definitions.

Like terminal residues, patches are bizarre... until you have made one: then they make sense. I wrote about them here but that's not needed as in summary, unlike residuetypes, they stack (for the category), so you can redeclare the upper and lower patches specifically for your residues and the copypaste text mangling is straightforward.

Wed, 2022-06-15 02:03
matteoferla

Ok, I think that was a big step toward adding the variants. I have created the two patches files needed (i.e., for the upper and lower variants). Then, I loaded them with the -extra_patch_fa option, which was recognized as such. However,  when trying to read a 10-long version of the polymer, I am getting a segmentation fault error:

...
core.chemical.GlobalResidueTypeSet: Finished initializing fa_standard residue type set.  Created 986 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 0.999578 seconds.
core.import_pose.import_pose: File 'pet_10.pdb' automatically determined to be of type PDB
core.io.pose_from_sfr.PoseFromSFRBuilder: Adding undetected lower terminus type to residue 1,    1
core.io.pose_from_sfr.PoseFromSFRBuilder: Adding undetected upper terminus type to residue 10,   10
Segmentation fault (core dumped)

I then installed Rosetta in debug mode to see if I could see some extra information about the line the error was originating:

/usr/include/c++/9/bits/shared_ptr_base.h:1007: std::__shared_ptr_access<_Tp, _Lp, <anonymous>, <anonymous> >::element_type& std::__shared_ptr_access<_Tp, _Lp, <anonymous>, <anonymous> >::operator*() const [with _Tp = const core::chemical::MutableICoorRecord; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic; bool <anonymous> = false; bool <anonymous> = false; std::__shared_ptr_access<_Tp, _Lp, <anonymous>, <anonymous> >::element_type = const core::chemical::MutableICoorRecord]: Assertion '_M_get() != nullptr' failed.
Aborted (core dumped)


It says something about the internal coordinate, but I am not sure what exactly could be wrong. Maybe I am missing something obvious about the patches files?

I am putting my patches file here, as well as the typical ROSETTA_CRASH.log file.

File attachments:
Wed, 2022-06-29 05:32
Martin Floor

Okay. I gave your snippets a spin in PyRosetta (same as Scripts, pasted below just for posterity), but I was confused by the fact that with the autotermini off (same as the command line argument -use_truncated_termini false) and relying on the original, the output is a mess even the bonds are optimised:

I suspect the connection might be declared back to front or something peculiar. So that is a way bigger issue than bad termini.

Namely, your PDB may read fine albeit with odd termini, but will blow up.

Also, for the sake of formality, I would expect the oxygen linking the two residues not be part of the therephthate, but of the ethylene glycol as that is where it came from.

# ## Copypasted funs
import requests

def get_block_from_forum(url:str) -> str:
response = requests.get(url)
response.raise_for_status()
return response.text

name:str,
rts: pyrosetta.rosetta.core.chemical.ResidueTypeSet):
"""
Add a residue type via a params file block.
Cannibilised from rdkit_to_params
Remember to run pose.conformation().reset_residue_type_set_for_conf(rts)
"""
buffer = pyrosetta.rosetta.std.stringbuf(params_block)
stream = pyrosetta.rosetta.std.istream(buffer)
name,
rts)

# ## Get blocks

pose = pyrosetta.Pose()
rts = pose.conformation()\
.modifiable_residue_type_set_for_conf(pyrosetta.rosetta.core.chemical.FULL_ATOM_t)
pose.conformation().reset_residue_type_set_for_conf(rts)
print(rts.get_base_types_name3('PET'))

# ## create pose
pyrosetta.io.make_pose_from_sequence(pose, 'X[PET]X[PET]X[PET]',
rts, True)

# ## minimise
scorefxn = pyrosetta.create_score_function('ref2015_cart')
cycles = 15
relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn, cycles)
movemap = pyrosetta.MoveMap()
movemap.set_bb(True)
movemap.set_chi(True)
movemap.set_jump(True)
relax.set_movemap(movemap)
relax.minimize_bond_angles(True)
relax.minimize_bond_lengths(True)
relax.apply(pose)

# ## Check
import nglview as nv
nv.show_rosetta(pose)

Wed, 2022-06-29 07:52
matteoferla

So, would you say this is an issue with the definition of the internal coordinates? It seems that inter-monomer connections are the only thing distorted.

Thu, 2022-06-30 10:05
Martin Floor

Yes. When -use_truncated_termini is set to true it circumvents your issue —good to verify the main params file. Even if it does not affect each residue it will blow up the structure if the backbone is altered. Feel free to try it.

I looked at your file and it does indeed seem like your connects are wired backwards:

LOWER_CONNECT O1
UPPER_CONNECT C10
ICOOR_INTERNAL   UPPER -16.465872   60.478963    1.511195   O1    C1    O2
ICOOR_INTERNAL   LOWER 112.041935   66.821965    1.508362   C10   C9    O3


There may be other issues... Therefore, maybe it might be checking in a simpler system whether it is correct as you are hoping?

Fri, 2022-07-01 13:49
matteoferla