# selecting pivot_residue in Backrub for ensemble generation

7 posts / 0 new
selecting pivot_residue in Backrub for ensemble generation
#1

Hello,

I need to generate an ensemble of proteins to capture backbone flexibility in docking. I ran relax with thorough flag, but the resultants decoys are not that much different (RMSD ranges fro 0.2 to 0.7). Then I ran Backrub without defining pivot_residue and without resfile. The resultant decoys of Backrub are also similar. Does anyone have suggestions to define pivot_residue? or how can I sample conformational space enough to consider backbone flexibility?

Thanks

Category:
Post Situation:
Fri, 2020-08-07 15:41
rohi

Without specifying a resfile, the sidechains aren't repacked, "NATRO", whereas you'd probably want them to change in conformation ("NATAA") if you want to assess your protein's plasticity.
You might want to try pumping the kT up. The default is 0.6 kcal/mol (which what you'd get at 25°C in a real system). Say, 1 kcal/mol (37°C) or higher. It is a statistical potential value so the temperature is far from real, so you can go much higher.

Backrub is not really an MD run, it is meant for protein design with flexible backbone and the end result will be a protein in an energy minima. A better alternative is the cartesianMD mover: https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/movers_pages/CartesianMD

• It's a scripts/pyrosetta only, but has decent documention.
• It does what you'd expect: it rattles your protein at a given kT energy making it drift like a normal MD simulation —except in implicit solvent.
• It is okay with ligands
• It is cartesian and not internal coordinate mode though

For pyrosetta here is an example:

pose = pyrosetta.Pose()
params_paths = pyrosetta.rosetta.utility.vector1_string()
params_paths.extend([params_filename])
pyrosetta.generate_nonstandard_residue_set(pose, params_paths)
pyrosetta.rosetta.core.import_pose.pose_from_file(pose, pdb_filename)
md = pyrosetta.rosetta.protocols.md.CartesianMD(pose, pyrosetta.create_score_function('ref2015_cart'))
md.set_ncyc_premin(0)
md.set_ncyc_postmin(0)
md.set_nstep(500)
md.set_temperature(310)
md.use_rattle(True)
md.set_store_trj(True)
md.apply(pose)
ps = md.dump_poses(pose)
pyrosetta.rosetta.core.scoring.CA_rmsd(ps[1], ps[3])
print({'poses': len(md.dump_poses(pose)),
'kinetic_energy': md.kinetic_energy(),
'potential_energy': md.potential_energy(),
'score': pyrosetta.create_score_function('ref2015_cart')(pose),
'dofs': md.n_dof(),
'RMSD': pyrosetta.rosetta.core.scoring.CA_rmsd(ps[1], ps[3])}
)

Sat, 2020-08-08 03:47
matteoferla

Dear matteoferla,

Thank you for your help. Just one question:

I was trying to run the code above, but I got error saying "params_filename" is undefined. I was wondering what is "params_filename"?

Thanks

Sun, 2020-08-09 19:05
rohi

params_filename is just a string of a path of a params file  (topology file, "ligand.params") required to load the pose, if you have multiple add multiple to that list if you have none skip the second, third and four lines, it is equivalent to "-extra_res_fa". Likewise, the string pdb_filename is your filename for your pdb, equivalent to -s.

The code was meant as an example snippet and not meant to be complete. It lacks the import and init of pyrosetta

import pyrosetta
pyrosetta.init(extra_options='-no_optH false -mute all -ignore_unrecognized_res true -load_PDB_components false')

And after the run you may want to dump any pose to PDB

pose.dump_pdb('whatever.pdb')

Additionally you may want to change the setting say:

md.set_ncyc_premin(200)
md.set_ncyc_postmin(0)
md.set_nstep(50_000)

In this example, as each step is 2 fs, that would be 100 ps. The number of poses in the list ps is one every 100 I think, so you'd have 500 poses, but worth checking that detail with a short run.

Mon, 2020-08-10 03:31
matteoferla

Dear Matteoferla,

I appreciate your help. Based on your response, I provided the code below:

import pyrosetta
pyrosetta.init(extra_options='-no_optH false -mute all -ignore_unrecognized_res true -load_PDB_components false')
pose = pyrosetta.Pose()
pyrosetta.rosetta.core.import_pose.pose_from_file(pose, 'input.pdb')
md = pyrosetta.rosetta.protocols.md.CartesianMD(pose, pyrosetta.create_score_function('ref2015_cart'))
md.set_ncyc_premin(200)
md.set_ncyc_postmin(0)
md.set_nstep(200)
md.set_temperature(310)
md.use_rattle(True)
md.set_store_trj(True)
md.apply(pose)
ps = md.dump_poses(pose)
pyrosetta.rosetta.core.scoring.CA_rmsd(ps[1], ps[3])
print({'poses': len(md.dump_poses(pose)),
'kinetic_energy': md.kinetic_energy(),
'potential_energy': md.potential_energy(),
'score': pyrosetta.create_score_function('ref2015_cart')(pose),
'dofs': md.n_dof(),
'RMSD': pyrosetta.rosetta.core.scoring.CA_rmsd(ps[1], ps[3])}
)
pose.dump_pdb('cartesianMD_output.pdb')

I have some other questions:

1. I got an error with this line "pyrosetta.rosetta.core.scoring.CA_rmsd(ps[1],ps[3])" saying "IndexError". Could you talk me through this line? I know Python, but I am a little bit familiar with  PyRosetta.

2. About dumping any pose to PDB, is it fine to add pose.dump_pdb('cartesianMD_output.pdb') at the end of the code?

3. Except for the link you provided above, is there any paper that explains more cartesianMD?

Thank you

Mon, 2020-08-10 14:20
rohi

The author as stated with the code is Hahnbeom Park and I believe https://www.pnas.org/content/115/12/3054 is the reference —an MD run is done for homology modelling, so not a typical MD run you'd do with Gromacs. But it basically does an MD trajectory as you'd get from Gromacs, but with implicit solvent.

Sorry for being cryptic ps = md.dump_poses(pose), should have been defined as poses =. It is a list-like of poses. list(poses) will give a normal list. Pyrosetta objects use "vectors", most notably these start from 1. Each of the entries is a pose, which can be saved with the .dump_pdb code. So pose.dump_pdb('cartesianMD_output.pdb') will save the last, while poses[1].dump_pdb('cartesianMD_output.pdb') will save the first say.

The RMSD was to see how much did it deviate, so you can do whatever combination. The code I copied from must have returned 3 poses, while in your code there where less. Say you could do this (or analyse the trajectory however you fancy):

import pandas as pd

pd.DataFrame({f'#{i}': {f'#{j}': pyrosetta.rosetta.core.scoring.CA_rmsd(poses[i], poses[j]) for j in range(1, 1+len(poses))} for i in range(1, 1+len(poses))})


If you are in need of a good control, I'd recommend 1L2Y, the Trp-cage miniprotein which is 20 amino acids, but is an NMR ensemble, which needs to be read in a weird way:

original = pyrosetta.toolbox.rcsb.pose_from_rcsb('1L2Y') # this is a mess of 720...
# usable test bed
pose = pyrosetta.rosetta.protocols.grafting.return_region(original, 1,20)
# ensemble
poses = pyrosetta.rosetta.utility.vector1_core_pose_Pose()
n = [pyrosetta.rosetta.protocols.grafting.return_region(original, 1+i,20+i) for i in range(0, original.total_residue(),20)]
poses.extend(n)`

Tue, 2020-08-11 10:00
matteoferla

Hello,

I also tried to use Backrub with high mc_kt (mc_kt=1) and NATAA for all residues (to repack all residues), but the resultant decoys are still too similar (RMSD=0).