You are here

Pyrosetta score protein structures with missing atoms in sidechain

2 posts / 0 new
Last post
Pyrosetta score protein structures with missing atoms in sidechain
#1

I have a question regarding Pyrosetta. Currently, I want to use pyrosetta to score some protein structures but with missing atoms in sidechain. So my protein structures contrain full-atoms for backbone but missing one or two atoms per residue on sidechain. I used pose_from_pdb to read my protein structure (pdb). But it seems pysoretta will fill in the missing atoms according to rotamer library. Is there a way to prevent it so I can just score my structures with missing atoms on sidechains?  

Thanks a lot for the help,

Category: 
Post Situation: 
Fri, 2021-04-09 13:33
xuezhi

Those sidechains play with other sidechains. Therefore, if you were to replace them with alanines say, you might get worse solvent terms of the nearby atoms. Plus there is the problem that the ref term differs.

MutateResidue = pyrosetta.rosetta.protocols.simple_moves.MutateResidue
MutateResidue(target=r, new_res='ALA').apply(pose)

An alternative is to make the sidechain atoms virtual. A virtual atom does not count in the scoring.

This is my test pose

pose = pyrosetta.pose_from_sequence('ELVISISALIVE')
scorefxn = pyrosetta.get_fa_scorefxn()
relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn, 15)
relax.apply(pose)
scorefxn(pose)  # 4.690357769636089
pose2pandas(pose, scorefxn)  # personal method.
# Leucine and valines are bad. with +1.5 kcal/mol

Adding the variant type `pyrosetta.rosetta.core.chemical.VariantType.VIRTUAL_SIDE_CHAIN` which does not make the atoms virtual per se, but does the job.

vt_mover.set_additional_type_to_add('VIRTUAL_SIDE_CHAIN')
leu_sele = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector('LEU')
vt_mover.set_residue_selector(leu_sele)
vt_mover.apply(pose)
# did it work?
res = ghosted.residue(2)
print(res.name3(), res.has_variant_type(pyrosetta.rosetta.core.chemical.VIRTUAL_SIDE_CHAIN))
# yeay!
# what is the score?
pd.DataFrame([pose2pandas(pose, scorefxn).iloc[1], 
              pose2pandas(ghosted, scorefxn).iloc[1]]).transpose()

The `fa_atr` term drops from -2.15142 to -0.897523, making the leucine scores worse. But the `ref` score is the same as hoped (ref is a fudge factor for the residues to give near empirical ∆∆G scores), but do note the `ResidueNameSelector` weirdly will no longer recognise the residue as having that name, even if the name does.

A thing to note that as soon a mover touches the residue, the virtualisation goes away.

relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn, 15)
relax.apply(ghosted)
res = ghosted.residue(2)
print(res.name3(), res.has_variant_type(pyrosetta.rosetta.core.chemical.VIRTUAL_SIDE_CHAIN))

 

Sat, 2021-04-10 07:18
matteoferla