# Including waters in RosettaCM

10 posts / 0 new
Including waters in RosettaCM
#1

I want to include several binding site waters in my homology model, since they seem to have an important impact on the configuration of the cofactor (GTP & Mg2+).  I.e. when I do the homology model without the waters, the position of the cofactor is significantly different from the crystal structure template, despite the surrounding residues being completely conserved. The cofactor occupies a space that is normally occupied by several structurally conserved water molecules. However, when I try to include the waters I get the error message:

Error from core::conformation::Residue.functions.cc.Could not place the following hydrogens:  H1  with atom stubs:  O  ,  H1 , &  H2
H2  with atom stubs:  O  ,  H1 , &  H2

ERROR: Failed to place ideal hydrogen positions
ERROR:: Exit from: src/core/conformation/Residue.functions.cc line: 83
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(print_backtrace(char const*)+0x2b) [0x7f6496406a4b]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libutility.so(utility::exit(std::string const&, int, std::string const&, int)+0x15c) [0x7f6494a1575c]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.2.so(core::conformation::idealize_hydrogens(core::conformation::Residue&, core::conformation::Conformation const&)+0x420) [0x7f649568cb60]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f6493caef45]
caught exception

[ERROR] EXCN_utility_exit has been thrown from: src/core/conformation/Residue.functions.cc line: 83
ERROR: Failed to place ideal hydrogen positions

I tried removing the hydrogens from the waters in the PDB but it gives the exact same error message. Is there way to solve this or work around it?

Post Situation:
Thu, 2017-04-27 11:59
ajkal

The issue here is that while the PDB-loading code has facilities to load in hydrogens for waters, RosettaCM is invoking a function which strips off the hydrogens, and then tries to rebuild them. The method which RosettaCM uses isn't able to handle a water missing two hydrogens. (Because if you replace one missing hydrogen, the location depends on where the other hydrogens is, but you can't build the other one, as it depends on the position of the first ...)

The stop-gap solution is to use the "TP5" water type, rather than the "HOH" type. This adds additional "virtual" atoms which will not be removed by the function RosettaCM is calling, meaning that when the hydrogens are removed, their location should be able to be rebuilt based on the positions of the virtual atoms.

Sat, 2017-04-29 11:39
rmoretti

I changed the residue name to TP5, but I still get the same error. The output does suggest that the hydrogens are being built (see below), but then it crashes with the same error message as before.

core.conformation.Conformation: [ WARNING ] Building missing atom ( H1 ) at root of residue tree, using stubs:  O    EP1  EP2
This probably means that a torsion angle is being taken from the ideal residue and
should be further optimized...
core.conformation.Conformation: [ WARNING ] Building missing atom ( H2 ) at root of residue tree, using stubs:  O    EP1  H1
This probably means that a torsion angle is being taken from the ideal residue and
should be further optimized...

The error message (shown here) occurs after the hydrogens have supposedly been placed:

Error from core::conformation::Residue.functions.cc.Could not place the following hydrogens:  H1  with atom stubs:  O  ,  H1 , &  H2
H2  with atom stubs:  O  ,  H1 , &  H2

ERROR: Failed to place ideal hydrogen positions
ERROR:: Exit from: src/core/conformation/Residue.functions.cc line: 83
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(print_backtrace(char const*)+0x2b) [0x7f282c01ea4b]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libutility.so(utility::exit(std::string const&, int, std::string const&, int)+0x15c) [0x7f282a62d75c]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.2.so(core::conformation::idealize_hydrogens(core::conformation::Residue&, core::conformation::Conformation const&)+0x420) [0x7f282b2a4b60]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f28298c6f45]
caught exception

[ERROR] EXCN_utility_exit has been thrown from: src/core/conformation/Residue.functions.cc line: 83
ERROR: Failed to place ideal hydrogen positions

Is Rosetta not recognizing the newly placed hydrogens? Is this a bug in the source code or is there any other work around I can use?

Mon, 2017-05-01 11:56
ajkal

I talked with an expert user of RosettaCM, and he's done RosettaCM runs with included waters without any significant problems.

However, he doesn't include the waters during the partial threading stage -- do the partial threading with just the protein portions of the template, and then manually add the waters to the template afterwards. (This also means that you would need to update the sequences/alignments to thread without the waters, but include the waters in the sequences passed to the RosettaScripts stage.) -- You'll also want to put the     add_hetatm="1"   option into the XML for the Hybridize mover.

Another recommendation he has is to get the most recent weekly release and use the -beta flag. The most recent ("experimental") scorefunction beta_nov16 has a *much* improved support for water hydrogen bonding geometry.

Mon, 2017-05-01 13:06
rmoretti

Thank you for your help. I went ahead with your instructions but a few new problems have arisen. As you suggested, I downloaded the newest weekly version and threaded my sequence to the template while ignoring the waters. I then manually added the waters to the threaded pdb and gave them the residue name "TP5".  I then ran the following command with the attached files:

"~/Rosetta/main/source/bin/rosetta_scripts.linuxgccrelease -database ~/Rosetta/main/database -in:file:fasta LeishTubulin1_sitewats.fasta -parser:protocol hybridize.xml -default_max_cycles 200 -dualspace -in:file:extra_res_fa GTP/GTP.fa.params GDP/GDP.fa.params -in:file:extra_res_cen GTP/GTP.cen.params GDP/GDP.cen.params -score:extra_improper_file tors/NT.fa.tors -detect_disulf false -overwrite"

I get the following errors:

core.chemical.GlobalResidueTypeSet: For ResidueTypeSet centroid there is no shadow_list.txt file to list known PDB ids.
core.chemical.GlobalResidueTypeSet: Finished initializing centroid residue type set.  Created 64 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 0.023672 seconds.

ERROR: The residue TP5 could not be generated.  Has a suitable params file been loaded?  (Note that custom params files not in the Rosetta database can be loaded with the -extra_res or -extra_res_fa command-line flags.)
ERROR:: Exit from: src/core/chemical/ResidueTypeSet.cc line: 115
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(print_backtrace(char const*)+0x2b) [0x7fd25f73f65b]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libutility.so(utility::exit(std::string const&, int, std::string const&, int)+0x15c) [0x7fd25e5d4c6c]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.2.so(core::chemical::ResidueTypeSet::name_map(std::string const&) const+0xb3) [0x7fd259c80883]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(core::pose::residue_types_from_sequence(std::string const&, core::chemical::ResidueTypeSet const&, bool)+0x456) [0x7fd25f801136]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(core::pose::make_pose_from_sequence(core::pose::Pose&, std::string const&, core::chemical::ResidueTypeSet const&, bool)+0x1d) [0x7fd25f80171d]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libcore.3.so(core::pose::make_pose_from_sequence(core::pose::Pose&, std::string const&, std::string const&, bool)+0x4e) [0x7fd25f801b8e]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libprotocols.3.so(protocols::simple_moves::ExtendedPoseMover::apply(core::pose::Pose&)+0x41) [0x7fd25bd1e2d1]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libprotocols_g.4.so(protocols::comparative_modeling::GenericJobInputter::pose_from_job(core::pose::Pose&, std::shared_ptr<protocols::jd2::Job>)+0x2e9) [0x7fd25c6061c9]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libprotocols.1.so(protocols::jd2::JobDistributor::run_one_job(std::shared_ptr<protocols::moves::Mover>&, long, std::string&, std::string&, unsigned long&, unsigned long&, bool)+0x6ad) [0x7fd262025a1d]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libprotocols.1.so(protocols::jd2::JobDistributor::go_main(std::shared_ptr<protocols::moves::Mover>)+0xc9) [0x7fd262027b59]
/home/ajkal/Rosetta/main/source/build/src/release/linux/3.13/64/x86/gcc/4.8/default/libprotocols.1.so(protocols::jd2::FileSystemJobDistributor::go(std::shared_ptr<protocols::moves::Mover>)+0x4a) [0x7fd2620005aa]
/home/ajkal/Rosetta/main/source/bin/rosetta_scripts.linuxgccrelease() [0x405f68]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fd25da7bf45]
/home/ajkal/Rosetta/main/source/bin/rosetta_scripts.linuxgccrelease() [0x4060e2]
Error: ERROR: Exception caught by JobDistributor while trying to get pose from job 'S_0001'
Error:

[ERROR] EXCN_utility_exit has been thrown from: src/core/chemical/ResidueTypeSet.cc line: 115
ERROR: The residue TP5 could not be generated.  Has a suitable params file been loaded?  (Note that custom params files not in the Rosetta database can be loaded with the -extra_res or -extra_res_fa command-line flags.)

Error: Treating failure as bad input; canceling similar jobs
protocols.jd2.FileSystemJobDistributor: job failed, reporting bad input; other jobs of same input will be canceled: S_0001
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 1 jobs considered, 1 jobs attempted in 0 seconds
Error: ERROR: Exception caught by rosetta_scripts application:1 jobs failed; check output for error messages
Error:

The main problem seems to be that it can't find the TP5 residue type even though the file "TP5.params" is clearly there in the "~/Rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/water/" directory. I even tried listing the TP5.params file explicitly, under the -in:file:extra_res_fa flag, but I got the same error message. Is it a problem that the pose energies table (at the end of the threaded pdb file) doesn't include the waters (since I left them out of the threading process)?

I also tried the the beta_nov16 scoring function with the -beta flag, but I kept getting errors saying my XML script was invalid. I used the rewrite_rosetta_script.py script to generate the attached xml input file, but it still stays the file is invalid. What changes do I need to make to my current script to use the beta_nov16 scoring function properly? (I know I need to change talaris2014 to beta_nov16, but other than that I'm not sure how to implement the new scoring function; FWIW, my current XML script is simply adopted from the Rosetta_CM tutorial)

Mon, 2017-05-01 19:54
ajkal

The issue with the TP5 is that the centroid mode ResidueTypeSet doesn't have that parameter (you're looking at the full atom mode parameter file). Luckily, you should be able to use the full atom params file as the centroid one. Just add -extra_res_cen ~/Rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/water/TP5.params to your commandline.  (BTW, Rosetta doesn't read in the pose energies table - it writes it, but when reading it ignores it completely.)

On the XML being invalid, I'm not sure what the issue is. The script you posted works fine for me. Do you have the exact error message you got? To switch to the new scorefunction, you need to replace all instances of "talaris2014" in the XML with "beta_nov16", and then you also have to pass the "-beta_nov16" option on the commandline.

Tue, 2017-05-02 08:13
rmoretti

Adding "-extra_res_cen ~/Rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/water/TP5.params" works (Rosetta is able to initialize the residue centroid set), but then I just get a Segmentation fault. Is the syntax for my "LeishTubulin_sitewats.fasta" and the first part of my PDB file correct? Getting rid of the ligand residues in these files (i.e. Z[GTP], Z[GDP], w[TP5]) gets rid of the segmentation fault (which is why I ask about them), but that obviously excludes these residues from the homology model...

The entire command/output is:

[ajkal@sackettlab]\$ /home/ajkal/Rosetta/main/source/bin/rosetta_scripts.linuxgccrelease -database /home/ajkal/Rosetta/main/database -in:file:fasta LeishTubulin1_sitewats.fasta -parser:protocol hybridize_new.xml -default_max_cycles 200 -dualspace -in:file:extra_res_fa GTP/GTP.fa.params GDP/GDP.fa.params -in:file:extra_res_cen GTP/GTP.cen.params GDP/GDP.cen.params -extra_res_cen ~/Rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/water/TP5.params -score:extra_improper_file tors/NT.fa.tors -detect_disulf false -overwrite
core.init: Rosetta version unknown:exported  from http://www.rosettacommons.org
core.init: command: /home/ajkal/Rosetta/main/source/bin/rosetta_scripts.linuxgccrelease -database /home/ajkal/Rosetta/main/database -in:file:fasta LeishTubulin1_sitewats.fasta -parser:protocol hybridize_new.xml -default_max_cycles 200 -dualspace -in:file:extra_res_fa GTP/GTP.fa.params GDP/GDP.fa.params -in:file:extra_res_cen GTP/GTP.cen.params GDP/GDP.cen.params -extra_res_cen /home/ajkal/Rosetta/main/database/chemical/residue_type_sets/fa_standard/residue_types/water/TP5.params -score:extra_improper_file tors/NT.fa.tors -detect_disulf false -overwrite
core.init: 'RNG device' seed mode, using '/dev/urandom', seed=853192164 seed_offset=0 real_seed=853192164
core.init.random: RandomGenerator:init: Normal mode, seed=853192164 RG_type=mt19937
protocols.evaluation.ChiWellRmsdEvaluatorCreator: Evaluation Creator active ...
protocols.jd2.JobDistributor: Parser is present.  Input mover will be overwritten with whatever the parser creates.
core.chemical.GlobalResidueTypeSet: For ResidueTypeSet centroid there is no shadow_list.txt file to list known PDB ids.
core.chemical.GlobalResidueTypeSet: Finished initializing centroid residue type set.  Created 65 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 0.02006 seconds.
Segmentation fault

Tue, 2017-05-02 09:56
ajkal

I have still not been able to work out this issue. Is there any chance you could put me in touch with the Rosetta_CM expert you spoke to earlier? (who you said has managed to include waters in constructing homology models). If there are any other files you need for troubleshooting, please let me know.

Mon, 2017-05-08 12:07
ajkal

Is there any hope of resolving this issue? I am currently at an impasse. I will be changing jobs soon so I am running out of time on this project. If you can put me in touch with the expert on Rosetta CM you mentioned, I would greatly appreciate it.

Tue, 2017-05-16 14:38
ajkal

Hi. I have run hybridize with explicit water molecules in the past and can try to get your modeling jobs working. Could you supply your full inputs? I'm only seeing the first 50 lines of your input pdb file.

Tue, 2017-06-06 11:45
rpavlov