Rosetta Commons Research Experience for Undergraduates
A Cyberlinked Program in Computational Biomolecular Structure & Design
Interns in this geographically-distributed REU program have the opportunity to participate in research using the Rosetta Commons software. The Rosetta Commons software suite includes algorithms for computational modeling and analysis of protein structures. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.
We expect most internships will be in-person, but some institutions or applicants may require a virtual format.
We will adjust the program to conditions as needed.
The summer 2023 application opens on November 1.
- One week of Rosetta Code School (June 5 through June 9) where you will learn the inner details of the RosettaPython code and community coding environment, so you are fully prepared for the summer!
- 8 weeks of hands-on research in a molecular modeling and design laboratory, developing new algorithms and discovering new science.
- The summer will finish with a trip to the Rosetta Conference in the gorgeous Cascade Mountains of Washington State, where you will present your research in a poster and connect with Rosetta developers from around the world. The conference will be held from August 7 through August 10.
- This program is supported by NSF (Award 1950697). Interns will receive housing, paid travel expenses, and a $6,000 stipend.
Include the following in the application:
- Personal statement - why this internship interests you - brief summary of research and computing experience - why you are an appropriate candidate for the internship (up to 500 words, not characters)
- Two references (complete the reference forms, in the application, with contact information)
- Select top three labs and projects of interest from the list below.
- Deadline for receipt of applications is February 1, 2023.
- Deadline for receipt of recommendation letters is February 4, 2023.
- Program contact: firstname.lastname@example.org
- U.S. citizens, permanent residents, U.S. nationals, AND international students are eligible
- College Sophomores or Juniors preferred
- Major in computer science, engineering, mathematics, chemistry, biology, and/or biophysics
- Available for at least 10 weeks during the summer of 2022
- Interest in graduate school
- While not required, we seek candidates with some combination of experiences in scientific or academic research, C++/Python/*nix/databases, software engineering, object-oriented programming, and/or collaborative development (git)
- **Students graduating before the start of the program are not eligible for the REU and are encouraged to apply to our RaMP Program.
Available projects and locations:
Baker Lab @ University of Washington in Seattle, WA
"Protein design using generative models"
Students will learn cutting edge deep learning protein design methods, and apply them to current design challenges. Areas of focus include de novo enzyme design and de novo binder design.
Cheng Group @ Merck & Co. in San Francisco, CA
"Predictive models for binding and developability of antibodies"
In antibody drug discovery, two important goals are to improve antigen binding while reducing antibody self-interactions, and modeling is useful in prioritizing engineering efforts. We have generated large datasets along with homology models and conformational ensembles for each antibody in the dataset. The successful student will leverage Rosetta to generate structure-based descriptors and use them in building predictive machine learning models. The student will work with Merck & Co. scientists to assess the advantages of derived predictions alone and in combination with state-of-the-art predictive approaches.
Cooper Lab @ Northeastern University in Boston, MA
“Crowdsourcing protein folding and design”
We are exploring how citizen science and crowdsourcing through video games can help biochemists with their work. To do this, we have developed the game Foldit, a multiplayer online game that allows players without previous experience in biochemistry to work on protein folding and design problems. This project will focus on development of game-related aspects to understand and improve the player experience. Potential projects include virtual reality, procedural content generation, and dynamic difficulty adjustment.
Das Lab @ Stanford University in Stanford, CA
“Modeling and designing RNA at high resolution”
The project seeks to understand a big question related to the most fundamental — but also most mysterious — machine in living systems, the protein making ribosomes. The participant will predict the effects of mutations in the ribosome active site and compare to experimental measurements on ribosome structure and activity made by our lab and collaborators using high-throughput biochemistry. The REU participant will learn RNA structure, design, and experimental measurements.
Glasgow Lab @ Columbia University in New York, NY
"Computational design of allosteric protein therapeutics"
Perturbations like mutations, binding interactions, and post-translational modifications (PTMs) can change the structural dynamics of a protein, which affect its biomolecular interactions, stability, other PTMs, and catalytic activity. Such structural changes affect protein function, which manifests as aberrant metabolism and disease progression. Understanding how perturbations drive conformational changes in proteins is necessary to characterize dysregulation in disease, but these changes are very difficult to observe, and there are no methods to predict them. Accurate predictions of perturbation-driven conformational changes in proteins would enable the discovery of currently invisible disease mechanisms and the design of highly specific therapeutics. We are developing a method to predict how mutations and ligand binding impact protein conformational ensembles towards uncovering the missing link between mutations and disease phenotypes. We first computationally design a library of barcoded and mutagenized proteins, and then measure the conformational dynamics of library members using a high-throughput hydrogen-deuterium exchange with mass spectrometry (HDX/MS) strategy that we are developing. Using the resulting dataset, we will build a machine learning model that can predict the effects of mutations on the conformational states of proteins.
Gray Lab @ Johns Hopkins University in Baltimore, MD
“Antibody engineering by deep learning”
Antibodies are an excellent model system for loop structure prediction and design, which remain difficult problems. Deep learning has improved high-resolution loop modeling but antigen docking remains challenging, likely due to the lack of multiple-sequence alignments. In this project, the student will combine emerging deep learning models and create and test deep generative and variational models toward designing developable, high-affinity binders for specific epitopes. The REU participant will learn antibody engineering, homology modeling and docking, and machine learning.
Hosseinzadeh Lab@ University of Oregon in Eugene, OR
"Designing protein/peptide binders for therapeutic/biosensing"
Our lab works on a number of projects focused on designing new proteins and peptides that can bind to and interact with other molecules. Some of these projects include designing peptides that bind to antiviral targets, designing proteins that bind to small molecules, and designing protein heterodimers in a high throughput manner.
Karanicolas lab @ Fox Chase Cancer Center in Philadelphia, PA
“Designing targeted protein degraders”
PROTACs (PROteolysis TArgeting Chimeras) are a new approach to eliminate activity of a given protein in cells. Rather than inhibiting the protein of interest, PROTACs completely eliminate the target protein by inducing its degradation. PROTACs are bi-functional small molecules that use a chemical linker to join a “warhead” directed against some target protein with a moiety that recruits an E3 ubiquitin ligase. In this project, the student will apply docking to build models of several PROTACs in complex with their target proteins and E3 ubiquitin ligases, then will use these as input to develop a machine learning approach for predicting the efficacy of or designing a given PROTAC. Through this project, the REU student will learn homology modeling and docking, machine learning, and computational chemical biology.
Khare lab @ Rutgers University in New Brunswick, NJ
"Designing stimulus-responsive enzymes for targeted chemotherapy"
The ability to design stimulus-responsiveness into any enzyme of choice would aid in our ability to interrogate and intervene in biological processes with exquisitely high spatial and temporal precision. This project focusses on developing, testing and improving pro-drug activating enzymes that can be used to better target therapeutic delivery. The participant will use computational design, high-throughput experiments and machine learning-based approaches to obtain enzymes that can be activated by external stimuli such as light, or by the presence of tissue-specific molecules.
Khmelinskaia Lab @ University of Bonn in Bonn, Germany
" Expanding the structural and functional space of de novo designed protein assemblies"
Computational methods have been recently developed for designing novel protein assemblies with atomic-level accuracy, yet several aspects of current methods limit the structural and functional space that can be explored. We aim to expand the plethora of available protein assemblies for application by introducing and controlling new structural properties (e.g. flexibility, structural switches) and functional moiteties (e.g. sequence recognition elements, surface binding). Students will combine computational protein modelling and design methods with in vitro biophysical characterization techniques of the designed protein materials, having the chance to learn both computational and wet-lab skills.
King Lab, University of Washington in Seattle, WA
“De novo design of nanoparticle vaccine scaffolds tailored to the display of specific antigens”
Computationally designed nanoparticles have proven to be a robust and versatile platform for multivalent antigen display, a strategy that can improve the potency and breadth of vaccines. However, suitable nanoparticle scaffolds are not available for all types of antigens. In this project, the REU participant will use cutting-edge design methods to generate nanoparticle scaffolds to display several classes of viral glycoprotein antigens. The designed nanoparticles will be tested in the lab to determine if they assemble to the target architecture and can display the antigens as intended.
Kortemme Lab @ University of California, San Francisco, in San Francisco, CA
“Computational design of de novo proteins to control biological signaling ”
We are working towards engineering synthetic signaling systems built from de novo designed protein components that can reognize inputs, transduce signals, and control programmable outputs. We have a range of projects to create proteins with custom-designed shapes to recognize specific signals, and to engineer switchable protein structures. We integrate computational design, including recent advances from deep learning, and experimental characterization in vitro and in cellular systems.
Kuhlman Lab @ University of North Carolina, Chapel Hill in Chapel Hill, NC
“Design of protein switches and complexes”
Enzymes can dramatically increase the rate of chemical reactions because they bind with high affinity to the reaction transition state. The goal of this project is to develop new machine learning methods for designing small molecule binding sites in proteins. These methods will be useful for creating novel enzymes and biosensors.
Lindert Lab @ Ohio State University in Columbus, OH
" Structure Modeling using Mass Spec Data"
Knowledge of protein structure is paramount to the understanding of biological function and for developing new therapeutics. Mass spectrometry experiments which provide some structural information, but not enough to unambiguously assign atomic positions have been developed recently. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. We are developing integrative modeling techniques, computational modeling with mass spec data, that enable prediction of protein complex structure from the experimental data.
Meiler Lab @ Vanderbilt University in Nashville, TN
"Integrating Artificial Intelligence and Protein Structure for Drug Discovery
Focus of this project is the development of new computer algorithms that integrate ligand-based, i.e. AI-driven drug discovery with structure-based methods, i.e. docking within RosettaLigand. The student will be trained in both types of methods and afterwards develop and integrate an AI into RosettaLigand for said task. Several drug discovery application projects in cancer, neuroscience, and metabolic diseases are running in the laboratory to test out the new method in a realistic practical setting.
Merck Protein Engineering Lab in Rahway, NJ
“Design and engineering of novel enzymes”
Enzymes catalyze a diverse set of chemical transformations with significant rate enhancements and with excellent chemo, stereo, and regiospecificity. These features combined with the fact that enzymes operate in aqueous solution and are typically more environmentally friendly than synthetic catalysts has led to the broad adoption of enzymes in the chemical industries. While enzymes are amazing catalysts, they have evolved to solve the challenges faced by Mother Nature and not the challenges we face today. We use computational protein design and evolution-based methods to engineer and invent new protein functions. This project will leverage our high-throughput automation capabilities with structure-based design and machine learning to engineer enzymes with novel properties. Students will gain experience in computational protein design, machine learning, and wet-lab methods for engineering proteins.
Mills Lab @ Arizona State University in Tempe AZ
"Computational design of proteins containing functional non-canonical amino acids"
Despite the amazing functions proteins achieve with only 20 standard building blocks, the ability to add new chemistries to the genetic codes of standard organisms could allow for new functions. For the last two decades, over 150 "non-canonical amino acids" (NCAAs) have been added to the genomes of organisms from E. coli to mice. In the Mills lab, we use Rosetta to design proteins that take advantage of the novel chemical functionalities contained in some of these NCAAs. Current efforts are focused on the development of rapid diagnostics (i.e. for COVID-19) and new metalloproteins. Interns in our group will have the abiity to learn how to both design and experimentally characterize new proteins containing NCAAs.
Rocklin Lab @ Northwestern University in Chicago, IL
"Applying high-throughput experimental data to guide computational protein design"
Today, most computational protein design tools like Rosetta use the features of natural proteins structures (which amino acids like to be near each other, what types of structures are very common, etc) to guide the design of new proteins. However, for many applications, we want to design proteins with properties far beyond what already exists in nature. To achieve this, we need new sources of data - not just natural protein structures - that can guide design into new territory. Our lab develops new experimental methods to measure properties like folding stablity, binding affinity, and dynamics for tens to hundreds of thousands of designed or natural proteins at the same time. We then use these new large datasets to guide protein design proteins. We have a range of different focused on basic science, therapeutic development, and tools for synthetic biology. Each person's project is described on our website (www.rocklinlab.org). We will work with an intern or post-bac to find which project in our lab is best for their interests.
Schoeder Lab @ University of Leipzig in Leipzig, Germany
“Designing the next generation of gene and cell protein therapeutics”
Computer-assisted protein design has emerged as a lead technology to design tailored therapeutics and vaccines in recent years. In the Schoeder Lab we leverage structure-based methods and machine learning to design novel therapeutics including antibodies, adeno-associated virus vectors for gene therapy and chimeric antigen receptors for immunotherapy. We combine these computational approaches with experimental validation and biophysical studies. Students will have the chance to gain experience both in computational and wet-lab methods for engineering and characterizing protein therapeutics
Schueler-Furman Lab @ Hebrew University in Jerusalem, Israel
“How do post-translational modifications change the communication of a protein with its partners?”
Would you like to learn more about how interactions that are mediated by short peptide motifs regulate cellular behavior? Join our lab for the summer to work on a project that will involve different deep learning techniques and modeling using Rosetta to characterize motifs in flexible regions of a protein and their interactions with different partners, and to design specific inhibitors for these interactions.
Siegel Lab @ University of California, Davis in Davis, CA
"Computational enzyme design and modeling"
The Siegel Lab student undergrad research project involves students in investigating structure-function relationships in enzymes and collecting relevant data for the computational protein modeling and design stakeholder community. Students use computational modeling tools to design novel protein variants, build their variant gene with site-directed mutagenesis followed by sequence verification, they learn to express and purify their variant enzymes, and finally biophysically characterize them with colorimetric kinetic and thermal stability assays. The data generated by the students will be used to train biomolecular modeling software to more accurately predict enzyme function, which remains a holy grail in the field. Improvements in in-silico model accuracy will translate to huge gains in efficiency in the wet-lab to engineer proteins to tackle today’s grand challenges.
Slusky Lab @University of Kansas in Lawrence, KS
“Design of biosensors using machine learning”
The most time consuming step of enzyme design—especially in the case of de novo design or design on previously non-catalytic scaffolds—is experimentally screening dozens or even hundreds of proteins to find the ones that function as intended. We are creating and testing machine learning classifiers to accurately determine which designs will succeed. A successful classifier would dramatically accelerate progress in computational enzyme design and be a significant advance to the state-of-the-art. Designed enzymes could be used for environmental remediation to break down oil spills.
Whitehead Lab @ University of Colorado, Boulder in Boulder, CO
" Designing ligand-activatable proteins"
The intern will design de novo allosteric effector sites into proteins by designing a disruptive, cavity-forming residue mutation or deletion. It has been shown that these structural disruptions can have a significant impact on protein function through various mechanisms, including local unfolding or perturbation of catalytically important residues (Deckert et al. 2012). This computational approach will be tested against a range of biotechnologically-relevant proteins, including polymerases and gene editing ribonucleoproteins (e.g. CRISPR systems).
Yarov-Yarovoy Lab @ University of California, Davis in Davis, CA
"Design of macrocycles, peptides, and antibodies targeting ion channels"
This project aims to design potent and selective macrocycles, peptides, and antibodies as modulators of ion channels and as molecular probes to visualize ion channel activity in live cells. Three recent breakthroughs: (1) high-resolution cryoEM and x-ray structures of ion channels, (2) Rosetta protein design, and (3) AlphaFold protein structure prediction, have together set the stage for design of macrocycles, peptides, and antibodies targeting ion channels. Rosetta Interns will work with an interdisciplinary and collaborative research team and learn how to use Rosetta and AlphaFold to design prototypes of macrocycles, peptides, and antibodies as modulators of ion channels.
"De novo design and characterization of miniprotein therapeutics for the treatment of cancer"
"Using the protein manifold sampler to discover new drugs"
"Vaccine design and antibody design"
"High-throughput prediction of antibody developability from sequence and structural features"
Companies may partner with us and sponsor an intern--click here for more information
Intern Research Posters: