You are here

Rosetta Commons Research Experience for Undergraduates

Rosetta Commons Research Experience for Undergraduates 

A Cyberlinked Program in Computational Biomolecular Structure & Design

Interns in this geographically-distributed REU program have the opportunity to participate in research using the Rosetta Commons software. The Rosetta Commons software suite includes algorithms for computational modeling and analysis of protein structures. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes. 

Due to the COVID-19 global pandemic, the summer 2020 program was administered in a virtual format. While we hope this is not the case, we can continue the virtual format for summer of 2021, if needed.

The summer 2022 application will open on November 1, 2021. 

The program:

  • One week of Rosetta Code School (June 7 through June 11) where you will learn the inner details of the  RosettaPython code and community coding environment, so you are fully prepared for the summer!
  • 8 weeks of hands-on research in a molecular modeling and design laboratory, developing new algorithms and discovering new science.
  • The summer will finish with a trip to the Rosetta Conference in the gorgeous Cascade Mountains of Washington State, where you will present your research in a poster and connect with Rosetta developers from around the world. The conference will be held from August 10 through August 13. 
  • This program is supported by NSF (Award 1950697). Interns will receive housing, paid travel expenses, and a $6,000 stipend. 


To apply:

  • Include the following in the application:
    • Resume
    • Transcript
    • Personal statement - why this internship interests you - brief summary of research and computing experience - why you are an appropriate candidate for the internship (up to 500 words)
    • Two references (complete the reference forms, in the application, with contact information)
    • Select top five labs and projects of interest from the list below. 
  • Deadline for receipt of applications is February 1, 2022.
  • Deadline for receipt of recommendation letters is February 3, 2022. 
  • Program contact: .



  • U.S. citizens, permanent residents, U.S. nationals, AND international students are eligible
  • College Sophomores or Juniors preferred
  • Major in computer science, engineering, mathematics, chemistry, biology, and/or biophysics
  • Available for at least 10 weeks during the summer of 2022
  • Interest in graduate school  
  • While not required, we seek candidates with some combination of experiences in scientific or academic research, C++/Python/*nix/databases, software engineering, object-oriented programming, and/or collaborative development (git)
  • **Students graduating before the start of the program are not eligible for the REU and are encouraged to apply to our post-bac program


Available projects and locations:



Cheng Group @ Merck & Co. in San Francisco, CA

Predictive models for binding and developability of antibodies"

In antibody drug discovery, two important goals are to improve antigen binding while reducing antibody self-interactions, and modeling is useful in prioritizing engineering efforts. We have generated large datasets along with homology models and conformational ensembles for each antibody in the dataset. z The successful student will leverage Rosetta to generate structure-based descriptors and use them in building predictive machine learning models. The student will work with Merck & Co. scientists to assess the advantages of derived predictions alone and in combination with state-of-the-art predictive approaches.


Cooper Lab @ Northeastern University in Boston, MA

Crowdsourcing protein folding and design

We are exploring how citizen science and crowdsourcing through video games can help biochemists with their work. To do this, we have developed the game Foldit, a multiplayer online game that allows players without previous experience in biochemistry to work on protein folding and design problems. This project will focus on development of game-related aspects to understand and improve the player experience. Potential projects include virtual reality, procedural content generation, and dynamic difficulty adjustment. 


Correia Lab @ École Polytechnique Fédérale de Lausanne in Lausanne, Switzerland  

"Deciphering protein surface features for the prediction and design of function

Proteins are engraved with surface patterns that determine their function. We are creating a computational framework to identify and design such functional fingerprints. Several projects are possible this is just one of the themes which we are exploring currently.


Gray Lab @ Johns Hopkins University in Baltimore, MD

“Antibody engineering by deep learning”

Antibodies are critically important for our immune systems and as drugs with high target specificity. Our lab has developed new deep learning-based approaches to predict the structure of antibody loops, and we are now poised to use these networks to design new antibody molecules. For example, we would like to be able to design antibodies quickly for emerging pandemics like coronavirus, even as structural information on a new pandemic may be limited. Good designs will require high-resolution models of the antigen and the antibody loops and an effective design algorithm. In this project, you will learn about deep learning methods (including generative models), antibody structure and therapeutics, and scientific model validation.


Gront Lab @ University of Warsaw in Warsaw, Poland

"Rosetta on the Web: Homology Modelling and other applications"

The use of Rosetta software can be intimidating for inexperienced users due to the size of the package and its level of complexity. A possible way to reach wider audience is to provide an interactive web-based user interface. In this project you turn a one of Rosetta protocols into a fully working Python web application. You will then apply it to analyse and visualise your research results! This will be a learning experience of Rosetta modelling but also help you improve your Pythons skills on any level.


Huang Lab @ Stanford University in Stanford, CA

"Protein design for immunological intervention"

We actively develop ML based protein design tools, as well as wet-lab driven molecular platforms for intervention with the immune system. We recently developed a new monobody engineering software pipeline and a molecular platform that can specifically target MHC antigens. We hope to expand these two areas of research interests. We combine Rosetta, neural networks and yeast display to achieve these goals..


Jha Lab @ Los Alamos National Laboratory in Las Alamos, NM

"Enzyme design for novel hydrolase activities"

Design-Build-Test-Learn (DBTL) cycle applied to enzyme engineering. Rosetta will inform design of libraries and high throughput assays (Jha et al, ACS Syn Biol, 2020) will be used to test them. Information from good and bad perfomers will feed to the next DBTL cycle.


Karanicolas lab @ Fox Chase Cancer Center in Philadelphia, PA

Designing targeted protein degraders (PROTACs)” 

The Karanicolas lab is developing methods for computationally designing "molecular glues": small-molecules that induce two proteins to come together in cells, as a means to rationally tune the proteins' activities. We are specifically interested in targeting a few key proteins that drive cancer. Through this project you will learn how to use Rosetta for modeling protein-protein complexes, how to do virtual screening using large libraries of small molecules, and how to deploy machine learning to pick the best compounds for testing in cells.


Khare lab @ Rutgers University in New Brunswick, NJ

"Designing stimulus-responsive enzymes for targeted chemotherapy"

Traditional chemotherapy has limited efficacy because chemotherapeutics are toxic to all dividing cells, which limits the dose that can be safely administered. One approach to increase selectivity, called directed enzyme prodrug therapy (DEPT) involves prodrugs, which are site-specifically activated by exogenously delivered enzymes. The prodrug activation reaction is intended to be orthogonal to the human enzymatic repertoire to minimize side-effects. DEPT’s therapeutic benefit in the clinic stands to improve by using a new generation of prodrug-activating enzymes that we are developing using computational design approaches to be “smart”: they can sense and respond to the tumor microenvironment or an external cue (e.g. tissue-penetrant light) in a controllable manner to maximize their site selectivity, and can avoid triggering a strong immune reaction. These developments will enable potent and safer chemotherapy regimens as well as general design methodology to build novel therapeutic switches and biological circuits for a broad range of applications.


Khmelinskaia Lab @ University of Bonn in Bonn, Germany

" Expanding the structural and functional space of de novo designed protein assemblies"

Computational methods have been recently developed for designing novel protein assemblies with atomic-level accuracy, yet several aspects of current methods limit the structural and functional space that can be explored. We aim to expand the plethora of available protein assemblies for application by introducing and controlling new structural properties (e.g. flexibility, structural switches) and functional moiteties (e.g. sequence recognition elements, surface binding). Students will combine computational protein modelling and design methods with in vitro biophysical characterization techniques of the designed protein materials, having the chance to learn both computational and wet-lab skills.


Kortemme Lab @ University of California, San Francisco, in San Francisco, CA

Computational design of de novo proteins to control biological signaling 

We are working towards engineering synthetic signaling systems built from de novo designed protein components that can reognize inputs, transduce signals, and control programmable outputs. We have a range of projects to create proteins with custom-designed shapes to recognize specific signals, and to engineer switchable protein structures. We integrate computational design and experimental characterization in vitro and in cellular systems, and are exploring new opportunites through advances in deep learning. 


Kuhlman Lab @ University of North Carolina, Chapel Hill in Chapel Hill, NC

"Applying machine learning to protein design"

Advances in machine learning are revolutionizing the fields of protein structure prediction and design. You will help create and test protocols that make use of Rosetta in combination with machine learning to design new protein stuctures and complexes. 


Lindert Lab @ Ohio State University in Columbus, OH

" Structure Modeling using Mass Spec Data"

Knowledge of protein structure is paramount to the understanding of biological function and for developing new therapeutics. Mass spectrometry experiments which provide some structural information, but not enough to unambiguously assign atomic positions have been developed recently. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. We are developing integrative modeling techniques, computational modeling with mass spec data, that enable prediction of protein complex structure from the experimental data.


Merck Protein Engineering Lab in Rahway, NJ

“Design and engineering of novel enzymes”

Enzymes catalyze a diverse set of chemical transformations with significant rate enhancements and with excellent chemo, stereo, and regiospecificity. These features combined with the fact that enzymes operate in aqueous solution and are typically more environmentally friendly than synthetic catalysts has led to the broad adoption of enzymes in the chemical industries. While enzymes are amazing catalysts, they have evolved to solve the challenges faced by Mother Nature and not the challenges we face today. We use computational protein design and evolution-based methods to engineer and invent new protein functions. This project will leverage our high-throughput automation capabilities with structure-based design and machine learning to engineer enzymes with novel properties. Students will gain experience in computational protein design, machine learning, and wet-lab methods for engineering proteins.


Mills Lab @ Arizona State University in Tempe AZ

"Computational design of proteins containing functional non-canonical amino acids"

Despite the amazing functions proteins achieve with only 20 standard building blocks, the ability to add new chemistries to the genetic codes of standard organisms could allow for new functions. For the last two decades, over 150 "non-canonical amino acids" (NCAAs) have been added to the genomes of organisms from E. coli to mice. In the Mills lab, we use Rosetta to design proteins that take advantage of the novel chemical functionalities contained in some of these NCAAs. Current efforts are focused on the development of rapid diagnostics (i.e. for COVID-19) and new metalloproteins. Interns in our group will have the abiity to learn how to both design and experimentally characterize new proteins containing NCAAs.


Rocklin Lab @ Northwestern University in Chicago, IL 

"Applying high-throughput experimental data to guide computational protein design" 

Today, most computational protein design tools like Rosetta use the features of natural proteins structures (which amino acids like to be near each other, what types of structures are very common, etc) to guide the design of new proteins. However, for many applications, we want to design proteins with properties far beyond what already exists in nature. To achieve this, we need new sources of data - not just natural protein structures - that can guide design into new territory. Our lab develops new experimental methods to measure properties like folding stablity, binding affinity, and dynamics for tens to hundreds of thousands of designed or natural proteins at the same time. We then use these new large datasets to guide protein design proteins. We have a range of different focused on basic science, therapeutic development, and tools for synthetic biology. Each person's project is described on our website ( We will work with an intern or post-bac to find which project in our lab is best for their interests.


Siegel Lab @ University of California, Davis in Davis, CA

"Computational enzyme design and modeling"

The Siegel Lab engineers enzymes to address human-centered challenges in health, food, and environmental systems. The group is primarily focused on work with direct applications and frequently engages in close collaboration with biotech industry partners. Interns will use Rosetta to model and design enzymes that catalyze novel biochemical reactions in projects that align with the mission of the lab. Using insights from in-silico experiments, interns will move on to characterize and evaluate their designs in the wet-lab, learning both computational and benchwork skills that will propel them forward in their academic careers. 


Smith Lab @ Wesleyan University in Middletown, CT

"Reshaping protein energy landscapes to optimize dynamics/function"

The Smith lab is currently developing and applying methods for combining Rosetta design calculations with molecular dynamics simulations to reshape the energy landscapes of proteins. We are applying these techniques to several unique systems: Rosetta-designed mini flurorescence activating proteins (mFAPs), a natural protein that undergoes pressure-induced conformational changes, and Rosetta-designed miniproteins that exhibit undesired conformational heterogeneity. You will learn use Rosetta with free energy data analysis techniques, optimize algorithm parameters with existing datasets, and apply these techniques to improve protein function.


Vorobieva Lab @ Vrije Universiteit in Brussels, Belgium

" De novo design of membrane proteins for bottom-up understand of their biogenesis pathway"

Transmembrane beta-barrels fold in the outer membrane of Gram negative bacteria, where they play major roles in pathogenicity, multi-drug resistance. We aim to use synthetic (de novo designed) TMBs to understand the basis of TMB folding. Naturally-occurring TMBs have co-evolved with periplasmic transporters and chaperones to enable correct outer membrane targeting. However, these interactions likely involve complex 3D properties of the unfolded OMPs, which evade classic single-mutation studies and are clouded by evolutionary “noise”. By contrast, de novo designed TMBs have no evolutive history, and can fold in synthetic lipid membranes but NOT in cellulo. They can therefore be used as a “blank slate” to build in cellulo folding TMBs from the bottom-up. The intern will modify existing de novo design methods to incorporate new properties in the generated TMB designs, following a hypothesise/build/test approach.


Whitehead Lab @ University of Colorado, Boulder in Boulder, CO

" Designing ligand-activatable proteins"

The intern will design de novo allosteric effector sites into proteins by designing a disruptive, cavity-forming residue mutation or deletion. It has been shown that these structural disruptions can have a significant impact on protein function through various mechanisms, including local unfolding or perturbation of catalytically important residues (Deckert et al. 2012). This computational approach will be tested against a range of biotechnologically-relevant proteins, including polymerases and gene editing ribonucleoproteins (e.g. CRISPR systems). 


Companies may partner with us and sponsor an intern--click here for more information


Intern Research Posters:











































Award Number: 1659649



tinyStanleyposter.jpg731.48 KB