You are here



Computational Structure Prediction and Design of Biomolecular Structures

Scientists in this geographically-distributed post-baccalaureate program have the opportunity to participate in research using and developing the Rosetta Commons software. The Rosetta Commons software suite includes algorithms for computational modeling and design of proteins and other biomolecules. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.  

This 1 year post-baccalaureate program is aimed at preparing underrepresented minority and/or disadvantaged students to succeed in PhD programs. 



  • One week of Rosetta Code School (June 6- June 10) where you will learn the inner details of the Rosetta Python code and community coding environment, so you are fully prepared to research using the software.
  • Assignment to a Rosetta lab where you will be mentored by a graduate student and faculty member who will guide and foster your research.
  • Participation in the Summer Rosetta Conference in the gorgeous Cascade Mountains of Washington State (August 10 through August 13) and the Winter Rosetta Conference (location TBD in February 2023), where you will connect with Rosetta developers from around the world.
  • Salary, health benefits, and funding for conference travel are included.
  • Integration into the host institution’s NIH PREP program.



PREP Programs Provides:

  • Research experience: Scholars conduct hypothesis-driven research in their Mentor’s lab, with day-to-day guidance by an experienced PhD student or postdoc. Scholars participate fully in weekly lab meetings, attend weekly research seminars in their department, attend a vibrant PhD program retreat and a national conference of their choice.
  • Community: Scholars come together each month for two-hour ‘Journal Club’ events to present and discuss their research with Peer-Mentors (PhD students, postdocs) and faculty. These meetings include professional development mini-lessons on topics like the NSF-GRFP, graduate school applications, research posters, and more.
  • Project (‘mini-thesis’) meetings: Scholars gain confidence by organizing, preparing for, and convening three one-hour ‘mini-thesis’ meetings with two subject-expert faculty, plus their research mentor and the PREP Director. Scholars benefit both scientifically and professionally by building strong working relationships with multiple faculty members at Johns Hopkins who are experts in their field of interest.
  • Professional training and custom mentoring: Scholars participate in workshops designed to improve their scientific writing skills, and understand ethics in science, and can choose from many other workshops including communication and improvisation. Each Scholar charts an individual development plan with the PREP Director, with custom mentoring both formal (monthly one-hour meetings) and informally as needed.
  • Preparation for GRE or MCAT exam, graduate school applications and interviews.
  • Annual salary plus health, tuition and other benefits.



  • Individuals from racial and ethnic groups that have been shown by government studies, to be underrepresented in health-related sciences on a national basis.
  • Individuals with disabilities, who are defined as those with a physical or mental impairment that substantially limits one or more major life activities, as described in the Americans with Disabilities Act of 1990, as amended.
  • Individuals from disadvantaged backgrounds
  • U.S.citizens, permanent residents, and U.S. nationals are eligible.
  • Undergraduate major in computer science, engineering, mathematics, chemistry, biology, and/or biophysics.
  • Individuals must plan to graduate with thier bachelor's degree before the start of the program and must not be admitted into an advanced degree program. 
  • While not required, we seek candidates with some combination of experiences in scientific or academic research, C++/Python/*nix/databases, software engineering, object-oriented programming, and/or collaborative development.




  • Resume
  • Unofficial transcript
  • Personal statement that summarizes why you are an appropriate candidate (up to 2000 characters) including:
    • Why this program interests you
    • Brief summary of research and computing experience
    • Research career goals
  • Two recommendation letters, completed recommendations can be sent to
  • Select top three labs and projects of interest from the list below.
  • Deadline for receipt of applications is February 1, 2022.
  • Deadline for receipt of recommendation letters is February 5, 2022.
  • Program contact:



Correia Lab @ École Polytechnique Fédérale de Lausanne in Lausanne, Switzerland  

"Deciphering protein surface features for the prediction and design of function

Proteins are engraved with surface patterns that determine their function. We are creating a computational framework to identify and design such functional fingerprints. Several projects are possible this is just one of the themes which we are exploring currently.


Gray Lab @ Johns Hopkins University in Baltimore, MD
“Antibody engineering by deep learning"
Antibodies are an excellent model system for loop structure prediction and design, a difficult problem in the field. High-resolution models of the loop structure are necessary for successful docking to antigens or for design for improved affinities, yet traditional loop prediction methods have been frustrated on antibody loops because of their extreme variability. In this project, the student will apply deep learning methods, including transfer learning and attention gating to leverage data from a large set of protein structure and focus predictions on the key loop. The PREP trainee will learn antibody engineering, homology modeling and docking, and machine learning. 


Huang Lab @ Stanford University in Stanford, CA
" Protein design for immunological intervention"
We actively develop ML based protein design tools, as well as wet-lab driven molecular platforms for intervention with the immune system. We recently developed a new monobody engineering software pipeline and a molecular platform that can specifically target MHC antigens. We hope to expand these two areas of research interests. We combine Rosetta, neural networks and yeast display to achieve these goals.


Jha Lab @ Los Alamos National Laboratory in Las Alamos, NM

"Enzyme design for novel hydrolase activities"

Design-Build-Test-Learn (DBTL) cycle applied to enzyme engineering. Rosetta will inform design of libraries and high throughput assays (Jha et al, ACS Syn Biol, 2020) will be used to test them. Information from good and bad perfomers will feed to the next DBTL cycle.


Karanicholas Lab@ Fox Chaser Cancer Center in Philadephia, PA

" Designing targeted protein degraders (PROTACs)"

The Karanicolas lab is developing methods for computationally designing "molecular glues": small-molecules that induce two proteins to come together in cells, as a means to rationally tune the proteins' activities. We are specifically interested in targeting a few key proteins that drive cancer. Through this project you will learn how to use Rosetta for modeling protein-protein complexes, how to do virtual screening using large libraries of small molecules, and how to deploy machine learning to pick the best compounds for testing in cells.


Kellogg Lab@ Cornell University in Utica, NY

"Engineering CRISPR-associated transposases for genome-editing"

The Kellogg lab seeks to harness the power of transposases to engineer novel genome-editing tools. Recent development of CRISPR systems have revolutionized molecular genetics: This powerful gene-editing system can be directed to disable any target gene in a programmable fashion. However, CRISPR systems rely on introducing DNA double-strand breaks (DSBs), which has the potential to disrupt genomic integrity. Transposons, or “jumping genes,” are autonomous DNA elements that insert new sequences while bypassing DSBs. Recently discovered CRISPR-associated transposases represent a promising solution to insert new DNA sequences at desired chromosomal locations, since they are capable of programmed DNA transposition. However, these systems are not yet ready for genome-editing applications due to their low efficiency and off-site targeting. Furthermore, the mechanisms these systems use for recognizing and inserting DNA into their target-sites are largely mysterious. Using techniques in structural biology, genetics, and protein design, my lab will explore the mechanisms these CRISPR-associated transposases use to identify and integrate into their target-sites. Based on our mechanistic models, we will re-engineer these systems using computational protein design in order to develop new genome-editing tools that are adaptable, efficient, and precise. This work could produce a sophisticated tool for genome engineering or for gene therapies that could be used to treat a variety of human disorders.

Khare lab @ Rutgers University in New Brunswick, NJ
"Designing stimulus-responsive enzymes for targeted chemotherapy"
Traditional chemotherapy has limited efficacy because chemotherapeutics are toxic to all dividing cells, which limits the dose that can be safely administered. One approach to increase selectivity, called directed enzyme prodrug therapy (DEPT) involves prodrugs, which are site-specifically activated by exogenously delivered enzymes. The prodrug activation reaction is intended to be orthogonal to the human enzymatic repertoire to minimize side-effects. DEPT’s therapeutic benefit in the clinic stands to improve by using a new generation of prodrug-activating enzymes that we are developing using computational design approaches to be “smart”: they can sense and respond to the tumor microenvironment or an external cue (e.g. tissue-penetrant light) in a controllable manner to maximize their site selectivity, and can avoid triggering a strong immune reaction. These developments will enable potent and safer chemotherapy regimens as well as general design methodology to build novel therapeutic switches and biological circuits for a broad range of applications.


Kortemme Lab @ University of California, San Francisco in San Francisco, CA
" Computational design of de novo proteins to control biological signaling"

We are working towards engineering synthetic signaling systems built from de novo designed protein components that can recognize inputs, transduce signals, and control programmable outputs. We have a range of projects to create proteins with custom-designed shapes to recognize specific signals, and to engineer switchable protein structures. We integrate computational design and experimental characterization in vitro and in cellular systems, and are exploring new opportunities through advances in deep learning.  


Kuhlman Lab @ University of North Carolina, Chapel Hill in Chapel Hill, NC

"Applying machine learning to protein design"

Advances in machine learning are revolutionizing the fields of protein structure prediction and design. You will help create and test protocols that make use of Rosetta in combination with machine learning to design new protein stuctures and complexes. 


Lindert lab @ Ohio State University in Columbus, OH
"Structure Modeling using Mass Spec Data"
Knowledge of protein structure is paramount to the understanding of biological function and for developing new therapeutics. Mass spectrometry experiments which provide some structural information, but not enough to unambiguously assign atomic positions have been developed recently. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. We are developing integrative modeling techniques, computational modeling with mass spec data, that enable prediction of protein complex structure from the experimental data.  


Rocklin Lab @ Northwestern University in Chicago, IL
"Applying high-throughput experimental data to guide computational protein design"
Today, most computational protein design tools like Rosetta use the features of natural proteins structures (which amino acids like to be near each other, what types of structures are very common, etc) to guide the design of new proteins. However, for many applications, we want to design proteins with properties far beyond what already exists in nature. To achieve this, we need new sources of data - not just natural protein structures - that can guide design into new territory. Our lab develops new experimental methods to measure properties like folding stability, binding affinity, and dynamics for tens to hundreds of thousands of designed or natural proteins at the same time. We then use these new large datasets to guide protein design proteins. We have a range of different focused on basic science, therapeutic development, and tools for synthetic biology. Each person's project is described on our website ( We will work with an intern or post-bac to find which project in our lab is best for their interests.


Siegel Lab @ University of California, Davis in Davis, CA

" Computational enzyme design and modeling"

The Siegel Lab engineers enzymes to address human-centered challenges in health, food, and environmental systems. The group is primarily focused on work with direct applications and frequently engages in close collaboration with biotech industry partners. Interns will use Rosetta to model and design enzymes that catalyze novel biochemical reactions in projects that align with the mission of the lab. Using insights from in-silico experiments, interns will move on to characterize and evaluate their designs in the wet-lab, learning both computational and benchwork skills that will propel them forward in their academic careers.


Smith Lab @ Wesleyan University in Middletown, CT

"Reshaping protein energy landscapes to optimize dynamics/function"

The Smith lab is currently developing and applying methods for combining Rosetta design calculations with molecular dynamics simulations to reshape the energy landscapes of proteins. We are applying these techniques to several unique systems: Rosetta-designed mini flurorescence activating proteins (mFAPs), a natural protein that undergoes pressure-induced conformational changes, and Rosetta-designed miniproteins that exhibit undesired conformational heterogeneity. You will learn use Rosetta with free energy data analysis techniques, optimize algorithm parameters with existing datasets, and apply these techniques to improve protein function.


Vorobieva Lab @ Vrije Universiteit in Brussels, Belgium

" De novo design of membrane proteins for bottom-up understand of their biogenesis pathway"

Transmembrane beta-barrels fold in the outer membrane of Gram negative bacteria, where they play major roles in pathogenicity, multi-drug resistance. We aim to use synthetic (de novo designed) TMBs to understand the basis of TMB folding. Naturally-occurring TMBs have co-evolved with periplasmic transporters and chaperones to enable correct outer membrane targeting. However, these interactions likely involve complex 3D properties of the unfolded OMPs, which evade classic single-mutation studies and are clouded by evolutionary “noise”. By contrast, de novo designed TMBs have no evolutive history, and can fold in synthetic lipid membranes but NOT in cellulo. They can therefore be used as a “blank slate” to build in cellulo folding TMBs from the bottom-up. The intern will modify existing de novo design methods to incorporate new properties in the generated TMB designs, following a hypothesise/build/test approach.


Whitehead Lab @ University of Colorado, Boulder in Boulder, CO

" Designing ligand-activatable proteins"

The intern will design de novo allosteric effector sites into proteins by designing a disruptive, cavity-forming residue mutation or deletion. It has been shown that these structural disruptions can have a significant impact on protein function through various mechanisms, including local unfolding or perturbation of catalytically important residues (Deckert et al. 2012). This computational approach will be tested against a range of biotechnologically-relevant proteins, including polymerases and gene editing ribonucleoproteins (e.g. CRISPR systems). 

PH Lab @ University of Oregon in Eugene, OR

"Designing protein/peptide binders for therapeutic/biosensing"

Our lab works on a number of projects focused on designing new proteins and peptides that can bind to and interact with other molecules. Some of these projects include designing peptides that bind to antiviral targets, designing proteins that bind to small molecules, and designing protein heterodimers in a high throughput manner. Check our website for more information ( We will work with the interested postbac or intern to find the project they are interested in.

Khmelinskaia Lab @ University of Bonn in Bonn, Germany

" Expanding the structural and functional space of de novo designed protein assemblies"

Computational methods have been recently developed for designing novel protein assemblies with atomic-level accuracy, yet several aspects of current methods limit the structural and functional space that can be explored. We aim to expand the plethora of available protein assemblies for application by introducing and controlling new structural properties (e.g. flexibility, structural switches) and functional moiteties (e.g. sequence recognition elements, surface binding). Students will combine computational protein modelling and design methods with in vitro biophysical characterization techniques of the designed protein materials, having the chance to learn both computational and wet-lab skills.