Deep Learning’s Impact on de novo Protein Design
Key Takeaways:
- With advancements in AI, the field of protein engineering has shifted from template-based approaches to de novo designs with approaches including seed-based design, deep generative models, and binder hallucination.
- Design towards specific protein functions is the current frontier factoring in parameters including neosurfaces, conditional binding with biological stimuli, and dynamic modeling, linking deep learning with molecular dynamics.
- Tackling the targetability of hard to match structures, including highly polar surfaces, and encoding controlled conformational flexibility into molecules are still emerging areas of research.
Advances in Protein Binder Design Target Therapeutic and Bioengineering Challenges
Advances in de novo protein design have furthered our progress on many therapeutic and bioengineering challenges. Current methods have led to successfully designed de novo binders for SARS-CoV-2 spike, PD-1, PD-L1, and CTLA-4 proteins, as well as multidomain Cas9 protein and common allergens. As the field looks beyond structure-oriented designs and into function-oriented designs, controllable pathways for therapeutic and sustainability applications as well as tissue engineering are on the horizon.
A Move to De Novo Approaches
Historically, protein design relied on template-based approaches. While successful, these methods are limited by the geometries of known structures. De novo design removes these constraints by using only the target protein’s surface information to engineer entirely new binders. Due to advances in artificial intelligence, accurate structure prediction tools, such as AlphaFold, combined with deep generative models for backbone and sequence generation have led to more robust protein design capabilities. Advancements in de novo binder design have led to three main approaches seed-based design with tools like MaSIF, deep generative models like RFDiffusion paired with sequence design tools like ProteinMPNN for backbone and sequence matching, and binder hallucination tools like BindCraft that utilize structure prediction models like AlphaFold to find binders through iterative feedback.
The rapid progress in these approaches is largely attributed to the advancement of AI in structure prediction. Tools, such as AlphaFold2, RoseTTAFold, ESMFold, and OpenFold, have revolutionized the design pipeline by providing accurate in silico filtering that identifies the most promising candidates, thereby reducing the need for costly, large-scale experimental screening and enabling hallucination methods.
A Focus on Function-Oriented Design Has Emerged
While structure-oriented design has seen massive success, the current frontier is function-oriented design. Researchers are now developing computational tools and methods for small molecules that can be used as triggers to disrupt or enable interactions or that can address neosurfaces. Another approach is designing systems that change conformation in response to biological stimuli (e.g., peptide binding), allowing for the creation of controllable pathways for cell engineering and therapeutics.
Despite these advances, several challenges remain. Some binding sites are still difficult to target because of their geometry or high polarity. Also, more research is needed to understand protein dynamics in biological processes for encoding conformational flexibility into the molecules in a controlled manner. However, with developments in retraining structure prediction models and training deep generative models on molecular dynamics trajectories, improvement in function-oriented design is anticipated.
The paper by the Correia lab was published online on February 26, 2026 and in the April 2026 issue of Current Opinion in Structural Biology.
