Next Generation Generative Model Unlocks de novo Designs at Scale

Key Takeaways:

  • NVIDIA’s new Proteina-Complexa model combines generative AI with inference-time search algorithms to operate 30-60x faster than RFdiffusion when designing custom proteins that target specific diseases.
  • In a test to find binders for 127 different targets from over 1 million protein designs, the model successfully found binders for 68% of the targets.
  • The model could accelerate early-stage therapeutic protein discovery, including improvement with polar protein interfaces, flexible peptides, carbohydrates, and recessed/occluded epitopes.

 

New Model Can Speed Up Matching de novo Protein Design Candidates to Targets

Traditionally, finding a binder for a new disease target could take years of trial and error in a lab. NVIDIA’s new Proteina-Complexa model can speed up the process for generating high-quality candidates for finding binders to targets. By combining generative AI with search algorithms, it can design custom proteins to stick to specific disease targets with improved accuracy. This includes improvement in matching with polar protein interfaces, flexible peptides, carbohydrates, and recessed/occluded epitopes. 

New Approach Combines Generative Models with AI Optimization

Up until now, scientists generally used one of two methods for identifying de novo proteins: generative models or AI optimization. In the first approach, AI creates a structure in one shot which is fast, but often low accuracy. In the second approach, AI interactively tweaks a sequence until it fits, which is highly accurate, but slow. The Proteina-Complexa model bridges this gap, by allowing for reward-guided inference-time search in continuous latent space. 

The model uses a framework called partially latent flow matching. It uses joint sequence-structure generation, where the backbone Cα coordinates are explicit while amino-acid identity and remaining atomic coordinates are compressed into latent variables. This allows AI to navigate through millions of potential designs much more efficiently than if it were moving every atom individually. Plus, it uses algorithms, like beam search, to evaluate many generative trajectories during sampling of a protein binder at a time before choosing the best one as an inference-time search feature.

Results Highlight Model’s Skill and Versatility for Broader Use

In collaboration with Manifold Bio, the researchers tested over 1 million protein designs against 127 different targets. The model successfully found binders for 86 of the targets tested. Additionally, beyond just binding, the model showed it could design enzymes and binders for small molecules. And because the model produces full atomic details, including side-chains, there is no need for post hoc sequence redesign. The model could accelerate early-stage therapeutic protein discovery, and highlights advancement of generative models in this area.

The paper was posted by NVIDIA on March 16, 2026 and includes several Rosetta Commons members as co-authors.

Leave a Reply