Stable de novo protein design via joint conformational landscape and sequence optimization

Why this research matters

Protein design only works if a designed sequence reliably folds into the intended structure. In practice, many designs that look promising on a computer fail when tested in the lab. This study takes a direct, experimental approach to a simple question: which current protein design strategies actually produce stable proteins?

Background: why stability is difficult to design

Many protein design methods focus on only part of the folding problem. Some methods design sequences to fit a chosen structure. Others predict what structure a sequence might adopt and use that prediction as a quality check.

The authors point out a key limitation of these approaches. Designing a sequence for one target structure does not guarantee that this structure is the most stable option for that sequence. If alternative folds are never considered during design, proteins may fold incorrectly even if they appear well designed.

Design strategies compared

To explore this systematically, the authors compared four design strategies using the same set of protein folds.

TrRosetta is used to generate protein structures starting from random sequences. It guides designs toward well-defined folds but does not directly evaluate whether other competing structures might be equally favorable.

TrMRF designs amino acid sequences to fit a fixed protein structure using a Markov-Random-Field model. Because the structure is held constant, the method focuses on sequence compatibility rather than whether the structure itself is the most stable choice.

Joint TrRosetta + TrMRF design combines both approaches into a single process. Instead of designing sequence and structure separately, the joint method optimizes them together. The goal is to favor sequences that prefer the intended fold while avoiding alternative structures.

ProteinMPNN was included as a widely used comparison method.

Experimental testing at scale

The authors designed thousands of mini-proteins shorter than 80 amino acids, excluding cysteine to avoid stabilization by disulfide bonds. Each target fold was designed using all four methods, allowing direct comparisons.

In total, 20,668 designed proteins were tested experimentally. After removing designs likely to form multimers, 13,442 proteins remained. A key analysis focused on 5708 proteins where all four methods designed a sequence for the same fold.

Key results: joint design performs best

When comparing designs targeting the same structure, the joint TrRosetta + TrMRF method produced the most stable proteins most often:

  • More stable than TrRosetta designs in 80.5% of comparisons
  • More stable than TrMRF designs in 74.4%
  • More stable than ProteinMPNN designs in 84.7%

These differences were based on experimental stability measurements, not computational confidence scores.

Common filters miss many stable proteins

The authors also examined widely used computational filters, such as AlphaFold confidence thresholds. Among experimentally stable proteins, only 21.7% passed typical filtering criteria.

Applying strict filters increased average stability but discarded most designs. For joint designs, only about 12% passed a high-confidence filter, even though many excluded proteins were experimentally stable.

How stability was measured

All designs were tested using a high-throughput protease resistance assay. Proteins that fold stably resist digestion by enzymes, allowing folding stability to be measured directly for thousands of designs in parallel.

Broader significance and limits

The authors conclude that designing sequence and structure together improves protein stability for small, single-domain proteins. They note that the study focuses on mini-proteins, and that further work will be needed to test whether these results extend to larger proteins.

This work was recently published by Rocklin, Ovchinnikov, and colleagues in Nature Communications.

Leave a Reply