Reverse-translating proteins into DNA shed light on scalable proteomics

A new Nature Biotechnology paper points to a compelling shift in how protein sequencing might finally scale: stop treating proteins purely as a proteomics problem, and start treating them as an information-conversion problem.

The core idea is elegant. Instead of directly reading peptide sequences with a bespoke protein sensor, the authors use a modified Edman degradation workflow to iteratively remove N-terminal amino acids, label them with peptide-specific DNA barcodes, and convert each sequencing step into a PCR-amplifiable DNA signal. That DNA is then read out by standard next-generation sequencing. In effect, peptide sequence information is “reverse translated” into a digital nucleic acid format.

Why this matters is that DNA sequencing already has what proteomics still lacks at scale with massive throughput, mature workflows, amplification, and a highly optimized readout ecosystem. By converting peptide identity and position into DNA, the platform leverages decades of infrastructure built for genomics rather than requiring single-molecule proteomics to mature entirely on its own.

The paper’s headline claim is strong: true single-molecule peptide sequencing with single-amino-acid resolution, including full sequence coverage across millions of reads and accurate discrimination between native and post-translationally modified peptides. If that performance proves robust in broader settings, this would represent a meaningful step toward de novo, high-throughput protein sequencing.

The larger strategic insight is that the future of proteomics may belong not only to better protein readers, but also to better molecular transducers. Sometimes the winning move is not to solve the hardest sensing problem directly, but to convert it into a format the existing technology stack already knows how to scale. This study suggests that for proteins, DNA may be that format.

This is best viewed as a foundational technology paper, not yet proof of routine proteome-wide clinical deployment. But as a platform concept, it is one of the clearest recent examples of how proteomics could begin to inherit the economics and scalability of sequencing.

Reference

https://www.nature.com/articles/s41587-026-03061-z

Next
Next

Engineering CAR-T cells directly in the body