Decoding Realistic Quantum Error Syndrome with Quantum Elements Digital Twins

Fault-tolerant quantum computing requires quantum error correction (QEC): Encoding one logical qubit into many physical qubits so that, below a threshold error rate, the logical error rate falls rapidly as the code grows. The practical engineering question is: How large must the code be and how good must the hardware be to reach a useful logical qubit? Credible answers require models that capture a device’s real error mechanisms, including coherent and correlated effects, yet run fast enough to support iterative design. These requirements motivate the design of hardware‑calibrated digital twins for QEC.

As a step toward this goal, we report on results from a collaboration involving researchers from Quantum Elements Inc., the University of Southern California (USC), Harvard University and Amazon Web Services (AWS), to speed up hardware-faithful QEC simulations with classical compute resources. Building on a real-time quantum Monte Carlo (QMC) algorithm developed at USC [1], we used Amazon Elastic Compute Cloud (Amazon EC2) Hpc7a instances orchestrated by AWS ParallelCluster to run quantum master‑equation simulations of a distance‑7 rotated surface code with 97 physical qubits (49 data qubits + 48 measurement qubits) on par with the state-of-the-art surface-code memory demonstrations [2]. A full open-system simulation of a 97-qubit distance-7 surface code round without approximations would require tracking a density matrix with 4⁹⁷ entries, far beyond the capabilities of classical computers. Our approach runs this simulation in about an hour on a single compute node, while faithfully capturing coherent and correlated noise that simpler models miss.

In this post, we present a foundational demonstration of scalable, hardware-faithful QEC simulations at experiment-relevant scale. In future posts, we will incorporate richer error models into the digital twin, use the resulting syndrome data to develop and evaluate more expressive error-correction software, and connect those improvements to measurable performance gains on hardware.

QEC: A software and hardware challenge

Operationally, QEC is a measurement-and-decoding cycle with classical feedforward: each round measures stabilizers (syndromes), a classical decoder infers likely errors and carries the result forward as a Pauli-frame update (or, when required, applied as a physical recovery). The cycle then repeats.

As codes grow and cycles accumulate, QEC performance depends not only on underlying hardware error rates, but also on how effectively the decoder extracts error information from syndrome data. Improved decoding can reduce the logical error per cycle or reach a target logical error rate at smaller code distance, as highlighted by surface-code experiments where logical error suppression improves with increasing distance [2,3]. Decoders infer from patterns in syndrome measurements, and those patterns are set by a device’s actual noise. When teams tune decoders using simplified noise assumptions (for example, weakly correlated stochastic Pauli flips), they can miss coherent and correlated mechanisms that reshape syndrome statistics, producing systematically mismatched recovery decisions that degrade logical performance.

Keeping the hardware-software loop aligned requires calibrated modeling, which is precisely what device digital twins provide. A digital twin’s simulated syndrome statistics match those of the real hardware under the same circuits. By enabling the extraction of not only syndrome, but true errors on the data qubits, realistic digital-twin studies can:

Produce realistic syndrome statistics for developing, training, and evaluating more expressive decoders, including neural-network-based decoders.
Stress-test decoding protocols against error mechanisms hard to isolate cleanly in experiments.
Guide QEC co-design of circuits, calibration, layout, and architecture, based on predictions grounded in calibrated device noise.

The challenge is that realism is expensive. The most faithful approaches rely on open-system quantum master-equation simulation, which captures both coherent and incoherent error sources. But dense master-equation simulation becomes prohibitive beyond ~15-20 qubits because the density matrix scales as O(4ⁿ) for n qubits.

Faster alternatives exist, but they get their speed by narrowing what they represent exactly. Clifford (stabilizer) simulators are exceptionally efficient for QEC workflows but usually require noise to be expressed in a form compatible with that framework, that is most commonly as stochastic Pauli channels (often via Pauli twirling), which can miss coherent and phase-sensitive effects. Tensor-network (TN) methods can go beyond Clifford and capture more general dynamics, but in two dimensions their efficiency depends on the circuit remaining compressible; for open-system simulation this is especially challenging because one is effectively evolving a density matrix, and real-time evolution can quickly build long-range correlations across the lattice. Practical TN simulations rely on approximate contraction and truncation, so cost and fidelity can vary sharply by regime. When truncations become aggressive, subtle correlated and phase-sensitive noise signatures, that is the very effects a hardware-calibrated digital twin is meant to preserve, can be blurred.

Next, we describe how we retain master-equation realism while making the computation scalable enough for routine use.

Making digital twins scale

A quantum master equation evolves a circuit’s density matrix, capturing coherent effects (for example, detuning and crosstalk) and incoherent processes (for example, dephasing and amplitude damping). But the density matrix has 4ⁿ elements, so dense simulation quickly becomes intractable.

Our approach builds on a real-time QMC method that stochastically compresses and evolves the density matrix using a population of walkers, enabling master-equation-faithful simulation at much larger scales [1]. As with any Monte Carlo method, accuracy is controlled by sampling (walker population and averaging), trading deterministic cost for statistical error bars, while retaining noise features that are lost in purely Pauli/stabilizer models.

QMC is also naturally parallel: walker updates and reductions distribute cleanly across compute resources, making it a strong match for large-scale runs on Amazon EC2 HPC instances.

Experiment-scale digital twin for surface code on AWS HPC

As an experiment-scale proof of concept, we simulated a single syndrome-extraction round of a distance‑7 rotated surface code (97 physical qubits) using a QMC‑accelerated, master‑equation digital twin. This is at a scale comparable to recent distance‑7 surface‑code memory demonstrations [2]. We ran the simulation on Amazon EC2 Hpc7a instances with AWS ParallelCluster. We averaged results over five independent QMC runs, with roughly 75 minutes per run using 96 vCPUs on a single instance.

The circuit contains 228 single‑qubit Hadamard gates and 168 entangling two‑qubit gates arranged in eight layers that couple data and ancilla qubits. We evaluate it under a hardware‑motivated transmon noise model. Each qubit is assigned relaxation and dephasing times, with T1 and T2 drawn uniformly from 150–300 μs. Each physical coupling includes residual ZZ crosstalk drawn uniformly from 20–100 kHz. These specific noise parameters are illustrative and not a parameter fit to any single device.

At the pulse level, single‑qubit gates use 25 ns Gaussian pulses plus virtual‑Z frame updates, while two‑qubit gates run as R_zz rotations using 50 ns unipolar sigmoid pulses. Both gate types include a uniform 0.1% under‑rotation to model miscalibrations.

For the th ancilla, we compute two complementary quantities at the end of the circuit:

Ancilla readout: We measure the ancilla in the computational (Z) basis over many repetitions; We map outcomes to ±1, the average defines the ancilla readout signal . The quantity directly accessible in an experiment.
Stabilizer value: In simulation, we can also evaluate the corresponding stabilizer operator on the neighboring data qubits. For example, ZZZZ or XXXX in the bulk and weight‑2 checks ZZ or XX on boundaries. Mapping that stabilizer outcome to ±1, the average defines

In an ideal scenario (where no additional error happens during syndrome extraction) these agree; discrepancies reflect imperfections in syndrome extraction (gate errors, crosstalk, decoherence, and measurement errors) that distort how data‑qubit information is mapped onto the ancilla.

We summarize this with an average syndrome extraction bias

where denotes an average over QMC runs of the same circuit with the same noise model, and we plot across the lattice. For comparison, we compute the same quantity using Stim – a standard Clifford simulator for syndrome-decoding workflows, after Pauli‑twirling the noise into stochastic Pauli channels, which is the noise model supported by stim. The results are plotted in Figure 1.

Figure 1. Grey nodes are data qubits; labeled circles are ancillas measuring X- or Z-type checks on neighboring data qubits. Edge colors encode the residual ZZ crosstalk strength (darker edges indicate stronger crosstalk). The error bars are negligible at the figure resolution.

To probe how coherent control errors manifest in syndrome extraction, we sweep the gate drive frequency (detuning) away from the qubit frequency. When we sweep the detuning, the QMC digital twin reveals a frequency-dependent, spatially structured syndrome-extraction bias, directly exposing sensitivity to control-parameter miscalibration, as shown in the left plot of Figure 1. In contrast, the Pauli‑twirled Stim model predicts a largely uniform response and misses the structured, phase‑sensitive bias patterns that arise. This is because the digital twin explicitly tracks open-system dynamics, it enables richer diagnostics and reveals correlated, spatiotemporal signatures that can inform more expressive decoder design and evaluation.

Conclusion

These results make hardware-faithful noisy circuit simulation practical at experiment-relevant scales that were previously out of reach, enabling routine digital-twin studies of QEC on classical cloud infrastructure. Because the workflow is efficient enough to run at volume, it can also generate realistic syndrome datasets for developing and validating more expressive decoders—an important lever for stronger logical-error suppression and improved QEC code performance.

At Quantum Elements, this capability forms the foundation of our hardware-calibrated digital twin platform for quantum error correction and system co-design. To learn more or request access, contact Quantum Elements at: info@quantumelements.ai and www.quantumelements.ai

References

Shen, Tong, and Daniel A. Lidar. “Real-time Sign-Problem-Suppressed Quantum Monte Carlo Algorithm for Noisy Quantum Circuit Simulations.” arXiv:2502.18929 (2025).
Google Quantum AI and Collaborators. “Quantum error correction below the surface code threshold.” Nature 638, 920–926 (2025).
Google Quantum AI. “Suppressing quantum errors by scaling a surface code logical qubit.” Nature 614, 676–681 (2023).

AWS Quantum Technologies Blog

Decoding Realistic Quantum Error Syndrome with Quantum Elements Digital Twins

QEC: A software and hardware challenge

Making digital twins scale

Experiment-scale digital twin for surface code on AWS HPC

Conclusion

References

Learn

Resources

Developers

Help