Artificial neural network syndrome decoding on IBM quantum processors

25/07/2024 Introduction. The development of quantum processors has made remarkable progress over the past few years with quantum devices consisting of more than 100 qubits currently accessible from multiple developers [1–3]. In principle, 100 qubits could allow computations intractable on classical supercomputers, yet the computational capabilities of the current generation of quantum processors are limited by high levels of physical noise [4]. Several studies have implemented and tested error mitigation strategies to suppress the detrimental impact of noise with varying levels of success [5–8]. Ultimately, the full power of quantum computers can only be realized when Quantum Error Correction (QEC) techniques are implemented. These will allow efficient and scalable detection and correction of errors in quantum circuits, leading to fault-tolerant quantum computations [9–12]. Over the recent decades, QEC codes have been theoretically developed to provide a means to suppress errors on logical information through the use of encoding in a larger Hilbert space [12–15]. One of the leading QEC codes is the surface code, which offers a high logical error rate threshold based on nearest neighbor interactions between qubits on a two-dimensional lattice [10,16]. The implementation of surface code–based QEC requires the classical processing of syndrome data—related to the physical error locations—to find appropriate corrections for physical qubits. However, this step, known as decoding, is a computationally intensive task. Recent work has theoretically shown that Artificial Neural Network (ANN)-based decoders can facilitate fast and scalable decoding [17–24], which is crucial to prevent the accumulation of errors during any quantum computation. The next major milestone is to implement an ANN-based syndrome decoder on quantum processors to directly benchmark their performance. This has been reported by three recent papers to date, which are based on experimental data from devices developed by Google [25–27].

In this work, we develop an ANN-based syndrome decoder and demonstrate its implementation on IBM quantum processors. Further, we assess its performance through comparison against the well-established graph-based Minimum Weight Perfect Matching (MWPM) technique, using PyMatching [28]. Our work shows that, in principle, ANN-based syndrome decoders can efficiently process syndrome measurement data from IBM devices and suggest appropriate corrections—achieving a crucial step in the pipeline of QEC on quantum computational devices.

Historically, the development of surface code literature has been primarily based on the square lattice arrangement of qubits [10,16,29], yet the architecture of IBM quantum processors is built on a heavy-hexagonal (HH) arrangement of qubits, as shown in Fig. 1(a). The motivation for such a qubit layout was to reduce the local connectivity of qubits. This addressed the physical difficulty of controlling many connections to each qubit and aimed to reduce cross-talk noise [30]. However, the HH format required the modification of the traditional square surface code construction to a hexagonal architecture, with ancillary qubits—changing the underlying circuit structures for syndrome measurement. In 2020, Chamberland et al. laid out the foundational framework for QEC on HH and heavy-square lattices of low-degree locally connected qubits [30], introducing the HH QEC code. This original HH code was optimized to minimize the number of required physical qubits by removing some ancillary qubits on the boundaries of the hexagonal lattice and maintaining a lattice connectivity of, at most, 3 [30]. However, IBM has developed increasingly large devices on HH lattices, without the original optimization of boundaries [31], shown in Fig. 1(a), as the original code layout was incompatible with being realized in the bulk of a HH lattice.
This created a discrepancy between the HH code proposed in Ref. [30] and the HH layout of physical qubits in IBM devices. To address this disparity, we have modified the existing HH code, by adjusting the original prescription's boundaries to fit with the bulk (see the Methods section for details on the adjustment made). This conforms with the IBM quantum processor layout, which is a crucial step in the direct implementation and benchmarking of our ANN decoder on IBM devices. A recent work by Sundaresan et al. has also looked at the modified HH code for distance 3 measurements [32]. However, our work is distinct, as we investigate HH code threshold plots and implementation comparison between distance 3 and 5 codes based on direct measurements on IBM devices.



FIG. 1.
Neural network decoder framework. (a) The lattice connectivity of qubits of a 127 qubit device developed by IBM with color range denoting error probabilities associated with single and two qubit gates. The shaded section represents a subsection of this device where the average error rate is lowest, in a region which supports a 𝑑=3 HH error correction code. Dotted outlines indicate some other possible subgraph locations. (b) The qubits of a HH code, with orange circles representing the data qubits and light/dark gray circles representing the ancillary flag and measurement qubits, respectively. Connecting lines represent the connectivity of two-qubit gates within the lattice. (c) Multiple cycles of the HH error syndrome measurement in the presence of circuit noise. (d) The circuits for 𝑋 and 𝑍 gauge operator measurement of the HH code. (e) An ANN-based syndrome decoder as developed in this work. A large input layer takes the measurements over 𝑑 cycles, and it linearly decreases over four layers to an output which is the size of the number of data qubits. (f) A possible correction being sampled from the prediction given by the ANN-based syndrome decoder. The appropriate correction is then applied to the IBM device.

HH adjustment. Across the structure of the HH code, qubits are labeled as either data, flag, or measurement qubits. These different qubit types are what facilitate the locating of errors in the HH code. These form the basis of the stabilizer formalism for QEC codes. Although IBM quantum processing devices have been developed for some years, the HH code which directly corresponds to the physical layout has not been discussed often, with only a few current works directly implementing the adjusted HH structure on superconducting transmon qubits [32,33]. Within the main text, it is stated that the HH boundary optimization was not included when IBM physically realized their quantum devices.
In Supplemental Material Fig. S1 [34], the boundary optimization shown on the left is removed on the right. The structure shown on the right-hand side is physically implementable on IBM devices. Within the adjusted HH lattice on the right of Supplemental Material Fig. S1 [34], there are three types of stabilizer generator: the 𝑋-type Bacon-Shor style operators,

𝑆𝑋=∏𝑛𝑋𝑛,𝑗⁢𝑋𝑛,𝑗+1,the weight-four 𝑍-type plaquette operators, found in the bulk,

𝑆𝑍=𝑍𝑖,𝑗⁢𝑍𝑖+1,𝑗⁢𝑍𝑖,𝑗+1⁢𝑍𝑖+1,𝑗+1,and the weight-two 𝑍-type edge operators,

𝑆𝑍=𝑍2⁢𝑚,1⁢𝑍2⁢𝑚+1,1,𝑍2⁢𝑚−1,𝑑⁢𝑍2⁢𝑚,𝑑,where 𝑖,𝑗∈ℕ≤𝑑−1, 𝑚∈ℕ≤𝑑−12, and 𝑛∈ℕ≤𝑑, and 𝑖+𝑗= even in the second set. Here, 𝑖,𝑗 refer to the lattice of data qubits, with 𝑖 as rows and 𝑗 as columns. The stabilizer group, as used in QEC codes, is sufficiently defined by the stabilizer generators which form the entire group after all multiple combinations. Given the boundary conditions of the device, the edge operators are found along the top and bottom of the lattice when arranged in the alignment of Supplemental Material Fig. S1 [34]. This is to ensure that operators do not act on nonexistent qubits. The result of measurement of the stabilizers across the lattice is the syndrome measurement. These generators mutually commute, allowing for their collective simultaneous measurement. Given that there are many ancillary qubits on the lattice, gauge operators are defined to localized areas to measure the local parity, and the stabilizers of each kind measure the parity of gauge operators of each kind. The gauge operators are defined as

𝐺𝑋=𝑋𝑖,𝑗⁢𝑋𝑖+1,𝑗⁢𝑋𝑖,𝑗+1⁢𝑋𝑖+1,𝑗+1,𝑋1,2⁢𝑚−1⁢𝑋1,2⁢𝑚,𝑋𝑑,2⁢𝑚⁢𝑋𝑑,2⁢𝑚+1and

𝐺𝑍=𝑍𝑖,𝑗⁢𝑍𝑖+1,𝑗for 𝑋 and 𝑍 gauge operators, respectively, where 𝑖,𝑗∈ℕ≤𝑑, 𝑚∈ℕ≤𝑑−12. A constraint of 𝑖+𝑗= odd must be used for the first term in the 𝑋 gauge operator set. The measurements of these gauge operators and hence stabilizers can be facilitated by the gauge operator circuit diagrams illustrated in Supplemental Material Fig. S2 [34]. Supplemental Material Fig. S4 [34] illustrates the overall layout of a 𝑑=5 adjusted HH code with some data errors and corresponding stabilizer measurements.
In Supplemental Material Fig. S4 [34], the yellow and green stabilizers are shown for illustrative purposes. Examples of data qubit errors are shown and the corresponding stabilizers are ‘lit up’ from dull to bright, via the measurement of the gauge operators. The eigenvalues associated with the stabilizer operators in Supplemental Material Fig. S4 [34] are

[−1+1−1+1−1+1+1+1−1−1+1−1][−1−1+1+1]
(1)
corresponding to the 𝑍 and 𝑋 operator, respectively. These are simplified to

[101010001101][1100]
(2)
for ease of ANN training. In Eq. (2), a zero is given where no change has occurred and 1 is given when a stabilizer change has occurred: −1 eigenvalue to +1 eigenvalue. When multiple errors occur within the same parity measurement of a single stabilizer, it may have its eigenvalue inverted twice, returning to its original state. Therefore, only stabilizers, at the end of chains have their values changed, as illustrated in Supplemental Material Fig. S4 [34]. This is less obvious for the 𝑍 errors, as the nature of the Bacon-Shor stabilizer allows for chains to be continued anywhere across entire columns.

Errors across the lattice which are of the same form as the stabilizer elements, generators or otherwise, commute with all stabilizer generators and hence do not change the underlying information in the lattice. This means that the encoded state of the lattice may only be affected by a global phase and encoded information is unaltered. Given that the state is unaltered, gates of the same kind can be applied to the lattice wherever required, to correct for errors. This can be used to create sets of equivalent error chains from the same start and end points on the lattice.

ANN construction and training. In the case of square surface code lattices, it has been shown that ANN syndrome decoders can offer highly promising performance when suggesting suitable corrections [17,19,35–38], including testing on experimental data [25]. The low-level decoders developed in Refs. [17,35,36] were built in a similar manner to this work. They each show the ability of an ANN to learn the relationship between syndrome data and corrections after being given multiple training instances. Many ANN varieties have been developed for square surface codes, including dense Feed Forward Neural Networks (FFNN), Long Short Term Memory Networks (LSTM), and Convolutional Neural Networks (CNN). Varsamopolous et al. showed that although slower than the FFNN, the LSTM was more accurate at decoding on average and both were faster and more accurate than the MWPM baseline [19]. Meinerz et al. and Gicev et al. have independently shown that implementing convolutional layers allows an ANN decoder to be compatible with larger code distances unseen in training [18,20]. These results showed that an ANN syndrome decoder is able to fit any size QEC square surface code.

The ANN developed for this work was built with dense layers, meaning each neuron within each layer is intricately connected with each neuron in the previous layers. The choice for the number of hidden layers is based on the decoder performance. Limited overfitting of training data occurred when two hidden layers were included. Utilizing exclusively dense layers is the simplest layer structure of a neural network and requires no additional pruning or alterations [17]. This methodology allows for the quick proof of concept construction of an ANN syndrome decoder for physical devices and can give suitable corrections with minimal pre/postprocessing. Given that the input layer takes the entirety of the syndrome measurement at once, there is no need to explicitly distinguish between bulk stabilizers and boundary stabilizers when training the network. The network is able to learn the direct relationship between observed syndrome patterns and appropriate corrections without needing to perform auxiliary tasks after corrections are applied—similar to the MWPM algorithm. The MWPM algorithm can provide exact corrections by pairing −1 eigenvalue stabilizers without the need for any pre/postprocessing, but it lacks in the speed of suggestion, especially as the distance of the code increases.

At the smallest distance, 3, the size of input and output layers are the same, yet the input layer size grows significantly faster than the output layer when the distance of the code is increased. Each entry in the input corresponds to a single stabilizer measurement, with the total equaling the number of stabilizers, 𝑛 multiplied by the number of cycles, 𝑑: 𝑑2⁢(𝑑2+2⁢𝑑−3). Similarly, each output pair corresponds to a single data qubit requiring 𝑋 and 𝑍 correction, respectively. The total output size is 2⁢𝑑2.
Each layer is activated with the ReLU activation function, excluding the final layer which incorporates a Sigmoid function, to return values between 0 and 1. The BinaryCrossEntropy loss function and ADAM optimizer functions were used, allowing the network output to be interpreted as a probability that an error was present at each qubit. During training and testing, 96 Intel Xeon Platinum 8274 CPU cores were used and four NVIDIA V100-SXM2-32GB GPUs were used. Each value in the output of the final layer will be a value between 0 and 1, which are then processed in two ways. First, the values are truncated such that each value in the correction suggestion is exactly 0 or 1, which corresponds to a given correction being not required or required, respectively. If this correction is consistent with the final syndrome measurement cycle, the truncated prediction is kept. If not, the predictions given are sampled using a Bernoulli Trial, and this is repeated until an appropriate correction is given [36]. Sampling of a prediction could take many re-tries if the network is uncertain with its prediction. Therefore a cut-off point is used, where after 𝑛 re-samples, if no appropriate correction is given, re-sampling is stopped and it is assumed that a logical error has occurred in that instance [36]. Although there is theoretically a 50% chance that a logical error has occurred, for benchmarking purposes, the occurrence of a logical error is assumed and the additional logical errors are reflected in Fig. 3. Re-sampling can be a major overhead computationally, furthering the need to cut off early, before qubits decohere within the structure. Given that this work only considered small distances of the HH QEC code, truncation of predictions often produced an appropriate correction—not always requiring re-sampling. The coherence time of current qubits is on the order of microseconds and this work's re-sample time is also on the order of microseconds, forcing re-sampling to be avoided as much as possible [39]. This dense ANN methodology is fast enough to produce corrections within the coherence time of physical qubits in the lattice for these small distances [39]. The average decode time per instance for MWPM is approximately 1 ms compared to 0.3 ms for the ANN in this work. The time taken to find corrections increases with code distance and may not be appropriate for large distance codes. Instead, CNN techniques can be employed for decoding large distance codes [18,20,40,41].

Results and discussion. Figure 1(b) schematically illustrates a distance 3 patch of the adjusted HH code shape as described within our work, where data qubits (orange) store useful information and ancilla qubits (gray) are used to facilitate syndrome measurements. These measurements are used to locate errors on physical qubits within the HH lattice. Typically, the syndrome measurements are collected in multiple rounds before they are decoded to find appropriate corrections for data qubit errors and also in the syndrome measurement process itself. Figure 1(c) schematically shows many cycles of the HH code being executed and corresponding syndromes measured for each cycle. The circuits that are used to measure the syndrome in both the 𝑋 and 𝑍 basis are shown in Fig. 1(d), with the physical qubits numbered in Fig. 1(b) illustrated within a dashed box.

The data collected from syndrome measurement over several cycles is processed by a classical syndrome decoding method. This prescribes adequate corrections to fix physical errors in data qubits and restore the logical state of the lattice. The construction of an efficient and scalable syndrome decoder is a challenging computational problem and has recently been the focus of intensive research [42,43]. One of the leading syndrome decoder algorithms, MWPM, calculates corrections by matching pairs of changed stabilizers. It has received extensive development in many square lattice surface code studies [16,28,44–47]. Chamberland et al. implemented the MWPM algorithm to the original HH layout to compute logical error rate curves for both 𝑋 and 𝑍 logical errors [30].

We benchmarked the adjusted HH code using the MWPM decoder from the Python package PyMatching [28] and compared it to the work of Chamberland et al. [30]. In Fig. 2, odd distances, 𝑑, of the code between 3 and 11 are tested and the lowest clear crossover point can be seen at approximately 0.0007 in the 𝑋 logical error plot on the left. This will be the benchmark for thresholds for the adjusted HH QEC code, as only distances 3 and 5 are tested by demonstration on IBM devices. Note that the threshold if computed based on increasing code distance would be slightly higher ( ∼0.001). We used MWPM as implemented by PyMatching to confirmthe 𝑋 logical error threshold of 0.0045 of Chamberland et al. [30] and a very similar threshold of 0.005 was found. Details of this can be found in Supplemental Material Sec. S1 [34]. The addition of 2⁢𝑑−2 extra ancilla qubits and 2⁢𝑑−2 of CNOTs has lowered the threshold physical error probability further by a small amount.


FIG. 2.
Benchmarking of the adjusted HH code with MWPM. Both the threshold and pseudothreshold for 𝑋 logical errors (left) and 𝑍 logical errors (right) for the adjusted HH code are shown, decoded by MWPM as implemented by PyMatching. The vertical dash line indicates the crossover point for 𝑑 = 3 and 5 curves.

Despite promising performance, it has been regularly discussed that the MWPM algorithm may not be fast enough for quantum state coherence times on current devices [29,35,48,49]. Even the best adaptations of this algorithm are slow in the large distance regime of QEC codes. The development of fast and scalable syndrome decoders have been a topic of significant research, with proposals attempting to address the real-time decoding challenge [43]. Machine Learning (ML)-based syndrome decoder construction has gained significant momentum in recent years, with some studies indicating that a faster and scalable syndrome decoding method may be possible by leveraging the computational efficiency and flexibility of ANN algorithms. In terms of the HH architecture, there is no current demonstration on the measured syndrome data from IBM devices. Some theoretical studies has explored the implementation of dense ANN- and CNN-based decoders for the original HH code proposed by Chamberland et al., yet their work is not directly applicable to the IBM hardware due to the aforementioned adjustment required to do so [50,51]. Our work is the first to implement and benchmark an ANN decoder on the adjusted HH code through theoretical simulations and demonstration on IBM devices based on cloud access.
The ANN decoder developed in this work was constructed using the Python package TensorFlow [52]. The decoder consists of an input layer, two hidden layers, and an output layer. More details about the construction of the network are provided in Methods section. In Fig. 1(e), a dense ANN is illustrated, where the number of neurons in each layer linearly decreases from the input layer to the output layer. The size of the input layer is adjusted to feed in each 𝑋 and 𝑍 stabilizer measurement separately, for each measurement cycle. The size of the output layer allows a value for both 𝑋 and 𝑍 errors for each physical qubit.

Our ANN decoder was rigorously trained on tens of millions of simulated noise patterns, using uniform depolarizing Pauli channels. The uniform depolarizing noise model was simulated with an even chance, 𝑝3, to select from the three Pauli gate errors 𝑋, 𝑌, and 𝑍. Each qubit can experience each of these errors, and each CNOT on the lattice can experience some tensor product of two Pauli gate errors and the identity, excluding 𝐼⊗𝐼. No bias or other error factors were included in this training. During training, circuits were modeled such that when Pauli errors occur on a state, |𝜓⟩, it may be denoted as 𝐸|𝜓⟩, where 𝐸 is the combination of errors on a single qubit. The goal of error correction is to detect and apply the appropriate correction to |𝜓⟩ to turn the string of errors 𝐸 into the identity, 𝐼, or to return the lattice to an equivalent logical state. We compute a correction 𝐸𝑐 such that the correction succeeds if 𝐸𝑐⁢𝐸∈𝐺, where 𝐺 is the corresponding gauge group. This is simulated within this work by tracking each error which occurs on every qubit and multiplying the Pauli gate errors, where two of the same give the identity; 𝑋2=𝑌2=𝑍2=𝐼, and 𝑋⁢𝑍=𝑍⁢𝑋=𝑌 up to a global phase.
The ANN decoder developed in our work provides appropriate corrections based on the syndrome measurements over 𝑑 cycles of the adjusted HH code. The ANN is able to functionally learn how stabilizer inversions are related to error chains within the lattice, including on the boundaries of the lattice where chains abruptly end. This work has explicitly shown that a dense ANN syndrome decoder can input an exact stabilizer syndrome measurement and return a prediction related to an appropriate correction on par or better than suggestions from the MWPM algorithm.

First evaluation of the model was done with a similar error model to the training: a uniform depolarizing noise model. The underlying physical error, 𝑝, was varied to test the performance at different rates. Further, the decoders were then tested on imported device error models from IBM quantum experience; each physical error rate was given for qubits and two-qubit gates which was then used as the underlying error probability 𝑝. This was implemented for each individual qubit and CNOT, instead of uniformly across the lattice. Finally, the circuits as defined in Fig. 1(d) were constructed to fit distance 3 and 5 HH QEC codes and executed on multiple IBM devices. Figure 3 displays IBM demonstration and theoretical results from our ANN syndrome decoder. The decoder is tested on a simulated lattice of qubits in the form of IBM devices which suffer from uniform circuit-based depolarizing noise (blue and orange line plots) and also on device noise models derived from error rates provided by five of the IBM quantum processors (marked with open circle points). In these plots, similar crossover behavior is observed, and thus it can be inferred that the ANN syndrome decoder is able to decode the HH QEC code with the same overall properties as the MWPM algorithm. Note that the threshold for the ANN syndrome decoder is approximately 0.0005 for 𝑋 logical errors, and hence reduced by a small amount compared to the MWPM threshold of 0.0007 from Fig. 2(a). In the future, more sophisticated ML-based syndrome decoders, such as CNN decoders, can be designed to improve the threshold and scale to larger distances [20,38,40,51].


FIG. 3.
Neural network decoder implementation on adjusted HH code. Threshold plot for the adjusted HH code decoded by an ANN showing error rates of the 𝑋 logical operator (a) and 𝑍 logical operator (b). Each point refers to an error model derived for each IBM device. The horizontal value of the points shown are the overall error rate of the specific subgraph location chosen, and the horizontal uncertainty shows the range of overall error rates of each possible subgraph location on each device, with the point placed on the median heuristic subgraph score. (c), (d) The HH QEC code IBM measurement circuit plots., in thich the top right-hand corner of (a) and (b) is enlarged, and the points which refer to the circuits running on the IBM devices are also marked. Unfilled circles refer to the simulated noise model corrections, and filled circles refer to the transpiled circuits run on devices.

In Figs. 3(a) and 3(b), the blue circle markings correspond to distance 3 subgraphs and orange for distance 5. Each data point has an alphabetical label showing the name of IBM device: 𝑏=𝑖⁢𝑏⁢𝑚_𝑏⁢𝑟⁢𝑖⁢𝑠⁢𝑏⁢𝑎⁢𝑛⁢𝑒, 𝑐=𝑖⁢𝑏⁢𝑚_𝑐⁢𝑢⁢𝑠⁢𝑐⁢𝑜, 𝑛=𝑖⁢𝑏⁢𝑚_𝑛⁢𝑎⁢𝑧⁢𝑐⁢𝑎, 𝑠=𝑖⁢𝑏⁢𝑚_𝑠⁢ℎ⁢𝑒⁢𝑟⁢𝑏⁢𝑟⁢𝑜⁢𝑜⁢𝑘⁢𝑒, and 𝑠⁢𝑒=𝑖⁢𝑏⁢𝑚_𝑠⁢𝑒⁢𝑎⁢𝑡⁢𝑡⁢𝑙⁢𝑒. The horizontal uncertainty for each marking corresponds to the possible values of average physical error for each available subgraph location, chosen with a heuristic described in Supplemental Material Sec. S2 [34], with the marking corresponding to the median location. Interestingly, the markings are in the approximate region of the simulated noise curves. This suggests that the ANN syndrome decoder is likely to be able to decode actual noise approximately as well as simulated noise. Due to the preceding threshold error rates of current physical machines, distances above 5 were not tested, since this would only increase the logical error rate and may not provide additional insight.

Figures 3(c) and 3(d) plot results based on direct measurements from the IBM quantum processors. The plots show both device noise simulations (open circles) from Figs. 3(a) and 3(b), as well as IBM demonstration points (colored circles) for a direct comparison. For the data points based on IBM measurements, the adjusted HH QEC code syndrome measurement circuits were created and run on physically realized IBM devices. Each circuit was initialized twice, once for 𝑋 measurements and once for 𝑍 measurements, and 10,000 shots were run for each case. The number of logical errors which occurred after the pass through of the ANN syndrome decoder was lower on average that the simulated noise models of the same devices for distance 3, and roughly similar for distance 5. Given that the points are still all within the same area or lower, it would follow that if the devices error rates were below the threshold of approximately 0.0005, then increasing the distance of the code, and using a suitable ANN syndrome decoder, would facilitate fault-tolerant quantum computation [45].

Note that in Fig. 3(b), the device derived error models seem to consistently provide lower logical error rates than equivalent uniform error models. This suggests that there is some intricate phenomenon occurring which may be related to subgraph location choice. Compared to what is expected under the uniform noise model, this results in the reduction of the rate of 𝑍 logical errors, which corrupt 𝑋 logical operator values. This is not observed in Fig. 3(d), however, as the measured data from IBM devices is not lower than the simulated uniform noise curve on average. Crosstalk and relaxation errors are missing in the simulated noise model but are possibly present on the physically realized devices, perhaps leading to this variation between IBM devices and simulation [33,53].

Conclusions. Despite the expeditious advances in quantum hardware, fault-tolerant quantum computation still requires significant research in the coming years to achieve scalable practical applications to real-world problems. However, this work, to the best of our knowledge for the first time, showed that the adjusted HH code that matches the IBM quantum machine structure is able to be decoded by both the MWPM algorithm and an ANN syndrome decoder. A dense ANN was shown to be compatible with the adjusted HH code and to perform measurements in accordance to the error rates present on the devices. The IBM demonstration results in this work showed that stabilizer circuit decoding approximately followed the theoretical curve's trend. It is therefore likely that lowering the physical error rate below the threshold will allow for arbitrary suppression of logical errors with code distance increase. This work's dense-style ANN lays the foundation of ANN decoding on physically realized IBM quantum machines.

In the future, our work could be extended with the benchmarking of larger distance code implementations on IBM devices to demonstrate the expected drop in logical error rates with respect to code distance. However, this would require larger physical devices and devices with error rates below the code threshold. A second line of study could be to implement and test more sophisticated ML-based decoders—such as CNN syndrome decoders—on quantum devices. In summary, our work has opened new avenues for experimentally realized, ML-based syndrome decoder implementation on quantum processors. This will be instrumental in realizing fault-tolerant quantum computing in the near future, where larger size and lower error rate devices are anticipated to be available.

Source: https://tinyurl.com/38dtfkf9 via APS - American Physical Society
Share: