Abstract. Cycle benchmarking is a scalable protocol used to measure the error rate of a specific cycle of quantum gates implemented in parallel. We introduce the cycle benchmarking protocol, explain how it works, and numerically illustrate its effectiveness. We demonstrate that relaxing the assumptions required to accurately estimate the process fidelity can lead to inaccurate results; in particular, allowing gate-dependent noise on the twirling gates can cause a misestimation of the process fidelity. However, we also show that highly gate-dependent errors can still yield accurate results in certain regimes. Finally, when the gate-independence condition is relaxed, we show that it is possible to reconcile the estimated and theoretical fidelities by measuring in a different basis via a unitary transformation which depends on both the chosen error model on the twirling gates and the targeted gateset.
1. Introduction
Randomized benchmarking (RB) is arguably the most influential experimental protocol for quantifying the performance of a quantum information device. In essence, RB-like protocols aim to measure a set of parameters $\{p_\lambda\}$ that characterize the quality of the implemented gates from a target gateset $\mathbb{G}$. In the standard RB protocol, the chosen gateset is the $n$-qubit Clifford group, for which there is a single quality parameter $p$. Magesan et al. showed that, under the assumption of gate-independent noise on the gateset (i.e., $\tilde{G} = G \circ \Phi$ for all $G$ in the $n$-qubit Clifford group $\mathbb{G}$), the parameter $p$ coincides with the depolarizing probability of the depolarizing channel $\Phi_{\text{dep}}$. Using the fact that the average gate fidelity is invariant under twirling, and that the $n$-qubit Clifford group forms a unitary 2-design, one can then estimate the average gate infidelity (or average error rate) of the noise model $\Phi$:
\begin{equation} r_{\text{avg}}(\Phi, \mathcal{I}) = 1 - \underbrace{\int d\ket{\psi}\, \mel{\psi}{\Phi(\ketbra{\psi})}{\psi}}_{\text{average gate fidelity}} = 1 - \int d\ket{\psi}\, \mel{\psi}{\Phi_{\text{dep}}(\ketbra{\psi})}{\psi} = \frac{(1-p)(4^n - 1)}{4^n}. \label{eq:avg_infidelity} \end{equation}Wallman showed that standard RB works almost the same even in the presence of gate-dependent noise ($\tilde{G} = G \circ \Phi_G$), and that the extracted decay parameter $p$ still reliably quantifies the average gate (or gateset circuit) fidelity, as in the ideal gate-independent case. Furthermore, RB-like protocols primarily assess the quality of an entire gateset, rather than the process fidelity of a single gate or a specific layer of gates. Cycle benchmarking addresses this limitation by providing a scalable, SPAM-robust protocol for estimating the process fidelity of a fixed cycle of gates implemented in parallel.
2. Randomized Compiling (RC)
Randomized compiling is an error-suppression protocol that works by tailoring the noise acting on different cycles into Pauli channels via Pauli twirling. Pauli channels are particularly convenient to analyze because they are stochastic channels, and therefore their diamond distance from the identity (the worst-case error rate) is tightly related to the process infidelity: in general, the process infidelity is a lower bound on the diamond distance for arbitrary channels, but for any stochastic channel this bound is saturated, so the process infidelity equals the diamond distance.
Moreover, a Pauli channel $\Phi_{\mathbb{P}}$ with Kraus operators $\{\sqrt{p(\mathcal{P})}\,\mathcal{P}\}$ can be expressed in the natural representation $K$ in the Pauli basis (also known as the Pauli transfer matrix) as $K_\mathbb{P}(\Phi_\mathbb{P}) = \operatorname{diag}(\vec{\lambda}_\mathbb{P})$, which will be useful for the cycle benchmarking analysis. The components $\lambda_\mathcal{P}$ are linear combinations of the Pauli probabilities $\{p(\mathcal{P})\}$:
\begin{equation} \lambda_\mathcal{P} = \sum_{\mathcal{Q} \in \mathbb{P}^{\otimes n}} (-1)^{\omega(\mathcal{P},\mathcal{Q})}\, p(\mathcal{Q}), \qquad \omega(\mathcal{P}, \mathcal{P}') = \begin{cases} 0 & \text{if } \mathcal{P} \text{ and } \mathcal{P}' \text{ commute,} \\ 1 & \text{if } \mathcal{P} \text{ and } \mathcal{P}' \text{ anti-commute.} \end{cases} \label{eq:pauli_eigenvalues} \end{equation}3. Cycle Benchmarking (CB)
Let $\mathcal{G}$ be a cycle of interest on $n$ qubits, composed of one or more Clifford gates. Assume the noise is Markovian, the local Pauli gates are affected by a gate-independent noise model such that $\tilde{\mathcal{P}} = \mathcal{E} \circ \mathcal{P}$ for all $\mathcal{P} \in \mathbb{P}^{\otimes n}$, and $\tilde{\mathcal{G}} = \mathcal{G} \circ \Lambda$. The quantity estimated by CB is
\begin{equation} F_{\text{CB}}(\tilde{\mathcal{G}}, \mathcal{G}) = \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \mathcal{F}(\tilde{\mathcal{G}} \circ \tilde{\mathcal{P}},\, \mathcal{G} \circ \mathcal{P}) = \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \mathcal{F}(\Lambda \circ \mathcal{E},\, \mathcal{I}) = \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \lambda_\mathcal{P}, \label{eq:F_CB} \end{equation}where $\mathbb{P}^{\otimes n}$ is the $n$-qubit Pauli group, $\lambda_\mathcal{P}$ are the Pauli eigenvalues (defined in $\eqref{eq:pauli_eigenvalues}$) of the Pauli twirled noise channel $\mathcal{T}_\mathbb{P}(\Lambda \circ \mathcal{E})$, and $\mathcal{F}$ is the process fidelity (also known as entanglement fidelity), represented in the natural representation picture $K$ as
\begin{equation} \mathcal{F}(\Phi, \Xi) = \frac{1}{4^n} \langle K(\Phi), K(\Xi) \rangle, \qquad \mathcal{F}(\Psi, \mathcal{I}) = \frac{1}{4^n} \langle K(\Psi), K(\mathcal{I}) \rangle = \frac{1}{4^n} \Tr(K(\Psi)). \label{eq:process_fidelity} \end{equation}The final equality in $\eqref{eq:F_CB}$ follows from the fact that the trace is preserved under a unitary change of basis, and that Pauli twirling preserves the diagonal elements of the Pauli transfer matrix. Therefore,
\begin{equation} \Tr(K(\Lambda \circ \mathcal{E})) = \Tr(K_\mathbb{P}(\Lambda \circ \mathcal{E})) = \Tr(K_\mathbb{P}(\mathcal{T}_\mathbb{P}(\Lambda \circ \mathcal{E}))) = \Tr(\operatorname{diag}(\vec{\lambda}_\mathbb{P})) = \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \lambda_\mathcal{P}. \end{equation}We now briefly summarize how the cycle benchmarking protocol works:
- Choose $m_1, m_2$ so that $\mathcal{G}^{m_1} = \mathcal{G}^{m_2} = \mathcal{I}$ and a subset $\mathbb{P}'^{\otimes n} \subseteq \mathbb{P}^{\otimes n}$.
- Pick a Pauli $\mathcal{P} \in \mathbb{P}'^{\otimes n}$ and prepare the $+1$ eigenstate of $\mathcal{P}$, denoted $\ket{\psi_\mathcal{P}}$, such that $\mathcal{P}\ket{\psi_\mathcal{P}} = \ket{\psi_\mathcal{P}}$.
-
For each $m \in \{m_1, m_2\}$ and $l \in \{1, \ldots, L\}$, where $L$ is the number of
randomizations:
- Select a vector $\vec{\mathcal{R}}(m, l) = (\mathcal{R}_0(l), \mathcal{R}_1(l), \ldots, \mathcal{R}_m(l))$ of random $n$-qubit Paulis and run the cycle benchmarking circuit \begin{equation} \mathcal{C}(\mathcal{P}, \vec{\mathcal{R}}(m,l))\,\ket{\psi_\mathcal{P}} = \mathcal{R}_m(l)\,\mathcal{G}\,\mathcal{R}_{m-1}(l)\,\mathcal{G}\,\cdots\, \mathcal{R}_1(l)\,\mathcal{G}\,\mathcal{R}_0(l)\,\ket{\psi_\mathcal{P}}. \end{equation}
- Apply an appropriate change of basis to measure in the computational basis.
- Measure in the computational basis to estimate the expectation value $f_{\mathcal{P},m,l}$ for all randomizing circuits $\{\mathcal{C}(\mathcal{P}, \vec{\mathcal{R}}(m,l))\}_{l=1}^L$.
- Repeat steps 2 and 3 for all $\mathcal{P} \in \mathbb{P}'^{\otimes n}$.
- The estimator of $F_{\text{CB}}$ is given by \begin{equation} \hat{F}_{\text{CB}} = \frac{1}{|\mathbb{P}'^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}'^{\otimes n}} \left(\frac{\sum_{l=1}^L f_{\mathcal{P},m_2,l}}{\sum_{l=1}^L f_{\mathcal{P},m_1,l}}\right)^{\!\frac{1}{m_2 - m_1}} = \frac{1}{|\mathbb{P}'^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}'^{\otimes n}} \hat{\lambda}_\mathcal{P}. \label{eq:F_hat_CB} \end{equation}
Key properties of the CB protocol
We highlight several important aspects of CB:
- As the number of qubits increases, the required values of $m_1$ and $m_2$ generally become larger, increasing the computational resources needed.
- Ideally, we would choose $\mathbb{P}'^{\otimes n} = \mathbb{P}^{\otimes n}$, but this is infeasible because the number of Paulis grows exponentially with qubit number. Erhard et al. showed that the fidelity can instead be accurately estimated using a fixed number of Paulis, independent of system size. In practice, choosing $\min(40, 4^n - 1)$ Paulis is typically sufficient.
- The insertion of random Paulis around the target cycle $\mathcal{G}$ effectively implements Pauli twirling of the noise between gates, mimicking the effect of the randomized compiling protocol. In the limit as $L \to \infty$, the resulting noise channels converge to fully stochastic Pauli channels.
-
When $\mathcal{P}$ commutes with $\mathcal{G}$, we have $\hat{\lambda}_\mathcal{P} = \lambda_\mathcal{P}$. Otherwise,
\begin{equation} \hat{\lambda}_\mathcal{P} = \left(\prod_{\mathcal{Q} \in \mathcal{P}^{\circlearrowright \mathcal{G}}} \lambda_\mathcal{Q}\right)^{1/\left|\mathcal{P}^{\circlearrowright \mathcal{G}}\right|}, \end{equation}where $\mathcal{P}^{\circlearrowright \mathcal{G}}$ is a $\mathcal{G}$-orbit, defined as
\begin{equation} \mathcal{P}^{\circlearrowright \mathcal{G}} = \bigl\{\mathcal{G}^j \mathcal{P} \mathcal{G}^{-j} \mid j \in \mathbb{N}\bigr\}. \end{equation}By the AM–GM inequality,
\begin{equation} \hat{\lambda}_\mathcal{P} = \left(\prod_{\mathcal{Q} \in \mathcal{P}^{\circlearrowright \mathcal{G}}} \lambda_\mathcal{Q}\right)^{1/\left|\mathcal{P}^{\circlearrowright \mathcal{G}}\right|} \leq \frac{1}{\left|\mathcal{P}^{\circlearrowright \mathcal{G}}\right|} \sum_{\mathcal{Q} \in \mathcal{P}^{\circlearrowright \mathcal{G}}} \lambda_\mathcal{Q}. \end{equation}The set of $\mathcal{G}$-orbits $(\mathbb{P}^{\otimes n})^{\circlearrowright \mathcal{G}}$ forms a partition over $\mathbb{P}^{\otimes n}$. Therefore, when $\mathbb{P}'^{\otimes n} = \mathbb{P}^{\otimes n}$,
\begin{equation} \hat{F}_{\text{CB}} \leq \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \frac{1}{\left|\mathcal{P}^{\circlearrowright \mathcal{G}}\right|} \sum_{\mathcal{Q} \in \mathcal{P}^{\circlearrowright \mathcal{G}}} \lambda_\mathcal{Q} = \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \lambda_\mathcal{P} = F_{\text{CB}}. \label{eq:cb_inequality} \end{equation}Thus, in the limit of infinite resources, $\hat{F}_{\text{CB}} \leq F_{\text{CB}}$ (Theorem 3 of). Furthermore, $\hat{F}_{\text{CB}} - F_{\text{CB}} = \mathcal{O}\!\left([1 - F_{\text{CB}}]^2\right)$, so CB is highly accurate at low infidelity rates.
- Since $\hat{\lambda}_\mathcal{P}$ is not necessarily equal to $\lambda_\mathcal{P}$ for all $\mathcal{P} \in \mathbb{P}^{\otimes n}$, it is not possible to "learn" the contribution of each individual Pauli eigenvalue to the process fidelity.
- It is possible to benchmark the $T$ gate by twirling over the dihedral group of order 8, $D_8 \cong \langle X, \sqrt{Z} \rangle$, on qubits that have $T$ gates, instead of twirling over the Pauli group.
- There are other CB-like protocols, such as character cycle benchmarking (CCB), where one may choose any arbitrary $\mathcal{G}$ and $m$-values rather than restricting to Clifford cycles satisfying $\mathcal{G}^m = \mathcal{I}$, and character average benchmarking (CAB), which reduces resource requirements by partially depolarizing the noise via local Clifford twirling.
4. Results
We ran cycle benchmarking in a simulator for three noisy gates of interest — $H$, $T$, and CNOT — and an idling cycle, under three different noise models on the twirling gates:
- a small $X$-rotation (gate-independent),
- a gate-dependent over-rotation error model ($\tilde{U} = U^{1+\epsilon}$),
- a model where the twirling gates are decomposed into $ZX$ form and a small $X$-rotation is applied to the $\sqrt{X}$ gates.
We observed that, up to numerical precision, the inequality in $\eqref{eq:cb_inequality}$ was satisfied for each error model. However, the gap between $\hat{F}_{\text{CB}}$ and $F_{\text{CB}}$ was extremely large for the gate-dependent over-rotation error model, while it remained very tight for both the gate-independent and $ZX$-decomposition error models. Interestingly, the $ZX$-decomposition error on the twirling gates yielded the highest variance in infidelity (i.e., was highly gate-dependent), yet the estimation of $F_{\text{CB}}$ was still very accurate.
We were able to reconcile the underestimation of $F_{\text{CB}}$ for the Clifford gates under the over-rotation error model by comparing the estimator $\hat{F}_{\text{CB}}$ to a different figure of merit, namely $F_{\text{CB}}^{\mathcal{U}}$, defined as
\begin{equation} F_{\text{CB}}^{\mathcal{U}} = \frac{1}{|\mathbb{P}^{\otimes n}|} \sum_{\mathcal{P} \in \mathbb{P}^{\otimes n}} \mathcal{F}\!\left(\tilde{\mathcal{G}} \circ \tilde{\mathcal{P}},\; \mathcal{U}(\mathcal{G} \circ \mathcal{P})\mathcal{U}^\dagger\right), \end{equation}where $\mathcal{U}$ can be numerically obtained via Equation 17 of and depends on the chosen noise model. Note that if $\mathcal{U} = \mathcal{I}$, which holds when the twirling gates are subject to gate-independent noise, then $F_{\text{CB}}^{\mathcal{U}} = F_{\text{CB}}$. In the case of the $T$ gate, we find that $\hat{F}_{\text{CB}} - F_{\text{CB}}^{\mathcal{U}} < \hat{F}_{\text{CB}} - F_{\text{CB}}$, although the gap remains noticeable.