-
-
Notifications
You must be signed in to change notification settings - Fork 163
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add theory, intro and use case pages of LRE user guide (#2522)
* draft 1 * vincent + nate feedback * remove : for warning block * Apply suggestions from code review Co-authored-by: nate stemen <[email protected]> * nate's feedback round 2 * Apply suggestions from code review Co-authored-by: nate stemen <[email protected]> * clarify theory sections * gen monomial terms * Add intro section to LRE docs (#2535) * add intro and use case pages Co-Authored-By: Purva Thakre <[email protected]> * clean up intro/use case * clarify depth comment * wordsmithing --------- Co-authored-by: Purva Thakre <[email protected]> * change wording of Bi matrix * cleanup first section * fix l/L typo --------- Co-authored-by: nate stemen <[email protected]> Co-authored-by: Purva Thakre <[email protected]>
- Loading branch information
1 parent
bc8fdf9
commit 7e4fb92
Showing
5 changed files
with
312 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
--- | ||
jupytext: | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.11.1 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
|
||
# How do I use LRE? | ||
|
||
LRE works in two main stages: generate noise-scaled circuits via layerwise scaling, and apply inference to resulting measurements post-execution. | ||
|
||
This workflow can be executed by a single call to {func}`.execute_with_lre`. | ||
If more control is needed over the protocol, Mitiq provides {func}`.multivariate_layer_scaling` and {func}`.multivariate_richardson_coefficients` to handle the first and second steps respectively. | ||
|
||
```{danger} | ||
LRE is currently compatible with quantum programs written using `cirq`. | ||
Work on making this technique compatible with other frontends is ongoing. 🚧 | ||
``` | ||
|
||
## Problem Setup | ||
|
||
To demonstrate the use of LRE, we'll first define a quantum circuit, and a method of executing circuits for demonstration purposes. | ||
|
||
For simplicity, we define a circuit whose unitary compiles to the identity operation. | ||
Here we will use a randomized benchmarking circuit on a single qubit, visualized below. | ||
|
||
```{code-cell} ipython3 | ||
from mitiq import benchmarks | ||
circuit = benchmarks.generate_rb_circuits(n_qubits=1, num_cliffords=3)[0] | ||
print(circuit) | ||
``` | ||
|
||
We define an [executor](executors.md) which simulates the input circuit subjected to depolarizing noise, and returns the probability of measuring the ground state. | ||
By altering the value for `noise_level`, ideal and noisy expectation values can be obtained. | ||
|
||
```{code-cell} ipython3 | ||
from cirq import DensityMatrixSimulator, depolarize | ||
def execute(circuit, noise_level=0.025): | ||
noisy_circuit = circuit.with_noise(depolarize(p=noise_level)) | ||
rho = DensityMatrixSimulator().simulate(noisy_circuit).final_density_matrix | ||
return rho[0, 0].real | ||
``` | ||
|
||
Compare the noisy and ideal expectation values: | ||
|
||
```{code-cell} ipython3 | ||
noisy = execute(circuit) | ||
ideal = execute(circuit, noise_level=0.0) | ||
print(f"Error without mitigation: {abs(ideal - noisy) :.5f}") | ||
``` | ||
|
||
## Apply LRE directly | ||
|
||
With the circuit and executor defined, we just need to choose the polynomial extrapolation degree as well as the fold multiplier. | ||
|
||
```{code-cell} ipython3 | ||
from mitiq.lre import execute_with_lre | ||
degree = 2 | ||
fold_multiplier = 3 | ||
mitigated = execute_with_lre( | ||
circuit, | ||
execute, | ||
degree=degree, | ||
fold_multiplier=fold_multiplier, | ||
) | ||
print(f"Error with mitigation (LRE): {abs(ideal - mitigated):.{3}}") | ||
``` | ||
|
||
As you can see, the technique is extremely simple to apply, and no knowledge of the hardware/simulator noise is required. | ||
|
||
## Step by step application of LRE | ||
|
||
In this section we demonstrate the use of {func}`.multivariate_layer_scaling` and {func}`.multivariate_richardson_coefficients` for those who might want to inspect the intermediary circuits, and have more control over the protocol. | ||
|
||
### Create noise-scaled circuits | ||
|
||
We start by creating a number of noise-scaled circuits which we will pass to the executor. | ||
|
||
```{code-cell} ipython3 | ||
from mitiq.lre import multivariate_layer_scaling | ||
noise_scaled_circuits = multivariate_layer_scaling(circuit, degree, fold_multiplier) | ||
num_scaled_circuits = len(noise_scaled_circuits) | ||
print(f"total number of noise-scaled circuits for LRE = {num_scaled_circuits}") | ||
print( | ||
f"Average circuit depth = {sum(len(circuit) for circuit in noise_scaled_circuits) / num_scaled_circuits}" | ||
) | ||
``` | ||
|
||
As you can see, the noise scaled circuits are on average much longer than the original circuit. | ||
An example noise-scaled circuit is shown below. | ||
|
||
```{code-cell} ipython3 | ||
noise_scaled_circuits[3] | ||
``` | ||
|
||
With the many noise-scaled circuits in hand, we can run them through our executor to obtain the expectation values. | ||
|
||
```{code-cell} ipython3 | ||
noise_scaled_exp_values = [ | ||
execute(circuit) for circuit in noise_scaled_circuits | ||
] | ||
``` | ||
|
||
### Classical inference | ||
|
||
The penultimate step here is to fetch the coefficients we'll use to combine the noisy data we obtained above. | ||
The astute reader will note that we haven't defined or used a `degree` or `fold_multiplier` parameter, and this is where they are both needed. | ||
|
||
```{code-cell} ipython3 | ||
from mitiq.lre import multivariate_richardson_coefficients | ||
coefficients = multivariate_richardson_coefficients( | ||
circuit, | ||
fold_multiplier=fold_multiplier, | ||
degree=degree, | ||
) | ||
``` | ||
|
||
Each noise scaled circuit has a coefficient of linear combination and a noisy expectation value associated with it. | ||
|
||
### Combine the results | ||
|
||
```{code-cell} ipython3 | ||
mitigated = sum( | ||
exp_val * coeff | ||
for exp_val, coeff in zip(noise_scaled_exp_values, coefficients) | ||
) | ||
print( | ||
f"Error with mitigation (LRE): {abs(ideal - mitigated):.{3}}" | ||
) | ||
``` | ||
|
||
As you can see we again see a nice improvement in the accuracy using a two stage application of LRE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
--- | ||
jupytext: | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.10.3 | ||
kernelspec: | ||
display_name: Python 3 (ipykernel) | ||
language: python | ||
name: python3 | ||
--- | ||
|
||
# When should I use LRE? | ||
|
||
## Advantages | ||
|
||
Just as in ZNE, LRE can also be applied without a detailed knowledge of the underlying noise model as the effectiveness of the technique depends on the choice of scale factors. | ||
Thus, LRE is useful in scenarios where tomography is impractical. | ||
|
||
The sampling overhead is flexible wherein the cost can be reduced by using larger values for the fold multiplier (used to | ||
create the noise-scaled circuits) or by chunking a larger circuit to fold groups of layers of circuits instead of each one individually. | ||
|
||
## Disadvantages | ||
|
||
When using a large circuit, the number of noise scaled circuits grows polynomially such that the execution time rises because we require the sample matrix to be a square matrix (more details in the [theory](lre-5-theory.md) section). | ||
|
||
When reducing the sampling cost by using a larger fold multiplier, the bias for polynomial extrapolation increases as one moves farther away from the zero-noise limit. | ||
|
||
Chunking a large circuit with a lower number of chunks to reduce the sampling cost can reduce the performance of LRE. | ||
In ZNE parlance, this is equivalent to local folding faring better than global folding in LRE when we use a higher number of chunks in LRE. | ||
|
||
```{attention} | ||
We are currently investigating the issue related to the performance of chunking large circuits. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
--- | ||
jupytext: | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.11.4 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
|
||
# What is the theory behind LRE? | ||
|
||
Similar to [ZNE](zne.md), LRE works in two steps: | ||
|
||
- **Step 1:** Intentionally create multiple noise-scaled but logically equivalent circuits by scaling each layer or chunk of the input circuit through unitary folding. | ||
|
||
- **Step 2:** Extrapolate to the noiseless limit using multivariate richardson extrapolation. | ||
|
||
The noise-scaled circuits in ZNE are scaled by the user choosing which layers of the input circuit to fold whereas in LRE | ||
each noise-scaled circuit scales the layers in the input circuit in a specific pattern. | ||
LRE leverages the flexible configuration space of layerwise unitary folding, allowing for a more nuanced mitigation of errors by treating the noise level of each layer of the quantum circuit as an independent variable. | ||
|
||
## Step 1: Create noise-scaled circuits | ||
|
||
The goal is to create noise-scaled circuits of different depths where the layers in each circuit are scaled in a specific pattern as a result of [unitary folding](zne-5-theory.md). | ||
This pattern is described by the vector of scale factor vectors which are generated after the fold multiplier and degree for multivariate Richardson extrapolation are chosen. | ||
|
||
Suppose we're interested in the value of some observable of a circuit $C$ that has $l$ layers. | ||
For each layer $0 \leq L \leq l$ we can choose a scale factor for how much to scale that particular layer. | ||
Thus a vector $\lambda \in \mathbb{R}^l_+$ corresponds to a folding configuration where $\lambda_0$ corresponds to the scale factor for the first layer, and $\lambda_{l - 1}$ is the scale factor to apply on the circuits final layer. | ||
|
||
Fix the number of noise-scaled circuits we wish to generate at $M\in\mathbb{N}$. | ||
Define $\Lambda = (λ_1, λ_2, \ldots, λ_M)^T$ to be the collection of scale factors and let $(C_{λ_1}, C_{λ_2}, \ldots, C_{λ_M})^T$ denote the noise-scaled circuits corresponding to each scale factor. | ||
|
||
After $d$ is fixed as the degree of the multivariate polynomial, we define $M_j(λ_i, d)$ to be the terms in the polynomial arranged in increasing order. | ||
In general, the number of monomial terms with $l$ variables up to degree $d$ can be determined | ||
through the [stars and bars method](https://en.wikipedia.org/wiki/Stars_and_bars_%28combinatorics%29). | ||
|
||
For example, if $C$ has 2 layers, the degree of the extrapolating polynomial is 2, the basis of monomials contains 6 terms: $\{1, λ_1, λ_2, {λ_1}^2, λ_1 \cdot λ_2, {λ_2}^2 \}$. | ||
|
||
$$ | ||
\text{total number of terms in the monomial basis with max degree } d = \binom{d + l}{d} | ||
$$ | ||
|
||
As the choice for the degree of the extrapolating polynomial is 2, we search for the number of terms with total degree 2 using the following formula: | ||
|
||
$$ | ||
\text{number of terms in the monomial basis with total degree } d = \binom{d + l - 1}{d} | ||
$$ | ||
|
||
Terms with total degree 2 are 3 calculated by $\binom{2 + 2 -1}{2} = 3$ and correspond to $\{{λ_1}^2, λ_1 \cdot λ_2, {λ_2}^2 \}$. | ||
|
||
Similarly, number of terms with total degree 1 and 0 can be calculated as $\binom{1 + 2 -1}{1} = 2:\{λ_1, λ_2\}$ and $\binom{0 + 2 -1}{0}= 1: \{1\}$ respectively. | ||
|
||
These terms in the monomial basis define the rows of the square sample matrix as shown below: | ||
|
||
$$ | ||
\mathbf{A}(\Lambda, d) = | ||
\begin{bmatrix} | ||
M_1(λ_1, d) & M_2(λ_1, d) & \cdots & M_N(λ_1, d) \\ | ||
M_1(λ_2, d) & M_2(λ_2, d) & \cdots & M_N(λ_2, d) \\ | ||
\vdots & \vdots & \ddots & \vdots \\ | ||
M_1(λ_N, d) & M_2(λ_N, d) & \cdots & M_N(λ_N, d) | ||
\end{bmatrix} | ||
$$ | ||
|
||
For our example circuit of $l=2$ and $d=2$, each row defined by the generic monomial terms $\{M_1(λ_i, d), M_2(λ_i, d), \ldots, M_N(λ_i, d)\}$ in the sample matrix $\mathbf{A}$ will instead be replaced by $\{1, λ_1, λ_2, {λ_1}^2, λ_1 \cdot λ_2, {λ_2}^2 \}$. | ||
|
||
Here, each monomial term in the sample matrix $\mathbf{A}$ is then evaluated using the values in the scale factor vectors. In Step 2, this sample matrix will be utilized to obtain our mitigated expectation value. | ||
|
||
## Step 2: Extrapolate to the noiseless limit | ||
|
||
Each noise scaled circuit $C_{λ_i}$ has an expectation value $\langle O(λ_i) \rangle$ associated with it such that we can define a vector of the noisy expectation values $z = (\langle O(λ_1) \rangle, \langle O(λ_2) \rangle, \ldots, \langle O(λ_M)\rangle)^T$. | ||
These values can then be combined via a linear combination to estimate the ideal value $variable$. | ||
|
||
$$ | ||
O_{\mathrm{LRE}} = \sum_{i=1}^{M} \eta_i \langle O(λ_i) \rangle. | ||
$$ | ||
|
||
Finding the coefficients in the linear combination becomes a problem solvable through a system of linear equations $\mathbf{A} c = z$ where $c$ is the coefficients vector $(\eta_1, \eta_2, \ldots, \eta_N)^T$, $z$ is the vector of the noisy expectation values and $\mathbf{A}$ is the sample matrix evaluated using the values in the scale factor vectors. | ||
|
||
The [general multivariate Lagrange interpolation polynomial](https://www.siam.org/media/wkvnvame/a_simple_expression_for_multivariate.pdf) is defined by a new matrix $\mathbf{B}_i$ obtained by replacing the $i$-th row of the sample matrix $\mathbf{A}$ with monomial terms evaluated using the generic variable λ. Thus, matrix $\mathbf{B}_i$ represents an interpolating polynomial in variable λ of degree $d$. As we only need to find the noiseless expectation value, we can skip calculating the full vector of linear combination coefficients if we use the [Lagrange interpolation formula](https://files.eric.ed.gov/fulltext/EJ1231189.pdf) evaluated at $λ = 0$ i.e. the zero-noise limit. | ||
|
||
To get the matrix $\mathbf{B}_i(\mathbf{0})$, replace the $i$-th row of the sample matrix $\mathbf{A}$ by $\mathbf{e}_i=(1, 0, \ldots, 0)$ where except $M_1(0, d) = 1$ all the other monomial terms are zero when $λ=0$. | ||
|
||
$$ | ||
O_{\rm LRE} = \sum_{i=1}^M \langle O (\boldsymbol{\lambda}_i)\rangle \frac{\det \left(\mathbf{B}_i (\boldsymbol{0}) \right)}{\det \left(\mathbf{A}\right)} | ||
$$ | ||
|
||
To summarize, based on a user's choice of degree of extrapolating polynomial for some circuit, expectation values from noise scaled circuits created in a specific pattern along with multivariate Lagrange interpolation of the sample matrix evaluated using the scale factor vectors are used to find error mitigated expectation value. | ||
|
||
Additional details on the LRE functionality are available in the [API-doc](https://mitiq.readthedocs.io/en/stable/apidoc.html#module-mitiq.lre.multivariate_scaling.layerwise_folding). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
```{warning} | ||
The user guide for LRE in Mitiq is currently under construction. | ||
``` | ||
|
||
# Layerwise Richardson Extrapolation | ||
|
||
Layerwise Richardson Extrapolation (LRE), an error mitigation technique, introduced in | ||
{cite}`Russo_2024_LRE` extends the ideas found in ZNE by allowing users to create multiple noise-scaled variations of the input | ||
circuit such that the noiseless expectation value is extrapolated from the execution of each | ||
noisy circuit. | ||
|
||
Layerwise Richardson Extrapolation (LRE), an error mitigation technique, introduced in | ||
{cite}`Russo_2024_LRE` works by creating multiple noise-scaled variations of the input | ||
circuit such that the noiseless expectation value is extrapolated from the execution of each | ||
noisy circuit (see the section [What is the theory behind LRE?](lre-5-theory.md)). Compared to | ||
Zero-Noise Extrapolation, this technique treats the noise in each layer of the circuit | ||
as an independent variable to be scaled and then extrapolated independently. | ||
|
||
You can get started with LRE in Mitiq with the following sections of the user guide: | ||
|
||
```{toctree} | ||
--- | ||
maxdepth: 1 | ||
--- | ||
lre-1-intro.md | ||
lre-2-use-case.md | ||
lre-5-theory.md | ||
``` |