Algorithmic benchmarking results

Evaluating Fire Opal performance improvement across multiple benchmark algorithms


Benchmarks are a common industry practice to determine the performance of quantum computing systems. The usefulness of a quantum computer is ultimately determined by its ability to successfully implement meaningful and relevant quantum algorithms in reasonable times and with reasonable resources.

Fire Opal boosts algorithmic success on quantum computers, and makes it possible to run algorithms on NISQ devices. This article provides an overview of the benchmarked improvements of Fire Opal run on IBM backends, compared to bare execution without Fire Opal.

At Q-CTRL, benchmarking is performed continuously to test the performance of new error suppression methods. This document reveals some of the latest benchmarking results achieved across different algorithms, which are adapted from the QED-C Application-Oriented Performance Benchmarks for Quantum Computing. These benchmarking results can help to set expectations for what sorts of improvements you will see with Fire Opal, but it is still highly encouraged to try it out yourself as Fire Opal's error suppression capabilities are constantly improving.

Benchmarks were carried out across a set of algorithms with widely varying characteristics (connectivity, circuit depth, etc): Bernstein–Vazirani (BV), quantum Fourier transform (QFT), Greenberger–Horne–Zeilinger (GHZ) state, and quantum approximate optimization algorithm (QAOA). Certain algorithms are inherently better suited for certain devices, due to device connectivity and layouts, but Fire Opal generally boosts algorithmic success probability across devices and algorithms.

For a detailed look at our methodologies and previous results, you can reference our technical manuscript.

Success metrics

  1. Success probability: The probability that a single execution (shot) of the circuit will yield the correct solution. The ideal value is 1.
  2. Selectivity: A way to measure the signal-to-noise ratio or how much the right answer stands out relative to the other (wrong) choices. More precisely, selectivity is the logarithm of the ratio of the success probability to the probability of the most likely incorrect bitstring. Greater than 1 is ideal.
  3. Confidence level: The odds of running the circuit N times (shots) and picking a correct solution instead of the incorrect alternatives. The ideal value is 100%.
  4. Hellinger fidelity: A function of the Hellinger distance, which is used to quantify the similarity between two probability distributions. The ideal value is 1.

Multiple success metrics are used because together they combine to tell a complete story. As problem sizes get larger, the number of possible outcomes increases vastly, meaning that the success probability naturally decreases. Selectivity provides an indication that even a low success probability can be meaningful.

A selectivity greater than one implies that the solution bitstring is much more probable than any other, and the signal can be amplified by simply performing further executions (averaging). As you repeat the experiment, you will keep getting the correct solution as the most probable answer, and this boosts the confidence that the algorithm is working as expected, as opposed to obtaining this result by chance.

With several measures of success probability and selectivity greater than one, the confidence level can be measured and shown to converge to 100%. Together, these three metrics provide a more complete view of how most algorithms are performing. Some algorithms also have specific metrics that can be used to quantify performance.

For algorithms where the ideal distribution is known, Hellinger fidelity can be used to compare the generated and ideal distributions. In the example of QAOA, metrics specific to the problem are defined.

Configuration information

The results were collected using 127-qubit IBM devices: IBM Brisbane and IBM Sherbrooke. Daily benchmark tests are run and these represent mean values.

The "Fire Opal" represents results using Fire Opal's error suppression pipeline, whereas "Default" measurements were taken using the built-in error suppression and mitigation in Qiskit Runtime (optimization_level=1 and resilience_level=1).

Bernstein–Vazirani algorithm

Success probability

In the following figure, the success probability of a Bernstein–Vazirani algorithm is shown as a function of the number of qubits, ranging from 10 to 45. While the success probability without Fire Opal drops to zero at 20 qubits, Fire Opal is able to achieve the right answer up to and including 45 qubits.

Screenshot 2024-02-20 at 11.50.33 AM.png-1


The following figure shows the selectivity calculated from the same results plotted above. A selectivity greater than one indicates a strong signal of the correct answer within the probability distribution. For the Bernstein–Vazirani algorithm, Fire Opal achieves the correct answer with a strong signal up to and including 45 qubits.

Screenshot 2024-02-20 at 11.50.22 AM.png-2

Confidence level

Because the selectivity is greater than one, the confidence level can be increased by running more iterations (shots). In the plot of "Confidence level," it's clear that greater than 99% confidence can be reached across all three circuit sizes—arriving there just takes more shots and averaging as the number of qubits increases. Still, achieving 99% confidence only takes about 1000 shots with Fire Opal.

Achieving this level of confidence isn't possible above 10 qubits without Fire Opal. Because selectivity is negative, increasing the number of shots will continue to generate a random spread of results.

Screenshot 2024-02-08 at 2.26.19 PM.png-3

Quantum Fourier transform

The following set of figures show the success probability and selectivity of running the Quantum Fourier transform algorithm up to 20 qubits. Since this algorithm is more complex than the previous, scaling to higher qubit counts poses a greater challenge. However, Fire Opal still increases the size of the circuit you can run from 8 up to 20, while still achieving meaningful results.

Success probability

Without Fire Opal, the probability of getting the correct answer on even a single shot drops to zero at 12-qubit QFT. However, with Fire Opal, you're still able to get the correct answer up to 20 qubits.

Screenshot 2024-02-20 at 11.47.34 AM.png-4


Selectivity remains above 1 up to 14 qubits with Fire Opal, which means that signal is very strong. From 16 to 20 qubits, selectivity is still positive, which indicates that the correct answer is still the most probable answer.

Screenshot 2024-02-20 at 11.47.19 AM.png-5

Quantum phase estimation

The following figures show the success probability and selectivity for the quantum phase estimation algorithm. Using Fire Opal, you can achieve meaningful results running this algorithm up to 16 qubits, compared to only 8 qubits without.

Success probability

Similar to QFT, quantum phase estimation is a complex algorithm that quickly increases in depth (number of gates) as the number of qubits increases. Without using Fire Opal, the likelihood of getting the right answer in even a single shot drops to zero after 12 qubits. With Fire Opal's error suppression, it's possible to still land on correct answers.

Screenshot 2024-02-20 at 11.45.07 AM.png-6


The selectivity plot shows that even though there were still some correct answers being found without Fire Opal up to 12 qubits, the most common answer returned was the incorrect answer, thus the selectivity is negative. Fire Opal maintains a positive selectivity up to 16 qubits.

Screenshot 2024-02-20 at 11.45.17 AM.png-7

Greenberger–Horne–Zeilinger (GHZ) state

Entanglement is a resource used in quantum computing that links qubits together in a truly “non-classical” way. Generating a large entagled GHZ state is notoriously difficult.

It's possible to generate GHZ states up to about 60 qubits using Q-CTRL's error suppression.

Here, Hellinger fidelity is a metric that compares the expected and actual outputs of the machine following state generation—higher is better. The horizontal dashed line near 0.5 is a typical threshold used to identify whether a state passes a test as verifiably entangled. At 60 qubits, the value is 0.501.

Screenshot 2024-02-20 at 11.47.08 AM.png-8

The following table provides the numerical data for previous figure.

Number of qubitsQ-CTRL + IBM Hellinger fidelityHellinger fidelity without Q-CTRL

Quantum approximate optimization algorithm (QAOA)

The algorithmic performance of the quantum approximate optimization algorithm (QAOA) was benchmarked using randomly generated MaxCut problems of 3-regular graphs (where each node is connected to exactly three other nodes) with a unique solution. In the QAOA implementation benchmarked, p=1, where p is an integer parameter which dictates the depth of the ansatz, and thus affects the approximation quality.

Since QAOA is an iterative algorithm, the individual metrics generated after each circuit execution (the quantum piece of the quantum-classical hybrid algorithm) are not as useful as a measure of the overall performance of the algorithm. Rather, it's more indicative to look at the quality of the final solution.

The most important indicator is whether or not the actual "max cut", the optimal solution, was also found as the solution by the QAOA algorithm. This means a correct answer was given. Fire Opal enables you to get the correct max cut on problems up to 100 qubits.

Here, Fire Opal is benchmarked against a "brute-force" classical solution, which gives a random distribution of results at this scale. These random results are comparable to what you would achieve using the default hardware performance as well. Fire Opal's improvement is visually indicated by the shift of the distribution of results to the right, which indicates higher cut values, i.e. better solutions.

Note that these results are achieved using additional enhancements that are included in the Enterprise tier of Fire Opal and available via the QAOA Solver. Contact us to get access to Fire Opal Enterprise.

The following images show randomly generated MaxCut problems of 80-qubit, 100-qubit, and 120-qubit sizes. Q-CTRL found the correct answer each time.

Screenshot 2024-02-20 at 2.45.47 PM.png-9

Screenshot 2024-05-09 at 1.26.41 PM.png-10

Even at the scale of 120 qubits, Fire Opal is still able to determine the correct maximum cut value. The likelihood of finding this answer randomly is incredibly low.

Screenshot 2024-05-09 at 1.18.13 PM.png-11


This document provides an indication of the performance improvement you can expect to achieve across various algorithms using Fire Opal. The reported metrics represent current mean values across multiple benchmarking experiments, but they will continue to improve further over subsequent weeks and months as Q-CTRL methods are continuously evolving.

Validating and reproducing these metrics is welcomed and encouraged. Learn how to get started with Fire Opal, and run the Bernstein–Vazirani algorithm.

Was this useful?

cta background

New to Fire Opal?

Get access to everything you need to automate and optimize quantum hardware performance at scale.