# Characterizing your hardware using system identification in Boulder Opal

Build a system model using probe measurements and data fusion routines

## Introduction to system identification

As outlined in the Boulder Opal overview, system identification is often the first step in designing new quantum control strategies. It most generally refers to the process by which a user probes their system in order to learn about its characteristics. These characteristics can be highly varied including system responses in the time or frequency domain, or identifying the parameters or terms of a Hamiltonian model of the system. System identification is a complement to first-principles analyses that involves trying to discover the model that actually describes your quantum system, imperfections and all.

## How does system identification work?

System identification is at its heart an input-output process; you subject your system to stimuli and measure the output response. The stimuli you apply must be closely linked to the characteristics you wish to probe.

For instance, if you wish to understand the frequency-domain effect of drives applied to a qubit, you can generate shaped probe pulses modulated with different center frequencies and examine the response of the system. Alternatively, if you're trying to characterize a coupling term between excited levels in your system, you could use pulses that excite those transitions of different strength or duration.

After collecting the input-output relationship, the core task of system identification is to model the process that transforms the input stimulus to the output measurement. In general you can accomplish this by testing a variety of candidate models and choosing the one which gives the response closest to that of the actual system.

## System identification as an optimization problem

In general you can cast the problem of model selection as an optimization problem which is conveniently handled using the graph-based optimization engine in Boulder Opal.

In this approach, your aim is to find the model parameters $\{\lambda_j\}$ that minimize the distance between some measured data points $\{ P_i \}$ and those predicted by a candidate model $\{p_i(\{\lambda_j\})\}$ with the same inputs. This defines a simple cost function that links the measured and predicted data,

$$C(\{\lambda_j\}) = \sum_i \frac{[P_i-p_i(\{\lambda_j\})]^2}{2(\Delta P_i)^2} \, ,$$

where $\Delta P_i$ is the standard deviation of each of the measured points $P_i$.

The problem can be translated to Boulder Opal by creating a graph with the relationship between the parameters to be identified (defined as optimization variables) and the measured response. Then, you calculate the cost function by comparing each measured point to the simulated one in the graph. Running the optimization will give you the parameters that minimize the cost function, and thus create the model that most faithfully describes the measured system dynamics.

Minimizing the cost function $C$ is equivalent to minimizing the negative log-likelihood that a certain set of parameters $\{\lambda_j\}$ could generate the points $P_i$, for the case where the probability distribution of the errors is a Gaussian. In this case, the probability of a certain curve $p_i(\{\lambda_j\})$ generating the points $P_i$ is the product of all the individual Gaussian probabilities:

$$P = \prod_i \frac{1}{\sqrt{2\pi (\Delta P_i)^2}} \exp \left\{ - \frac{[P_i - p_i(\{\lambda_j\})]^2}{2 (\Delta P_i)^2} \right\} \, .$$

Here you can see that the negative logarithm of $P$ is the cost function $C$ plus constants.

You can also calculate the precision of the estimated parameters by using the Cramér–Rao bound, and identifying the Hessian of the cost function with the inverse of the covariance matrix for the variables estimated.

## Selecting a system identification strategy

Choosing the best system identification strategy for your system will be linked to the constraints of your problem, and will ultimately feed into the selection of an optimization method best suited to your needs. Note that these are the same mathematical methods used for control design, but here employed for cost-function minimization in a different setting.

### System identification for simple systems with moderate amounts of measurement data

For most system identification problems, involving a moderate amount of data and a simple physical system, using a deterministic gradient-based optimizer (such as the one provided by qctrl.functions.calculate_optimization) is a fast and reliable way to obtain the minimum of a cost function.

See the How to perform Hamiltonian parameter estimation using a small amount of measured data or the How to characterize the bandwidth of a transmission line using a qubit as a probe user guides to learn how to perform system identification using Boulder Opal.

### System identification for multiparameter systems with large amounts of measurement data

Optimizations on large systems or involving a large dataset can be quite demanding computationally; frequently such optimizations can be slow or require large amounts of classical computing memory. In such cases it is generally advantageous to perform a stochastic optimization. This is available in Boulder Opal through qctrl.functions.calculate_stochastic_optimization.

In such cases it can be beneficial to split the dataset in small batches, using a different one at each optimization step. This option is particularly useful for problems that involve a large dataset or a complex simulation due to, for example, a fine time discretization. Moreover, using a momentum-based stochastic optimization algorithm such as Adam, as employed by the stochastic optimization engine in Boulder Opal, can provide an advantage in finding local minima in the cost landscape when using noisy datasets.

See the How to perform Hamiltonian parameter estimation using a large amount of measured data user guide to learn how to use stochastic optimization in Boulder Opal to perform system identification.