choices

random.choices(data, sample_count, seed=None, *, name=None)

Create random samples from the data that you provide.

You can provide the data as a list and each element of that list represents one component of the full data. For example, considering a single variable linear regression problem that is described by the input xx and output yy, the data you provide would be [x,y][x, y]

This node effectively chooses a random batch of sample_count indices {si}\{s_i\}, and extracts the corresponding slices {c[si]}\{c[s_i]\}

If this node is evaluated multiple times (for example during an optimization), it samples indices without replacement until all indices have been seen, at which point it starts sampling from the full set of indices again. You can therefore use this node to create minibatches that iterate over your data set in a series of epochs.

Parameters

  • data (list [ np.ndarray or Tensor ]) – A list of data components. The first dimensions of the elements in this list denote the total amount of the data, and therefore must be the same.
  • sample_count (int) – Number of samples in the returned batch.
  • seed (int or None , optional) – Seed for random number generator. Defaults to None. If set, it ensures the random samples are generated in a reproducible sequence.
  • name (str or None , optional) – The name of the node.

Returns

A sequence representing a batch of random samples from data. You can access the elements of the sequence using integer indices. The number of elements of the sequence is the same as the size of data. Each element of the sequence has the length (along its first dimension) as defined by sample_count.

Return type

Sequence[Tensor]

SEE ALSO

Graph.random.normal : Create a sample of normally distributed random numbers.

Graph.random.uniform : Create a sample of uniformly distributed random numbers.

boulderopal.run_stochastic_optimization : Function to find the minimum of generic stochastic functions.

Examples

>>> x = np.arange(20).reshape((10, 2))
>>> y = np.arange(10) * 0.2
>>> sampled_x, sampled_y = graph.random.choices([x, y], 3, seed=1)
>>> sampled_x.name = "sampled_x"
>>> sampled_y.name = "sampled_y"
>>> result = bo.execute_graph(graph=graph, output_node_names=["sampled_x", "sampled_y"])
>>> result["output"]["sampled_x"]["value"]
array([[2, 3], [6, 7], [0, 1]])
>>> result["output"]["sampled_y"]["value"]
array([0.2, 0.6, 0. ])

See more examples in the How to perform Hamiltonian parameter estimation using a large amount of measured data user guide.

Was this useful?