choices

random.choices(data, sample_count, seed=None, *, name=None)

Create random samples from the data that you provide.

You can provide the data as a list and each element of that list represents one component of the full data. For example, considering a single variable linear regression problem that is described by the input $x$ and output $y$ , the data you provide would be $[x, y]$ . The first dimension of the data component in this list is the size of the data and therefore must be same for all components. However, all these components can have different value shapes, meaning the other dimensions can vary.

This node effectively chooses a random batch of sample_count indices $\{s_i\}$ , and extracts the corresponding slices $\{c[s_i]\}$ of each data component. For example, in the case of linear regression, you can use this node to extract a random subset of your full data set.

If this node is evaluated multiple times (for example during an optimization), it samples indices without replacement until all indices have been seen, at which point it starts sampling from the full set of indices again. You can therefore use this node to create minibatches that iterate over your data set in a series of epochs.

Parameters

data (list [ np.ndarray or Tensor ]) – A list of data components. The first dimensions of the elements in this list denote the total amount of the data, and therefore must be the same.
sample_count (int) – Number of samples in the returned batch.
seed (int or None , optional) – Seed for random number generator. Defaults to None. If set, it ensures the random samples are generated in a reproducible sequence.
name (str or None , optional) – The name of the node.

Returns

A sequence representing a batch of random samples from data. You can access the elements of the sequence using integer indices. The number of elements of the sequence is the same as the size of data. Each element of the sequence has the length (along its first dimension) as defined by sample_count.

Return type

Sequence[Tensor]

Examples

>>> x = np.arange(20).reshape((10, 2))
>>> y = np.arange(10) * 0.2
>>> sampled_x, sampled_y = graph.random.choices([x, y], 3, seed=1)
>>> sampled_x.name = "sampled_x"
>>> sampled_y.name = "sampled_y"
>>> result = bo.execute_graph(graph=graph, output_node_names=["sampled_x", "sampled_y"])
>>> result["output"]["sampled_x"]["value"]
array([[2, 3], [6, 7], [0, 1]])
>>> result["output"]["sampled_y"]["value"]
array([0.2, 0.6, 0. ])

See more examples in the How to perform Hamiltonian parameter estimation using a large amount of measured data user guide.

Parameters

Returns

Return type

SEE ALSO

Examples