class Result(*, next_actions=None, state=None, metadata=None, action=None, errors=None)

Result from a reinforcement learning step.

  • next_actions (List[qctrl.dynamic.types.reinforcement_learning_step.Action], optional) – The batch of actions received from the agent’s policy. Only omitted when episode_done is True, otherwise it has the same length as environment_feedbacks with matching item ordering.

  • state (str, optional) – The encoded optimizer state to be passed back in the next step.

  • metadata (dict, optional) – Metadata about the agent at the current step. No guarantees are made about the contents of this metadata dictionary. The contained information is intended purely to help interpret the results of the agent on a one-off basis.

  • action (qctrl.dynamic.types.CoreAction, optional) – Metadata about the execution of the function. Note: You might be looking for next_actions instead.

  • errors (List[qctrl.dynamic.types.Error], optional) – Related validation errors.