Result

class qctrl.dynamic.types.reinforcement_learning_step.Result(*, next_actions=None, state=None, metadata=None, action=None, errors=None)

Result from a reinforcement learning step.

Variables
  • next_actions (List[qctrl.dynamic.types.reinforcement_learning_step.Action], optional) – The batch of actions received from the agent’s policy. Only omitted when episode_done == True, otherwise it has same length as environmentFeedbacks with matching item ordering.

  • state (str, optional) – The encoded optimizer state to be passed back in the next step.

  • metadata (dict, optional) – Metadata about the agent at the current step. No guarantees are made about the contents of this metadata dictionary. The contained information is intended purely to help interpret the results of the agent on a one-off basis.

  • action (qctrl.dynamic.types.CoreAction, optional) – Metadata about the execution of the function. Note: You might be looking for nextActions instead.

  • errors (List[qctrl.dynamic.types.Error], optional) – Related validation errors.