class qctrl.dynamic.types.reinforcement_learning_step.Agent(*, policy_gradient_initializer=None, state=None)

The agent used in reinforcement learning. Exactly one field must be non-null. For the first step of reinforcement learning, this must be an initializer, such as PolicyGradientInitializer. For subsequent steps, you need to provide the state returned in the result of the previous step, which already contains the agent’s configuration and relevant information.

  • policy_gradient_initializer (qctrl.dynamic.types.reinforcement_learning_step.PolicyGradientInitializer, optional) – Initializer for the policy gradient based learning agent. Pass this field to use this learning algorithm for the reinforcement learning.

  • state (str, optional) – The state of the agent at the current step, as returned in the result of the previous step. Must be updated in steps following the agent’s initialization.