EnvironmentFeedback

class qctrl.dynamic.types.reinforcement_learning_step.EnvironmentFeedback(*, observation, reward=0.0)

A response from the environment to the agent.

Variables
  • observation (List[float]) – The observation from the environment to be given to the agent.

  • reward (float, optional) – Optional: The reward for the agent due to its previous actions. If not provided, the reward will be assumed to be zero.