Twitter-color

SARSA (State-Action-Reward-State-Action) is a reinforcement learning algorithm used to learn a policy that maximizes the expected return. It is an on-policy algorithm, meaning it evaluates and improves the policy that is used to make decisions. SARSA updates its action-value function based on the action taken in the current state and the next state, allowing for more exploratory behavior compared to off-policy methods like Q-learning. Common use cases include robotics, game playing, and any scenario where an agent learns to make decisions through trial and error in an environment.

AI Glosario

SARSA Algorithm

Términos relacionados

Saliency Maps

Scalable Oversight

Scaling Laws

Scatter Plot