From random actions to optimal decisions through continuous learning
🎮 Use arrow keys or buttons to guide the agent to the goal!
Understanding the fundamental cycle that powers machine learning
Click any component
to learn more
Watch how an agent improves through episodes of trial and error
Random actions, many mistakes. The agent explores blindly.
Pattern recognition begins. The agent starts learning.
Optimized strategy achieved. Performance plateaus.
Finding the balance between trying new things and using what works
The agent tries new actions 50% of the time while exploiting known good actions 50% of the time.
Move slider and watch the action choices change
How positive and negative signals shape agent behavior
+10 Reward
Rewards for good actions encourage the agent to repeat them.
Example: Robot picks up the correct object
Click to see example →
-5 Penalty
Penalties for bad actions teach the agent what to avoid.
Example: Robot bumps into wall
Click to see example →
Train an agent to find the best path! Click cells to set rewards, then watch it learn.
Click cells to place goals and traps, then train!
Or try the simple reward feedback demo:
Click an action to start!
Reinforcement learning powers innovation across industries
The essential principles of Reinforcement Learning
Agents learn by directly interacting with their environment, receiving feedback for every action.
Mistakes are valuable learning opportunities. Every failure brings the agent closer to success.
Performance improves gradually through repeated episodes. Patience leads to mastery.
The reward signal is the compass. Properly designed rewards lead to intelligent behavior.
Reinforcement Learning is transforming how machines learn and make decisions. Start your journey into AI today!