Intermediate
Example
Demonstration
Link to External Site
This example shows how to train a model-based policy optimization (MBPO) agent to balance a cart-pole system modeled in MATLAB. For more information on MBPO agents, see Model-Based Policy Optimization Agents.
MBPO agents use an environment model to generate more experiences while training a base agent. In this example, the base agent is a soft actor-critic (SAC) agent.