A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The pendulum starts upright, and the goal is to prevent it from falling over by increasing and reducing the cart's velocity.
The agent only knows his cart position
, cart velocity
, pole angle
, and pole velocity
in evry step. The agent can take one action from push left
and push right
.
Following are the commands used to train and test the model:
To train the model:
python dqn.py train --itr 200 --capacity 10000 --batch 80 --save True --plot True
To run with pre-trained weights:
python dqn.py test
Reward Plot:
The obtained result: