RL-tictactoe

I have implemented two algorithms for the tic-tac-toe games: TD(0) & Q-Learning For the final question, we only need TD(0) to do model-free prediction. Due to the lateness of the message in the slack, I have already done the self-play implementation so that I get ultimate value for each state.

Usage

See all the usage in help function in details

python3 main.py -h

play games with the trained agent

python3 main.py -p

show first three state values(this is used for the final question)

python3 main.py -o

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
QAgent.py		QAgent.py
README.md		README.md
TDAgent.py		TDAgent.py
main.py		main.py
tictactoe.py		tictactoe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-tictactoe

Usage

About

Releases

Packages

Languages

Rtlyc/RL-tictactoe

Folders and files

Latest commit

History

Repository files navigation

RL-tictactoe

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages