Multi-armed bandit algorithms; Research done at Dr Ji's SNAIL lab @ Virginia Tech
Implements different bandit algorithms that solve the exploration vs exploitation problem.
- Epsilon-greedy: Random exploration with probability ε
- UCB: Upper confidence bound approach
- Thompson Sampling: Bayesian method
- Contextual bandits: Uses additional info to make decisions
numpy, matplotlib, scipy