GitHub - BigR-Lab/sklearn-deeprl: Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Dive-in button:

Currently both demos are vanilla crossentropy(CE) method for policy approximated by a neural network. For RL, it boild down to Repeat:

Generate N games
Take M best
Fit to those M best samples

The CE is a very general approach for approximate estimation and maximization tasks, you can read about it here. For reinforcement learning, we use the optimization version, basically trying to fit agent to generating games where reward is high. More on that here.

While this approach falls flat in some cases and it takes black magic to make it work with infinite MDPs or long session lengths, it still works unreasonably well in most cases. One more awesome trait is that it extendds effortlessly to policy approximation (e.g. deep RL), partially observable MDPs and all kinds of weird stuff you see in the wild.

If you want something heavier, take a look at agentnet.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Dockerfile		Dockerfile
README.md		README.md
Sklearn_CEM_Cartpole.ipynb		Sklearn_CEM_Cartpole.ipynb
Sklearn_CEM_LunarLander.ipynb		Sklearn_CEM_LunarLander.ipynb
cartpole.mp4		cartpole.mp4
lunarlander1.mp4		lunarlander1.mp4
lunarlander2.mp4		lunarlander2.mp4
xvfb		xvfb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages