Runners ======= A "runner" manages the interaction between the Environment and the Agent. TensorForce comes with ready-to-use runners. Of course, you can implement your own runners, too. If you are not using simulation environments, the runner is simply your application code using the Agent API. > Environment <-> Runner <-> Agent <-> Model Ready-to-use runners -------------------- We implemented a standard runner, a threaded runner (for real-time interaction e.g. with OpenAI Universe) and a distributed runner for A3C variants. ### Runner This is the standard runner. It requires an agent and an environment for initialization: ```python from tensorforce.execution import Runner runner = Runner( agent = agent, # Agent object environment = env # Environment object ) ``` A reinforcement learning agent observes states from the environment, selects actions and collect experience which is used to update its model and improve action selection. You can get information about our ready-to-use agents [here](agents_models.html). The environment object is either the "real" environment, or a proxy which fulfills the actions selected by the agent in the real world. You can find information about environments [here](environments.html). The runner is started with the `Runner.run(...)` method: ```python runner.run( episodes = int, # number of episodes to run max_timesteps = int, # maximum timesteps per episode episode_finished = object, # callback function called when episode is finished ) runner.close() ``` You can use the episode\_finished callback for printing performance feedback: ```python def episode_finished(r): if r.episode % 10 == 0: print("Finished episode {ep} after {ts} timesteps".format(ep=r.episode + 1, ts=r.timestep + 1)) print("Episode reward: {}".format(r.episode_rewards[-1])) print("Average of last 10 rewards: {}".format(np.mean(r.episode_rewards[-10:]))) return True ``` #### Using the Runner Here is some example code for using the runner (without preprocessing). ```python import logging from tensorforce.contrib.openai_gym import OpenAIGym from tensorforce.agents import DQNAgent from tensorforce.execution import Runner def main(): gym_id = 'CartPole-v0' max_episodes = 10000 max_timesteps = 1000 env = OpenAIGym(gym_id) network_spec = [ dict(type='dense', size=32, activation='tanh'), dict(type='dense', size=32, activation='tanh') ] agent = DQNAgent( states_spec=env.states, actions_spec=env.actions, network_spec=network_spec, batch_size=64 ) runner = Runner(agent, env) report_episodes = 10 def episode_finished(r): if r.episode % report_episodes == 0: logging.info("Finished episode {ep} after {ts} timesteps".format(ep=r.episode, ts=r.timestep)) logging.info("Episode reward: {}".format(r.episode_rewards[-1])) logging.info("Average of last 100 rewards: {}".format(sum(r.episode_rewards[-100:]) / 100)) return True print("Starting {agent} for Environment '{env}'".format(agent=agent, env=env)) runner.run(max_episodes, max_timesteps, episode_finished=episode_finished) runner.close() print("Learning finished. Total episodes: {ep}".format(ep=runner.episode)) if __name__ == '__main__': main() ``` Building your own runner ------------------------ There are three mandatory tasks any runner implements: Obtaining an action from the agent, passing it to the environment, and passing the resulting observation to the agent. ```python # Get action action = agent.act(state) # Execute action in the environment state, reward, terminal_state = environment.execute(action) # Pass observation to the agent agent.observe(state, action, reward, terminal_state) ``` The key idea here is the separation of concerns. External code should not need to manage batches or remember network features, this is that the agent is for. Conversely, an agent need not concern itself with how a model is implemented and the API should facilitate easy combination of different agents and models. If you would like to build your own runner, it is probably a good idea to take a look at the [source code of our Runner class](https://github.com/reinforceio/tensorforce/blob/master/tensorforce/execution/runner.py).