qualia2.rl package

Submodules

qualia2.rl.memory module

class qualia2.rl.memory.Experience(state, next, reward, action, done)

Bases: tuple

property action

Alias for field number 3

property done

Alias for field number 4

property next

Alias for field number 1

property reward

Alias for field number 2

property state

Alias for field number 0

class qualia2.rl.memory.PrioritizedMemory(maxlen, alpha=0.6, beta_start=0.4, beta_frames=100000)[source]

Bases: collections.deque

append(experience)[source]

Add an element to the right side of the deque.

beta()[source]
sample(batch_size, steps=1)[source]
update_priorities(idx, priorities)[source]
class qualia2.rl.memory.ReplayMemory(maxlen)[source]

Bases: collections.deque

sample(batch_size, steps=1)[source]

qualia2.rl.rl_core module

class qualia2.rl.rl_core.ActorCriticAgent(actor, critic)[source]

Bases: qualia2.rl.rl_core.BaseAgent

Base class for actor-critic based agents. Some methods needs to be over ridden. Args:

actor (Module): actor network critic (Module): critic network

classmethod init(env, actor, critic)[source]
load(filename)[source]
load_actor(filename)[source]
policy(observation, *args, eps=None)[source]
save(filename)[source]
set_actor_optim(optim, **kwargs)[source]
set_critic_optim(optim, **kwargs)[source]
class qualia2.rl.rl_core.BaseAgent(actions, model)[source]

Bases: object

Base class for agents. Some methods needs to be over ridden. Args:

actions (list): list of actions model (Module): model network

get_train_signal(experience, gamma=0.9)[source]
classmethod init(env, model)[source]
load(filename)[source]
play(env, render=True, filename=None)[source]
policy(observation, *args)[source]
save(filename)[source]
set_optim(optim, **kwargs)[source]
update(state_action_value, target_action_value, loss_func=<qualia2.functions.loss.MSELoss object>)[source]
update_target_model()[source]
class qualia2.rl.rl_core.Env(env)[source]

Bases: object

Wrapper class of gym.env for reinforcement learning. Args:

env (str): task name

property action_space
animate(frames, filename)[source]
close()[source]
property max_steps
property observation_space
render(**kwargs)[source]
reset()[source]
reward_transformer(reward)[source]
show(filename=None)[source]
state_transformer(state)[source]
step(action)[source]
class qualia2.rl.rl_core.PolicyAgent(actions, model)[source]

Bases: qualia2.rl.rl_core.BaseAgent

Base class for policy based agents. Some methods needs to be over ridden.

policy(observation, *args, eps=None)[source]
class qualia2.rl.rl_core.ValueAgent(actions, model)[source]

Bases: qualia2.rl.rl_core.BaseAgent

Base class for value based agents. Some methods needs to be over ridden.

policy(observation, *args, eps=None)[source]

qualia2.rl.rl_util module

class qualia2.rl.rl_util.Trainer(memory, batch, capacity, gamma)[source]

Bases: object

Trainer for RL agent

Args:

memory (deque): replay memory object capacity (int): capacity of the memory batch (int): batch size for training gamma (int): gamma value

after_episode(episode, steps, agent, loss, reward, filename=None)[source]
after_train()[source]
before_episode(env, agent)[source]
before_train(env, agent)[source]
property defaults
experience_replay(episode, step_count, agent)[source]
load_settings(defaults)[source]
plot(filename=None)[source]
train(env, agent, episodes=200, render=False, filename=None)[source]
train_routine(env, agent, episodes=200, render=False, filename=None)[source]

Module contents