Sequential trainer

Train agents sequentially (i.e., one after the other in each interaction with the environment).



Concept

Sequential trainerSequential trainer

Usage

from skrl.trainers.torch import SequentialTrainer

# assuming there is an environment called 'env'
# and an agent or a list of agents called 'agents'

# create a sequential trainer
cfg = {"timesteps": 50000, "headless": False}
trainer = SequentialTrainer(env=env, agents=agents, cfg=cfg)

# train the agent(s)
trainer.train()

# evaluate the agent(s)
trainer.eval()

Configuration

SEQUENTIAL_TRAINER_DEFAULT_CONFIG = {
    "timesteps": 100000,            # number of timesteps to train for
    "headless": False,              # whether to use headless mode (no rendering)
    "disable_progressbar": False,   # whether to disable the progressbar. If None, disable on non-TTY
    "close_environment_at_exit": True,   # whether to close the environment on normal program termination
}

API (PyTorch)

skrl.trainers.torch.sequential.SEQUENTIAL_TRAINER_DEFAULT_CONFIG

alias of {‘close_environment_at_exit’: True, ‘disable_progressbar’: False, ‘headless’: False, ‘timesteps’: 100000}

class skrl.trainers.torch.sequential.SequentialTrainer(env: Wrapper, agents: Agent | List[Agent], agents_scope: List[int] | None = None, cfg: dict | None = None)

Bases: Trainer

__init__(env: Wrapper, agents: Agent | List[Agent], agents_scope: List[int] | None = None, cfg: dict | None = None) None

Sequential trainer

Train agents sequentially (i.e., one after the other in each interaction with the environment)

Parameters:
  • env (skrl.envs.wrappers.torch.Wrapper) – Environment to train on

  • agents (Union[Agent, List[Agent]]) – Agents to train

  • agents_scope (tuple or list of int, optional) – Number of environments for each agent to train on (default: None)

  • cfg (dict, optional) – Configuration dictionary (default: None). See SEQUENTIAL_TRAINER_DEFAULT_CONFIG for default values

eval() None

Evaluate the agents sequentially

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

multi_agent_eval() None

Evaluate multi-agents

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

multi_agent_train() None

Train multi-agents

This method executes the following steps in loop:

  • Pre-interaction

  • Compute actions

  • Interact with the environments

  • Render scene

  • Record transitions

  • Post-interaction

  • Reset environments

single_agent_eval() None

Evaluate agent

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

single_agent_train() None

Train agent

This method executes the following steps in loop:

  • Pre-interaction

  • Compute actions

  • Interact with the environments

  • Render scene

  • Record transitions

  • Post-interaction

  • Reset environments

train() None

Train the agents sequentially

This method executes the following steps in loop:

  • Pre-interaction (sequentially)

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Record transitions (sequentially)

  • Post-interaction (sequentially)

  • Reset environments


API (JAX)

skrl.trainers.jax.sequential.SEQUENTIAL_TRAINER_DEFAULT_CONFIG

alias of {‘close_environment_at_exit’: True, ‘disable_progressbar’: False, ‘headless’: False, ‘timesteps’: 100000}

class skrl.trainers.jax.sequential.SequentialTrainer(env: Wrapper, agents: Agent | List[Agent], agents_scope: List[int] | None = None, cfg: dict | None = None)

Bases: Trainer

__init__(env: Wrapper, agents: Agent | List[Agent], agents_scope: List[int] | None = None, cfg: dict | None = None) None

Sequential trainer

Train agents sequentially (i.e., one after the other in each interaction with the environment)

Parameters:
  • env (skrl.envs.wrappers.jax.Wrapper) – Environment to train on

  • agents (Union[Agent, List[Agent]]) – Agents to train

  • agents_scope (tuple or list of int, optional) – Number of environments for each agent to train on (default: None)

  • cfg (dict, optional) – Configuration dictionary (default: None). See SEQUENTIAL_TRAINER_DEFAULT_CONFIG for default values

eval() None

Evaluate the agents sequentially

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

multi_agent_eval() None

Evaluate multi-agents

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

multi_agent_train() None

Train multi-agents

This method executes the following steps in loop:

  • Pre-interaction

  • Compute actions

  • Interact with the environments

  • Render scene

  • Record transitions

  • Post-interaction

  • Reset environments

single_agent_eval() None

Evaluate agent

This method executes the following steps in loop:

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Reset environments

single_agent_train() None

Train agent

This method executes the following steps in loop:

  • Pre-interaction

  • Compute actions

  • Interact with the environments

  • Render scene

  • Record transitions

  • Post-interaction

  • Reset environments

train() None

Train the agents sequentially

This method executes the following steps in loop:

  • Pre-interaction (sequentially)

  • Compute actions (sequentially)

  • Interact with the environments

  • Render scene

  • Record transitions (sequentially)

  • Post-interaction (sequentially)

  • Reset environments