Parallel trainer¶
Train agents in parallel using multiple processes.
Concept¶
Usage¶
Note
Each process adds a GPU memory overhead (~1GB, although it can be much higher) due to PyTorch’s CUDA kernels. See PyTorch Issue #12873 for more details
Note
At the moment, only simultaneous training and evaluation of agents with local memory (no memory sharing) is implemented
from skrl.trainers.torch import ParallelTrainer
# assuming there is an environment called 'env'
# and an agent or a list of agents called 'agents'
# create a sequential trainer
cfg = {"timesteps": 50000, "headless": False}
trainer = ParallelTrainer(env=env, agents=agents, cfg=cfg)
# train the agent(s)
trainer.train()
# evaluate the agent(s)
trainer.eval()
Configuration¶
PARALLEL_TRAINER_DEFAULT_CONFIG = {
"timesteps": 100000, # number of timesteps to train for
"headless": False, # whether to use headless mode (no rendering)
"disable_progressbar": False, # whether to disable the progressbar. If None, disable on non-TTY
"close_environment_at_exit": True, # whether to close the environment on normal program termination
"environment_info": "episode", # key used to get and log environment info
"stochastic_evaluation": False, # whether to use actions rather than (deterministic) mean actions during evaluation
}
API (PyTorch)¶
- skrl.trainers.torch.parallel.PARALLEL_TRAINER_DEFAULT_CONFIG¶
alias of {‘close_environment_at_exit’: True, ‘disable_progressbar’: False, ‘environment_info’: ‘episode’, ‘headless’: False, ‘stochastic_evaluation’: False, ‘timesteps’: 100000}
- class skrl.trainers.torch.parallel.ParallelTrainer(env: Wrapper, agents: Agent | List[Agent], agents_scope: List[int] | None = None, cfg: dict | None = None)¶
Bases:
Trainer
Parallel trainer
Train agents in parallel using multiple processes
- Parameters:
env (skrl.envs.wrappers.torch.Wrapper) – Environment to train on
agents_scope (tuple or list of int, optional) – Number of environments for each agent to train on (default:
None
)cfg (dict, optional) – Configuration dictionary (default:
None
). See PARALLEL_TRAINER_DEFAULT_CONFIG for default values
- eval() None ¶
Evaluate the agents sequentially
This method executes the following steps in loop:
Compute actions (in parallel)
Interact with the environments
Render scene
Reset environments
- multi_agent_eval() None ¶
Evaluate multi-agents
This method executes the following steps in loop:
Compute actions (sequentially)
Interact with the environments
Render scene
Reset environments
- multi_agent_train() None ¶
Train multi-agents
This method executes the following steps in loop:
Pre-interaction
Compute actions
Interact with the environments
Render scene
Record transitions
Post-interaction
Reset environments
- single_agent_eval() None ¶
Evaluate agent
This method executes the following steps in loop:
Compute actions (sequentially)
Interact with the environments
Render scene
Reset environments