Sequential trainer¶
Train agents sequentially (i.e., one after the other in each interaction with the environment).
Concept¶
Usage¶
from skrl.trainers.torch import SequentialTrainer
# assuming there is an environment called 'env'
# and an agent or a list of agents called 'agents'
# create a sequential trainer
cfg = {"timesteps": 50000, "headless": False}
trainer = SequentialTrainer(env=env, agents=agents, cfg=cfg)
# train the agent(s)
trainer.train()
# evaluate the agent(s)
trainer.eval()
from skrl.trainers.jax import SequentialTrainer
# assuming there is an environment called 'env'
# and an agent or a list of agents called 'agents'
# create a sequential trainer
cfg = {"timesteps": 50000, "headless": False}
trainer = SequentialTrainer(env=env, agents=agents, cfg=cfg)
# train the agent(s)
trainer.train()
# evaluate the agent(s)
trainer.eval()
from skrl.trainers.warp import SequentialTrainer
# assuming there is an environment called 'env'
# and an agent or a list of agents called 'agents'
# create a sequential trainer
cfg = {"timesteps": 50000, "headless": False}
trainer = SequentialTrainer(env=env, agents=agents, cfg=cfg)
# train the agent(s)
trainer.train()
# evaluate the agent(s)
trainer.eval()
Configuration¶
Dataclass |
|
|
|
|---|---|---|---|
|
|
API¶
PyTorch¶
Configuration for the sequential trainer. |
|
Sequential trainer. |
- class skrl.trainers.torch.sequential.SequentialTrainerCfg(*, timesteps: int = 100000, headless: bool = False, render_interval: int = 1, disable_progressbar: bool | None = False, close_environment_at_exit: bool = True, environment_info: str = 'episode', stochastic_evaluation: bool = False)[source]¶
Bases:
TrainerCfgConfiguration for the sequential trainer.
Methods:
Attributes:
Whether to close the environment on normal program termination.
Whether to disable the progressbar.
Key used to get and log environment info.
Whether to run in headless mode (do not call
env.render()).Interval (in timesteps) for rendering the environments.
Whether to use actions rather than (deterministic) mean actions during evaluation.
Number of timesteps to train/evaluate for.
- close_environment_at_exit: bool = True¶
Whether to close the environment on normal program termination.
- disable_progressbar: bool | None = False¶
Whether to disable the progressbar. If None, disable on non-TTY.
- render_interval: int = 1¶
Interval (in timesteps) for rendering the environments. Only effective if
headlessis False.
- class skrl.trainers.torch.sequential.SequentialTrainer(*, env: Wrapper | MultiAgentEnvWrapper, agents: Agent | MultiAgent | list[Agent] | list[MultiAgent], scopes: list[int] | None = None, cfg: SequentialTrainerCfg | dict = {})[source]¶
Bases:
TrainerSequential trainer.
Train agents sequentially, i.e., one after the other, in each interaction with the environment.
- Parameters:
env – Environment to train/evaluate on.
agents – Agent(s) to train/evaluate.
scopes – Number of environments for each simultaneous agent to train/evaluate on.
cfg – Configuration dictionary.
Methods:
JAX¶
Configuration for the sequential trainer. |
|
Sequential trainer. |
- class skrl.trainers.jax.sequential.SequentialTrainerCfg(*, timesteps: int = 100000, headless: bool = False, render_interval: int = 1, disable_progressbar: bool | None = False, close_environment_at_exit: bool = True, environment_info: str = 'episode', stochastic_evaluation: bool = False)[source]¶
Bases:
TrainerCfgConfiguration for the sequential trainer.
Methods:
Attributes:
Whether to close the environment on normal program termination.
Whether to disable the progressbar.
Key used to get and log environment info.
Whether to run in headless mode (do not call
env.render()).Interval (in timesteps) for rendering the environments.
Whether to use actions rather than (deterministic) mean actions during evaluation.
Number of timesteps to train/evaluate for.
- close_environment_at_exit: bool = True¶
Whether to close the environment on normal program termination.
- disable_progressbar: bool | None = False¶
Whether to disable the progressbar. If None, disable on non-TTY.
- render_interval: int = 1¶
Interval (in timesteps) for rendering the environments. Only effective if
headlessis False.
- class skrl.trainers.jax.sequential.SequentialTrainer(*, env: Wrapper | MultiAgentEnvWrapper, agents: Agent | MultiAgent | list[Agent] | list[MultiAgent], scopes: list[int] | None = None, cfg: SequentialTrainerCfg | dict = {})[source]¶
Bases:
TrainerSequential trainer.
Train agents sequentially, i.e., one after the other, in each interaction with the environment.
- Parameters:
env – Environment to train/evaluate on.
agents – Agent(s) to train/evaluate.
scopes – Number of environments for each simultaneous agent to train/evaluate on.
cfg – Configuration dictionary.
Methods:
Warp¶
Configuration for the sequential trainer. |
|
Sequential trainer. |