Model instantiators¶
Utilities for quickly creating model instances.
Models |
|
|
---|---|---|
Tabular model (discrete domain) |
\(\square\) |
\(\square\) |
Categorical model (discrete domain) |
\(\blacksquare\) |
\(\blacksquare\) |
Gaussian model (continuous domain) |
\(\blacksquare\) |
\(\blacksquare\) |
Multivariate Gaussian model (continuous domain) |
\(\blacksquare\) |
\(\square\) |
Deterministic model (continuous domain) |
\(\blacksquare\) |
\(\blacksquare\) |
\(\blacksquare\) |
\(\square\) |
API (PyTorch)¶
- class skrl.utils.model_instantiators.torch.Shape(value)¶
Enum to select the shape of the model’s inputs and outputs
- property ONE¶
Flag to indicate that the model’s input/output has shape (1,)
This flag is useful for the definition of critic models, where the critic’s output is a scalar
- property STATES¶
Flag to indicate that the model’s input/output is the state (observation) space of the environment It is an alias for
OBSERVATIONS
- property OBSERVATIONS¶
Flag to indicate that the model’s input/output is the observation space of the environment
- property ACTIONS¶
Flag to indicate that the model’s input/output is the action space of the environment
- property STATES_ACTIONS¶
Flag to indicate that the model’s input/output is the combination (concatenation) of the state (observation) and action spaces of the environment
- skrl.utils.model_instantiators.torch.categorical_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, unnormalized_log_prob: bool = True, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = None) Model ¶
Instantiate a categorical model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
unnormalized_log_prob (bool, optional) – Flag to indicate how to be interpreted the model’s output (default: True). If True, the model’s output is interpreted as unnormalized log probabilities (it can be any real number), otherwise as normalized probabilities (the output must be non-negative, finite and have a non-zero sum)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: None)
- Returns:
Categorical model instance
- Return type:
- skrl.utils.model_instantiators.torch.deterministic_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model ¶
Instantiate a deterministic model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped to the action space (default: False)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)
output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled
- Returns:
Deterministic model instance
- Return type:
- skrl.utils.model_instantiators.torch.gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model ¶
Instantiate a Gaussian model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)
clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)
min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)
max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)
initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)
output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled
- Returns:
Gaussian model instance
- Return type:
- skrl.utils.model_instantiators.torch.multivariate_gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model ¶
Instantiate a multivariate Gaussian model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)
clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)
min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)
max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)
initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)
output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled
- Returns:
Multivariate Gaussian model instance
- Return type:
Instantiate a shared model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
structure (str, optional) – Shared model structure (default:
""
). Note: this parameter is ignored for the momentroles (sequence of strings, optional) – Organized list of model roles (default:
[]
)parameters (sequence of dict, optional) – Organized list of model instantiator parameters (default:
[]
)
- Returns:
Shared model instance
- Return type:
API (JAX)¶
- class skrl.utils.model_instantiators.jax.Shape(value)¶
Enum to select the shape of the model’s inputs and outputs
- property ONE¶
Flag to indicate that the model’s input/output has shape (1,)
This flag is useful for the definition of critic models, where the critic’s output is a scalar
- property STATES¶
Flag to indicate that the model’s input/output is the state (observation) space of the environment It is an alias for
OBSERVATIONS
- property OBSERVATIONS¶
Flag to indicate that the model’s input/output is the observation space of the environment
- property ACTIONS¶
Flag to indicate that the model’s input/output is the action space of the environment
- property STATES_ACTIONS¶
Flag to indicate that the model’s input/output is the combination (concatenation) of the state (observation) and action spaces of the environment
- skrl.utils.model_instantiators.jax.categorical_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, unnormalized_log_prob: bool = True, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = None) Model ¶
Instantiate a categorical model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
unnormalized_log_prob (bool, optional) – Flag to indicate how to be interpreted the model’s output (default: True). If True, the model’s output is interpreted as unnormalized log probabilities (it can be any real number), otherwise as normalized probabilities (the output must be non-negative, finite and have a non-zero sum)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: None)
- Returns:
Categorical model instance
- Return type:
- skrl.utils.model_instantiators.jax.deterministic_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, clip_actions: bool = False, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model ¶
Instantiate a deterministic model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped to the action space (default: False)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)
output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled
- Returns:
Deterministic model instance
- Return type:
- skrl.utils.model_instantiators.jax.gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model ¶
Instantiate a Gaussian model
- Parameters:
observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space
action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space
device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)
clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)
min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)
max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)
initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)
input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)
hiddens (int or list of ints) – Number of hidden units in each hidden layer
hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).
output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)
output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)
output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled
- Returns:
Gaussian model instance
- Return type: