Model instantiators#

Utilities for quickly creating model instances.



Models

    pytorch    

    jax    

Tabular model (discrete domain)

\(\square\)

\(\square\)

Categorical model (discrete domain)

\(\blacksquare\)

\(\blacksquare\)

Gaussian model (continuous domain)

\(\blacksquare\)

\(\blacksquare\)

Multivariate Gaussian model (continuous domain)

\(\blacksquare\)

\(\square\)

Deterministic model (continuous domain)

\(\blacksquare\)

\(\blacksquare\)

Shared model

\(\blacksquare\)

\(\square\)


API (PyTorch)#

class skrl.utils.model_instantiators.torch.Shape(value)#

Enum to select the shape of the model’s inputs and outputs

property ONE#

Flag to indicate that the model’s input/output has shape (1,)

This flag is useful for the definition of critic models, where the critic’s output is a scalar

property STATES#

Flag to indicate that the model’s input/output is the state (observation) space of the environment It is an alias for OBSERVATIONS

property OBSERVATIONS#

Flag to indicate that the model’s input/output is the observation space of the environment

property ACTIONS#

Flag to indicate that the model’s input/output is the action space of the environment

property STATES_ACTIONS#

Flag to indicate that the model’s input/output is the combination (concatenation) of the state (observation) and action spaces of the environment

skrl.utils.model_instantiators.torch.categorical_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, unnormalized_log_prob: bool = True, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = None) Model#

Instantiate a categorical model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • unnormalized_log_prob (bool, optional) – Flag to indicate how to be interpreted the model’s output (default: True). If True, the model’s output is interpreted as unnormalized log probabilities (it can be any real number), otherwise as normalized probabilities (the output must be non-negative, finite and have a non-zero sum)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: None)

Returns:

Categorical model instance

Return type:

Model

skrl.utils.model_instantiators.torch.deterministic_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model#

Instantiate a deterministic model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped to the action space (default: False)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns:

Deterministic model instance

Return type:

Model

skrl.utils.model_instantiators.torch.gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model#

Instantiate a Gaussian model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)

  • clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)

  • min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)

  • max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)

  • initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns:

Gaussian model instance

Return type:

Model

skrl.utils.model_instantiators.torch.multivariate_gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model#

Instantiate a multivariate Gaussian model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)

  • clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)

  • min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)

  • max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)

  • initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns:

Multivariate Gaussian model instance

Return type:

Model

skrl.utils.model_instantiators.torch.shared_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | torch.device | None = None, structure: str = '', roles: Sequence[str] = [], parameters: Sequence[Mapping[str, Any]] = []) Model#

Instantiate a shared model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • structure (str, optional) – Shared model structure (default: ""). Note: this parameter is ignored for the moment

  • roles (sequence of strings, optional) – Organized list of model roles (default: [])

  • parameters (sequence of dict, optional) – Organized list of model instantiator parameters (default: [])

Returns:

Shared model instance

Return type:

Model


API (JAX)#

class skrl.utils.model_instantiators.jax.Shape(value)#

Enum to select the shape of the model’s inputs and outputs

property ONE#

Flag to indicate that the model’s input/output has shape (1,)

This flag is useful for the definition of critic models, where the critic’s output is a scalar

property STATES#

Flag to indicate that the model’s input/output is the state (observation) space of the environment It is an alias for OBSERVATIONS

property OBSERVATIONS#

Flag to indicate that the model’s input/output is the observation space of the environment

property ACTIONS#

Flag to indicate that the model’s input/output is the action space of the environment

property STATES_ACTIONS#

Flag to indicate that the model’s input/output is the combination (concatenation) of the state (observation) and action spaces of the environment

skrl.utils.model_instantiators.jax.categorical_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, unnormalized_log_prob: bool = True, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = None) Model#

Instantiate a categorical model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • unnormalized_log_prob (bool, optional) – Flag to indicate how to be interpreted the model’s output (default: True). If True, the model’s output is interpreted as unnormalized log probabilities (it can be any real number), otherwise as normalized probabilities (the output must be non-negative, finite and have a non-zero sum)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: None)

Returns:

Categorical model instance

Return type:

Model

skrl.utils.model_instantiators.jax.deterministic_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, clip_actions: bool = False, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model#

Instantiate a deterministic model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped to the action space (default: False)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns:

Deterministic model instance

Return type:

Model

skrl.utils.model_instantiators.jax.gaussian_model(observation_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, action_space: int | Tuple[int] | gym.Space | gymnasium.Space | None = None, device: str | jax.Device | None = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = -20, max_log_std: float = 2, initial_log_std: float = 0, input_shape: Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: Shape = Shape.ACTIONS, output_activation: str | None = 'tanh', output_scale: float = 1.0) Model#

Instantiate a Gaussian model

Parameters:
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default: None). If None, the device will be either "cuda" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)

  • clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)

  • min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)

  • max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)

  • initial_log_std (float, optional) – Initial value for the log standard deviation (default: 0)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns:

Gaussian model instance

Return type:

Model