Model instantiators

Basic usage

TODO: add snippet

API

class skrl.utils.model_instantiators.Shape(value)

Enum to select the shape of the model’s inputs and outputs

property ONE

Flag to indicate that the model’s input/output has shape (1,)

This flag is useful for the definition of critic models, where the critic’s output is a scalar

property STATES

Flag to indicate that the model’s input/output is the state (observation) space of the environment It is an alias for OBSERVATIONS

property OBSERVATIONS

Flag to indicate that the model’s input/output is the observation space of the environment

property ACTIONS

Flag to indicate that the model’s input/output is the action space of the environment

property STATES_ACTIONS

Flag to indicate that the model’s input/output is the combination (concatenation) of the state (observation) and action spaces of the environment

skrl.utils.model_instantiators.categorical_model(observation_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, action_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, device: Optional[Union[str, torch.device]] = None, unnormalized_log_prob: bool = False, input_shape: skrl.utils.model_instantiators.Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: skrl.utils.model_instantiators.Shape = Shape.ACTIONS, output_activation: Optional[str] = None) skrl.models.torch.base.Model

Instantiate a categorical model

Parameters
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a torch tensor is or will be allocated (default: None). If None, the device will be either "cuda:0" if available or "cpu"

  • unnormalized_log_prob (bool, optional) – Flag to indicate how to be interpreted the model’s output (default: True). If True, the model’s output is interpreted as unnormalized log probabilities (it can be any real number), otherwise as normalized probabilities (the output must be non-negative, finite and have a non-zero sum)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: None)

Returns

Categorical model instance

Return type

Model

skrl.utils.model_instantiators.deterministic_model(observation_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, action_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, device: Optional[Union[str, torch.device]] = None, clip_actions: bool = False, input_shape: skrl.utils.model_instantiators.Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: skrl.utils.model_instantiators.Shape = Shape.ACTIONS, output_activation: Optional[str] = 'tanh', output_scale: float = 1.0) skrl.models.torch.base.Model

Instantiate a deterministic model

Parameters
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a torch tensor is or will be allocated (default: None). If None, the device will be either "cuda:0" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped to the action space (default: False)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns

Deterministic model instance

Return type

Model

skrl.utils.model_instantiators.gaussian_model(observation_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, action_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, device: Optional[Union[str, torch.device]] = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = - 20, max_log_std: float = 2, input_shape: skrl.utils.model_instantiators.Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: skrl.utils.model_instantiators.Shape = Shape.ACTIONS, output_activation: Optional[str] = 'tanh', output_scale: float = 1.0) skrl.models.torch.base.Model

Instantiate a Gaussian model

Parameters
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a torch tensor is or will be allocated (default: None). If None, the device will be either "cuda:0" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)

  • clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)

  • min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)

  • max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns

Gaussian model instance

Return type

Model

skrl.utils.model_instantiators.multivariate_gaussian_model(observation_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, action_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, device: Optional[Union[str, torch.device]] = None, clip_actions: bool = False, clip_log_std: bool = True, min_log_std: float = - 20, max_log_std: float = 2, input_shape: skrl.utils.model_instantiators.Shape = Shape.STATES, hiddens: list = [256, 256], hidden_activation: list = ['relu', 'relu'], output_shape: skrl.utils.model_instantiators.Shape = Shape.ACTIONS, output_activation: Optional[str] = 'tanh', output_scale: float = 1.0) skrl.models.torch.base.Model

Instantiate a multivariate Gaussian model

Parameters
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a torch tensor is or will be allocated (default: None). If None, the device will be either "cuda:0" if available or "cpu"

  • clip_actions (bool, optional) – Flag to indicate whether the actions should be clipped (default: False)

  • clip_log_std (bool, optional) – Flag to indicate whether the log standard deviations should be clipped (default: True)

  • min_log_std (float, optional) – Minimum value of the log standard deviation (default: -20)

  • max_log_std (float, optional) – Maximum value of the log standard deviation (default: 2)

  • input_shape (Shape, optional) – Shape of the input (default: Shape.STATES)

  • hiddens (int or list of ints) – Number of hidden units in each hidden layer

  • hidden_activation (list of strings) – Activation function for each hidden layer (default: “relu”).

  • output_shape (Shape, optional) – Shape of the output (default: Shape.ACTIONS)

  • output_activation (str or None, optional) – Activation function for the output layer (default: “tanh”)

  • output_scale (float, optional) – Scale of the output layer (default: 1.0). If None, the output layer will not be scaled

Returns

Multivariate Gaussian model instance

Return type

Model

skrl.utils.model_instantiators.shared_model(observation_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, action_space: Optional[Union[int, Tuple[int], gym.spaces.space.Space, gymnasium.spaces.space.Space]] = None, device: Optional[Union[str, torch.device]] = None, structure: str = '', roles: Sequence[str] = [], parameters: Sequence[Mapping[str, Any]] = []) skrl.models.torch.base.Model

Instantiate a shared model

Parameters
  • observation_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Observation/state space or shape (default: None). If it is not None, the num_observations property will contain the size of that space

  • action_space (int, tuple or list of integers, gym.Space, gymnasium.Space or None, optional) – Action space or shape (default: None). If it is not None, the num_actions property will contain the size of that space

  • device (str or torch.device, optional) – Device on which a torch tensor is or will be allocated (default: None). If None, the device will be either "cuda:0" if available or "cpu"

  • structure (str, optional) – Shared model structure (default: ""). Note: this parameter is ignored for the moment

  • roles (sequence of strings, optional) – Organized list of model roles (default: [])

  • parameters (sequence of dict, optional) – Organized list of model instantiator parameters (default: [])

Returns

Shared model instance

Return type

Model