Wrapping (multi-agents)



This library works with a common API to interact with the following RL multi-agent environments:

To operate with them and to support interoperability between these non-compatible interfaces, a wrapping mechanism is provided as shown in the diagram below


Environment wrappingEnvironment wrapping

Usage

# import the environment wrapper
from skrl.envs.wrappers.torch import wrap_env

# import a PettingZoo environment
from pettingzoo.sisl import multiwalker_v9

# load the environment
env = multiwalker_v9.parallel_env()

# wrap the environment
env = wrap_env(env)  # or 'env = wrap_env(env, wrapper="pettingzoo")'

API (PyTorch)

skrl.envs.wrappers.torch.wrap_env(env: Any, wrapper: str = 'auto', verbose: bool = True) Wrapper | MultiAgentEnvWrapper

Wrap an environment to use a common interface

Example:

>>> from skrl.envs.wrappers.torch import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)
Parameters:
  • env (gym.Env, gymnasium.Env, dm_env.Environment or VecTask) – The environment to be wrapped

  • wrapper (str, optional) –

    The type of wrapper to use (default: "auto"). If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

    Environment

    Wrapper tag

    OpenAI Gym

    "gym"

    Gymnasium

    "gymnasium"

    Petting Zoo

    "pettingzoo"

    DeepMind

    "dm"

    Robosuite

    "robosuite"

    Bi-DexHands

    "bidexhands"

    Isaac Gym preview 2

    "isaacgym-preview2"

    Isaac Gym preview 3

    "isaacgym-preview3"

    Isaac Gym preview 4

    "isaacgym-preview4"

    Omniverse Isaac Gym

    "omniverse-isaacgym"

    Isaac Sim (orbit)

    "isaac-orbit"

  • verbose (bool, optional) – Whether to print the wrapper type (default: True)

Raises:

ValueError – Unknown wrapper type

Returns:

Wrapped environment

Return type:

Wrapper or MultiAgentEnvWrapper


API (JAX)

skrl.envs.wrappers.jax.wrap_env(env: Any, wrapper: str = 'auto', verbose: bool = True) Wrapper | MultiAgentEnvWrapper

Wrap an environment to use a common interface

Example:

>>> from skrl.envs.wrappers.jax import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)
Parameters:
  • env (gym.Env, gymnasium.Env, dm_env.Environment or VecTask) – The environment to be wrapped

  • wrapper (str, optional) –

    The type of wrapper to use (default: "auto"). If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

    Environment

    Wrapper tag

    OpenAI Gym

    "gym"

    Gymnasium

    "gymnasium"

    Petting Zoo

    "pettingzoo"

    Bi-DexHands

    "bidexhands"

    Isaac Gym preview 2

    "isaacgym-preview2"

    Isaac Gym preview 3

    "isaacgym-preview3"

    Isaac Gym preview 4

    "isaacgym-preview4"

    Omniverse Isaac Gym

    "omniverse-isaacgym"

    Isaac Sim (orbit)

    "isaac-orbit"

  • verbose (bool, optional) – Whether to print the wrapper type (default: True)

Raises:

ValueError – Unknown wrapper type

Returns:

Wrapped environment

Return type:

Wrapper or MultiAgentEnvWrapper


Internal API (PyTorch)

class skrl.envs.wrappers.torch.MultiAgentEnvWrapper(env: Any)

Bases: object

__init__(env: Any) None

Base wrapper class for multi-agent environments

Parameters:

env (Any supported multi-agent environment) – The multi-agent environment to wrap

property device

The device used by the environment

If the wrapped environment does not have the device property, the value of this property will be "cuda:0" or "cpu" depending on the device availability

property possible_agents

A list of all possible_agents the environment could generate

action_space(agent: str) gym.Space

Action space

Parameters:

agent (str) – Name of the agent

Returns:

The action space for the specified agent

Return type:

gym.Space

property action_spaces: Mapping[str, gym.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() None

Close the environment

property num_agents: int

Number of agents

If the wrapped environment does not have the num_agents property, it will be set to 1

property num_envs: int

Number of environments

If the wrapped environment does not have the num_envs property, it will be set to 1

observation_space(agent: str) gym.Space

Observation space

Parameters:

agent (str) – Name of the agent

Returns:

The observation space for the specified agent

Return type:

gym.Space

property observation_spaces: Mapping[str, gym.Space]

Observation spaces

render(*args, **kwargs) None

Render the environment

reset() Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]

Reset the environment

Raises:

NotImplementedError – Not implemented

Returns:

Observation, info

Return type:

tuple of dictionaries of torch.Tensor and any other info

shared_observation_space(agent: str) gym.Space

Shared observation space

Parameters:

agent (str) – Name of the agent

Returns:

The shared observation space for the specified agent

Return type:

gym.Space

property shared_observation_spaces: Mapping[str, gym.Space]

Shared observation spaces

shared_state_space(agent: str) gym.Space

Shared state space

Parameters:

agent (str) – Name of the agent

Returns:

The shared state space for the specified agent

Return type:

gym.Space

property shared_state_spaces: Mapping[str, gym.Space]

Shared state spaces

An alias for the shared_observation_spaces property

state_space(agent: str) gym.Space

State space

Parameters:

agent (str) – Name of the agent

Returns:

The state space for the specified agent

Return type:

gym.Space

property state_spaces: Mapping[str, gym.Space]

State spaces

An alias for the observation_spaces property

step(actions: Mapping[str, torch.Tensor]) Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dictionary of torch.Tensor) – The actions to perform

Raises:

NotImplementedError – Not implemented

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dictionaries of torch.Tensor and any other info

class skrl.envs.wrappers.torch.BiDexHandsWrapper(env: Any)

Bases: MultiAgentEnvWrapper

__init__(env: Any) None

Bi-DexHands wrapper

Parameters:

env (Any supported Bi-DexHands environment) – The environment to wrap

property action_spaces: Mapping[str, gym.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

property observation_spaces: Mapping[str, gym.Space]

Observation spaces

reset() Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]

Reset the environment

Returns:

Observation, info

Return type:

tuple of dictionaries of torch.Tensor and any other info

property shared_observation_spaces: Mapping[str, gym.Space]

Shared observation spaces

step(actions: Mapping[str, torch.Tensor]) Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dictionary of torch.Tensor) – The actions to perform

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dictionaries torch.Tensor and any other info

class skrl.envs.wrappers.torch.PettingZooWrapper(env: Any)

Bases: MultiAgentEnvWrapper

__init__(env: Any) None

PettingZoo (parallel) environment wrapper

Parameters:

env (Any supported PettingZoo (parallel) environment) – The environment to wrap

property action_spaces: Mapping[str, gymnasium.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() None

Close the environment

property num_agents: int

Number of agents

property observation_spaces: Mapping[str, gymnasium.Space]

Observation spaces

render(*args, **kwargs) None

Render the environment

reset() Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]

Reset the environment

Returns:

Observation, info

Return type:

tuple of dictionaries of torch.Tensor and any other info

property shared_observation_spaces: Mapping[str, gymnasium.Space]

Shared observation spaces

step(actions: Mapping[str, torch.Tensor]) Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dictionary of torch.Tensor) – The actions to perform

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dictionaries torch.Tensor and any other info


Internal API (JAX)

class skrl.envs.wrappers.jax.MultiAgentEnvWrapper(env: Any)

Bases: object

__init__(env: Any) None

Base wrapper class for multi-agent environments

Parameters:

env (Any supported multi-agent environment) – The multi-agent environment to wrap

property device

The device used by the environment

If the wrapped environment does not have the device property, the value of this property will be "cuda:0" or "cpu" depending on the device availability

property possible_agents

A list of all possible_agents the environment could generate

action_space(agent: str) gym.Space

Action space

Parameters:

agent (str) – Name of the agent

Returns:

The action space for the specified agent

Return type:

gym.Space

property action_spaces: Mapping[str, gym.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() None

Close the environment

property num_agents: int

Number of agents

If the wrapped environment does not have the num_agents property, it will be set to 1

property num_envs: int

Number of environments

If the wrapped environment does not have the num_envs property, it will be set to 1

observation_space(agent: str) gym.Space

Observation space

Parameters:

agent (str) – Name of the agent

Returns:

The observation space for the specified agent

Return type:

gym.Space

property observation_spaces: Mapping[str, gym.Space]

Observation spaces

render(*args, **kwargs) None

Render the environment

reset() Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Reset the environment

Raises:

NotImplementedError – Not implemented

Returns:

Observation, info

Return type:

tuple of dict of np.ndarray or jax.Array and any other info

shared_observation_space(agent: str) gym.Space

Shared observation space

Parameters:

agent (str) – Name of the agent

Returns:

The shared observation space for the specified agent

Return type:

gym.Space

property shared_observation_spaces: Mapping[str, gym.Space]

Shared observation spaces

shared_state_space(agent: str) gym.Space

Shared state space

Parameters:

agent (str) – Name of the agent

Returns:

The shared state space for the specified agent

Return type:

gym.Space

property shared_state_spaces: Mapping[str, gym.Space]

Shared state spaces

An alias for the shared_observation_spaces property

state_space(agent: str) gym.Space

State space

Parameters:

agent (str) – Name of the agent

Returns:

The state space for the specified agent

Return type:

gym.Space

property state_spaces: Mapping[str, gym.Space]

State spaces

An alias for the observation_spaces property

step(actions: Mapping[str, ndarray | jax.Array]) Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dict of np.ndarray or jax.Array) – The actions to perform

Raises:

NotImplementedError – Not implemented

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dict of np.ndarray or jax.Array and any other info

class skrl.envs.wrappers.jax.BiDexHandsWrapper(env: Any)

Bases: MultiAgentEnvWrapper

__init__(env: Any) None

Bi-DexHands wrapper

Parameters:

env (Any supported Bi-DexHands environment) – The environment to wrap

property action_spaces: Mapping[str, gym.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

property observation_spaces: Mapping[str, gym.Space]

Observation spaces

reset() Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Reset the environment

Returns:

Observation, info

Return type:

tuple of dict of np.ndarray of jax.Array and any other info

property shared_observation_spaces: Mapping[str, gym.Space]

Shared observation spaces

step(actions: Mapping[str, ndarray | jax.Array]) Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dict of nd.ndarray or jax.Array) – The actions to perform

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dict of nd.ndarray or jax.Array and any other info

class skrl.envs.wrappers.jax.PettingZooWrapper(env: Any)

Bases: MultiAgentEnvWrapper

__init__(env: Any) None

PettingZoo (parallel) environment wrapper

Parameters:

env (Any supported PettingZoo (parallel) environment) – The environment to wrap

property action_spaces: Mapping[str, gymnasium.Space]

Action spaces

property agents: Sequence[str]

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() None

Close the environment

property num_agents: int

Number of agents

property observation_spaces: Mapping[str, gymnasium.Space]

Observation spaces

render(*args, **kwargs) None

Render the environment

reset() Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Reset the environment

Returns:

Observation, info

Return type:

tuple of dict of np.ndarray or jax.Array and any other info

property shared_observation_spaces: Mapping[str, gymnasium.Space]

Shared observation spaces

step(actions: Mapping[str, ndarray | jax.Array]) Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, ndarray | jax.Array], Mapping[str, Any]]

Perform a step in the environment

Parameters:

actions (dict of np.ndarray or jax.Array) – The actions to perform

Returns:

Observation, reward, terminated, truncated, info

Return type:

tuple of dict of np.ndarray or jax.Array and any other info