Wrapping (multi-agents)



This library works with a common API to interact with the following RL multi-agent environments:

To operate with them, out-of-the-box, and to support interoperability between these non-compatible interfaces, a wrapping mechanism is provided as shown in the following image.


Environment wrapping Environment wrapping

Usage

The following snippets show how to wrap multi-agent environments from the different supported libraries:


# import the environment wrapper and loader
from skrl.envs.wrappers.torch import wrap_env
from skrl.envs.loaders.torch import load_isaaclab_env

# load the environment
env = load_isaaclab_env(task_name="Isaac-Cart-Double-Pendulum-Direct-v0")

# wrap the environment
env = wrap_env(env)  # or 'env = wrap_env(env, wrapper="isaaclab-multi-agent")'

API


PyTorch

skrl.envs.wrappers.torch.wrap_env(env: Any, wrapper: Literal['auto', 'gym', 'gymnasium', 'isaaclab', 'isaaclab-single-agent', 'isaaclab-multi-agent', 'mani-skill', 'pettingzoo', 'playground'] = 'auto', verbose: bool = True) Wrapper | MultiAgentEnvWrapper[source]

Wrap an environment to use a common interface.

Example:

>>> from skrl.envs.wrappers.torch import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)
Parameters:
  • env – The environment instance to be wrapped.

  • wrapper

    The type of wrapper to use. If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

    Single-agent environments

    Environment

    Wrapper tag

    OpenAI Gym

    "gym"

    Gymnasium

    "gymnasium"

    Isaac Lab

    "isaaclab" ("isaaclab-single-agent")

    ManiSkill

    "mani-skill"

    MuJoCo Playground

    "playground"

    Multi-agent environments

    Environment

    Wrapper tag

    PettingZoo

    "pettingzoo"

    Isaac Lab

    "isaaclab" ("isaaclab-multi-agent")

  • verbose – Whether to print verbose information about the environment and the wrapper.

Returns:

Wrapped environment instance.

Raises:

ValueError – Unknown wrapper type.


JAX

skrl.envs.wrappers.jax.wrap_env(env: Any, wrapper: Literal['auto', 'gym', 'gymnasium', 'isaaclab', 'isaaclab-single-agent', 'isaaclab-multi-agent', 'mani-skill', 'pettingzoo', 'playground'] = 'auto', verbose: bool = True) Wrapper | MultiAgentEnvWrapper[source]

Wrap an environment to use a common interface.

Example:

>>> from skrl.envs.wrappers.jax import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)
Parameters:
  • env – The environment instance to be wrapped.

  • wrapper

    The type of wrapper to use. If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

    Single-agent environments

    Environment

    Wrapper tag

    OpenAI Gym

    "gym"

    Gymnasium

    "gymnasium"

    Isaac Lab

    "isaaclab" ("isaaclab-single-agent")

    ManiSkill

    "mani-skill"

    MuJoCo Playground

    "playground"

    Multi-agent environments

    Environment

    Wrapper tag

    PettingZoo

    "pettingzoo"

    Isaac Lab

    "isaaclab" ("isaaclab-multi-agent")

  • verbose – Whether to print verbose information about the environment and the wrapper.

Returns:

Wrapped environment instance.

Raises:

ValueError – Unknown wrapper type.


Internal API


PyTorch

MultiAgentEnvWrapper

Base wrapper class for multi-agent environments.

IsaacLabMultiAgentWrapper

Isaac Lab environment wrapper for multi-agent implementation.

PettingZooWrapper

PettingZoo (Parallel API) environment wrapper.

class skrl.envs.wrappers.torch.MultiAgentEnvWrapper(env: Any)[source]

Bases: ABC

Base wrapper class for multi-agent environments.

Parameters:

env – The multi-agent environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

abstractmethod close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

abstractmethod render(*args, **kwargs) Any[source]

Render the environment.

Returns:

Any value from the wrapped environment.

abstractmethod reset() tuple[dict[str, torch.Tensor], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

abstractmethod state() dict[str, torch.Tensor | None][source]

Get the environment state.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

abstractmethod step(actions: dict[str, torch.Tensor]) tuple[dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: torch.device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.

class skrl.envs.wrappers.torch.isaaclab_envs.IsaacLabMultiAgentWrapper(env: Any)[source]

Bases: MultiAgentEnvWrapper

Isaac Lab environment wrapper for multi-agent implementation.

Parameters:

env – The environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

render(*args, **kwargs) None[source]

Render the environment.

reset() tuple[dict[str, torch.Tensor], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

state() dict[str, torch.Tensor | None][source]

Get the environment state.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

step(actions: dict[str, torch.Tensor]) tuple[dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: torch.device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.

class skrl.envs.wrappers.torch.pettingzoo_envs.PettingZooWrapper(env: Any)[source]

Bases: MultiAgentEnvWrapper

PettingZoo (Parallel API) environment wrapper.

Parameters:

env – The environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

render(*args, **kwargs) Any[source]

Render the environment.

reset() tuple[dict[str, torch.Tensor], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

state() dict[str, torch.Tensor | None][source]

Get the environment state.

In PettingZoo, the state is a global view of the environment, so it is the same for all agents.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

step(actions: dict[str, torch.Tensor]) tuple[dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, torch.Tensor], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: torch.device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.


JAX

MultiAgentEnvWrapper

Base wrapper class for multi-agent environments.

IsaacLabMultiAgentWrapper

Isaac Lab environment wrapper for multi-agent implementation.

PettingZooWrapper

PettingZoo (Parallel API) environment wrapper.

class skrl.envs.wrappers.jax.MultiAgentEnvWrapper(env: Any)[source]

Bases: ABC

Base wrapper class for multi-agent environments.

Parameters:

env – The multi-agent environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

abstractmethod close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

abstractmethod render(*args, **kwargs) Any[source]

Render the environment.

Returns:

Any value from the wrapped environment.

abstractmethod reset() tuple[dict[str, jax.Array], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

abstractmethod state() dict[jax.Array | None][source]

Get the environment state.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

abstractmethod step(actions: dict[str, jax.Array]) tuple[dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: jax.Device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.

class skrl.envs.wrappers.jax.isaaclab_envs.IsaacLabMultiAgentWrapper(env: Any)[source]

Bases: MultiAgentEnvWrapper

Isaac Lab environment wrapper for multi-agent implementation.

Parameters:

env – The environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

render(*args, **kwargs) None[source]

Render the environment.

reset() tuple[dict[str, jax.Array], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

state() dict[jax.Array | None][source]

Get the environment state.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

step(actions: dict[str, jax.Array]) tuple[dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: jax.Device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.

class skrl.envs.wrappers.jax.pettingzoo_envs.PettingZooWrapper(env: Any)[source]

Bases: MultiAgentEnvWrapper

PettingZoo (Parallel API) environment wrapper.

Parameters:

env – The environment instance to wrap.

Methods:

action_space(agent)

Action space.

close()

Close the environment.

observation_space(agent)

Observation space.

render(*args, **kwargs)

Render the environment.

reset()

Reset the environment.

state()

Get the environment state.

state_space(agent)

State space.

step(actions)

Perform a step in the environment.

Attributes:

action_spaces

Action spaces.

agents

Names of all current agents.

device

The device used by the environment.

max_num_agents

Number of possible agents the environment could generate.

num_agents

Number of current agents.

num_envs

Number of environments.

observation_spaces

Observation spaces.

possible_agents

Names of all possible agents the environment could generate.

state_spaces

State spaces.

action_space(agent: str) gymnasium.Space[source]

Action space.

Parameters:

agent – Name of the agent.

Returns:

The action space for the specified agent.

close() None[source]

Close the environment.

observation_space(agent: str) gymnasium.Space[source]

Observation space.

Parameters:

agent – Name of the agent.

Returns:

The observation space for the specified agent.

render(*args, **kwargs) Any[source]

Render the environment.

reset() tuple[dict[str, jax.Array], dict[str, Any]][source]

Reset the environment.

Returns:

Observation, info.

state() dict[jax.Array | None][source]

Get the environment state.

In PettingZoo, the state is a global view of the environment, so it is the same for all agents.

Returns:

State.

state_space(agent: str) gymnasium.Space | None[source]

State space.

See state_spaces for more details.

Parameters:

agent – Name of the agent.

Returns:

The state space for the specified agent.

step(actions: dict[str, jax.Array]) tuple[dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, jax.Array], dict[str, Any]][source]

Perform a step in the environment.

Parameters:

actions – The actions to perform.

Returns:

Observation, reward, terminated, truncated, info.

property action_spaces: dict[str, gymnasium.Space][source]

Action spaces.

property agents: list[str][source]

Names of all current agents.

These may be changed as an environment progresses (i.e. agents can be added or removed).

property device: jax.Device[source]

The device used by the environment.

If the wrapped environment does not have the device property, the value of this property will be "cuda" or "cpu" depending on the device availability.

property max_num_agents: int[source]

Number of possible agents the environment could generate.

Read from the length of the possible_agents property if the wrapped environment doesn’t define it.

property num_agents: int[source]

Number of current agents.

Read from the length of the agents property if the wrapped environment doesn’t define it.

property num_envs: int[source]

Number of environments.

If the wrapped environment does not have the num_envs property, it will be set to 1.

property observation_spaces: dict[str, gymnasium.Space][source]

Observation spaces.

property possible_agents: list[str][source]

Names of all possible agents the environment could generate.

These can not be changed as an environment progresses.

property state_spaces: dict[str, gymnasium.Space | None][source]

State spaces.

Although this property returns a dictionary, the space for each agent adheres to the next rules:

  • The wrapped environment has the state_space attribute (homogeneous state). The state is a global view of the environment, so the space is the same for all agents.

  • The wrapped environment has the state_spaces attribute (heterogeneous state). The state may differ for each agent, so the agent spaces may also differ.

  • The wrapped environment does not have the previous attributes. The space is None for all agents.