Wrapping (multi-agents)¶

This library works with a common API to interact with the following RL multi-agent environments:

Farama PettingZoo (parallel API)
Bi-DexHands

To operate with them and to support interoperability between these non-compatible interfaces, a wrapping mechanism is provided as shown in the diagram below

Usage¶

# import the environment wrapper
from skrl.envs.wrappers.torch import wrap_env

# import a PettingZoo environment
from pettingzoo.sisl import multiwalker_v9

# load the environment
env = multiwalker_v9.parallel_env()

# wrap the environment
env = wrap_env(env)  # or 'env = wrap_env(env, wrapper="pettingzoo")'

# import the environment wrapper
from skrl.envs.wrappers.jax import wrap_env

# import a PettingZoo environment
from pettingzoo.sisl import multiwalker_v9

# load the environment
env = multiwalker_v9.parallel_env()

# wrap the environment
env = wrap_env(env)  # or 'env = wrap_env(env, wrapper="pettingzoo")'

# import the environment wrapper and loader
from skrl.envs.wrappers.torch import wrap_env
from skrl.envs.loaders.torch import load_bidexhands_env

# load the environment
env = load_bidexhands_env(task_name="ShadowHandOver")

# wrap the environment
env = wrap_env(env, wrapper="bidexhands")

# import the environment wrapper and loader
from skrl.envs.wrappers.jax import wrap_env
from skrl.envs.loaders.jax import load_bidexhands_env

# load the environment
env = load_bidexhands_env(task_name="ShadowHandOver")

# wrap the environment
env = wrap_env(env, wrapper="bidexhands")

API (PyTorch)¶

skrl.envs.wrappers.torch.wrap_env(env: Any, wrapper: str = 'auto', verbose: bool = True) → Wrapper | MultiAgentEnvWrapper¶

Wrap an environment to use a common interface

Example:

>>> from skrl.envs.wrappers.torch import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)

Parameters:

env (gym.Env, gymnasium.Env, dm_env.Environment or VecTask) – The environment to be wrapped

wrapper (str, optional) –

The type of wrapper to use (default: "auto"). If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

Environment	Wrapper tag
OpenAI Gym	`"gym"`
Gymnasium	`"gymnasium"`
Petting Zoo	`"pettingzoo"`
DeepMind	`"dm"`
Robosuite	`"robosuite"`
Bi-DexHands	`"bidexhands"`
Isaac Gym preview 2	`"isaacgym-preview2"`
Isaac Gym preview 3	`"isaacgym-preview3"`
Isaac Gym preview 4	`"isaacgym-preview4"`
Omniverse Isaac Gym	`"omniverse-isaacgym"`
Isaac Sim (orbit)	`"isaac-orbit"`

verbose (bool, optional) – Whether to print the wrapper type (default: True)

Raises:

ValueError – Unknown wrapper type

Returns:

Wrapped environment

Return type:

Wrapper or MultiAgentEnvWrapper

API (JAX)¶

skrl.envs.wrappers.jax.wrap_env(env: Any, wrapper: str = 'auto', verbose: bool = True) → Wrapper | MultiAgentEnvWrapper¶

Wrap an environment to use a common interface

Example:

>>> from skrl.envs.wrappers.jax import wrap_env
>>>
>>> # assuming that there is an environment called "env"
>>> env = wrap_env(env)

Parameters:

env (gym.Env, gymnasium.Env, dm_env.Environment or VecTask) – The environment to be wrapped

wrapper (str, optional) –

The type of wrapper to use (default: "auto"). If "auto", the wrapper will be automatically selected based on the environment class. The supported wrappers are described in the following table:

Environment	Wrapper tag
OpenAI Gym	`"gym"`
Gymnasium	`"gymnasium"`
Petting Zoo	`"pettingzoo"`
Bi-DexHands	`"bidexhands"`
Isaac Gym preview 2	`"isaacgym-preview2"`
Isaac Gym preview 3	`"isaacgym-preview3"`
Isaac Gym preview 4	`"isaacgym-preview4"`
Omniverse Isaac Gym	`"omniverse-isaacgym"`
Isaac Sim (orbit)	`"isaac-orbit"`

verbose (bool, optional) – Whether to print the wrapper type (default: True)

Raises:

ValueError – Unknown wrapper type

Returns:

Wrapped environment

Return type:

Wrapper or MultiAgentEnvWrapper

Internal API (PyTorch)¶

class skrl.envs.wrappers.torch.MultiAgentEnvWrapper(env: Any)¶

Bases: object

__init__(env: Any) → None¶

Base wrapper class for multi-agent environments

Parameters:: env (Any supported multi-agent environment) – The multi-agent environment to wrap

property device¶

The device used by the environment

If the wrapped environment does not have the device property, the value of this property will be "cuda:0" or "cpu" depending on the device availability

property possible_agents¶: A list of all possible_agents the environment could generate

action_space(agent: str) → gym.Space¶

Action space

Parameters:: agent (str) – Name of the agent
Returns:: The action space for the specified agent
Return type:: gym.Space

property action_spaces: Mapping[str, gym.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() → None¶: Close the environment

property num_agents: int¶

Number of agents

If the wrapped environment does not have the num_agents property, it will be set to 1

property num_envs: int¶

Number of environments

If the wrapped environment does not have the num_envs property, it will be set to 1

observation_space(agent: str) → gym.Space¶

Observation space

Parameters:: agent (str) – Name of the agent
Returns:: The observation space for the specified agent
Return type:: gym.Space

property observation_spaces: Mapping[str, gym.Space]¶: Observation spaces

render(*args, **kwargs) → None¶: Render the environment

reset() → Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Reset the environment

Raises:: NotImplementedError – Not implemented
Returns:: Observation, info
Return type:: tuple of dictionaries of torch.Tensor and any other info

shared_observation_space(agent: str) → gym.Space¶

Shared observation space

Parameters:: agent (str) – Name of the agent
Returns:: The shared observation space for the specified agent
Return type:: gym.Space

property shared_observation_spaces: Mapping[str, gym.Space]¶: Shared observation spaces

shared_state_space(agent: str) → gym.Space¶

Shared state space

Parameters:: agent (str) – Name of the agent
Returns:: The shared state space for the specified agent
Return type:: gym.Space

property shared_state_spaces: Mapping[str, gym.Space]¶

Shared state spaces

An alias for the shared_observation_spaces property

state_space(agent: str) → gym.Space¶

State space

Parameters:: agent (str) – Name of the agent
Returns:: The state space for the specified agent
Return type:: gym.Space

property state_spaces: Mapping[str, gym.Space]¶

State spaces

An alias for the observation_spaces property

step(actions: Mapping[str, torch.Tensor]) → Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Perform a step in the environment

Parameters:: actions (dictionary of torch.Tensor) – The actions to perform
Raises:: NotImplementedError – Not implemented
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dictionaries of torch.Tensor and any other info

class skrl.envs.wrappers.torch.BiDexHandsWrapper(env: Any)¶

Bases: MultiAgentEnvWrapper

__init__(env: Any) → None¶

Bi-DexHands wrapper

Parameters:: env (Any supported Bi-DexHands environment) – The environment to wrap

property action_spaces: Mapping[str, gym.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

property observation_spaces: Mapping[str, gym.Space]¶: Observation spaces

reset() → Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Reset the environment

Returns:: Observation, info
Return type:: tuple of dictionaries of torch.Tensor and any other info

property shared_observation_spaces: Mapping[str, gym.Space]¶: Shared observation spaces

step(actions: Mapping[str, torch.Tensor]) → Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Perform a step in the environment

Parameters:: actions (dictionary of torch.Tensor) – The actions to perform
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dictionaries torch.Tensor and any other info

class skrl.envs.wrappers.torch.PettingZooWrapper(env: Any)¶

Bases: MultiAgentEnvWrapper

__init__(env: Any) → None¶

PettingZoo (parallel) environment wrapper

Parameters:: env (Any supported PettingZoo (parallel) environment) – The environment to wrap

property action_spaces: Mapping[str, gymnasium.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() → None¶: Close the environment

property num_agents: int¶: Number of agents

property observation_spaces: Mapping[str, gymnasium.Space]¶: Observation spaces

render(*args, **kwargs) → None¶: Render the environment

reset() → Tuple[Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Reset the environment

Returns:: Observation, info
Return type:: tuple of dictionaries of torch.Tensor and any other info

property shared_observation_spaces: Mapping[str, gymnasium.Space]¶: Shared observation spaces

step(actions: Mapping[str, torch.Tensor]) → Tuple[Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, torch.Tensor], Mapping[str, Any]]¶

Perform a step in the environment

Parameters:: actions (dictionary of torch.Tensor) – The actions to perform
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dictionaries torch.Tensor and any other info

Internal API (JAX)¶

class skrl.envs.wrappers.jax.MultiAgentEnvWrapper(env: Any)¶

Bases: object

__init__(env: Any) → None¶

Base wrapper class for multi-agent environments

Parameters:: env (Any supported multi-agent environment) – The multi-agent environment to wrap

property device¶

The device used by the environment

If the wrapped environment does not have the device property, the value of this property will be "cuda:0" or "cpu" depending on the device availability

property possible_agents¶: A list of all possible_agents the environment could generate

action_space(agent: str) → gym.Space¶

Action space

Parameters:: agent (str) – Name of the agent
Returns:: The action space for the specified agent
Return type:: gym.Space

property action_spaces: Mapping[str, gym.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() → None¶: Close the environment

property num_agents: int¶

Number of agents

If the wrapped environment does not have the num_agents property, it will be set to 1

property num_envs: int¶

Number of environments

If the wrapped environment does not have the num_envs property, it will be set to 1

observation_space(agent: str) → gym.Space¶

Observation space

Parameters:: agent (str) – Name of the agent
Returns:: The observation space for the specified agent
Return type:: gym.Space

property observation_spaces: Mapping[str, gym.Space]¶: Observation spaces

render(*args, **kwargs) → None¶: Render the environment

reset() → Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]¶

Reset the environment

Raises:: NotImplementedError – Not implemented
Returns:: Observation, info
Return type:: tuple of dict of np.ndarray or jax.Array and any other info

shared_observation_space(agent: str) → gym.Space¶

Shared observation space

Parameters:: agent (str) – Name of the agent
Returns:: The shared observation space for the specified agent
Return type:: gym.Space

property shared_observation_spaces: Mapping[str, gym.Space]¶: Shared observation spaces

shared_state_space(agent: str) → gym.Space¶

Shared state space

Parameters:: agent (str) – Name of the agent
Returns:: The shared state space for the specified agent
Return type:: gym.Space

property shared_state_spaces: Mapping[str, gym.Space]¶

Shared state spaces

An alias for the shared_observation_spaces property

state_space(agent: str) → gym.Space¶

State space

Parameters:: agent (str) – Name of the agent
Returns:: The state space for the specified agent
Return type:: gym.Space

property state_spaces: Mapping[str, gym.Space]¶

State spaces

An alias for the observation_spaces property

Perform a step in the environment

Parameters:: actions (dict of np.ndarray or jax.Array) – The actions to perform
Raises:: NotImplementedError – Not implemented
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dict of np.ndarray or jax.Array and any other info

class skrl.envs.wrappers.jax.BiDexHandsWrapper(env: Any)¶

Bases: MultiAgentEnvWrapper

__init__(env: Any) → None¶

Bi-DexHands wrapper

Parameters:: env (Any supported Bi-DexHands environment) – The environment to wrap

property action_spaces: Mapping[str, gym.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

property observation_spaces: Mapping[str, gym.Space]¶: Observation spaces

reset() → Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]¶

Reset the environment

Returns:: Observation, info
Return type:: tuple of dict of np.ndarray of jax.Array and any other info

property shared_observation_spaces: Mapping[str, gym.Space]¶: Shared observation spaces

Perform a step in the environment

Parameters:: actions (dict of nd.ndarray or jax.Array) – The actions to perform
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dict of nd.ndarray or jax.Array and any other info

class skrl.envs.wrappers.jax.PettingZooWrapper(env: Any)¶

Bases: MultiAgentEnvWrapper

__init__(env: Any) → None¶

PettingZoo (parallel) environment wrapper

Parameters:: env (Any supported PettingZoo (parallel) environment) – The environment to wrap

property action_spaces: Mapping[str, gymnasium.Space]¶: Action spaces

property agents: Sequence[str]¶

Names of all current agents

These may be changed as an environment progresses (i.e. agents can be added or removed)

close() → None¶: Close the environment

property num_agents: int¶: Number of agents

property observation_spaces: Mapping[str, gymnasium.Space]¶: Observation spaces

render(*args, **kwargs) → None¶: Render the environment

reset() → Tuple[Mapping[str, ndarray | jax.Array], Mapping[str, Any]]¶

Reset the environment

Returns:: Observation, info
Return type:: tuple of dict of np.ndarray or jax.Array and any other info

property shared_observation_spaces: Mapping[str, gymnasium.Space]¶: Shared observation spaces

Perform a step in the environment

Parameters:: actions (dict of np.ndarray or jax.Array) – The actions to perform
Returns:: Observation, reward, terminated, truncated, info
Return type:: tuple of dict of np.ndarray or jax.Array and any other info