Memories¶

Memories are storage components that allow agents to collect and use/reuse current or past experiences of their interaction with the environment or other types of information.

Implemented memories¶

The following table lists the implemented memories and their support for different frameworks.

Memories
Random memory	\(\blacksquare\)	\(\blacksquare\)	\(\blacksquare\)

Base class¶

Base class for memories.

API¶

PyTorch¶

Memory

Base class that represents a memory with circular buffers.

class skrl.memories.torch.Memory(*, memory_size: int, num_envs: int = 1, device: str | torch.device | None = None, export: bool = False, export_format: Literal['pt', 'npz', 'csv'] = 'pt', export_directory: str = '')[source]¶

Bases: ABC

Base class that represents a memory with circular buffers.

Buffers are tensors with shape (memory_size, num_envs, data_size). Circular buffer is implemented with two integers: a memory index (memory_index, dimension 0) and an environment index (env_index, dimension 1).

Parameters:

memory_size – Maximum number of elements in the first dimension for each tensor.
num_envs – Number of parallel environments.
device – Data allocation and computation device. If not specified, the default device will be used.
export – Export the memory to a file. If True, the memory will be exported once it is filled and before the circular buffer starts to overwrite the oldest data.
export_format – File format to export the memory. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").
export_directory – Directory where the memory files will be exported. If not specified, the agent’s experiment directory will be used.

Raises:

ValueError – Unsupported export format.

__len__() → int[source]¶

Compute and return the current (valid) size of the memory.

The valid size is computed as:

memory_size * num_envs if the memory is full (filled)
memory_index * num_envs + env_index otherwise

Returns:: Valid size.

Methods:

`add_samples`(**tensors)	Add/store samples in memory.
`create_tensor`(name, *, size[, dtype, ...])	Create a new internal tensor in memory.
`get_sampling_indexes`()	Get the last indexes used for sampling.
`get_tensor_by_name`(name)	Get a tensor by its name.
`get_tensor_names`()	Get the name of the internal tensors, sorted alphabetically.
`load`(path)	Load the memory from a file.
`reset`()	Reset the memory by clearing internal indexes and flags.
`sample`(names, *, batch_size[, mini_batches, ...])	Data sampling method to be implemented by the inheriting classes.
`sample_all`(names, *[, mini_batches, ...])	Sample all data from memory.
`sample_by_index`(names, *, indexes[, ...])	Sample data from memory according to their indexes.
`save`([directory, format])	Save the memory to a file.
`set_tensor_by_name`(name, tensor)	Set a tensor by its name.
`share_memory`()	Set the tensors to be shared between processes.

add_samples(**tensors: dict[str, torch.Tensor]) → None[source]¶

Add/store samples in memory.

Important

All tensors must have the same dimensions (2 dimensions) and shape: (current_num_envs, data_size). If the tensors have one dimension, it is assumed that current_num_envs is 1.

No check is performed for compatibility of the shapes or for memory write overflow.

According to the number of environments, the following behavior is performed:

current_num_envs = num_envs: store samples and increment the memory index (1st index) by one.
current_num_envs < num_envs: store samples and increment the environment index (2nd index) by the current number of environments.
current_num_envs > num_envs and num_envs = 1: store multiple samples and increment the memory index (1st index) by the number of samples. If the number of samples is greater than the remaining memory size, the memory will be filled and circular buffer will overwrite the oldest data with the remaining samples.

Parameters:: tensors – Sample data, as key-value arguments (keys: tensor names). Non-existing tensors will be skipped.
Raises:: ValueError – No tensors were provided or the tensors have incompatible shapes.

create_tensor(name: str, *, size: int | list[int] | gymnasium.Space | None, dtype: torch.dtype | None = None, keep_dimensions: bool = False) → bool[source]¶

Create a new internal tensor in memory.

The tensor will have a 3-dimensional with shape (memory_size, num_envs, data_size). The internal representation will use _tensor_<name> as the name of the class property.

Parameters:

name – Tensor name (the name must follow the python PEP 8 style).
size – Number of elements in the last dimension (effective data size). If a space is provided, the size will be computed as the number of elements occupied by the space.
dtype – Data type. If not specified, the global default data type for PyTorch will be used.
keep_dimensions – Whether to create a tensor with the original data dimensions. If enabled, only sequences of integers are supported as data size.

Returns:

True if the tensor was created, otherwise False.

Raises:

ValueError – A tensor with the same name exists already but its size and/or dtype is different.

get_sampling_indexes() → list | np.ndarray | torch.Tensor[source]¶

Get the last indexes used for sampling.

Returns:: Last sampling indexes.

get_tensor_by_name(name: str) → torch.Tensor[source]¶

Get a tensor by its name.

Parameters:: name – Name of the tensor to get.
Returns:: Tensor.
Raises:: KeyError – The tensor does not exist.

get_tensor_names() → list[str][source]¶

Get the name of the internal tensors, sorted alphabetically.

Returns:: Tensor names without the internal prefix (_tensor_).

load(path: str) → None[source]¶

Load the memory from a file.

Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Parameters:: path – Path to the file where the memory will be loaded.
Raises:: ValueError – Unsupported format.

reset() → None[source]¶

Reset the memory by clearing internal indexes and flags.

Note

Old data will be retained until overwritten, but access through the available methods will not be guaranteed.

Default values of the internal indexes and flags after the reset:

filled: False
env_index: 0
memory_index: 0

abstractmethod sample(names: list[str], *, batch_size: int, mini_batches: int = 1, sequence_length: int = 1) → list[list[torch.Tensor]][source]¶

Data sampling method to be implemented by the inheriting classes.

Parameters:

names – Tensors names from which to obtain the samples.
batch_size – Number of elements to sample.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (batch_size, data_size).

sample_all(names: list[str], *, mini_batches: int = 1, sequence_length: int = 1) → list[list[torch.Tensor]][source]¶

Sample all data from memory.

Parameters:

names – Tensors names from which to obtain the samples.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from memory. The sampled tensors will have the following shape: (memory_size * number_of_environments, data_size).

sample_by_index(names: list[str], *, indexes: list | np.ndarray | torch.Tensor, mini_batches: int = 1) → list[list[torch.Tensor]][source]¶

Sample data from memory according to their indexes.

Parameters:

names – Tensors names from which to obtain the samples.
indexes – Indexes used for sampling.
mini_batches – Number of mini-batches to sample.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (number_of_indexes, data_size).

save(directory: str = '', *, format: Literal['pt', 'npz', 'csv'] = 'pt') → None[source]¶

Save the memory to a file.

Parameters:

directory – Path to the folder where the memory will be saved. If not provided, the directory defined in the constructor will be used.
format – Format of the file where the memory will be saved. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Raises:

ValueError – Unsupported format.

set_tensor_by_name(name: str, tensor: torch.Tensor) → None[source]¶

Set a tensor by its name.

Parameters:

name – Name of the tensor to set.
tensor – Tensor to set.

Raises:

KeyError – The tensor does not exist.

share_memory() → None[source]¶: Set the tensors to be shared between processes.

JAX¶

Memory

Base class that represents a memory with circular buffers.

class skrl.memories.jax.Memory(*, memory_size: int, num_envs: int = 1, device: str | jax.Device | None = None, export: bool = False, export_format: Literal['pt', 'npz', 'csv'] = 'pt', export_directory: str = '')[source]¶

Bases: ABC

Base class that represents a memory with circular buffers.

Parameters:

memory_size – Maximum number of elements in the first dimension for each tensor.
num_envs – Number of parallel environments.
device – Data allocation and computation device. If not specified, the default device will be used.
export – Export the memory to a file. If True, the memory will be exported once it is filled and before the circular buffer starts to overwrite the oldest data.
export_format – File format to export the memory. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").
export_directory – Directory where the memory files will be exported. If not specified, the agent’s experiment directory will be used.

Raises:

ValueError – Unsupported export format.

__len__() → int[source]¶

Compute and return the current (valid) size of the memory.

The valid size is computed as:

memory_size * num_envs if the memory is full (filled)
memory_index * num_envs + env_index otherwise

Returns:: Valid size.

Methods:

`add_samples`(**tensors)	Add/store samples in memory.
`create_tensor`(name, *, size[, dtype, ...])	Create a new internal tensor in memory.
`get_sampling_indexes`()	Get the last indexes used for sampling.
`get_tensor_by_name`(name)	Get a tensor by its name.
`get_tensor_names`()	Get the name of the internal tensors, sorted alphabetically.
`load`(path)	Load the memory from a file.
`reset`()	Reset the memory by clearing internal indexes and flags.
`sample`(names, *, batch_size[, mini_batches, ...])	Data sampling method to be implemented by the inheriting classes.
`sample_all`(names, *[, mini_batches, ...])	Sample all data from memory.
`sample_by_index`(names, *, indexes[, ...])	Sample data from memory according to their indexes.
`save`([directory, format])	Save the memory to a file.
`set_tensor_by_name`(name, tensor)	Set a tensor by its name.
`share_memory`()	Set the tensors to be shared between processes.

add_samples(**tensors: dict[str, jax.Array]) → None[source]¶

Add/store samples in memory.

Important

All tensors must have the same dimensions (2 dimensions) and shape: (current_num_envs, data_size). If the tensors have one dimension, it is assumed that current_num_envs is 1.

No check is performed for compatibility of the shapes or for memory write overflow.

According to the number of environments, the following behavior is performed:

current_num_envs = num_envs: store samples and increment the memory index (1st index) by one.
current_num_envs < num_envs: store samples and increment the environment index (2nd index) by the current number of environments.
current_num_envs > num_envs and num_envs = 1: store multiple samples and increment the memory index (1st index) by the number of samples. If the number of samples is greater than the remaining memory size, the memory will be filled and circular buffer will overwrite the oldest data with the remaining samples.

Parameters:: tensors – Sample data, as key-value arguments (keys: tensor names). Non-existing tensors will be skipped.
Raises:: ValueError – No tensors were provided or the tensors have incompatible shapes.

create_tensor(name: str, *, size: int | list[int] | gymnasium.Space | None, dtype: jnp.dtype | None = None, keep_dimensions: bool = False) → bool[source]¶

Create a new internal tensor in memory.

The tensor will have a 3-dimensional with shape (memory_size, num_envs, data_size). The internal representation will use _tensor_<name> as the name of the class property.

Parameters:

name – Tensor name (the name must follow the python PEP 8 style).
size – Number of elements in the last dimension (effective data size). If a space is provided, the size will be computed as the number of elements occupied by the space.
dtype – Data type. If not specified, the global default data type for PyTorch will be used.
keep_dimensions – Whether to create a tensor with the original data dimensions. If enabled, only sequences of integers are supported as data size.

Returns:

True if the tensor was created, otherwise False.

Raises:

ValueError – A tensor with the same name exists already but its size and/or dtype is different.

get_sampling_indexes() → list | jax.Array[source]¶

Get the last indexes used for sampling.

Returns:: Last sampling indexes.

get_tensor_by_name(name: str) → jax.Array[source]¶

Get a tensor by its name.

Parameters:: name – Name of the tensor to get.
Returns:: Tensor.
Raises:: KeyError – The tensor does not exist.

get_tensor_names() → list[str][source]¶

Get the name of the internal tensors, sorted alphabetically.

Returns:: Tensor names without the internal prefix (_tensor_).

load(path: str) → None[source]¶

Load the memory from a file.

Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Parameters:: path – Path to the file where the memory will be loaded.
Raises:: ValueError – Unsupported format.

reset() → None[source]¶

Reset the memory by clearing internal indexes and flags.

Note

Old data will be retained until overwritten, but access through the available methods will not be guaranteed.

Default values of the internal indexes and flags after the reset:

filled: False
env_index: 0
memory_index: 0

abstractmethod sample(names: list[str], *, batch_size: int, mini_batches: int = 1, sequence_length: int = 1) → list[list[jax.Array]][source]¶

Data sampling method to be implemented by the inheriting classes.

Parameters:

names – Tensors names from which to obtain the samples.
batch_size – Number of elements to sample.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (batch_size, data_size).

sample_all(names: list[str], *, mini_batches: int = 1, sequence_length: int = 1) → list[list[jax.Array]][source]¶

Sample all data from memory.

Parameters:

names – Tensors names from which to obtain the samples.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from memory. The sampled tensors will have the following shape: (memory_size * number_of_environments, data_size).

sample_by_index(names: list[str], *, indexes: list | jax.Array, mini_batches: int = 1) → list[list[jax.Array]][source]¶

Sample data from memory according to their indexes.

Parameters:

names – Tensors names from which to obtain the samples.
indexes – Indexes used for sampling.
mini_batches – Number of mini-batches to sample.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (number_of_indexes, data_size).

save(directory: str = '', *, format: Literal['pt', 'npz', 'csv'] = 'pt') → None[source]¶

Save the memory to a file.

Parameters:

directory – Path to the folder where the memory will be saved. If not provided, the directory defined in the constructor will be used.
format – Format of the file where the memory will be saved. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Raises:

ValueError – Unsupported format.

set_tensor_by_name(name: str, tensor: jax.Array) → None[source]¶

Set a tensor by its name.

Parameters:

name – Name of the tensor to set.
tensor – Tensor to set.

Raises:

KeyError – The tensor does not exist.

share_memory() → None[source]¶: Set the tensors to be shared between processes.

Warp¶

Memory

Base class that represents a memory with circular buffers.

class skrl.memories.warp.Memory(*, memory_size: int, num_envs: int = 1, device: str | wp.Device | None = None, export: bool = False, export_format: Literal['pt', 'npz', 'csv'] = 'pt', export_directory: str = '')[source]¶

Bases: ABC

Base class that represents a memory with circular buffers.

Parameters:

memory_size – Maximum number of elements in the first dimension for each tensor.
num_envs – Number of parallel environments.
device – Data allocation and computation device. If not specified, the default device will be used.
export – Export the memory to a file. If True, the memory will be exported once it is filled and before the circular buffer starts to overwrite the oldest data.
export_format – File format to export the memory. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").
export_directory – Directory where the memory files will be exported. If not specified, the agent’s experiment directory will be used.

Raises:

ValueError – Unsupported export format.

__len__() → int[source]¶

Compute and return the current (valid) size of the memory.

The valid size is computed as:

memory_size * num_envs if the memory is full (filled)
memory_index * num_envs + env_index otherwise

Returns:: Valid size.

Methods:

`add_samples`(**tensors)	Add/store samples in memory.
`create_tensor`(name, *, size[, dtype, ...])	Create a new internal tensor in memory.
`get_sampling_indexes`()	Get the last indexes used for sampling.
`get_tensor_by_name`(name)	Get a tensor by its name.
`get_tensor_names`()	Get the name of the internal tensors, sorted alphabetically.
`load`(path)	Load the memory from a file.
`reset`()	Reset the memory by clearing internal indexes and flags.
`sample`(names, *, batch_size[, mini_batches, ...])	Data sampling method to be implemented by the inheriting classes.
`sample_all`(names, *[, mini_batches, ...])	Sample all data from memory.
`sample_by_index`(names, *, indexes[, ...])	Sample data from memory according to their indexes.
`save`([directory, format])	Save the memory to a file.
`set_tensor_by_name`(name, tensor)	Set a tensor by its name.
`share_memory`()	Set the tensors to be shared between processes.

add_samples(**tensors: dict[str, warp.array]) → None[source]¶

Add/store samples in memory.

Important

All tensors must have the same dimensions (2 dimensions) and shape: (current_num_envs, data_size). If the tensors have one dimension, it is assumed that current_num_envs is 1.

No check is performed for compatibility of the shapes or for memory write overflow.

According to the number of environments, the following behavior is performed:

current_num_envs = num_envs: store samples and increment the memory index (1st index) by one.
current_num_envs < num_envs: store samples and increment the environment index (2nd index) by the current number of environments.
current_num_envs > num_envs and num_envs = 1: store multiple samples and increment the memory index (1st index) by the number of samples. If the number of samples is greater than the remaining memory size, the memory will be filled and circular buffer will overwrite the oldest data with the remaining samples.

Parameters:: tensors – Sample data, as key-value arguments (keys: tensor names). Non-existing tensors will be skipped.
Raises:: ValueError – No tensors were provided or the tensors have incompatible shapes.

create_tensor(name: str, *, size: int | list[int] | gymnasium.Space | None, dtype: type | None = None, keep_dimensions: bool = False) → bool[source]¶

Create a new internal tensor in memory.

The tensor will have a 3-dimensional with shape (memory_size, num_envs, data_size). The internal representation will use _tensor_<name> as the name of the class property.

Parameters:

name – Tensor name (the name must follow the python PEP 8 style).
size – Number of elements in the last dimension (effective data size). If a space is provided, the size will be computed as the number of elements occupied by the space.
dtype – Data type. If not specified, the global default data type for PyTorch will be used.
keep_dimensions – Whether to create a tensor with the original data dimensions. If enabled, only sequences of integers are supported as data size.

Returns:

True if the tensor was created, otherwise False.

Raises:

ValueError – A tensor with the same name exists already but its size and/or dtype is different.

get_sampling_indexes() → list | np.ndarray | wp.array[source]¶

Get the last indexes used for sampling.

Returns:: Last sampling indexes.

get_tensor_by_name(name: str) → warp.array[source]¶

Get a tensor by its name.

Parameters:: name – Name of the tensor to get.
Returns:: Tensor.
Raises:: KeyError – The tensor does not exist.

get_tensor_names() → list[str][source]¶

Get the name of the internal tensors, sorted alphabetically.

Returns:: Tensor names without the internal prefix (_tensor_).

load(path: str) → None[source]¶

Load the memory from a file.

Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Parameters:: path – Path to the file where the memory will be loaded.
Raises:: ValueError – Unsupported format.

reset() → None[source]¶

Reset the memory by clearing internal indexes and flags.

Note

Old data will be retained until overwritten, but access through the available methods will not be guaranteed.

Default values of the internal indexes and flags after the reset:

filled: False
env_index: 0
memory_index: 0

abstractmethod sample(names: list[str], *, batch_size: int, mini_batches: int = 1, sequence_length: int = 1) → list[list[warp.array]][source]¶

Data sampling method to be implemented by the inheriting classes.

Parameters:

names – Tensors names from which to obtain the samples.
batch_size – Number of elements to sample.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (batch_size, data_size).

sample_all(names: list[str], *, mini_batches: int = 1, sequence_length: int = 1) → list[list[warp.array]][source]¶

Sample all data from memory.

Parameters:

names – Tensors names from which to obtain the samples.
mini_batches – Number of mini-batches to sample.
sequence_length – Length of each sequence.

Returns:

Sampled data from memory. The sampled tensors will have the following shape: (memory_size * number_of_environments, data_size).

sample_by_index(names: list[str], *, indexes: list | np.ndarray | wp.array, mini_batches: int = 1) → list[list[wp.array]][source]¶

Sample data from memory according to their indexes.

Parameters:

names – Tensors names from which to obtain the samples.
indexes – Indexes used for sampling.
mini_batches – Number of mini-batches to sample.

Returns:

Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (number_of_indexes, data_size).

save(directory: str = '', *, format: Literal['pt', 'npz', 'csv'] = 'pt') → None[source]¶

Save the memory to a file.

Parameters:

directory – Path to the folder where the memory will be saved. If not provided, the directory defined in the constructor will be used.
format – Format of the file where the memory will be saved. Supported formats: PyTorch ("pt"), NumPy ("npz") or comma separated values ("csv").

Raises:

ValueError – Unsupported format.

set_tensor_by_name(name: str, tensor: warp.array) → None[source]¶

Set a tensor by its name.

Parameters:

name – Name of the tensor to set.
tensor – Tensor to set.

Raises:

KeyError – The tensor does not exist.

share_memory() → None[source]¶: Set the tensors to be shared between processes.