Memories#
Memories are storage components that allow agents to collect and use/reuse current or past experiences of their interaction with the environment or other types of information.
Memories |
|
|
---|---|---|
\(\blacksquare\) |
\(\blacksquare\) |
Base class#
Note
This is the base class for all the other classes in this module. It provides the basic functionality for the other classes. It is not intended to be used directly.
Basic inheritance usage#
from typing import Union, Tuple, List
import torch
from skrl.memories.torch import Memory
class CustomMemory(Memory):
def __init__(self, memory_size: int, num_envs: int = 1, device: Union[str, torch.device] = "cuda:0") -> None:
"""Custom memory
:param memory_size: Maximum number of elements in the first dimension of each internal storage
:type memory_size: int
:param num_envs: Number of parallel environments (default: 1)
:type num_envs: int, optional
:param device: Device on which a torch tensor is or will be allocated (default: "cuda:0")
:type device: str or torch.device, optional
"""
super().__init__(memory_size, num_envs, device)
def sample(self, names: Tuple[str], batch_size: int, mini_batches: int = 1) -> List[List[torch.Tensor]]:
"""Sample a batch from memory
:param names: Tensors names from which to obtain the samples
:type names: tuple or list of strings
:param batch_size: Number of element to sample
:type batch_size: int
:param mini_batches: Number of mini-batches to sample (default: 1)
:type mini_batches: int, optional
:return: Sampled data from tensors sorted according to their position in the list of names.
The sampled tensors will have the following shape: (batch size, data size)
:rtype: list of torch.Tensor list
"""
# ================================
# - sample a batch from memory.
# It is possible to generate only the sampling indexes and call self.sample_by_index(...)
# ================================
from typing import Optional, Union, Tuple, List
import jaxlib
import jax.numpy as jnp
from skrl.memories.jax import Memory
class CustomMemory(Memory):
def __init__(self, memory_size: int,
num_envs: int = 1,
device: Optional[jaxlib.xla_extension.Device] = None) -> None:
"""Custom memory
:param memory_size: Maximum number of elements in the first dimension of each internal storage
:type memory_size: int
:param num_envs: Number of parallel environments (default: 1)
:type num_envs: int, optional
:param device: Device on which an array is or will be allocated (default: None)
:type device: jaxlib.xla_extension.Device, optional
"""
super().__init__(memory_size, num_envs, device)
def sample(self, names: Tuple[str], batch_size: int, mini_batches: int = 1) -> List[List[jnp.ndarray]]:
"""Sample a batch from memory
:param names: Tensors names from which to obtain the samples
:type names: tuple or list of strings
:param batch_size: Number of element to sample
:type batch_size: int
:param mini_batches: Number of mini-batches to sample (default: 1)
:type mini_batches: int, optional
:return: Sampled data from tensors sorted according to their position in the list of names.
The sampled tensors will have the following shape: (batch size, data size)
:rtype: list of jnp.ndarray list
"""
# ================================
# - sample a batch from memory.
# It is possible to generate only the sampling indexes and call self.sample_by_index(...)
# ================================
API (PyTorch)#
- class skrl.memories.torch.base.Memory(memory_size: int, num_envs: int = 1, device: str | torch.device | None = None, export: bool = False, export_format: str = 'pt', export_directory: str = '')#
Bases:
object
- __init__(memory_size: int, num_envs: int = 1, device: str | torch.device | None = None, export: bool = False, export_format: str = 'pt', export_directory: str = '') None #
Base class representing a memory with circular buffers
Buffers are torch tensors with shape (memory size, number of environments, data size). Circular buffers are implemented with two integers: a memory index and an environment index
- Parameters:
memory_size (int) – Maximum number of elements in the first dimension of each internal storage
num_envs (int, optional) – Number of parallel environments (default:
1
)device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
export (bool, optional) – Export the memory to a file (default:
False
). If True, the memory will be exported when the memory is filledexport_format (str, optional) – Export format (default:
"pt"
). Supported formats: torch (pt), numpy (np), comma separated values (csv)export_directory (str, optional) – Directory where the memory will be exported (default:
""
). If empty, the agent’s experiment directory will be used
- Raises:
ValueError – The export format is not supported
- __len__() int #
Compute and return the current (valid) size of the memory
The valid size is calculated as the
memory_size * num_envs
if the memory is full (filled). Otherwise, thememory_index * num_envs + env_index
is returned- Returns:
Valid size
- Return type:
- add_samples(**tensors: torch.Tensor) None #
Record samples in memory
Samples should be a tensor with 2-components shape (number of environments, data size). All tensors must be of the same shape
According to the number of environments, the following classification is made:
one environment: Store a single sample (tensors with one dimension) and increment the environment index (second index) by one
number of environments less than num_envs: Store the samples and increment the environment index (second index) by the number of the environments
number of environments equals num_envs: Store the samples and increment the memory index (first index) by one
- Parameters:
tensors (dict) – Sampled data as key-value arguments where the keys are the names of the tensors to be modified. Non-existing tensors will be skipped
- Raises:
ValueError – No tensors were provided or the tensors have incompatible shapes
- create_tensor(name: str, size: int | Tuple[int] | gym.Space | gymnasium.Space, dtype: torch.dtype | None = None, keep_dimensions: bool = False) bool #
Create a new internal tensor in memory
The tensor will have a 3-components shape (memory size, number of environments, size). The internal representation will use _tensor_<name> as the name of the class property
- Parameters:
name (str) – Tensor name (the name has to follow the python PEP 8 style)
size (int, tuple or list of integers, gym.Space, or gymnasium.Space) – Number of elements in the last dimension (effective data size). The product of the elements will be computed for sequences or gym/gymnasium spaces
dtype (torch.dtype or None, optional) – Data type (torch.dtype) (default:
None
). If None, the global default torch data type will be usedkeep_dimensions (bool, optional) – Whether or not to keep the dimensions defined through the size parameter (default:
False
)
- Raises:
ValueError – The tensor name exists already but the size or dtype are different
- Returns:
True if the tensor was created, otherwise False
- Return type:
- get_sampling_indexes() tuple | ndarray | torch.Tensor #
Get the last indexes used for sampling
- Returns:
Last sampling indexes
- Return type:
tuple or list, numpy.ndarray or torch.Tensor
- get_tensor_by_name(name: str, keepdim: bool = True) torch.Tensor #
Get a tensor by its name
- Parameters:
- Raises:
KeyError – The tensor does not exist
- Returns:
Tensor
- Return type:
- get_tensor_names() Tuple[str] #
Get the name of the internal tensors in alphabetical order
- Returns:
Tensor names without internal prefix (_tensor_)
- Return type:
tuple of strings
- load(path: str) None #
Load the memory from a file
Supported formats: - PyTorch (pt) - NumPy (npz) - Comma-separated values (csv)
- Parameters:
path (str) – Path to the file where the memory will be loaded
- Raises:
ValueError – If the format is not supported
- reset() None #
Reset the memory by cleaning internal indexes and flags
Old data will be retained until overwritten, but access through the available methods will not be guaranteed
Default values of the internal indexes and flags
filled: False
env_index: 0
memory_index: 0
- sample(names: Tuple[str], batch_size: int, mini_batches: int = 1, sequence_length: int = 1) List[List[torch.Tensor]] #
Data sampling method to be implemented by the inheriting classes
- Parameters:
- Raises:
NotImplementedError – The method has not been implemented
- Returns:
Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (batch size, data size)
- Return type:
list of torch.Tensor list
- sample_all(names: Tuple[str], mini_batches: int = 1, sequence_length: int = 1) List[List[torch.Tensor]] #
Sample all data from memory
- Parameters:
- Returns:
Sampled data from memory. The sampled tensors will have the following shape: (memory size * number of environments, data size)
- Return type:
list of torch.Tensor list
- sample_by_index(names: Tuple[str], indexes: tuple | ndarray | torch.Tensor, mini_batches: int = 1) List[List[torch.Tensor]] #
Sample data from memory according to their indexes
- Parameters:
names (tuple or list of strings) – Tensors names from which to obtain the samples
indexes (tuple or list, numpy.ndarray or torch.Tensor) – Indexes used for sampling
mini_batches (int, optional) – Number of mini-batches to sample (default:
1
)
- Returns:
Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (number of indexes, data size)
- Return type:
list of torch.Tensor list
- save(directory: str = '', format: str = 'pt') None #
Save the memory to a file
Supported formats:
PyTorch (pt)
NumPy (npz)
Comma-separated values (csv)
- Parameters:
- Raises:
ValueError – If the format is not supported
- set_tensor_by_name(name: str, tensor: torch.Tensor) None #
Set a tensor by its name
- Parameters:
name (str) – Name of the tensor to set
tensor (torch.Tensor) – Tensor to set
- Raises:
KeyError – The tensor does not exist
Share the tensors between processes
API (JAX)#
- class skrl.memories.jax.base.Memory(memory_size: int, num_envs: int = 1, device: jax.Device | None = None, export: bool = False, export_format: str = 'pt', export_directory: str = '')#
Bases:
object
- __init__(memory_size: int, num_envs: int = 1, device: jax.Device | None = None, export: bool = False, export_format: str = 'pt', export_directory: str = '') None #
Base class representing a memory with circular buffers
Buffers are jax or numpy arrays with shape (memory size, number of environments, data size). Circular buffers are implemented with two integers: a memory index and an environment index
- Parameters:
memory_size (int) – Maximum number of elements in the first dimension of each internal storage
num_envs (int, optional) – Number of parallel environments (default:
1
)device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
export (bool, optional) – Export the memory to a file (default:
False
). If True, the memory will be exported when the memory is filledexport_format (str, optional) – Export format (default:
"pt"
). Supported formats: torch (pt), numpy (np), comma separated values (csv)export_directory (str, optional) – Directory where the memory will be exported (default:
""
). If empty, the agent’s experiment directory will be used
- Raises:
ValueError – The export format is not supported
- __len__() int #
Compute and return the current (valid) size of the memory
The valid size is calculated as the
memory_size * num_envs
if the memory is full (filled). Otherwise, thememory_index * num_envs + env_index
is returned- Returns:
Valid size
- Return type:
- add_samples(**tensors: Mapping[str, ndarray | jax.Array]) None #
Record samples in memory
Samples should be a tensor with 2-components shape (number of environments, data size). All tensors must be of the same shape
According to the number of environments, the following classification is made:
one environment: Store a single sample (tensors with one dimension) and increment the environment index (second index) by one
number of environments less than num_envs: Store the samples and increment the environment index (second index) by the number of the environments
number of environments equals num_envs: Store the samples and increment the memory index (first index) by one
- Parameters:
tensors (dict) – Sampled data as key-value arguments where the keys are the names of the tensors to be modified. Non-existing tensors will be skipped
- Raises:
ValueError – No tensors were provided or the tensors have incompatible shapes
- create_tensor(name: str, size: int | Tuple[int] | gym.Space | gymnasium.Space, dtype: dtype | None = None, keep_dimensions: bool = False) bool #
Create a new internal tensor in memory
The tensor will have a 3-components shape (memory size, number of environments, size). The internal representation will use _tensor_<name> as the name of the class property
- Parameters:
name (str) – Tensor name (the name has to follow the python PEP 8 style)
size (int, tuple or list of integers or gym.Space) – Number of elements in the last dimension (effective data size). The product of the elements will be computed for sequences or gym/gymnasium spaces
dtype (np.dtype or None, optional) – Data type (np.dtype) (default:
None
). If None, the global default jax.numpy.float32 data type will be usedkeep_dimensions (bool, optional) – Whether or not to keep the dimensions defined through the size parameter (default:
False
)
- Raises:
ValueError – The tensor name exists already but the size or dtype are different
- Returns:
True if the tensor was created, otherwise False
- Return type:
- get_tensor_by_name(name: str, keepdim: bool = True) ndarray | jax.Array #
Get a tensor by its name
- get_tensor_names() Tuple[str] #
Get the name of the internal tensors in alphabetical order
- Returns:
Tensor names without internal prefix (_tensor_)
- Return type:
tuple of strings
- load(path: str) None #
Load the memory from a file
Supported formats: - PyTorch (pt) - NumPy (npz) - Comma-separated values (csv)
- Parameters:
path (str) – Path to the file where the memory will be loaded
- Raises:
ValueError – If the format is not supported
- reset() None #
Reset the memory by cleaning internal indexes and flags
Old data will be retained until overwritten, but access through the available methods will not be guaranteed
Default values of the internal indexes and flags
filled: False
env_index: 0
memory_index: 0
- sample(names: Tuple[str], batch_size: int, mini_batches: int = 1, sequence_length: int = 1) List[List[ndarray | jax.Array]] #
Data sampling method to be implemented by the inheriting classes
- Parameters:
- Raises:
NotImplementedError – The method has not been implemented
- Returns:
Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (batch size, data size)
- Return type:
list of np.ndarray or jax.Array list
- sample_all(names: Tuple[str], mini_batches: int = 1, sequence_length: int = 1) List[List[ndarray | jax.Array]] #
Sample all data from memory
- Parameters:
- Returns:
Sampled data from memory. The sampled tensors will have the following shape: (memory size * number of environments, data size)
- Return type:
list of np.ndarray or jax.Array list
- sample_by_index(names: Tuple[str], indexes: tuple | ndarray | jax.Array, mini_batches: int = 1) List[List[ndarray | jax.Array]] #
Sample data from memory according to their indexes
- Parameters:
- Returns:
Sampled data from tensors sorted according to their position in the list of names. The sampled tensors will have the following shape: (number of indexes, data size)
- Return type:
list of np.ndarray or jax.Array list
- save(directory: str = '', format: str = 'pt') None #
Save the memory to a file
Supported formats:
PyTorch (pt)
NumPy (npz)
Comma-separated values (csv)
- Parameters:
- Raises:
ValueError – If the format is not supported
Share the tensors between processes