Running standard scaler#
Standardize input features by removing the mean and scaling to unit variance.
Algorithm#
Algorithm implementation#
Standardization by centering and scaling
Scale back the data to the original representation (inverse transform)
Update the running mean and variance (See parallel algorithm)
Usage#
The preprocessors usage is defined in each agent’s configuration dictionary.
The preprocessor class is set under the "<variable>_preprocessor"
key and its arguments are set under the "<variable>_preprocessor_kwargs"
key as a keyword argument dictionary. The following examples show how to set the preprocessors for an agent:
# import the preprocessor class
from skrl.resources.preprocessors.torch import RunningStandardScaler
cfg = DEFAULT_CONFIG.copy()
cfg["state_preprocessor"] = RunningStandardScaler
cfg["state_preprocessor_kwargs"] = {"size": env.observation_space, "device": device}
cfg["value_preprocessor"] = RunningStandardScaler
cfg["value_preprocessor_kwargs"] = {"size": 1, "device": device}
# import the preprocessor class
from skrl.resources.preprocessors.jax import RunningStandardScaler
cfg = DEFAULT_CONFIG.copy()
cfg["state_preprocessor"] = RunningStandardScaler
cfg["state_preprocessor_kwargs"] = {"size": env.observation_space}
cfg["value_preprocessor"] = RunningStandardScaler
cfg["value_preprocessor_kwargs"] = {"size": 1}
API (PyTorch)#
- class skrl.resources.preprocessors.torch.running_standard_scaler.RunningStandardScaler(*args: Any, **kwargs: Any)#
- __init__(size: int | Tuple[int] | gym.Space | gymnasium.Space, epsilon: float = 1e-08, clip_threshold: float = 5.0, device: str | torch.device | None = None) None #
Standardize the input data by removing the mean and scaling by the standard deviation
The implementation is adapted from the rl_games library (https://github.com/Denys88/rl_games/blob/master/rl_games/algos_torch/running_mean_std.py)
Example:
>>> running_standard_scaler = RunningStandardScaler(size=2) >>> data = torch.rand(3, 2) # tensor of shape (N, 2) >>> running_standard_scaler(data) tensor([[0.1954, 0.3356], [0.9719, 0.4163], [0.8540, 0.1982]])
- Parameters:
size (int, tuple or list of integers, gym.Space, or gymnasium.Space) – Size of the input space
epsilon (float) – Small number to avoid division by zero (default:
1e-8
)clip_threshold (float) – Threshold to clip the data (default:
5.0
)device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
- _get_space_size(space: int | Tuple[int] | gym.Space | gymnasium.Space) int #
Get the size (number of elements) of a space
- Parameters:
space (int, tuple or list of integers, gym.Space, or gymnasium.Space) – Space or shape from which to obtain the number of elements
- Raises:
ValueError – If the space is not supported
- Returns:
Size of the space data
- Return type:
Space size (number of elements)
- _parallel_variance(input_mean: torch.Tensor, input_var: torch.Tensor, input_count: int) None #
Update internal variables using the parallel algorithm for computing variance
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
- Parameters:
input_mean (torch.Tensor) – Mean of the input data
input_var (torch.Tensor) – Variance of the input data
input_count (int) – Batch size of the input data
- forward(x: torch.Tensor, train: bool = False, inverse: bool = False, no_grad: bool = True) torch.Tensor #
Forward pass of the standardizer
Example:
>>> x = torch.rand(3, 2, device="cuda:0") >>> running_standard_scaler(x) tensor([[0.6933, 0.1905], [0.3806, 0.3162], [0.1140, 0.0272]], device='cuda:0') >>> running_standard_scaler(x, train=True) tensor([[ 0.8681, -0.6731], [ 0.0560, -0.3684], [-0.6360, -1.0690]], device='cuda:0') >>> running_standard_scaler(x, inverse=True) tensor([[0.6260, 0.5468], [0.5056, 0.5987], [0.4029, 0.4795]], device='cuda:0')
- Parameters:
x (torch.Tensor) – Input tensor
train (bool, optional) – Whether to train the standardizer (default:
False
)inverse (bool, optional) – Whether to inverse the standardizer to scale back the data (default:
False
)no_grad (bool, optional) – Whether to disable the gradient computation (default:
True
)
- Returns:
Standardized tensor
- Return type:
API (JAX)#
- class skrl.resources.preprocessors.jax.running_standard_scaler.RunningStandardScaler(size: int | Tuple[int] | gym.Space | gymnasium.Space, epsilon: float = 1e-08, clip_threshold: float = 5.0, device: str | jax.Device | None = None)#
- __init__(size: int | Tuple[int] | gym.Space | gymnasium.Space, epsilon: float = 1e-08, clip_threshold: float = 5.0, device: str | jax.Device | None = None) None #
Standardize the input data by removing the mean and scaling by the standard deviation
The implementation is adapted from the rl_games library (https://github.com/Denys88/rl_games/blob/master/rl_games/algos_torch/running_mean_std.py)
Example:
>>> running_standard_scaler = RunningStandardScaler(size=2) >>> data = jax.random.uniform(jax.random.PRNGKey(0), (3,2)) # tensor of shape (N, 2) >>> running_standard_scaler(data) Array([[0.57450044, 0.09968603], [0.7419659 , 0.8941783 ], [0.59656656, 0.45325184]], dtype=float32)
- Parameters:
size (int, tuple or list of integers, gym.Space, or gymnasium.Space) – Size of the input space
epsilon (float) – Small number to avoid division by zero (default:
1e-8
)clip_threshold (float) – Threshold to clip the data (default:
5.0
)device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
- __call__(x: ndarray | jax.Array, train: bool = False, inverse: bool = False) ndarray | jax.Array #
Forward pass of the standardizer
Example:
>>> x = jax.random.uniform(jax.random.PRNGKey(0), (3,2)) >>> running_standard_scaler(x) Array([[0.57450044, 0.09968603], [0.7419659 , 0.8941783 ], [0.59656656, 0.45325184]], dtype=float32) >>> running_standard_scaler(x, train=True) Array([[ 0.167439 , -0.4292293 ], [ 0.45878986, 0.8719094 ], [ 0.20582889, 0.14980486]], dtype=float32) >>> running_standard_scaler(x, inverse=True) Array([[0.80847514, 0.4226486 ], [0.9047325 , 0.90777594], [0.8211585 , 0.6385405 ]], dtype=float32)
- _get_space_size(space: int | Tuple[int] | gym.Space | gymnasium.Space) int #
Get the size (number of elements) of a space
- Parameters:
space (int, tuple or list of integers, gym.Space, or gymnasium.Space) – Space or shape from which to obtain the number of elements
- Raises:
ValueError – If the space is not supported
- Returns:
Size of the space data
- Return type:
Space size (number of elements)