Running standard scaler¶
Standardize input features by removing the mean and scaling to unit variance.
Algorithm¶
Algorithm implementation¶
Standardization by centering and scaling
Scale back the data to the original representation (inverse transform)
Update the running mean and variance (See parallel algorithm)
Usage¶
The preprocessors usage is defined in each agent’s configuration dictionary.
The preprocessor class is set under the "<variable>_preprocessor"
key and its arguments are set under the "<variable>_preprocessor_kwargs"
key as a keyword argument dictionary. The following examples show how to set the preprocessors for an agent:
# import the preprocessor class
from skrl.resources.preprocessors.torch import RunningStandardScaler
cfg = DEFAULT_CONFIG.copy()
cfg["state_preprocessor"] = RunningStandardScaler
cfg["state_preprocessor_kwargs"] = {"size": env.observation_space, "device": device}
cfg["value_preprocessor"] = RunningStandardScaler
cfg["value_preprocessor_kwargs"] = {"size": 1, "device": device}
# import the preprocessor class
from skrl.resources.preprocessors.jax import RunningStandardScaler
cfg = DEFAULT_CONFIG.copy()
cfg["state_preprocessor"] = RunningStandardScaler
cfg["state_preprocessor_kwargs"] = {"size": env.observation_space}
cfg["value_preprocessor"] = RunningStandardScaler
cfg["value_preprocessor_kwargs"] = {"size": 1}
API (PyTorch)¶
- class skrl.resources.preprocessors.torch.running_standard_scaler.RunningStandardScaler(*args: Any, **kwargs: Any)¶
Standardize the input data by removing the mean and scaling by the standard deviation
The implementation is adapted from the rl_games library (https://github.com/Denys88/rl_games/blob/master/rl_games/algos_torch/running_mean_std.py)
Example:
>>> running_standard_scaler = RunningStandardScaler(size=2) >>> data = torch.rand(3, 2) # tensor of shape (N, 2) >>> running_standard_scaler(data) tensor([[0.1954, 0.3356], [0.9719, 0.4163], [0.8540, 0.1982]])
- Parameters:
size (int, tuple or list of integers, or gymnasium.Space) – Size of the input space
epsilon (float) – Small number to avoid division by zero (default:
1e-8
)clip_threshold (float) – Threshold to clip the data (default:
5.0
)device (str or torch.device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
- _parallel_variance(input_mean: torch.Tensor, input_var: torch.Tensor, input_count: int) None ¶
Update internal variables using the parallel algorithm for computing variance
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
- Parameters:
input_mean (torch.Tensor) – Mean of the input data
input_var (torch.Tensor) – Variance of the input data
input_count (int) – Batch size of the input data
- forward(x: torch.Tensor, train: bool = False, inverse: bool = False, no_grad: bool = True) torch.Tensor ¶
Forward pass of the standardizer
Example:
>>> x = torch.rand(3, 2, device="cuda:0") >>> running_standard_scaler(x) tensor([[0.6933, 0.1905], [0.3806, 0.3162], [0.1140, 0.0272]], device='cuda:0') >>> running_standard_scaler(x, train=True) tensor([[ 0.8681, -0.6731], [ 0.0560, -0.3684], [-0.6360, -1.0690]], device='cuda:0') >>> running_standard_scaler(x, inverse=True) tensor([[0.6260, 0.5468], [0.5056, 0.5987], [0.4029, 0.4795]], device='cuda:0')
- Parameters:
x (torch.Tensor) – Input tensor
train (bool, optional) – Whether to train the standardizer (default:
False
)inverse (bool, optional) – Whether to inverse the standardizer to scale back the data (default:
False
)no_grad (bool, optional) – Whether to disable the gradient computation (default:
True
)
- Returns:
Standardized tensor
- Return type:
API (JAX)¶
- class skrl.resources.preprocessors.jax.running_standard_scaler.RunningStandardScaler(size: int | Tuple[int] | gymnasium.Space, epsilon: float = 1e-08, clip_threshold: float = 5.0, device: str | jax.Device | None = None)¶
Standardize the input data by removing the mean and scaling by the standard deviation
The implementation is adapted from the rl_games library (https://github.com/Denys88/rl_games/blob/master/rl_games/algos_torch/running_mean_std.py)
Example:
>>> running_standard_scaler = RunningStandardScaler(size=2) >>> data = jax.random.uniform(jax.random.PRNGKey(0), (3,2)) # tensor of shape (N, 2) >>> running_standard_scaler(data) Array([[0.57450044, 0.09968603], [0.7419659 , 0.8941783 ], [0.59656656, 0.45325184]], dtype=float32)
- Parameters:
size (int, tuple or list of integers, or gymnasium.Space) – Size of the input space
epsilon (float) – Small number to avoid division by zero (default:
1e-8
)clip_threshold (float) – Threshold to clip the data (default:
5.0
)device (str or jax.Device, optional) – Device on which a tensor/array is or will be allocated (default:
None
). If None, the device will be either"cuda"
if available or"cpu"
- __call__(x: ndarray | jax.Array, train: bool = False, inverse: bool = False) ndarray | jax.Array ¶
Forward pass of the standardizer
Example:
>>> x = jax.random.uniform(jax.random.PRNGKey(0), (3,2)) >>> running_standard_scaler(x) Array([[0.57450044, 0.09968603], [0.7419659 , 0.8941783 ], [0.59656656, 0.45325184]], dtype=float32) >>> running_standard_scaler(x, train=True) Array([[ 0.167439 , -0.4292293 ], [ 0.45878986, 0.8719094 ], [ 0.20582889, 0.14980486]], dtype=float32) >>> running_standard_scaler(x, inverse=True) Array([[0.80847514, 0.4226486 ], [0.9047325 , 0.90777594], [0.8211585 , 0.6385405 ]], dtype=float32)