SKRL - Reinforcement Learning library (0.6.0)
skrl is an open-source modular library for Reinforcement Learning written in Python (using PyTorch) and designed with a focus on readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI Gym and DeepMind environment interfaces, it allows loading and configuring NVIDIA Isaac Gym and NVIDIA Omniverse Isaac Gym environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run
- Main features:
Clean code
Modularity and reusability
Documented library, code and implementations
Support for OpenAI Gym, DeepMind, NVIDIA Isaac Gym (preview 2 and 3) and NVIDIA Omniverse Isaac Gym environments
Simultaneous learning by scopes in Isaac Gym and Omniverse Isaac Gym
Warning
skrl is under active continuous development. Make sure you always have the latest version
Citing skrl: To cite this library (created at Mondragon Unibertsitatea) use the following reference to its article: “skrl: Modular and Flexible Library for Reinforcement Learning”
@article{serrano2022skrl,
title={skrl: Modular and Flexible Library for Reinforcement Learning},
author={Serrano-Mu{\~n}oz, Antonio and Arana-Arexolaleiba, Nestor and Chrysostomou, Dimitrios and B{\o}gh, Simon},
journal={arXiv preprint arXiv:2202.03825},
year={2022}
}
User guide
- Installation
- Getting Started
- Examples
- Learning in a Gym environment (one agent, one environment)
- Learning in a DeepMind environment (one agent, one environment)
- Learning in an Isaac Gym environment (one agent, multiple environments)
- Learning by scopes in an Isaac Gym environment (multiple agents and environments)
- Learning in an Omniverse Isaac Gym environment (one agent, multiple environments)
- Learning in an Omniverse Isaac Sim environment (one agent, one environment)
- Library utilities (skrl.utils module)
- Saving, loading and logging
Library components (overview)
Agents
Definition of reinforcement learning algorithms that compute an optimal policy. All agents inherit from one and only one base class (that defines a uniform interface and provides for common functionalities) but which is not tied to the implementation details of the algorithms
Cross-Entropy Method (CEM)
Double Deep Q-Network (DDQN)
Deep Q-Network (DQN)
Q-learning (Q-learning)
Soft Actor-Critic (SAC)
State Action Reward State Action (SARSA)
Twin-Delayed DDPG (TD3)
Environments
Definition of the Isaac Gym (preview 2 and preview 3) and Omniverse Isaac Gym environment loaders, and wrappers for the OpenAI Gym, DeepMind, Isaac Gym and Omniverse Isaac Gym environments
Wrapping OpenAI Gym, DeepMind, Isaac Gym and Omniverse Isaac Gym environments
Loading Isaac Gym environments
Loading Omniverse Isaac Gym environments
Memories
Generic memory definitions. Such memories are not bound to any agent and can be used for any role such as rollout buffer or experience replay memory, for example. All memories inherit from a base class that defines a uniform interface and keeps track (in allocated tensors) of transitions with the environment or other defined data
Models
Definition of helper classes for the construction of tabular functions or function approximators using artificial neural networks. This library does not provide predefined policies but helper classes to create discrete and continuous (stochastic or deterministic) policies in which the user only has to define the tables (tensors) or artificial neural networks. All models inherit from one base class that defines a uniform interface and provides for common functionalities
Tabular model (discrete domain)
Categorical model (discrete domain)
Gaussian model (continuous domain)
Deterministic model (continuous domain)
Trainers
Definition of the procedures responsible for managing the agent’s training and interaction with the environment. All trainers inherit from a base class that defines a uniform interface and provides for common functionalities
Resources
Definition of resources used by the agents during training and/or evaluation, such as exploration noises or learning rate schedulers
Noises: Definition of the noises used by the agents during the exploration stage. All noises inherit from a base class that defines a uniform interface
Gaussian noise
Ornstein-Uhlenbeck noise
Learning rate schedulers: Definition of learning rate schedulers. All schedulers inherit from the PyTorch
_LRScheduler
class (see how to adjust learning rate in the PyTorch documentation for more details)
Utils
Definition of helper functions and classes
Utilities, e.g. setting the random seed
Memory and Tensorboard file post-processing