SKRL - Reinforcement Learning library (1.4.0)

pypi huggingface discussions
license      docs pytest pre-commit

skrl is an open-source library for Reinforcement Learning written in Python (on top of PyTorch and JAX) and designed with a focus on modularity, readability, simplicity and transparency of algorithm implementation. In addition to supporting the OpenAI Gym , Farama Gymnasium and PettingZoo, Google DeepMind and Brax, among other environment interfaces, it allows loading and configuring NVIDIA Isaac Lab (as well as Isaac Gym and Omniverse Isaac Gym) environments, enabling agents’ simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.

Main features:
  • PyTorch ( pytorch ) and JAX ( jax )

  • Clean code

  • Modularity and reusability

  • Documented library, code and implementations

  • Support for Gym/Gymnasium (single and vectorized), Google DeepMind and Brax, NVIDIA Isaac Lab (as well as Isaac Gym and Omniverse Isaac Gym) environments, among others

  • Simultaneous learning by scopes in Gym/Gymnasium (vectorized), Google Brax, and NVIDIA Isaac Lab (as well as Isaac Gym and Omniverse Isaac Gym)


Warning

skrl is under active continuous development. Make sure you always have the latest version. Visit the develop branch or its documentation to access the latest updates to be released.

Citing skrl: To cite this library (created at Mondragon Unibertsitatea) use the following reference to its article: skrl: Modular and Flexible Library for Reinforcement Learning.

@article{serrano2023skrl,
  author  = {Antonio Serrano-Muñoz and Dimitrios Chrysostomou and Simon Bøgh and Nestor Arana-Arexolaleiba},
  title   = {skrl: Modular and Flexible Library for Reinforcement Learning},
  journal = {Journal of Machine Learning Research},
  year    = {2023},
  volume  = {24},
  number  = {254},
  pages   = {1--9},
  url     = {http://jmlr.org/papers/v24/23-0112.html}
}


User guide

To start using the library, visit the following links:



Library components (overview)

Agents

Definition of reinforcement learning algorithms that compute an optimal policy. All agents inherit from one and only one base class (that defines a uniform interface and provides for common functionalities) but which is not tied to the implementation details of the algorithms

Multi-agents

Definition of reinforcement learning algorithms that compute an optimal policies. All agents (multi-agents) inherit from one and only one base class (that defines a uniform interface and provides for common functionalities) but which is not tied to the implementation details of the algorithms

Environments

Definition of the Isaac Gym (preview 2, 3 and 4), Omniverse Isaac Gym, and Isaac Lab environment loaders, and wrappers for Gym/Gymnasium, DeepMind, Brax, Isaac Lab (as well as Isaac Gym and Omniverse Isaac Gym) environments, among others

Memories

Generic memory definitions. Such memories are not bound to any agent and can be used for any role such as rollout buffer or experience replay memory, for example. All memories inherit from a base class that defines a uniform interface and keeps track (in allocated tensors) of transitions with the environment or other defined data

Models

Definition of helper mixins for the construction of tabular functions or function approximators using artificial neural networks. This library does not provide predefined policies but helper mixins to create discrete and continuous (stochastic or deterministic) policies in which the user only has to define the tables (tensors) or artificial neural networks. All models inherit from one base class that defines a uniform interface and provides for common functionalities. In addition, it is possible to create shared model by combining the implemented definitions

Trainers

Definition of the procedures responsible for managing the agent’s training and interaction with the environment. All trainers inherit from a base class that defines a uniform interface and provides for common functionalities

Resources

Definition of resources used by the agents during training and/or evaluation, such as exploration noises or learning rate schedulers

Noises: Definition of the noises used by the agents during the exploration stage. All noises inherit from a base class that defines a uniform interface

Learning rate schedulers: Definition of learning rate schedulers. All schedulers inherit from the PyTorch _LRScheduler class (see how to adjust learning rate in the PyTorch documentation for more details)

Preprocessors: Definition of preprocessors

Optimizers: Definition of optimizers

Utils and configurations