Skip to content

lis-epfl/World-Model-Aerial-Swarms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

101 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Software of World Model Aerial Swarms

This repository studies how a small swarm of aerial robots can learn to find and track people in a bounded outdoor area. The goal is to train policies that make multiple drones search for people, keep detected people in range, and coordinate coverage of the area without relying on a hand-designed swarm controller.

It uses Isaac Lab to model OmniNXT drones, moving people, drone dynamics, and task rewards. The learned policies are intended as research tools for understanding swarm behavior before moving toward distributed control or real hardware tests.

What This Repo Does

The core task is simple at a high level:

  • A group of drones starts in a field.
  • People move through the field.
  • Drones must search the area, detect people, and stay close enough to keep tracking them.
  • The policy is rewarded for keeping people in range and for continuing to cover unexplored or stale parts of the field.

Two learning approaches are supported:

  • Dreamer / World Models: the policy learns an internal predictive model of the swarm and task dynamics, then improves behavior through imagined rollouts. This is the main research direction of the repo.
  • PPO-style reinforcement learning: model-free PPO baselines are available for comparison, including centralized policies that receive the same high-level state information.

Practical Drone Interface

The current implementation is state-based, not camera-based. The policy receives a compact numerical state from the simulator rather than raw images. This makes it easier to test the swarm-control idea before adding perception.

At each policy step, the controller receives information such as:

  • drone positions and velocities,
  • relative spacing between drones,
  • relative position of detected or assigned people,
  • a coarse map of how recently the field has been covered,
  • flags indicating whether target information is reliable.

The policy outputs high-level motion commands for each drone. These are not direct motor commands. The simulator converts them through onboard-style controllers into rotor forces:

  • horizontal/vertical acceleration command range: approximately -8 to 8 m/s^2,
  • body-rate command range: approximately -5 to 5 rad/s,
  • yaw is currently held fixed by the controller.

Important task ranges in the default setup:

  • field size: approximately 20 m x 20 m,
  • person detection radius: 3 m,
  • nominal maximum flight height: 4 m,
  • low-altitude termination threshold: 0.15 m,
  • default simulated swarm: 2 drones,
  • default simulated targets: 2 people.

These values are simulation parameters, not hardware safety limits. Anyone using the repo for real drones should treat the learned policy as experimental and add independent safety, geofencing, communication, and emergency-stop layers.

Repository Layout

  • source/nxt_dreamer/: Isaac Lab extension with drone assets, controllers, and swarm tasks.
  • source/DreamerRL/: PyTorch Dreamer implementation and wrappers used by the simulator.
  • scripts/dreamer/: train and play scripts for Dreamer policies.
  • scripts/skrl/: train and play scripts for PPO/IPPO/MAPPO baselines.
  • logs/: experiment outputs, TensorBoard events, checkpoints, and saved rollouts when present.

Setup

Install Isaac Lab and Isaac Sim first using the official Isaac Lab instructions. Then install this repository into the Python environment that has Isaac Lab available:

python -m pip install -e source/nxt_dreamer

If your Isaac Lab installation is not attached to the active Python environment, run the same commands through the Isaac Lab launcher Python instead.

Basic Commands

List registered tasks:

python scripts/list_envs.py

Run a random policy to check that the simulation starts:

python scripts/random_agent.py --task Isaac-NXT-FindAndTrack-Direct-v0 --num_envs 1

Train Dreamer:

python scripts/dreamer/train.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm DreamerV3 --num_envs 1

Train PPO:

python scripts/skrl/train.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm PPO --num_envs 1024

Play a trained Dreamer checkpoint:

python scripts/dreamer/play.py --task Isaac-NXT-FindAndTrack-Direct-v0 --checkpoint /path/to/checkpoint.pt

Play a trained PPO checkpoint:

python scripts/skrl/play.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm PPO --checkpoint /path/to/checkpoint.pt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors