Software of World Model Aerial Swarms

This repository studies how a small swarm of aerial robots can learn to find and track people in a bounded outdoor area. The goal is to train policies that make multiple drones search for people, keep detected people in range, and coordinate coverage of the area without relying on a hand-designed swarm controller.

It uses Isaac Lab to model OmniNXT drones, moving people, drone dynamics, and task rewards. The learned policies are intended as research tools for understanding swarm behavior before moving toward distributed control or real hardware tests.

What This Repo Does

The core task is simple at a high level:

A group of drones starts in a field.
People move through the field.
Drones must search the area, detect people, and stay close enough to keep tracking them.
The policy is rewarded for keeping people in range and for continuing to cover unexplored or stale parts of the field.

Two learning approaches are supported:

Dreamer / World Models: the policy learns an internal predictive model of the swarm and task dynamics, then improves behavior through imagined rollouts. This is the main research direction of the repo.
PPO-style reinforcement learning: model-free PPO baselines are available for comparison, including centralized policies that receive the same high-level state information.

Practical Drone Interface

The current implementation is state-based, not camera-based. The policy receives a compact numerical state from the simulator rather than raw images. This makes it easier to test the swarm-control idea before adding perception.

At each policy step, the controller receives information such as:

drone positions and velocities,
relative spacing between drones,
relative position of detected or assigned people,
a coarse map of how recently the field has been covered,
flags indicating whether target information is reliable.

The policy outputs high-level motion commands for each drone. These are not direct motor commands. The simulator converts them through onboard-style controllers into rotor forces:

horizontal/vertical acceleration command range: approximately -8 to 8 m/s^2,
body-rate command range: approximately -5 to 5 rad/s,
yaw is currently held fixed by the controller.

Important task ranges in the default setup:

field size: approximately 20 m x 20 m,
person detection radius: 3 m,
nominal maximum flight height: 4 m,
low-altitude termination threshold: 0.15 m,
default simulated swarm: 2 drones,
default simulated targets: 2 people.

These values are simulation parameters, not hardware safety limits. Anyone using the repo for real drones should treat the learned policy as experimental and add independent safety, geofencing, communication, and emergency-stop layers.

Repository Layout

source/nxt_dreamer/: Isaac Lab extension with drone assets, controllers, and swarm tasks.
source/DreamerRL/: PyTorch Dreamer implementation and wrappers used by the simulator.
scripts/dreamer/: train and play scripts for Dreamer policies.
scripts/skrl/: train and play scripts for PPO/IPPO/MAPPO baselines.
logs/: experiment outputs, TensorBoard events, checkpoints, and saved rollouts when present.

Setup

Install Isaac Lab and Isaac Sim first using the official Isaac Lab instructions. Then install this repository into the Python environment that has Isaac Lab available:

python -m pip install -e source/nxt_dreamer

If your Isaac Lab installation is not attached to the active Python environment, run the same commands through the Isaac Lab launcher Python instead.

Basic Commands

List registered tasks:

python scripts/list_envs.py

Run a random policy to check that the simulation starts:

python scripts/random_agent.py --task Isaac-NXT-FindAndTrack-Direct-v0 --num_envs 1

Train Dreamer:

python scripts/dreamer/train.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm DreamerV3 --num_envs 1

Train PPO:

python scripts/skrl/train.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm PPO --num_envs 1024

Play a trained Dreamer checkpoint:

python scripts/dreamer/play.py --task Isaac-NXT-FindAndTrack-Direct-v0 --checkpoint /path/to/checkpoint.pt

Play a trained PPO checkpoint:

python scripts/skrl/play.py --task Isaac-NXT-FindAndTrack-Direct-v0 --algorithm PPO --checkpoint /path/to/checkpoint.pt

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
.vscode		.vscode
scripts		scripts
source		source
.dockerignore		.dockerignore
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Software of World Model Aerial Swarms

What This Repo Does

Practical Drone Interface

Repository Layout

Setup

Basic Commands

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Software of World Model Aerial Swarms

What This Repo Does

Practical Drone Interface

Repository Layout

Setup

Basic Commands

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages