Baby Dragon Hatchling (BDH) is a biologically inspired large language model architecture that connects principles of deep learning with the foundations of neuroscience. Developed by researchers at Pathway, BDH provides a theoretical and practical framework for understanding the emergence of reasoning and generalization in artificial systems.
This repository contains the official implementation from the paper:
A. Kosowski, P. Uznański, J. Chorowski, Z. Stamirowska, M. Bartoszkiewicz. The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain, arXiv (2025).
BDH represents a scale-free, locally interacting network of neurons capable of intrinsic reasoning dynamics. BDH scales like a Transformer on performance benchmarks—yet retains full interpretability and theoretical grounding in the fine-grained dynamics of neuron interactions.
Key properties:
- Scale-free network topology mimicking biological connectivity
- Locally interacting neuron particles with excitatory/inhibitory dynamics
- Hebbian working memory based on synaptic plasticity, displaying monosemanticity
- GPU-friendly state-space formulation for efficient implementation
- Interpretable activations that are sparse and positive
BDH formalizes a bridge between neural computation and machine-based language understanding. It shows how macro reasoning behavior in large AI models emerges from micro-level neuron dynamics, guided by principles of graph theory and local computation.
Empirically, BDH matches GPT-2–scale Transformers across language and translation tasks at equivalent parameter scales (10M–1B).
BDH and the Transformer share attention-inspired computation; however, BDH’s graph-based architecture makes its attention emerge naturally from neuron-level interactions, reflecting attention as seen in biological systems.
BDH follows Transformer-like scaling laws, maintaining parameter efficiency while achieving interpretability at any scale.
# install dependencies
pip install -r requirements.txt
# train BDH on a toy dataset
python train.py-
Watch the SuperDataScience podcast
▶️ Dragon Hatchling: The Missing Link Between Transformers and the Brain (72 min.) featuring Adrian Kosowski in conversation with Jon Krohn, unpacking BDH’s neuron-level architecture and sparse reasoning dynamics. -
Read about BDH in Forbes, Semafor, The Turing Post, Quantum Zeitgeist, Golem, and elsewhere in the media.
-
Discuss and share the BDH paper on: Hugging Face Papers, Alphaxiv, and EmergentMind.
- adamskrodzki/bdh: dynamic vocabulary, stateful attention
- mosure/burn_dragon_hatchling: Burn port
- severian42/bdh: MLX port
- Git-Faisal/bdh
- GrahLnn/bdh
We thank Andrej Karpathy for the nanoGPT code and the tiny Shapespeare dataset used in this demonstration.
BDH research stands at the intersection of AI architecture, biological learning models, and theoretical computer science—an effort to map the equations of reasoning between artificial and biological intelligence.


