Skip to content

Commit eeeeecc

Browse files
committed
Add source for JOSS'25 paper
Signed-off-by: Antonio Paolillo <[email protected]>
1 parent b3fc5a5 commit eeeeecc

File tree

3 files changed

+365
-0
lines changed

3 files changed

+365
-0
lines changed

doc/joss25/build.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Build the JOSS25 paper using pythainer.
4+
"""
5+
6+
from pythainer.runners import ConcreteDockerRunner
7+
from pathlib import Path
8+
IMAGE = "myimg"
9+
CONTAINER = "myimg"
10+
LIB_DIR = "/home/${USER_NAME}/workspace/libraries"
11+
12+
13+
def main() -> None:
14+
"""Build the JOSS25 paper according to JOSS website instructions."""
15+
this_dir = Path(__file__).parent.resolve()
16+
17+
runner = ConcreteDockerRunner(
18+
image="openjournals/inara",
19+
environment_variables={
20+
"JOURNAL": "joss",
21+
},
22+
volumes={
23+
f"{this_dir}": "/data",
24+
},
25+
other_options=[],
26+
tty=False,
27+
interactive=False,
28+
)
29+
30+
runner.run()
31+
32+
33+
if __name__ == "__main__":
34+
main()

doc/joss25/paper.bib

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
@misc{nvblox,
2+
title = {{nvblox: GPU-Accelerated Incremental Signed Distance Field Mapping}},
3+
author = {Alexander Millane and Helen Oleynikova and Emilie Wirbel and Remo Steiner and Vikram Ramasamy and David Tingdahl and Roland Siegwart},
4+
year = 2024,
5+
url = {https://arxiv.org/abs/2311.00626},
6+
eprint = {2311.00626},
7+
archiveprefix = {arXiv},
8+
primaryclass = {cs.RO}
9+
}
10+
11+
@misc{robotcore,
12+
title = {{RobotCore: An Open Architecture for Hardware Acceleration in ROS 2}},
13+
author = {Víctor Mayoral-Vilches and Sabrina M. Neuman and Brian Plancher and Vijay Janapa Reddi},
14+
year = 2023,
15+
url = {https://arxiv.org/abs/2205.03929},
16+
eprint = {2205.03929},
17+
archiveprefix = {arXiv},
18+
primaryclass = {cs.RO}
19+
}
20+
21+
@inproceedings{tani2020reproducible,
22+
title = {{Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents}},
23+
author = {Tani, Jacopo and Daniele, Andrea F. and Bernasconi, Gianmarco and Camus, Amaury and Petrov, Aleksandar and Courchesne, Anthony and Mehta, Bhairav and Suri, Rohit and Zaluska, Tomasz and Walter, Matthew R. and Frazzoli, Emilio and Paull, Liam and Censi, Andrea},
24+
year = 2020,
25+
booktitle = {2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
26+
volume = {},
27+
number = {},
28+
pages = {6229--6236},
29+
doi = {10.1109/IROS45743.2020.9341677},
30+
keywords = {Benchmark testing;Tools;Hardware;Software;Complexity theory;Robots;System analysis and design}
31+
}
32+
33+
@article{macenski2022ros2,
34+
title = {{Robot Operating System 2: Design, architecture, and uses in the wild}},
35+
author = {Steven Macenski and Tully Foote and Brian Gerkey and Chris Lalancette and William Woodall},
36+
year = 2022,
37+
journal = {Science Robotics},
38+
volume = 7,
39+
number = 66,
40+
pages = {eabm6074},
41+
doi = {10.1126/scirobotics.abm6074},
42+
url = {https://www.science.org/doi/abs/10.1126/scirobotics.abm6074},
43+
eprint = {https://www.science.org/doi/pdf/10.1126/scirobotics.abm6074}
44+
}
45+
46+
@manual{nvidiaCudaPg13,
47+
title = {{CUDA C++ Programming Guide}},
48+
year = 2025,
49+
note = {Version 13.0},
50+
organization = {NVIDIA Corporation},
51+
howpublished = {\url{https://docs.nvidia.com/cuda/cuda-c-programming-guide/}, accessed 2025-09-15}
52+
}
53+
54+
@misc{qemu,
55+
title = {{QEMU: A generic and open source machine emulator and virtualizer}},
56+
author = {{QEMU Project}},
57+
year = 2003,
58+
howpublished = {\url{https://www.qemu.org/} accessed 2025-09-15}
59+
}
60+
61+
@misc{docker,
62+
title = {{Docker: Accelerated Container Application Development}},
63+
author = {{Docker, Inc.}},
64+
year = 2013,
65+
howpublished = {\url{https://www.docker.com/} accessed 2025-09-15}
66+
}
67+
68+
@inproceedings{llvm,
69+
title = {{LLVM: a compilation framework for lifelong program analysis & transformation}},
70+
author = {Lattner, C. and Adve, V.},
71+
year = 2004,
72+
booktitle = {International Symposium on Code Generation and Optimization, 2004. CGO 2004.},
73+
volume = {},
74+
number = {},
75+
pages = {75--86},
76+
doi = {10.1109/CGO.2004.1281665},
77+
keywords = {Information analysis;Program processors;Performance analysis;High level languages;Virtual machining;Runtime;Arithmetic;Application software;Software safety;Algorithm design and analysis}
78+
}
79+
80+
@inproceedings{mlir,
81+
title = {{MLIR: Scaling Compiler Infrastructure for Domain Specific Computation}},
82+
author = {Lattner, Chris and Amini, Mehdi and Bondhugula, Uday and Cohen, Albert and Davis, Andy and Pienaar, Jacques and Riddle, River and Shpeisman, Tatiana and Vasilache, Nicolas and Zinenko, Oleksandr},
83+
year = 2021,
84+
booktitle = {2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)},
85+
volume = {},
86+
number = {},
87+
pages = {2--14},
88+
doi = {10.1109/CGO51591.2021.9370308},
89+
keywords = {Program processors;Buildings;Semantics;Hardware;Software;Generators;Optimization}
90+
}
91+
92+
@inproceedings{shen2025sentryrt1,
93+
title = {{SentryRT‑1: A Case Study in Evaluating Real‑Time Linux for Safety‑Critical Robotic Perception}},
94+
author = {Yuwen Shen and Jorrit Vander Mynsbrugge and Nima Roshandel and Robin Bouchez and Hamed FirouziPouyaei and Constantin Scholz and Hoang‑Long Cao and Bram Vanderborght and Wouter Joosen and Antonio Paolillo},
95+
year = 2025,
96+
month = {July 8},
97+
booktitle = {Proceedings of the 19th Workshop on Operating Systems Platforms for Embedded Real‑Time Applications (OSPERT 2025)},
98+
address = {Brussels, Belgium},
99+
series = {ECRTS Workshops},
100+
pages = {35--41}
101+
}
102+
103+
@misc{itf24safebot,
104+
title = {{imec ITF World 2024 SAFEBOT demo}},
105+
year = 2024,
106+
note = {Demonstration booth at imec Technology Forum (ITF) 2024},
107+
howpublished = {\url{https://www.youtube.com/watch?v=F7m5_kQ_mRQ} accessed 2025-09-15}
108+
}
109+
110+
@inproceedings{degreef2025macros,
111+
title = {{Towards Macro-Aware C-to-Rust Transpilation (WIP)}},
112+
author = {De Greef, Robbe and Discepoli, Attilio and Aguililla Klein, Esteban and Engels, Th\'{e}o and Hasselmann, Ken and Paolillo, Antonio},
113+
year = 2025,
114+
booktitle = {Proceedings of the 26th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems},
115+
location = {Seoul, Republic of Korea},
116+
publisher = {Association for Computing Machinery},
117+
address = {New York, NY, USA},
118+
series = {LCTES '25},
119+
pages = {57–61},
120+
doi = {10.1145/3735452.3735535},
121+
isbn = 9798400719219,
122+
url = {https://doi.org/10.1145/3735452.3735535},
123+
abstract = {The automatic translation of legacy C code to Rust presents significant challenges, particularly in handling preprocessor macros. C macros introduce metaprogramming constructs that operate at the text level, outside of C's syntax tree, making their direct translation to Rust non-trivial. Existing transpilers --- source-to-source compilers --- expand macros before translation, sacrificing their abstraction and reducing code maintainability. In this work, we introduce Oxidize, a macro-aware C-to-Rust transpilation framework that preserves macro semantics by translating C macros into Rust-compatible constructs while selectively expanding only those that interfere with Rust's stricter semantics. We evaluate our techniques on a small-scale study of real-world macros and find that the majority can be safely and idiomatically transpiled without full expansion.},
124+
numpages = 5,
125+
keywords = {Abstract Syntax Tree, C, Embedded, Macros, Metaprogramming, Preprocessor, Rust, Transpilation}
126+
}
127+
128+
@inproceedings{discepoli2025computeKernels,
129+
title = {{Compute Kernels as Moldable Tasks: Towards Real‑Time Gang Scheduling in GPUs}},
130+
author = {Attilio Discepoli and Mathias Louis Huygen and Antonio Paolillo},
131+
year = 2025,
132+
month = {July 8},
133+
booktitle = {Proceedings of the 19th Workshop on Operating Systems Platforms for Embedded Real‑Time Applications (OSPERT 2025)},
134+
address = {Brussels, Belgium},
135+
series = {ECRTS Workshops},
136+
pages = {29--33}
137+
}
138+
139+
@misc{docker_compose,
140+
title = {{Docker Compose}},
141+
author = {{Docker, Inc.}},
142+
year = 2014,
143+
howpublished = {\url{https://docs.docker.com/compose/} accessed 2025-09-15}
144+
}
145+
146+
@misc{docker_sdk_python,
147+
title = {{Docker SDK for Python}},
148+
author = {{Docker, Inc.}},
149+
year = 2014,
150+
howpublished = {\url{https://docker-py.readthedocs.io/} accessed 2025-09-15}
151+
}
152+
153+
@misc{devcontainers,
154+
title = {{Development Containers Specification}},
155+
author = {{Dev Containers Spec}},
156+
year = 2022,
157+
howpublished = {\url{https://containers.dev} accessed 2025-09-15}
158+
}
159+
160+
@misc{repo2docker,
161+
title = {{repo2docker: Turn repositories into Jupyter-enabled Docker images}},
162+
author = {{Project Jupyter}},
163+
year = 2017,
164+
howpublished = {\url{https://repo2docker.readthedocs.io/} accessed 2025-09-15}
165+
}
166+
167+
@inproceedings{nix04lisa,
168+
title = {{Nix: A Safe and Policy-Free System for Software Deployment}},
169+
author = {Dolstra, Eelco and de Jonge, Merijn and Visser, Eelco},
170+
year = 2004,
171+
booktitle = {Proceedings of the 18th USENIX Conference on System Administration},
172+
location = {Atlanta, GA},
173+
publisher = {USENIX Association},
174+
address = {USA},
175+
series = {LISA '04},
176+
pages = {79–92},
177+
abstract = {Existing systems for software deployment are neither safe nor sufficiently flexible. Primary safety issues are the inability to enforce reliable specification of component dependencies, and the lack of support for multiple versions or variants of a component. This renders deployment operations such as upgrading or deleting components dangerous and unpredictable. A deployment system must also be flexible (i.e., policy-free) enough to support both centralised and local package management, and to allow a variety of mechanisms for transferring components. In this paper we present Nix, a deployment system that addresses these issues through a simple technique of using cryptographic hashes to compute unique paths for component instances.},
178+
numpages = 14
179+
}
180+
181+
@misc{courtes2013guix,
182+
title = {{Functional Package Management with Guix}},
183+
author = {Ludovic Courtès},
184+
year = 2013,
185+
url = {https://arxiv.org/abs/1305.4584},
186+
eprint = {1305.4584},
187+
archiveprefix = {arXiv},
188+
primaryclass = {cs.PL}
189+
}

doc/joss25/paper.md

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
title: "pythainer: composable and reusable Docker builders and runners for reproducible research"
3+
tags:
4+
- Python
5+
- Docker
6+
- reproducibility
7+
- software engineering
8+
- research tooling
9+
- system software
10+
authors:
11+
- name: Antonio Paolillo
12+
orcid: 0000-0001-6608-6562
13+
# affiliation: 1
14+
affiliations:
15+
- name: Software Languages Lab, Vrije Universiteit Brussel (VUB), Belgium
16+
# index: 1
17+
ror: 006e5kg04
18+
date: 16 September 2025
19+
bibliography: paper.bib
20+
---
21+
22+
# Summary
23+
24+
Software experiments today often depend on complex Linux environments that
25+
combine several toolchains, devices, and graphical interfaces. Many research
26+
projects [@nvblox; @robotcore], for instance, need to
27+
compose ROS 2 [@macenski2022ros2] with CUDA [@nvidiaCudaPg13], require non-root
28+
users, provide GPU and GUI access, and must be reproducible across time and
29+
machines. Docker [@docker] is a widely adopted substrate for packaging and
30+
running such environments, and is widely used to improve reproducibility in
31+
research software [@tani2020reproducible]. However, writing and maintaining
32+
Dockerfiles and project-specific `docker run` scripts becomes a burden as
33+
requirements grow.
34+
35+
`pythainer` raises the level of abstraction while remaining Docker-native. It
36+
lets users describe images as small, testable Python *builders* that can be
37+
composed (e.g., ROS 2 + CUDA ) and executed with reusable *runners* that capture
38+
runtime policy (GPU, GUI, users, mounts). `pythainer` renders deterministic
39+
Dockerfiles, builds standard images, and centralizes run
40+
configuration—improving reuse and reducing duplication across repositories.
41+
42+
# Statement of need
43+
44+
Plain Dockerfiles are intentionally minimal: they offer sequential shell steps
45+
but no first-class functions, loops, or composition. This is adequate for simple
46+
images, yet it complicates reuse in research settings where environments must be
47+
combined and parameterized. In particular, merging two existing images (e.g.,
48+
community ROS 2 and NVIDIA CUDA) is not first-class; multi-stage builds help
49+
trim artifacts but require intimate knowledge of which files, environment
50+
variables, and paths must be copied and preserved. On the runtime side, real
51+
projects often need non-root users, persistent volumes, access to GPUs and GUIs
52+
(X11/Wayland), and device mappings. These concerns are typically maintained as
53+
long shell scripts that are copy-pasted and diverge across projects.
54+
55+
`pythainer` addresses these pain points by adding a programmable front-end for
56+
image construction and a reusable abstraction for execution policy. Builders are
57+
Python objects and functions that support ordinary programming constructs
58+
(conditionals, loops, parameters) and can be composed with a simple operator.
59+
Runners encapsulate repeatable `docker run` policy, so launching a container is
60+
a matter of selecting presets rather than rewriting long commands. The target
61+
audience includes research groups and labs (robotics, vision, ML systems,
62+
compilers, systems), instructors who need reliable student environments, and
63+
continuous integration (CI) maintainers who prefer deterministic builds and
64+
centralized run policy over ad-hoc scripts.
65+
66+
# Functionality
67+
68+
`pythainer` is a lightweight Python package and CLI that provides a
69+
programmable front-end to Docker. Instead of writing raw Dockerfiles and shell
70+
scripts, users compose images with builders and control runtime behavior with
71+
runners. The library integrates naturally into Python workflows but remains
72+
Docker-native: it renders deterministic Dockerfiles, builds them with the
73+
Docker engine, and executes containers with reproducible runtime settings.
74+
`pythainer` provides:
75+
76+
- **Builders (image construction).** A small API exposes common steps (e.g.,
77+
FROM/RUN/ENV/WORKDIR, package installs). Builders can be composed via an
78+
in-place operator to form larger images (e.g., ROS 2 + CUDA). Output
79+
rendering is deterministic, which simplifies testing and review. The tool
80+
remains Docker-native: it emits standard Dockerfiles and uses the Docker
81+
engine to build images [@docker].
82+
83+
- **Runners (execution policy).** A runner object assembles `docker run` flags
84+
for typical research needs: non-root user mapping, volumes, devices, GPUs, and
85+
GUI/X11 forwarding. Presets capture best practices (e.g., mapping the X socket
86+
and `DISPLAY`, requesting `--gpus all` with the expected environment
87+
variables), reducing duplication across repositories.
88+
89+
- **CLI.** A command-line interface provides two convenience commands:
90+
`scaffold` generates a starter Python script (builders + runners) and `run`
91+
composes and executes directly for one-offs.
92+
93+
- **Examples and tests.** The package ships small composition recipes (e.g.,
94+
LLVM/MLIR, QEMU, Rust) [@llvm; @mlir; @qemu]. Unit tests lock down Dockerfile
95+
rendering and CLI behavior; an opt-in integration test builds a tiny image to
96+
validate the end-to-end flow. Continuous integration runs tests and linters.
97+
98+
`pythainer` is designed to be naturally extensible: users can easily define
99+
their own builders or runners, or build on top of those provided in the
100+
library.
101+
102+
# Research applications
103+
104+
We have used `pythainer` to assemble environments for
105+
(i) robotics experiments combining ROS 2 with CUDA toolchains [@shen2025sentryrt1; @itf24safebot];
106+
(ii) compiler research that requires pinned LLVM toolchains [@degreef2025macros];
107+
(iii) systems evaluations using QEMU built from source; and
108+
(iv) GPU scheduling experiments where deterministic containerized environments
109+
are required [@discepoli2025computeKernels].
110+
111+
In each case, the same small recipes are reused and composed across projects,
112+
which shortens setup time and reduces configuration drift. Because `pythainer`
113+
emits human-readable Dockerfiles, the resulting images remain transparent and
114+
easy to audit, and the approach integrates well with existing Docker-centric CI.
115+
116+
# Relation to other work
117+
118+
`pythainer` complements the Docker ecosystem by adding a programmable composition
119+
model on top of Dockerfiles. Unlike Docker Compose or the Docker SDK for Python,
120+
which focus on orchestrating multi-service deployments or driving the daemon
121+
[@docker_compose; @docker_sdk_python], `pythainer` focuses on single-image
122+
construction and single-container execution policy. This makes it especially
123+
suited for research projects where the goal is to provide a *single reproducible
124+
environment* for experiments rather than a full service-oriented stack.
125+
126+
Compared with editor-centric templates such as VS Code devcontainers
127+
[@devcontainers] or domain-specific generators such as repo2docker [@repo2docker],
128+
`pythainer` treats environment recipes as code with tests and deterministic
129+
rendering. Functional package managers such as Nix and Guix offer deep
130+
system-level reproducibility but require adopting a different stack
131+
[@nix04lisa; @courtes2013guix]; `pythainer` stays Docker-native for easier
132+
adoption in labs and CI. Pragmatically, many third-party packages (e.g., CUDA and
133+
ROS 2) are primarily supported on Ubuntu, so staying Docker-native with
134+
Ubuntu-based images eases reproduction without changing the base distribution.
135+
136+
# Acknowledgements
137+
138+
We thank contributors for feedback and patches that improved early designs and
139+
examples, including Attilio Discepoli, Yuwen Shen, Aaron Bogaert, and Samuel
140+
Beesoon.
141+
142+
# References

0 commit comments

Comments
 (0)