Run hundreds of isolated Linux development environments on a single VM. Built with LXC, SSH jump hosts, and cloud-native automation.
π« No Kubernetes π« No VM-per-user β Just fast, cheap, isolated Linux environments
Most teams still provision one VM per developer for SSH-based development.
That approach is:
πΈ Expensive
π’ Slow to provision
π§± Wasteful (idle CPU, memory, disk)
Containarium replaces that model with multi-tenant system containers (LXC):
One VM β many isolated Linux environments β massive cost savings
In real deployments, this reduces infrastructure costs by up to 90%.
Containarium is a container-based development environment platform that:
- Hosts many isolated Linux environments on a single cloud VM
- Gives each user SSH access to their own container
- Uses LXC system containers (not Docker app containers)
- Keeps containers persistent, even across VM restarts
- Is managed via a single Go CLI + gRPC
Each container behaves like a lightweight VM:
- Full Linux OS
- User accounts
- SSH access
- Can run Docker, build tools, ML workloads, etc.
Developer Laptop
|
| ssh (ProxyJump)
v
+-------------------+
| SSH Jump Host | (no shell access)
+-------------------+
|
v
+----------------------------------+
| Cloud VM (Host) |
| |
| +---------+ +---------+ |
| | LXC #1 | | LXC #2 | ... |
| | user A | | user B | |
| +---------+ +---------+ |
| |
| ZFS-backed persistent storage |
+----------------------------------+
π Fast Provisioning
- Create a full Linux environment in seconds
- No VM boot, no OS installation per user
π Strong Isolation
- Unprivileged LXC containers
- Separate users, filesystems, and processes
- SSH jump host prevents direct host access
πΎ Persistent Storage
- Containers survive:
- VM restarts
- Spot/preemptible instance termination
- Backed by ZFS persistent disks
βοΈ Simple Management
- Single Go binary
- gRPC-based control plane
- Terraform for infrastructure provisioning
π° Cost Efficient
Example (illustrative):
| Setup | Monthly Cost |
|---|---|
| 50 VMs (1 per user) | $$$$ |
| 1 VM + 50 LXC containers | $$ |
Containarium uses LXC system containers because:
- Each container runs a full Linux OS
- Better fit for:
- SSH-based workflows
- Long-running dev environments
- "Feels like a VM" usage
This is not:
- A Kubernetes cluster
- An application container platform
- A web IDE
It is intentionally simple.
π©βπ» Shared developer environments
π§βπ Education, bootcamps, workshops
π§ͺ AI / ML experimentation sandboxes
π§βπΌ Intern or contractor onboarding
π’ Cost-sensitive enterprises with SSH workflows
| Tool | What It Optimizes For |
|---|---|
| Kubernetes | Application orchestration |
| Docker | App packaging |
| Proxmox | General virtualization |
| Codespaces | Browser IDEs |
| Containarium | Cheap, fast, SSH-based dev environments |
- Actively used internally
- Early-stage open source
- APIs and CLI may evolve
- Contributions and feedback welcome
Host System:
- Ubuntu 24.04 LTS (Noble) or later
- Incus 6.19 or later (required for Docker build support)
- Ubuntu 24.04 default repos ship Incus 6.0.0 which has AppArmor bug (CVE-2025-52881)
- This bug breaks Docker builds in unprivileged containers
- Solution: Use Zabbly Incus repository for latest stable builds
- ZFS kernel module (for disk quotas)
- Kernel modules:
overlay,br_netfilter,nf_nat(for Docker in containers)
Quick Incus Installation (6.19+):
# Add Zabbly repository (recommended)
curl -fsSL https://pkgs.zabbly.com/key.asc | sudo gpg --dearmor -o /usr/share/keyrings/zabbly-incus.gpg
echo 'deb [signed-by=/usr/share/keyrings/zabbly-incus.gpg] https://pkgs.zabbly.com/incus/stable noble main' | sudo tee /etc/apt/sources.list.d/zabbly-incus-stable.list
sudo apt update
sudo apt install incus incus-tools incus-client
# Verify version
incus --version # Should show 6.19 or later- Provision infrastructure with Terraform
- Install Containarium CLI
- Create LXC containers
- Assign users
- Connect via SSH
π See docs/ for detailed setup instructions.
Containarium follows the same principle as Footprint-AI's platform:
Do more with less compute.
- Less idle.
- Less waste.
- Less cost.
Apache 2.0
Containarium is an open-source project by Footprint-AI, focused on resource-efficient computing for modern development and AI workloads.
Containarium provides a multi-layer architecture combining cloud infrastructure, container management, and secure access:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Users (SSH) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GCE Load Balancer (Optional) β
β β’ SSH Traffic Distribution β
β β’ Health Checks (Port 22) β
β β’ Session Affinity β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β Jump Server 1 β β Jump Server 2 β β Jump Server 3 β
β (Spot Instance) β β (Spot Instance) β β (Spot Instance) β
ββββββββββββββββββββ€ ββββββββββββββββββββ€ ββββββββββββββββββββ€
β β’ Debian 12 β β β’ Debian 12 β β β’ Debian 12 β
β β’ Incus LXC β β β’ Incus LXC β β β’ Incus LXC β
β β’ ZFS Storage β β β’ ZFS Storage β β β’ ZFS Storage β
β β’ Containarium β β β’ Containarium β β β’ Containarium β
ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Persistent β β Persistent β β Persistent β
β Disk (ZFS) β β Disk (ZFS) β β Disk (ZFS) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ
βΌ βΌ βΌ βΌ βΌ βΌ βΌ βΌ βΌ
[C1] [C2] [C3]... [C1] [C2] [C3]... [C1] [C2] [C3]...
50 Containers 50 Containers 50 Containers
- Compute: Spot instances with persistent disks
- Storage: ZFS on dedicated persistent disks (survives termination)
- Network: VPC with firewall rules, optional load balancer
- HA: Auto-start on boot, snapshot backups
- Runtime: Unprivileged LXC containers
- Storage: ZFS with compression (lz4) and quotas
- Network: Bridge networking with isolated namespaces
- Security: AppArmor profiles, resource limits
- Language: Go with Protobuf contracts
- Operations: Create, delete, list, info, resize, export
- API: Local CLI + optional gRPC daemon
- Automation: Automated container lifecycle
- Jump Server: SSH bastion host
- ProxyJump: Transparent container access
- Authentication: SSH key-based only
- Isolation: Per-user containers
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Machine β
β β
β $ ssh my-dev β
β β β
β βββ ProxyJump via Jump Server β
β β β
β βββ SSH to Container IP (10.0.3.x) β
β β β
β βββ User in isolated Ubuntu container β
β with Docker installed β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Terraform Workflow β
β β
β terraform apply β
β β β
β βββ Create GCE instances (spot + persistent disk) β
β βββ Configure ZFS on persistent disk β
β βββ Install Incus from official repo β
β βββ Setup firewall rules β
β βββ Optional: Deploy containarium daemon β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Containarium CLI Workflow β
β β
β containarium create alice --ssh-key ~/.ssh/alice.pub β
β β β
β βββ Generate container profile (ZFS quota, limits) β
β βββ Launch Incus container (Ubuntu 24.04) β
β βββ Configure networking (get IP from pool) β
β βββ Inject SSH key for user β
β βββ Install Docker and dev tools β
β βββ Return container IP and SSH command β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Internet
β
βΌ
βββββββββββββββββββββββββββββββββββ
β GCE Spot Instance β
β β’ n2-standard-8 (32GB RAM) β
β β’ 100GB boot + 100GB data disk β
β β’ ZFS pool on data disk β
β β’ 50 containers @ 500MB each β
βββββββββββββββββββββββββββββββββββ
Cost: $98/month | $1.96/user
Availability: ~99% (with auto-restart)
Load Balancer
(SSH Port 22)
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
Jump-1 Jump-2 Jump-3
(50 users) (50 users) (50 users)
β β β
βΌ βΌ βΌ
Persistent-1 Persistent-2 Persistent-3
(100GB ZFS) (100GB ZFS) (100GB ZFS)
Cost: $312/month | $2.08/user (150 users)
Availability: 99.9% (multi-server)
Each jump server is independent with its own containers and persistent storage.
1. User: containarium create alice --ssh-key alice.pub
2. CLI: Read SSH public key from file
3. CLI: Call Incus API to launch container
4. Incus: Pull Ubuntu 24.04 image (cached after first use)
5. Incus: Create ZFS dataset with quota (default 20GB)
6. Incus: Assign IP from pool (10.0.3.x)
7. CLI: Wait for container network ready
8. CLI: Inject SSH key into container
9. CLI: Install Docker and dev tools
10. CLI: Return IP and connection info
1. User: ssh my-dev (from ~/.ssh/config)
2. SSH: Connect to jump server as alice (ProxyJump)
3. Jump: Authenticate alice's key (proxy-only account, no shell)
4. SSH: Forward connection to container IP (10.0.3.x)
5. Container: Authenticate alice's key (same key!)
6. User: Shell access in isolated container
Secure Multi-Tenant Architecture:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User's Local Machine β
β β
β ~/.ssh/config: β
β Host my-dev β
β HostName 10.0.3.100 β
β User alice β
β IdentityFile ~/.ssh/containarium β ALICE'S KEY β
β ProxyJump containarium-jump β
β β
β Host containarium-jump β
β HostName 35.229.246.67 β
β User alice β ALICE'S ACCOUNT! β
β IdentityFile ~/.ssh/containarium β SAME KEY β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β (1) SSH as alice (proxy-only)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GCE Instance (Jump Server) β
β β
β /home/admin/.ssh/authorized_keys: β
β ssh-ed25519 AAAA... admin@laptop β ADMIN ONLY β
β Shell: /bin/bash β FULL ACCESS β
β β
β /home/alice/.ssh/authorized_keys: β
β ssh-ed25519 AAAA... alice@laptop β ALICE'S KEY β
β Shell: /usr/sbin/nologin β NO SHELL ACCESS! β
β β
β /home/bob/.ssh/authorized_keys: β
β ssh-ed25519 AAAA... bob@laptop β BOB'S KEY β
β Shell: /usr/sbin/nologin β NO SHELL ACCESS! β
β β
β β Alice authenticated for proxy only β
β β Cannot execute commands on jump server β
β β ProxyJump forwards connection to container β
β β Audit log: alice@jump-server β 10.0.3.100 β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β (2) SSH with same key
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Guest Container (alice-container) β
β β
β /home/alice/.ssh/authorized_keys: β
β ssh-ed25519 AAAA... alice@laptop β SAME KEY β
β β
β β Alice authenticated β
β β Shell access granted β
β β Audit log: alice@alice-container β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Security Architecture:
- Separate accounts: Each user has their own account on jump server
- No shell access: User accounts use
/usr/sbin/nologin(proxy-only) - Same key: Users use one key for both jump server and container
- Admin isolation: Only admin can access jump server shell
- Audit trail: Each user's connections logged separately
- DDoS protection: fail2ban can block malicious users per account
- Zero trust: Users cannot see other containers or inspect system
1. GCE: Spot instance terminated (preemption)
2. GCE: Persistent disk detached (data preserved)
3. GCE: Instance restarts (within 5 minutes)
4. Startup: Mount persistent disk to /var/lib/incus
5. Startup: Import existing ZFS pool (incus-pool)
6. Incus: Auto-start containers (boot.autostart=true)
7. Total downtime: 2-5 minutes
8. Data: 100% preserved
- Development Teams: Isolated dev environments for each developer (100+ users)
- Training & Education: Spin up temporary environments for students
- CI/CD Runners: Ephemeral build and test environments
- Testing: Isolated test environments with Docker support
- Multi-Tenancy: Safe isolation between users, teams, or projects
| Users | Traditional VMs | Containarium | Savings |
|---|---|---|---|
| 50 | $1,250/mo | $98/mo | 92% |
| 150 | $3,750/mo | $312/mo | 92% |
| 250 | $6,250/mo | $508/mo | 92% |
How?
- LXC containers: 10x more density than VMs
- Spot instances: 76% cheaper than regular VMs
- Persistent disks: Survive spot termination
- Single infrastructure: No VM-per-user overhead
Choose your deployment size:
Small Team (20-50 users):
cd terraform/gce
cp examples/single-server-spot.tfvars terraform.tfvars
vim terraform.tfvars # Add your project_id and SSH keys
terraform init
terraform applyMedium Team (100-150 users):
cp examples/horizontal-scaling-3-servers.tfvars terraform.tfvars
vim terraform.tfvars # Configure
terraform applyLarge Team (200-250 users):
cp examples/horizontal-scaling-5-servers.tfvars terraform.tfvars
terraform applyOption A: Deploy for Local Mode (SSH to server)
# Build containarium CLI for Linux
make build-linux
# Copy to jump server(s)
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containariumOption B: Setup for Remote Mode (Run from anywhere)
# Build containarium for your platform
make build # macOS/Linux on your laptop
# Setup daemon on server
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containarium
# Generate mTLS certificates
sudo containarium cert generate \
--server-ip <jump-server-ip> \
--output-dir /etc/containarium/certs
# Start daemon (via systemd or manually)
sudo systemctl start containarium
# Copy client certificates to your machine
exit
mkdir -p ~/.config/containarium/certs
scp admin@<jump-server-ip>:/etc/containarium/certs/{ca.crt,client.crt,client.key} \
~/.config/containarium/certs/Option A: Local Mode (SSH to server)
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container for a user
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
# Output:
# β Creating container for user: alice
# β [1/7] Creating container...
# β [2/7] Starting container...
# β [3/7] Creating jump server account (proxy-only)...
# β Jump server account created: alice (no shell access, proxy-only)
# β [4/7] Waiting for network...
# Container IP: 10.0.3.100
# β [5/7] Installing Docker, SSH, and tools...
# β [6/7] Creating user: alice...
# β [7/7] Adding SSH keys (including jump server key for ProxyJump)...
# β Container alice-container created successfully!
#
# Container Details:
# Name: alice-container
# User: alice
# IP: 10.0.3.100
# Disk: 50GB
# Auto-start: enabled
#
# Jump Server Account (Secure Multi-Tenant):
# Username: alice
# Shell: /usr/sbin/nologin (proxy-only, no shell access)
# SSH ProxyJump: enabled
#
# SSH Access (via ProxyJump):
# ssh alice-dev # (after SSH config setup)
# List containers
sudo containarium list
# +------------------+---------+----------------------+------+-----------+
# | NAME | STATE | IPV4 | TYPE | SNAPSHOTS |
# +------------------+---------+----------------------+------+-----------+
# | alice-container | RUNNING | 10.0.3.100 (eth0) | C | 0 |
# +------------------+---------+----------------------+------+-----------+Option B: Remote Mode (from your laptop)
# No SSH required - direct gRPC call with mTLS
containarium create alice --ssh-key ~/.ssh/alice.pub \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs \
--cpu 4 --memory 8GB -v
# List containers remotely
containarium list \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
# Export SSH config remotely (run on server)
ssh admin@<jump-server-ip>
sudo containarium export alice --jump-ip 35.229.246.67 >> ~/.ssh/configEach user needs their own SSH key pair for container access.
User generates SSH key (on their local machine):
# Generate new SSH key pair
ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/containarium_alice
# Output:
# ~/.ssh/containarium_alice (private key - keep secret!)
# ~/.ssh/containarium_alice.pub (public key - share with admin)Admin creates container with user's public key:
# User sends their public key to admin
# Admin receives: alice_id_ed25519.pub
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container with user's public key
sudo containarium create alice --ssh-key /path/to/alice_id_ed25519.pub
# Or if key is on admin's local machine, copy it first:
scp alice_id_ed25519.pub admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo containarium create alice --ssh-key /tmp/alice_id_ed25519.pubContainarium implements a secure proxy-only jump server architecture:
- β
Each user has a separate jump server account with
/usr/sbin/nologinshell - β Jump server accounts are proxy-only (no direct shell access)
- β SSH ProxyJump works transparently through the jump server
- β Users cannot access jump server data or see other users
- β Automatic jump server account creation when container is created
- β Jump server accounts deleted when container is deleted
User's Laptop Jump Server Container
β β β
β SSH to alice-jump β β
βββββββββββββββββββββββββββ>β (alice account: β
β (ProxyJump) β /usr/sbin/nologin) β
β β ββ> Blocks shell β
β β ββ> Allows proxy β
β β β β
β β β SSH forward β
β β ββββββββββββββ>β
β β
β Direct SSH to container (10.0.3.100) β
β<βββββββββββββββββββββββββββββββββββββββββββββββββββββ
Users configure SSH on their local machine:
Add to ~/.ssh/config:
# Jump server (proxy-only account - NO shell access)
Host containarium-jump
HostName <jump-server-ip>
User alice # Each user has their own jump account
IdentityFile ~/.ssh/containarium_alice
# Your dev container
Host alice-dev
HostName 10.0.3.100
User alice
IdentityFile ~/.ssh/containarium_alice
ProxyJump containarium-jump
StrictHostKeyChecking accept-newTest the setup:
# This will FAIL (proxy-only account - no shell)
ssh containarium-jump
# Output: "This account is currently not available."
# This WORKS (ProxyJump to container)
ssh alice-dev
# Output: alice@alice-container:~$Connect:
ssh my-dev
# Alice is now in her Ubuntu container with Docker!
# First connection will ask to verify host key:
# The authenticity of host '10.0.3.100 (<no hostip for proxy command>)' can't be established.
# ED25519 key fingerprint is SHA256:...
# Are you sure you want to continue connecting (yes/no)? yes# Method 1: Using incus exec
sudo incus exec alice-container -- bash -c "echo 'ssh-ed25519 AAAA...' >> /home/alice/.ssh/authorized_keys"
# Method 2: Using incus file push
echo 'ssh-ed25519 AAAA...' > /tmp/new_key.pub
sudo incus file push /tmp/new_key.pub alice-container/home/alice/.ssh/authorized_keys --mode 0600 --uid 1000 --gid 1000
# Method 3: SSH into container and add manually
ssh [email protected] # (from jump server)
echo 'ssh-ed25519 AAAA...' >> ~/.ssh/authorized_keys# Overwrite authorized_keys with new key
echo 'ssh-ed25519 NEW_KEY_AAAA...' | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/null
# Set correct permissions
sudo incus exec alice-container -- chown alice:alice /home/alice/.ssh/authorized_keys
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys# Edit authorized_keys file
sudo incus exec alice-container -- bash -c \
"sed -i '/alice@old-laptop/d' /home/alice/.ssh/authorized_keys"# List all authorized keys for a user
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keysContainarium/
βββ proto/ # Protobuf contracts (type-safe)
β βββ containarium/v1/
β βββ container.proto # Container operations
β βββ config.proto # System configuration
β
βββ cmd/containarium/ # CLI entry point
βββ internal/
β βββ cmd/ # CLI commands (create, list, delete, info)
β βββ container/ # Container management logic
β βββ incus/ # Incus API wrapper
β βββ ssh/ # SSH key management
β
βββ terraform/
β βββ gce/ # GCP deployment
β β βββ main.tf # Main infrastructure
β β βββ horizontal-scaling.tf # Multi-server setup
β β βββ spot-instance.tf # Spot VM + persistent disk
β β βββ examples/ # Ready-to-use configurations
β β βββ scripts/ # Startup scripts
β βββ embed/ # Terraform file embedding for tests
β βββ terraform.go # go:embed declarations
β βββ README.md # Embedding documentation
β
βββ test/integration/ # E2E tests
β βββ e2e_terraform_test.go # Terraform-based E2E tests
β βββ e2e_reboot_test.go # gcloud-based E2E tests
β βββ TERRAFORM-E2E.md # Terraform testing guide
β βββ E2E-README.md # gcloud testing guide
β
βββ docs/ # Documentation
β βββ HORIZONTAL-SCALING-QUICKSTART.md
β βββ SSH-JUMP-SERVER-SETUP.md
β βββ SPOT-INSTANCES-AND-SCALING.md
β
βββ Makefile # Build automation
βββ IMPLEMENTATION-PLAN.md # Detailed roadmap
# Show all commands
make help
# Build for current platform
make build
# Build for Linux (deployment)
make build-linux
# Generate protobuf code
make proto
# Run tests
make test
# Run E2E tests (requires GCP credentials)
export GCP_PROJECT=your-project-id
make test-e2e
# Lint and format
make lint fmt# Build and run locally
make run-local
# Test commands
./bin/containarium create alice
./bin/containarium list
./bin/containarium info aliceContainarium uses a comprehensive testing strategy with real infrastructure validation:
The E2E test suite leverages the same Terraform configuration used for production deployments:
test/integration/
βββ e2e_terraform_test.go # Terraform-based E2E tests
βββ e2e_reboot_test.go # Alternative gcloud-based tests
βββ TERRAFORM-E2E.md # Terraform E2E documentation
βββ E2E-README.md # gcloud E2E documentation
terraform/embed/
βββ terraform.go # Embeds Terraform files (go:embed)
βββ README.md # Embedding documentation
Key Features:
- β go:embed Integration: Terraform files embedded in test binary for portability
- β ZFS Persistence: Verifies data survives spot instance reboots
- β No Hardcoded Values: All configuration from Terraform outputs
- β Reproducible: Same Terraform config as production
- β Automatic Cleanup: Infrastructure destroyed after tests
Running E2E Tests:
# Set GCP project
export GCP_PROJECT=your-gcp-project-id
# Run full E2E test (25-30 min)
make test-e2e
# Test workflow:
# 1. Deploy infrastructure with Terraform
# 2. Wait for instance ready
# 3. Verify ZFS setup
# 4. Create container with test data
# 5. Reboot instance (stop/start)
# 6. Verify data persisted
# 7. Cleanup infrastructureTest Reports:
- Creates temporary Terraform workspace
- Verifies ZFS pool status
- Validates container quota enforcement
- Confirms data persistence across reboots
See test/integration/TERRAFORM-E2E.md for detailed documentation.
With separate user accounts, every SSH connection is logged with the actual username:
SSH Audit Logs (/var/log/auth.log):
# Alice connects to her container
Jan 10 14:23:15 jump-server sshd[12345]: Accepted publickey for alice from 203.0.113.10
Jan 10 14:23:15 jump-server sshd[12345]: pam_unix(sshd:session): session opened for user alice
# Bob connects to his container
Jan 10 14:25:32 jump-server sshd[12346]: Accepted publickey for bob from 203.0.113.11
Jan 10 14:25:32 jump-server sshd[12346]: pam_unix(sshd:session): session opened for user bob
# Failed login attempt
Jan 10 14:30:01 jump-server sshd[12347]: Failed publickey for charlie from 203.0.113.12
Jan 10 14:30:05 jump-server sshd[12348]: Failed publickey for charlie from 203.0.113.12
Jan 10 14:30:09 jump-server sshd[12349]: Failed publickey for charlie from 203.0.113.12View Audit Logs:
# SSH to jump server as admin
ssh admin@<jump-server-ip>
# View all SSH connections
sudo journalctl -u sshd -f
# View connections for specific user
sudo journalctl -u sshd | grep "for alice"
# View failed login attempts
sudo journalctl -u sshd | grep "Failed"
# View connections from specific IP
sudo journalctl -u sshd | grep "from 203.0.113.10"
# Export logs for security audit
sudo journalctl -u sshd --since "2025-01-01" --until "2025-01-31" > ssh-audit-jan-2025.logAutomatically block brute force attacks and unauthorized access attempts:
Install fail2ban (added to startup script):
# Automatically installed by Terraform startup script
sudo apt install -y fail2banConfigure fail2ban for SSH (/etc/fail2ban/jail.d/sshd.conf):
[sshd]
enabled = true
port = 22
filter = sshd
logpath = /var/log/auth.log
maxretry = 3 # Block after 3 failed attempts
findtime = 600 # Within 10 minutes
bantime = 3600 # Ban for 1 hour
banaction = iptables-multiportMonitor fail2ban:
# Check fail2ban status
sudo fail2ban-client status
# Check SSH jail status
sudo fail2ban-client status sshd
# Output:
# Status for the jail: sshd
# |- Filter
# | |- Currently failed: 2
# | |- Total failed: 15
# | `- File list: /var/log/auth.log
# `- Actions
# |- Currently banned: 1
# |- Total banned: 3
# `- Banned IP list: 203.0.113.12
# View banned IPs
sudo fail2ban-client get sshd banip
# Unban IP manually (if needed)
sudo fail2ban-client set sshd unbanip 203.0.113.12fail2ban Logs:
# View fail2ban activity
sudo tail -f /var/log/fail2ban.log
# Example output:
# 2025-01-10 14:30:15,123 fail2ban.filter [12345]: INFO [sshd] Found 203.0.113.12 - 2025-01-10 14:30:09
# 2025-01-10 14:30:20,456 fail2ban.actions [12346]: NOTICE [sshd] Ban 203.0.113.12
# 2025-01-10 15:30:20,789 fail2ban.actions [12347]: NOTICE [sshd] Unban 203.0.113.12Create monitoring script (/usr/local/bin/security-monitor.sh):
#!/bin/bash
echo "=== Containarium Security Status ==="
echo ""
echo "π Active SSH Sessions:"
who
echo ""
echo "π« Banned IPs (fail2ban):"
sudo fail2ban-client status sshd | grep "Banned IP"
echo ""
echo "β οΈ Recent Failed Login Attempts:"
sudo journalctl -u sshd --since "1 hour ago" | grep "Failed" | tail -10
echo ""
echo "β
Successful Logins (last hour):"
sudo journalctl -u sshd --since "1 hour ago" | grep "Accepted publickey" | tail -10
echo ""
echo "π₯ Unique Users Connected Today:"
sudo journalctl -u sshd --since "today" | grep "Accepted publickey" | \
awk '{print $9}' | sort -uRun monitoring:
# Make executable
sudo chmod +x /usr/local/bin/security-monitor.sh
# Run manually
sudo /usr/local/bin/security-monitor.sh
# Add to cron for daily reports
echo "0 9 * * * /usr/local/bin/security-monitor.sh | mail -s 'Daily Security Report' [email protected]" | sudo crontab -Since each user has their own account, you can track:
User-specific metrics:
# Count connections per user
sudo journalctl -u sshd --since "today" | grep "Accepted publickey" | \
awk '{print $9}' | sort | uniq -c | sort -rn
# Output:
# 45 alice
# 32 bob
# 18 charlie
# 5 david
# View all of Alice's connections
sudo journalctl -u sshd | grep "for alice" | grep "Accepted publickey"
# Find when Bob last connected
sudo journalctl -u sshd | grep "for bob" | grep "Accepted publickey" | tail -1With separate accounts, DDoS attacks are isolated:
Scenario: Alice's laptop is compromised and spams connections
# fail2ban detects excessive failed attempts from alice's IP
2025-01-10 15:00:00 fail2ban.filter [12345]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:05 fail2ban.filter [12346]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:10 fail2ban.filter [12347]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:15 fail2ban.actions [12348]: NOTICE [sshd] Ban 203.0.113.10
# Result:
# β
Alice's IP is banned (her laptop is blocked)
# β
Bob, Charlie, and other users are NOT affected
# β
Service continues for everyone else
# β
Admin can investigate Alice's account specificallyWithout separate accounts (everyone uses 'admin'):
# β Can't tell which user is causing the issue
# β Banning the IP might affect legitimate users behind NAT
# β No per-user accountabilityExport security logs for compliance:
# Export all SSH activity for user 'alice' in January
sudo journalctl -u sshd --since "2025-01-01" --until "2025-02-01" | \
grep "for alice" > alice-ssh-audit-jan-2025.log
# Export all failed login attempts
sudo journalctl -u sshd --since "2025-01-01" --until "2025-02-01" | \
grep "Failed" > failed-logins-jan-2025.log
# Export fail2ban bans
sudo fail2ban-client get sshd banhistory > ban-history-jan-2025.log- Regular Log Reviews: Check logs weekly for suspicious activity
- fail2ban Tuning: Adjust
maxretryandbantimebased on your security needs - Alert on Anomalies: Set up alerts for unusual patterns (100+ connections from one user)
- Log Retention: Keep logs for at least 90 days for compliance
- Separate Admin Access: Never use user accounts for admin tasks
- Monitor fail2ban: Ensure fail2ban service is always running
Complete end-to-end workflow for adding a new user:
User (on their local machine):
# Generate SSH key pair
ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/containarium_alice
# Output:
# Generating public/private ed25519 key pair.
# Enter passphrase (empty for no passphrase): [optional]
# Your identification has been saved in ~/.ssh/containarium_alice
# Your public key has been saved in ~/.ssh/containarium_alice.pub
# View and copy public key to send to admin
cat ~/.ssh/containarium_alice.pub
# ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL+XYZ... [email protected]Admin receives public key and creates container:
# Save user's public key to file
echo 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... [email protected]' > /tmp/alice.pub
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container with user's public key
# This automatically:
# 1. Creates jump server account for alice (proxy-only, no shell)
# 2. Creates alice-container with SSH access
# 3. Sets up SSH keys for both
sudo containarium create alice --ssh-key /tmp/alice.pub
# Output:
# β Creating jump server account: alice (proxy-only)
# β Creating container for user: alice
# β Container started: alice-container
# β IP Address: 10.0.3.100
# β Installing Docker and dev tools
# β Container alice-container created successfully!
#
# β Jump server account: [email protected] (proxy-only, no shell)
# β Container access: [email protected]
#
# Send this to user:
# Jump Server: 35.229.246.67 (user: alice)
# Container IP: 10.0.3.100
# Username: alice
# Enable auto-start for spot instance recovery
sudo incus config set alice-container boot.autostart trueMethod 1: Export SSH Config (Recommended)
# Admin exports SSH configuration
sudo containarium export alice --jump-ip 35.229.246.67 --key ~/.ssh/containarium_alice > alice-ssh-config.txt
# Send alice-ssh-config.txt to user via email/SlackMethod 2: Manual SSH Config
Admin sends to user via email/Slack:
Your development container is ready!
Jump Server IP: 35.229.246.67
Your Username: alice (for both jump server and container)
Container IP: 10.0.3.100
Add this to your ~/.ssh/config:
Host containarium-jump
HostName 35.229.246.67
User alice # β Your own username!
IdentityFile ~/.ssh/containarium_alice
Host alice-dev
HostName 10.0.3.100
User alice # β Same username
IdentityFile ~/.ssh/containarium_alice # β Same key
ProxyJump containarium-jump
Then connect with: ssh alice-dev
Note: Your jump server account is proxy-only (no shell access).
You can only access your container, not the jump server itself.
User (on their local machine):
Method 1: Using Exported Config (Recommended)
# Add exported config to your SSH config
cat alice-ssh-config.txt >> ~/.ssh/config
# Connect to container
ssh alice-dev
# You're now in your container!
alice@alice-container:~$ docker run hello-world
alice@alice-container:~$ sudo apt install vim git tmuxMethod 2: Manual Configuration
# Add to ~/.ssh/config
vim ~/.ssh/config
# Paste the configuration provided by admin
# Connect to container
ssh alice-dev
# First time: verify host key
# The authenticity of host '10.0.3.100' can't be established.
# ED25519 key fingerprint is SHA256:...
# Are you sure you want to continue connecting (yes/no)? yes
# You're now in your container!
alice@alice-container:~$ docker run hello-world
alice@alice-container:~$ sudo apt install vim git tmuxUser wants to access from second laptop:
# On second laptop, generate new key
ssh-keygen -t ed25519 -C "alice@home-laptop" -f ~/.ssh/containarium_alice_home
# Send new public key to admin
cat ~/.ssh/containarium_alice_home.pubAdmin adds second key:
# Add second key to container (keeps existing keys)
NEW_KEY='ssh-ed25519 AAAAC3... alice@home-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$NEW_KEY' >> /home/alice/.ssh/authorized_keys"User can now connect from both laptops!
Containarium provides a simple, intuitive CLI for container management.
Containarium uses a single binary that operates in two modes:
# Execute directly on the jump server (requires sudo)
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
sudo containarium list
sudo containarium delete bob- β Direct Incus API access via Unix socket
- β No daemon required
- β Fastest execution
- β Must be run on the server
- β Requires sudo/root privileges
# Execute from anywhere (laptop, CI/CD, etc.)
containarium create alice --ssh-key ~/.ssh/alice.pub \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
containarium list --server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs- β Remote execution from any machine
- β Secure mTLS authentication
- β No SSH required
- β Perfect for automation/CI/CD
- β Requires daemon running on server
- β Requires certificate setup
# Run as systemd service on the jump server
containarium daemon --address 0.0.0.0 --port 50051 --mtls
# Systemd service configuration
sudo systemctl start containarium
sudo systemctl enable containarium
sudo systemctl status containarium- Listens on port 50051 (gRPC)
- Enforces mTLS client authentication
- Manages concurrent container operations
- Automatically started via systemd
Generate mTLS certificates:
# On server: Generate server and client certificates
containarium cert generate \
--server-ip 35.229.246.67 \
--output-dir /etc/containarium/certs
# Copy client certificates to local machine
scp [email protected]:/etc/containarium/certs/{ca.crt,client.crt,client.key} \
~/.config/containarium/certs/Verify connection:
# Test remote connection
containarium list \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs# Basic usage
sudo containarium create <username> --ssh-key <path-to-public-key>
# Example
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
# With custom disk quota
sudo containarium create bob --ssh-key ~/.ssh/bob.pub --disk-quota 50GB
# Enable auto-start on boot
sudo containarium create charlie --ssh-key ~/.ssh/charlie.pub --autostartOutput:
β Creating container for user: alice
β Launching Ubuntu 24.04 container
β Container started: alice-container
β IP Address: 10.0.3.100
β Installing Docker and dev tools
β Configuring SSH access
β Container alice-container created successfully!
Container Details:
Name: alice-container
User: alice
IP: 10.0.3.100
Disk Quota: 20GB (ZFS)
SSH: ssh [email protected]
# List all containers
sudo containarium list
# Example output
NAME STATUS IP QUOTA AUTOSTART
alice-container Running 10.0.3.100 20GB Yes
bob-container Running 10.0.3.101 50GB Yes
charlie-container Stopped - 20GB No# Get detailed information
sudo containarium info alice
# Example output
Container: alice-container
Status: Running
User: alice
IP Address: 10.0.3.100
Disk Quota: 20GB
Disk Used: 4.2GB (21%)
Memory: 512MB / 2GB
CPU Usage: 5%
Uptime: 3 days
Auto-start: Enabled# Delete container (with confirmation)
sudo containarium delete alice
# Force delete (no confirmation)
sudo containarium delete bob --force
# Delete with data backup
sudo containarium delete charlie --backupDynamically adjust container resources (CPU, memory, disk) without any downtime. All changes take effect immediately without restarting the container.
# Resize CPU only
sudo containarium resize alice --cpu 4
# Resize memory only
sudo containarium resize alice --memory 8GB
# Resize disk only
sudo containarium resize alice --disk 100GB
# Resize all three at once
sudo containarium resize alice --cpu 4 --memory 8GB --disk 100GB
# With verbose output
sudo containarium resize alice --cpu 8 --memory 16GB -vAdvanced CPU Options:
# Set specific number of cores
sudo containarium resize alice --cpu 4
# Set CPU range (flexible allocation)
sudo containarium resize alice --cpu 2-4
# Pin to specific CPU cores (performance)
sudo containarium resize alice --cpu 0-3Memory Formats:
# Gigabytes
sudo containarium resize alice --memory 8GB
# Megabytes
sudo containarium resize alice --memory 4096MB
# Gibibytes (binary)
sudo containarium resize alice --memory 8GiBImportant Notes:
- CPU: Always safe to increase or decrease. Supports over-provisioning (4-8x).
- Memory: Safe to increase. Check current usage before decreasing to avoid OOM kills.
- Disk: Can only increase (cannot shrink below current usage).
- All changes are instant with no container restart required.
Verbose Output Example:
$ sudo containarium resize alice --cpu 4 --memory 8GB -v
Resizing container: alice-container
Setting CPU limit: 4
Setting memory limit: 8GB
β Resources updated successfully (no restart required)
β Container alice-container resized successfully!
Updated configuration:
CPU: 4
Memory: 8GB# Export to stdout (copy/paste to ~/.ssh/config)
sudo containarium export alice --jump-ip 35.229.246.67
# Export to file
sudo containarium export alice --jump-ip 35.229.246.67 --output ~/.ssh/config.d/containarium-alice
# With custom SSH key path
sudo containarium export alice --jump-ip 35.229.246.67 --key ~/.ssh/containarium_alice
# Append directly to SSH config
sudo containarium export alice --jump-ip 35.229.246.67 >> ~/.ssh/configOutput:
# Containarium SSH Configuration
# User: alice
# Generated: 2026-01-10 08:43:18
# Jump server (GCE instance with proxy-only account)
Host containarium-jump
HostName 35.229.246.67
User alice
IdentityFile ~/.ssh/containarium_alice
# No shell access - proxy-only account
# User's development container
Host alice-dev
HostName 10.0.3.100
User alice
IdentityFile ~/.ssh/containarium_alice
ProxyJump containarium-jump
Usage:
# After exporting, connect with:
ssh alice-dev# User generates their own key pair
ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/containarium
# Output files:
# ~/.ssh/containarium (private - never share!)
# ~/.ssh/containarium.pub (public - give to admin)
# View public key (to send to admin)
cat ~/.ssh/containarium.pub# Method 1: Admin has key file locally
sudo containarium create alice --ssh-key /path/to/alice.pub
# Method 2: Admin receives key via secure channel
# User sends their public key:
cat ~/.ssh/containarium.pub
# ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... [email protected]
# Admin creates container with key inline
echo 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... [email protected]' > /tmp/alice.pub
sudo containarium create alice --ssh-key /tmp/alice.pub
# Method 3: Multiple users with different keys
sudo containarium create alice --ssh-key /tmp/alice.pub
sudo containarium create bob --ssh-key /tmp/bob.pub
sudo containarium create charlie --ssh-key /tmp/charlie.pub# Add additional key (keep existing keys)
NEW_KEY='ssh-ed25519 AAAAC3... user@laptop'
sudo incus exec alice-container -- bash -c \
"echo '$NEW_KEY' >> /home/alice/.ssh/authorized_keys"
# Verify key was added
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keys# Replace all keys with new key
NEW_KEY='ssh-ed25519 AAAAC3... alice@new-laptop'
echo "$NEW_KEY" | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/null
# Fix permissions
sudo incus exec alice-container -- chown alice:alice /home/alice/.ssh/authorized_keys
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys# Add work laptop key
WORK_KEY='ssh-ed25519 AAAAC3... alice@work-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$WORK_KEY' >> /home/alice/.ssh/authorized_keys"
# Add home laptop key
HOME_KEY='ssh-ed25519 AAAAC3... alice@home-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$HOME_KEY' >> /home/alice/.ssh/authorized_keys"
# View all keys
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keys
# ssh-ed25519 AAAAC3... alice@work-laptop
# ssh-ed25519 AAAAC3... alice@home-laptop# Remove key by comment (last part of key)
sudo incus exec alice-container -- bash -c \
"sed -i '/alice@old-laptop/d' /home/alice/.ssh/authorized_keys"
# Remove key by fingerprint pattern
sudo incus exec alice-container -- bash -c \
"sed -i '/AAAAC3NzaC1lZDI1NTE5AAAAIAbc123/d' /home/alice/.ssh/authorized_keys"# Check authorized_keys permissions
sudo incus exec alice-container -- ls -la /home/alice/.ssh/
# Should show:
# drwx------ 2 alice alice 4096 ... .ssh
# -rw------- 1 alice alice 123 ... authorized_keys
# Fix permissions if wrong
sudo incus exec alice-container -- chown -R alice:alice /home/alice/.ssh
sudo incus exec alice-container -- chmod 700 /home/alice/.ssh
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys
# Test SSH from jump server
ssh -v [email protected]
# -v shows verbose output for debugging
# Check SSH logs in container
sudo incus exec alice-container -- tail -f /var/log/auth.log# Execute command in container
sudo incus exec alice-container -- df -h
# Shell into container
sudo incus exec alice-container -- su - alice
# View container logs
sudo incus console alice-container --show-log
# Snapshot container
sudo incus snapshot alice-container snap1
# Restore snapshot
sudo incus restore alice-container snap1
# Copy container
sudo incus copy alice-container alice-backup# Set memory limit
sudo incus config set alice-container limits.memory 4GB
# Set CPU limit
sudo incus config set alice-container limits.cpu 2
# View container metrics
sudo incus info alice-container
# Resize disk quota
sudo containarium resize alice --disk-quota 100GBcd terraform/gce
# Initialize Terraform
terraform init
# Preview changes
terraform plan
# Deploy infrastructure
terraform apply
# Deploy with custom variables
terraform apply -var-file=examples/horizontal-scaling-3-servers.tfvars
# Show outputs
terraform output
# Get specific output
terraform output jump_server_ip# Update infrastructure
terraform apply
# Destroy specific resource
terraform destroy -target=google_compute_instance.jump_server_spot[0]
# Destroy everything
terraform destroy
# Import existing resource
terraform import google_compute_instance.jump_server projects/my-project/zones/us-central1-a/instances/my-instance
# Refresh state
terraform refresh# Backup ZFS pool
sudo zfs snapshot incus-pool@backup-$(date +%Y%m%d)
# List snapshots
sudo zfs list -t snapshot
# Rollback to snapshot
sudo zfs rollback incus-pool@backup-20240115
# Export container
sudo incus export alice-container alice-backup.tar.gz
# Import container
sudo incus import alice-backup.tar.gz# Check ZFS pool status
sudo zpool status
# Check disk usage
sudo zfs list
# Check container resource usage
sudo incus list --columns ns4mDcup
# View system load
htop
# Check Incus daemon status
sudo systemctl status incus1. "cannot lock /etc/passwd" Error
This occurs when google_guest_agent is managing users while Containarium tries to create jump server accounts.
Solution: Containarium includes automatic retry logic with exponential backoff:
- β Pre-checks for lock files before attempting
- β 6 retry attempts with exponential backoff (500ms β 30s)
- β Jitter to prevent thundering herd
- β Smart error detection (only retries lock errors)
If retries are exhausted, check agent activity:
# Check what google_guest_agent is doing
sudo journalctl -u google-guest-agent --since "5 minutes ago" | grep -E "account|user|Updating"
# Temporarily disable account management (if needed)
sudo systemctl stop google-guest-agent
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
sudo systemctl start google-guest-agent
# Or wait and retry - agent usually releases lock within 30-60 seconds2. Container Network Issues
# View Incus logs
sudo journalctl -u incus -f
# Check container network
sudo incus network list
sudo incus network show incusbr0
# Restart Incus daemon
sudo systemctl restart incus3. Infrastructure Issues
# Check startup script logs (GCE)
gcloud compute instances get-serial-port-output <instance-name> --zone=<zone>
# Verify ZFS health
sudo zpool scrub incus-pool
sudo zpool status -vCheck daemon status:
# View daemon status
sudo systemctl status containarium
# View daemon logs
sudo journalctl -u containarium -f
# Restart daemon
sudo systemctl restart containarium
# Test daemon connection (with mTLS)
containarium list \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certsDirect gRPC testing with grpcurl:
# Install grpcurl if needed
go install github.com/fullstorydev/grpcurl/cmd/grpcurl@latest
# List services (with mTLS)
grpcurl -cacert /etc/containarium/certs/ca.crt \
-cert /etc/containarium/certs/client.crt \
-key /etc/containarium/certs/client.key \
35.229.246.67:50051 list
# Create container via gRPC (with mTLS)
grpcurl -cacert /etc/containarium/certs/ca.crt \
-cert /etc/containarium/certs/client.crt \
-key /etc/containarium/certs/client.key \
-d '{"username": "alice", "ssh_keys": ["ssh-ed25519 AAA..."]}' \
35.229.246.67:50051 containarium.v1.ContainerService/CreateContainer# Create multiple containers
for user in alice bob charlie; do
sudo containarium create $user --ssh-key ~/.ssh/${user}.pub
done
# Enable autostart for all containers
sudo incus list --format csv -c n | while read name; do
sudo incus config set $name boot.autostart true
done
# Snapshot all containers
for container in $(sudo incus list --format csv -c n); do
sudo incus snapshot $container "backup-$(date +%Y%m%d)"
done# terraform.tfvars
use_spot_instance = true # 76% cheaper
use_persistent_disk = true # Survives restarts
machine_type = "n2-standard-8" # 32GB RAM, 50 users# terraform.tfvars
enable_horizontal_scaling = true
jump_server_count = 3 # 3 independent servers
enable_load_balancer = true # SSH load balancing
use_spot_instance = trueDeploy:
cd terraform/gce
terraform init
terraform plan
terraform apply- Deployment Guide - START HERE! Complete workflow from zero to running containers
- Production Deployment - PRODUCTION READY! Remote state, secrets management, CI/CD
- Horizontal Scaling Quick Start - Deploy 3-5 jump servers
- SSH Jump Server Setup - SSH configuration guide
- Spot Instances & Scaling - Cost optimization
- Horizontal Scaling Architecture - Scaling strategies
- Terraform GCE README - Deployment details
- Implementation Plan - Architecture & roadmap
- Testing Architecture - E2E testing with Terraform
β Create 1 GCE VM per user
β Each VM: 2-4GB RAM (most unused)
β Cost: $25-50/month per user
β Slow: 30-60 seconds to provision
β Unmanageable: 50+ VMs to maintain
β
1 GCE VM hosts 50 containers
β
Each container: 100-500MB RAM (efficient)
β
Cost: $1.96-2.08/month per user
β
Fast: <60 seconds to provision
β
Scalable: Add servers as you grow
β
Resilient: Spot instances + persistent storage
| Metric | VM-per-User | Containarium | Improvement |
|---|---|---|---|
| Memory/User | 2-4 GB | 100-500 MB | 10x |
| Startup Time | 30-60s | 2-5s | 12x |
| Density | 2-3/host | 150/host | 50x |
| Cost (50 users) | $1,250/mo | $98/mo | 92% savings |
- Separate User Accounts: Each user has proxy-only account on jump server
- No Shell Access: User accounts use
/usr/sbin/nologin(cannot execute commands on jump server) - SSH Key Auth: Password authentication disabled globally
- Per-User Isolation: Users can only access their own containers
- Admin Separation: Only admin account has jump server shell access
- Unprivileged Containers: Container root β host root (UID mapping)
- Resource Limits: CPU, memory, disk quotas per container
- Network Isolation: Separate network namespace per container
- AppArmor Profiles: Additional security layer per container
- Firewall Rules: Restrict SSH access to known IPs
- fail2ban Integration: Auto-block brute force attacks per user account
- DDoS Protection: Per-account rate limiting
- Private Container IPs: Only jump server has public IP
- SSH Audit Logging: Track all connections by user account
- Per-User Logs: Separate logs for each user (alice@jump β container)
- Container Access Logs: Track who accessed which container and when
- Security Event Alerts: Monitor for suspicious activity
- Persistent Disk Encryption: Data encrypted at rest
- Automated Backups: Daily snapshots with 30-day retention
- ZFS Checksums: Detect data corruption automatically
| Configuration | Users | Servers | Cost/Month | Use Case |
|---|---|---|---|---|
| Dev/Test | 20-50 | 1 spot | $98 | Development, testing |
| Small Team | 50-100 | 1 regular | $242 | Production, small team |
| Medium Team | 100-150 | 3 spot | $312 | Production, medium team |
| Large Team | 200-250 | 5 spot | $508 | Production, large team |
| Enterprise | 500+ | 10+ or cluster | Custom | Enterprise scale |
- Phase 1: Protobuf contracts - Type-safe gRPC contracts for container operations
- Phase 2: Go CLI framework - Full-featured CLI with create, delete, list, info, resize commands
- Phase 3: Terraform GCE deployment - Single-server and horizontal scaling configurations
- Phase 4: Spot instances + persistent disk - ZFS-backed storage surviving spot termination
- Phase 5: Horizontal scaling with load balancer - Multi-server SSH load balancing
- Phase 6: Container management (Incus integration) - Complete LXC container lifecycle management
- Phase 7: End-to-end testing - Terraform-based E2E tests with ZFS persistence validation
- Phase 8: gRPC daemon with mTLS - Remote container management with mutual TLS authentication
- Phase 9: Secure multi-tenant architecture - Proxy-only jump server accounts, per-user isolation
- Phase 10: Production deployment guides - Complete documentation for production deployments
- Phase 11: Monitoring & observability - Metrics, alerts, and dashboards for production systems
- Phase 12: Automated backup & disaster recovery - Scheduled snapshots and recovery procedures
- Phase 13: AWS support - Terraform modules for AWS EC2 deployment
- Phase 14: Azure support - Terraform modules for Azure VM deployment
- Phase 15: Web UI dashboard - Browser-based container management interface
- Phase 16: Container templates - Pre-configured environments (ML, web dev, data science)
- Phase 17: Resource usage analytics - Per-user and per-container cost tracking
- Phase 18: Auto-scaling - Dynamic server provisioning based on demand
Containers automatically restart when spot instances recover:
# Spot VM terminated β Containers stop
# VM restarts β Containers auto-start (boot.autostart=true)
# Downtime: 2-5 minutes
# Data: Preserved on persistent disk# Automatic snapshots enabled by default
enable_disk_snapshots = true
# 30-day retention
# Point-in-time recovery available# SSH traffic distributed across healthy servers
# Session affinity keeps users on same server
# Health checks on port 22Q: Can multiple users share the same SSH key?
A: No, never! Each user must have their own SSH key pair. Sharing keys:
- Violates security best practices
- Makes it impossible to revoke access for one user
- Prevents audit logging of who accessed what
- Creates compliance issues
Q: What's the difference between admin and user accounts?
A: Containarium uses separate user accounts for security:
- Admin account: Full access to jump server shell, can manage containers
- User accounts: Proxy-only (no shell), can only connect to their container
Example:
/home/admin/.ssh/authorized_keys (admin - full shell access)
/home/alice/.ssh/authorized_keys (alice - proxy only, /usr/sbin/nologin)
/home/bob/.ssh/authorized_keys (bob - proxy only, /usr/sbin/nologin)
Q: Can I use the same key for both jump server and my container?
A: Yes, and that's the recommended approach! Each user has ONE key that works for both:
- Simpler for users (one key to manage)
- Same key authenticates to jump server account (proxy-only)
- Same key authenticates to container (full access)
- Users cannot access jump server shell (secured by
/usr/sbin/nologin) - Admin can still track per-user activity in logs
Q: How do I rotate SSH keys?
A:
# User generates new key
ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/containarium_alice_new
# Admin replaces old key with new key
NEW_KEY='ssh-ed25519 AAAAC3... alice@new-laptop'
echo "$NEW_KEY" | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/nullQ: Can users access the jump server itself?
A: No! User accounts are configured with /usr/sbin/nologin:
- Users can proxy through jump server to their container
- Users CANNOT get a shell on the jump server
- Users CANNOT see other containers or system processes
- Users CANNOT inspect other users' data
- Only admin has shell access to jump server
Q: Can one user access another user's container?
A: No! Each user only has SSH keys for their own container. Users cannot:
- Access other users' containers (no SSH keys for them)
- Become admin on the jump server (no shell access)
- See other users' data or processes (isolated)
- Execute commands on jump server (nologin shell)
Q: How does fail2ban protect against attacks?
A: With separate user accounts, fail2ban provides granular protection:
- Per-user banning: If alice's IP attacks, only alice is blocked
- Other users unaffected: bob, charlie continue working normally
- Audit trail: Logs show which user account was targeted
- DDoS isolation: Attacks on one user don't impact others
- Automatic recovery: Banned IPs are unbanned after timeout
Q: Why is separate user accounts more secure than shared admin?
A: Shared admin account (everyone uses admin) has serious flaws:
β Without separate accounts:
- All users can execute commands on jump server
- All users can see all containers (
incus list) - All users can inspect system processes (
ps aux) - All users can spy on other users
- Logs show only "admin" - can't tell who did what
- Banning one attacker affects all users
β With separate accounts (our design):
- Users cannot execute commands (nologin shell)
- Users cannot see other containers
- Users cannot inspect system
- Each user's activity logged separately
- Per-user banning without affecting others
- Follows principle of least privilege
Q: What happens if I lose my SSH private key?
A: You'll need to:
- Generate a new SSH key pair
- Send new public key to admin
- Admin updates your container with new key
- Old key is automatically invalid
Q: Can I have different keys for work laptop and home laptop?
A: Yes! You can have multiple public keys in your container:
# Admin adds second key for same user
sudo incus exec alice-container -- bash -c \
"echo 'ssh-ed25519 AAAAC3... alice@home-laptop' >> /home/alice/.ssh/authorized_keys"Q: What happens when a spot instance is terminated?
A: Containers automatically restart:
- Spot instance terminated (by GCP)
- Persistent disk preserved (data safe)
- Instance restarts within ~5 minutes
- Containers auto-start (
boot.autostart=true) - Users can reconnect
- Downtime: 2-5 minutes
- Data loss: None
Q: How many containers can fit on one server?
A: Depends on machine type:
- e2-standard-2 (8GB RAM): 10-15 containers
- n2-standard-4 (16GB RAM): 20-30 containers
- n2-standard-8 (32GB RAM): 40-60 containers
Each container uses ~100-500MB RAM depending on workload.
Q: Can containers run Docker?
A: Yes! Each container has Docker pre-installed and working.
# Inside your container
docker run hello-world
docker-compose up -d
docker build -t myapp . # Docker builds work with Incus 6.19+Important: Requires Incus 6.19 or later on the host. Earlier versions (including Ubuntu 24.04's default Incus 6.0.0) have an AppArmor bug (CVE-2025-52881) that breaks Docker builds in unprivileged containers. Use the Zabbly Incus repository for latest stable builds.
Q: Is my data backed up?
A: If you enabled snapshots:
# In terraform.tfvars
enable_disk_snapshots = trueAutomatic daily snapshots with 30-day retention.
Q: Can I resize my container's disk quota?
A: Yes:
# Increase quota to 50GB
sudo containarium resize alice --disk-quota 50GBQ: How do I install software in my container?
A:
# SSH to your container
ssh my-dev
# Install packages as usual
sudo apt update
sudo apt install vim git tmux htop
# Or use Docker
docker run -it ubuntu bashContributions are welcome! Please:
- Read the Implementation Plan
- Check existing issues and PRs
- Follow the existing code style
- Add tests for new features
- Update documentation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- Incus - Modern LXC container manager
- Protocol Buffers - Type-safe data contracts
- Cobra - Powerful CLI framework
- Terraform - Infrastructure as Code
- Documentation: See docs/ directory
- Issues: GitHub Issues
- Organization: FootprintAI
- π Horizontal Scaling Guide
- π§ SSH Setup Guide
- π° Cost & Scaling Strategies
- ποΈ Implementation Plan
- π©οΈ Terraform Examples
# Clone repo
git clone https://github.com/footprintai/Containarium.git
cd Containarium/terraform/gce
# Choose your size and configure
cp examples/single-server-spot.tfvars terraform.tfvars
vim terraform.tfvars # Add: project_id, admin_ssh_keys, allowed_ssh_sources
# Deploy to GCP
terraform init
terraform apply # Creates VM with Incus pre-installed
# Save the jump server IP from output!# Build for Linux
cd ../..
make build-linux
# Copy to jump server
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
# SSH and install
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containarium
exitEach user must generate their own SSH key pair first:
# User generates their key (on their local machine)
ssh-keygen -t ed25519 -C "[email protected]" -f ~/.ssh/containarium_alice
# User sends public key file to admin: ~/.ssh/containarium_alice.pubAdmin creates containers with users' public keys:
# SSH to jump server
ssh admin@<jump-server-ip>
# Save users' public keys (received from users)
echo 'ssh-ed25519 AAAAC3... [email protected]' > /tmp/alice.pub
echo 'ssh-ed25519 AAAAC3... [email protected]' > /tmp/bob.pub
# Create containers with users' keys
sudo containarium create alice --ssh-key /tmp/alice.pub --image images:ubuntu/24.04
sudo containarium create bob --ssh-key /tmp/bob.pub --image images:ubuntu/24.04
sudo containarium create charlie --ssh-key /tmp/charlie.pub --image images:ubuntu/24.04
# Output:
# β Container alice-container created successfully!
# β Jump server account: alice (proxy-only, no shell access)
# IP Address: 10.0.3.166
# Enable auto-start (survive spot instance restarts)
sudo incus config set alice-container boot.autostart true
sudo incus config set bob-container boot.autostart true
sudo incus config set charlie-container boot.autostart true
# Export SSH configs for users
sudo containarium export alice --jump-ip <jump-server-ip> > alice-ssh-config.txt
sudo containarium export bob --jump-ip <jump-server-ip> > bob-ssh-config.txt
sudo containarium export charlie --jump-ip <jump-server-ip> > charlie-ssh-config.txt
# Send config files to users
# List all containers
sudo containarium listAdmin sends exported SSH config to each user:
- Send
alice-ssh-config.txtto Alice - Send
bob-ssh-config.txtto Bob - Send
charlie-ssh-config.txtto Charlie
Users add to their ~/.ssh/config:
# Alice on her laptop
cat alice-ssh-config.txt >> ~/.ssh/config
# Connect!
ssh alice-dev
# Alice is now in her Ubuntu container with Docker!
docker run hello-worldDone! π
See Deployment Guide for complete details.
Made with β€οΈ by the FootprintAI team
Save 92% on cloud costs. Deploy in 5 minutes. Scale to 250+ users.