Skip to content

Add comprehensive integration test suite (t/integration/)#31

Open
struanb wants to merge 14 commits into
mainfrom
claude/dockside-integration-tests-u8jdA
Open

Add comprehensive integration test suite (t/integration/)#31
struanb wants to merge 14 commits into
mainfrom
claude/dockside-integration-tests-u8jdA

Conversation

@struanb
Copy link
Copy Markdown
Contributor

@struanb struanb commented Mar 15, 2026

  • CLI: add --host-header / DOCKSIDE_HOST_HEADER flag so tests can connect to
    https://localhost with a correct Host header (needed for local/harness modes)

  • t/integration/config/users.json: four test users (admin, testdev1, testdev2,
    testviewer) with committed test-only Ed25519 SSH keypairs embedded in the
    authorized_keys and keypairs fields

  • t/integration/config/roles.json: role definitions (admin/developer/user)
    matching the roles referenced in users.json

  • t/integration/config/ssh/: committed test-only Ed25519 keypairs for testdev1
    and testdev2 (safe to commit; used only against ephemeral test instances)

  • t/integration/lib/dockside_test.py: DocksideClient (CLI subprocess + cookie
    jar), TestCase base (can_modify_networks, assertions), TestRunner (TAP output,
    signal/atexit cleanup)

  • t/integration/lib/run_tests_main.py: entry point; discovers and runs test
    modules; passes mode/credentials/network-modify flags to runner

  • Tests:
    01_auth Authentication and permission checks
    02_lifecycle_alpine Full create/start/stop/remove (alpine)
    03_lifecycle_debian Full create/start/stop/remove (debian)
    04_access_and_http MERGED: visibility, router filtering, HTTP proxy codes;
    testdev1 is owner, testdev2 is added/removed as developer
    05_edit Edit metadata; viewer cannot edit; developer can
    06_git_profile Git URL / branch / PR creation options
    07_ide IDE access: viewer denied, developer allowed, removed developer denied
    08_network Network assignment; harness-only (or ALLOW_NETWORK_MODIFY=1):
    create/attach/detach unique test networks
    09_ssh Inbound SSH via wstunnel ProxyCommand; outbound via
    integrated ssh-agent; uses committed test keypairs

  • t/integration/harness.sh: pulls image, starts Dockside container, injects
    passwords, exports env vars, registers cleanup trap

  • t/integration/run_tests.sh: mode detection, connection parameters per mode,
    delegates to run_tests_main.py; DOCKSIDE_TEST_ALLOW_NETWORK_MODIFY override

  • test.sh: add opt-in 'integration' check category

  • docs/setup.md: add comprehensive Access Control Model section covering
    profile auth array vs meta.access; viewer/developer distinctions; router
    filtering; IDE/SSH router restrictions

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf

claude and others added 14 commits March 15, 2026 19:20
- CLI: add --host-header / DOCKSIDE_HOST_HEADER flag so tests can connect to
  https://localhost with a correct Host header (needed for local/harness modes)

- t/integration/config/users.json: four test users (admin, testdev1, testdev2,
  testviewer) with committed test-only Ed25519 SSH keypairs embedded in the
  authorized_keys and keypairs fields

- t/integration/config/roles.json: role definitions (admin/developer/user)
  matching the roles referenced in users.json

- t/integration/config/ssh/: committed test-only Ed25519 keypairs for testdev1
  and testdev2 (safe to commit; used only against ephemeral test instances)

- t/integration/lib/dockside_test.py: DocksideClient (CLI subprocess + cookie
  jar), TestCase base (can_modify_networks, assertions), TestRunner (TAP output,
  signal/atexit cleanup)

- t/integration/lib/run_tests_main.py: entry point; discovers and runs test
  modules; passes mode/credentials/network-modify flags to runner

- Tests:
    01_auth           Authentication and permission checks
    02_lifecycle_alpine  Full create/start/stop/remove (alpine)
    03_lifecycle_debian  Full create/start/stop/remove (debian)
    04_access_and_http   MERGED: visibility, router filtering, HTTP proxy codes;
                         testdev1 is owner, testdev2 is added/removed as developer
    05_edit           Edit metadata; viewer cannot edit; developer can
    06_git_profile    Git URL / branch / PR creation options
    07_ide            IDE access: viewer denied, developer allowed, removed developer denied
    08_network        Network assignment; harness-only (or ALLOW_NETWORK_MODIFY=1):
                      create/attach/detach unique test networks
    09_ssh            Inbound SSH via wstunnel ProxyCommand; outbound via
                      integrated ssh-agent; uses committed test keypairs

- t/integration/harness.sh: pulls image, starts Dockside container, injects
  passwords, exports env vars, registers cleanup trap

- t/integration/run_tests.sh: mode detection, connection parameters per mode,
  delegates to run_tests_main.py; DOCKSIDE_TEST_ALLOW_NETWORK_MODIFY override

- test.sh: add opt-in 'integration' check category

- docs/setup.md: add comprehensive Access Control Model section covering
  profile auth array vs meta.access; viewer/developer distinctions; router
  filtering; IDE/SSH router restrictions

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
…tion tests

- shellcheck: add t/integration/run_tests.sh and t/integration/harness.sh
- json: add t/integration/config/users.json and t/integration/config/roles.json
- python (new check): py_compile syntax-check all .py files under t/integration/ and cli/

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
_run(): place subcommand before global flags (--server, --output, etc.)
so argparse routes to the correct subparser. These flags are only
registered on subparsers, not the top-level parser; passing them first
caused argparse to see the server URL as the COMMAND positional.

_register_cleanup(): after emergency cleanup, restore SIG_DFL and
re-raise the signal so the process actually terminates on Ctrl-C
rather than silently continuing.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
Bug 1: run_tests_main.py now exits 0 when called with --cleanup,
preventing the bash EXIT/INT trap from re-running the entire test suite
after completion or on Ctrl-C.

Bug 2: Add _UnavailableClient placeholder class. Non-admin users
(dev1, dev2, viewer) are now validated at startup via list_containers();
if auth fails, their client is replaced with _UnavailableClient so
that any test accessing the unavailable role raises SkipTest and
shows "ok N - ... # SKIP ..." instead of "not ok" with a confusing
auth error message.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
Profile names must be the config filename without .json extension, not the
human-readable name property from the JSON:
  'Stock Image - Alpine Linux' → '10-alpine'
  'Stock Image - Debian'       → '11-debian'
  'Stock Image - NGINX'        → '30-nginx'
  'Git Repo (Branch/PR)'       → '03-git-repo'

Each test file now has a top-level PROFILE_NAME (or PROFILE_ALPINE /
PROFILE_NGINX for 04_access_and_http.py) constant so the profile can be
updated in one place.

Add DOCKSIDE_TEST_DEBUG=1 support to DocksideClient._run: when set, each
CLI invocation's command, return code, stdout, and stderr are printed to
stderr before parsing. Also wrap json.loads so that a JSONDecodeError now
includes the raw stdout in the error message, making mismatched responses
immediately visible without needing the debug flag.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
The $logger closure passed to Expect->log_file() in run_pty() was
unconditionally printing every chunk of docker output to STDOUT via
`print $chunk`. In the forked child process that runs `docker create`,
STDOUT is the inherited HTTP response pipe, so docker's progress output
(e.g. the leading `.`) was being written into the response body before
the parent serialised the JSON, causing JSON parse errors in the CLI.

The docker output is already written to the logfile at `print $fh $_`
so no information is lost.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
All print() calls in wait_for() were going to stdout (the default),
causing progress dots and their trailing newlines to be prepended to the
JSON output when the CLI was invoked with --output json and capture_output
was used. This produced stdout like '.\n{...json...}' which failed JSON
parsing.

Add file=sys.stderr to every print() call inside wait_for() so progress
output is cleanly separated from the machine-readable JSON on stdout.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
…per-test tearDown

Two root causes made tests 04-06 fail:

1. TestRunner._run_class creates a fresh TestCase instance per method and
   calls setUp/tearDown around each one. LifecycleAlpineTests.setUp()
   registered the container for cleanup on every call, so tearDown would
   stop+remove the container after every test — including test_01_create.
   Tests 02/03 passed by chance (async --no-wait removal still visible),
   but by test_04 the docker container was gone and docker start exited 1.
   Each failing test then fired its own tearDown stop+remove pair,
   producing the triple stop/remove pattern visible in the server log.

2. test_01_create produces a running container (status=1). test_04_start
   called docker start on an already-running container, which exits 1 on
   this docker/containerd version.

Fix:
- Add setUpClass/tearDownClass support to TestRunner._run_class. Clients
  are injected as class attributes before the method loop. setUpClass is
  called once (failure marks all methods not ok). tearDownClass is called
  once after all methods and is included in _emergency_cleanup.
- LifecycleAlpineTests: drop register_cleanup from setUp (no per-test
  cleanup); add tearDownClass that stop+removes the container once.
  test_04_start now stops first (container is running after create) then
  starts, actually exercising the start path. Remove now-unnecessary
  _cleanup_names.clear() from test_06_remove.
- Remove the unused _name() classmethod.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
remove() with --no-wait returns immediately; the reservation stays visible
in the list for up to ~30s while dockside's async docker rm completes.
test_06_remove checked the list immediately after the call and always
saw the container still there.

Fix:
- Add wait/timeout params to DocksideClient.remove() (default wait=True).
  When wait=True, omits --no-wait and passes --timeout to the CLI, which
  polls until the reservation is fully gone or status <= -2 with no
  containerId — consistent with start/stop behaviour.
- Update tearDown (base class) and tearDownClass (LifecycleAlpineTests)
  to pass wait=False so cleanup paths stay non-blocking.
- test_06_remove now calls remove() with the default wait=True, so
  assert_not_in fires only after removal is confirmed.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
Reverting remove() default to wait=False (--no-wait) so test_06_remove
stays fast.  Rather than blocking up to 30s for the reservation to
vanish, check its state immediately:
  (a) absent from the list – remove already propagated, or
  (b) present with status <= -2 and no containerId – the docker
      container is gone and dockside is holding the reservation in
      prelaunch state briefly before purging it.

This mirrors the same 'gone' condition used by wait_for() in the CLI,
without the polling delay.

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
The docker container is removed asynchronously; immediately after
--no-wait the status is still 0 (exited) and containerId is retained
even at status -3.  The previous 'status <= -2 and not containerId'
condition was therefore always false.

- STATUS_LABELS: add -3 → 'removed' so `dockside get` shows a
  human-readable label instead of the raw integer
- wait_for('gone'): fix condition to status <= -3 (drop the incorrect
  containerId check)
- test_06_remove: poll up to 10s (0.5s interval) for absent or
  status <= -3, keeping typical runtime well under 2s

https://claude.ai/code/session_0159qacE1enywepN5NTcqtuf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants