Skip to content

Support clusters with internet access only for Ansible via proxy #679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

sjpb
Copy link
Collaborator

@sjpb sjpb commented May 23, 2025

This PR provides a capacity to create clusters without cluster users having internet access. This uses an http proxy with basic auth, where the proxy environment variables are set via Ansible's remote environment support rather than in e.g. /etc/environment. The authenticated proxy is therefore only available to Ansible tasks, and the credentials are never exposed to any cluster users.

Currently deploying clusters in this way supports:

  • Groups enabled in the "everything" template environments/common/layouts/everything (i.e. features enabled by default in a new cookiecutter environment)
  • Groups enabled in the .stackhpc environment.
  • freeipa_client group for FreeIPA client (NB: not tested)

Plan

  • Configure test env again
    In a branch off this:
  • Disable all repos in a hook
  • Work out which appliances_environment_vars are actually needed then
    In here:
  • Backout the proxy_tasks_only var and replace with docs / common vars

Still to do

  • I'm not convinced the proxy_plays_only feature is good. In reality, for odd networking like this you really don't want to be running site.yml everytime during dev, so you'll push the proxy vars into the environment anyway. I don't know if there's a reasonably DRY way to help with that.
  • For client, I added a hook which disables all the enabled repos. That avoids dnf problems where we don't actually need to install packages, but otherwise try to update the repos. We could add something similar to dnf_repos maybe to do that properly here, and to make it so fatimages ship with all repos disabled. That would probaly let us remove some of the appliances_environment_vars below.

PR changes

  • Modifies squid role to allow configuration of basic auth.
  • Skips dnf install tasks for packages already installed in StackHPC images, to avoid triggering dnf cache updates.
  • Adds proxy_plays_only to proxy role to allow configuring proxy variables via the remote environment.
  • Adds variables proxy_basic_user, proxy_basic_password to common environment to allow easy configuration of proxies with basic auth
  • Adds proxy_http_address to common environment to allow easy overriding of default proxy address when using an appliance-deployed squid where the hostname is not resolvable.
  • Adds environment: keyword to plays requiring internet access.
  • TEMPORARY: Removes any grafana dashboards sourced from downloads (the curl command used for other reasons doens’t pick up the proxy vars). These should be replaced by building them into the image or adding them to the appliance. Anyway we want these in the appliance to perform upgrades to grafana v11 at some point)
  • Think about image build

Note

There are docs for this functionality in docs/experimental/isolated-clusters.md


Stuff below here needs tidying / removing where covered by docs.

To configure this:

  1. In your environment:
    1. Enable proxying for the cluster by adding cluster to the proxy group, see e.g. environments/.stackhpc/inventory/extra_groups
    2. Set proxy_http_proxy as for environments/.stackhpc/inventory/group_vars/all/proxy.yml. Note that this should be in the format http://user:password@address:port. In general address should be an IP address, not a hostname (TODO: explain). The corresponding proxy_https and no_proxy variables are derived automatically, see ansible/roles/proxy/defaults.md
    3. Set proxy_plays_only to true as shown in 2). This means the proxy is only available to Ansible.

Note: because the proxy role must be run to set the host facts for proxying, this precludes running indivudual playbooks during development. To work around this for development it would be possible to set the relevant variables directly, so it might look like:

# environments/$ENV/inventory/group_vars/all/proxy.yml

# your actual proxy configuration as per 1.b above:
proxy_http_proxy: http://user:password@address:port
proxy_plays_only: true

# from proxy role defaults:
proxy_https_proxy: "{{ proxy_http_proxy }}"
proxy_no_proxy_defaults: "{{ ['localhost', '127.0.0.1'] + groups['all'] + hostvars.values() | map(attribute='ansible_host') }}"
proxy_no_proxy_extras: []
proxy_no_proxy: "{{ (proxy_no_proxy_defaults + proxy_no_proxy_extras) | unique | sort | join(',') }}"
proxy_dnf: "{{ not proxy_plays_only }}"
proxy_systemd: "{{ not proxy_plays_only }}"

# then do what the role actually does:
appliances_environment_vars:
  http_proxy: "{{ proxy_http_proxy }}"
  https_proxy: "{{ proxy_https_proxy }}"
  no_proxy: "{{ proxy_no_proxy }}"

(TODO: consider if we should just set these in common inventory to facilitate these?

Why the appliance can’t run without internet access:

  • While most repos point to Ark and are disabled, some are not disabled (cernvm, ondemand-web, TurboVNC, epel-cisco-openh264, epel.rpmnew), hence any dnf tasks require internet access as it will try to update the cache for those repos.
  • For in-appliance roles, the tasks running dnf can generally be skipped using an appliances_mode flag, where the packages are present in the image. But for “external” roles this is not always possible if the dnf tasks are not broken out into.
  • Some tasks do actually require internet access to install things not present in the image, generally either dnf packages or pypi packages.

Plays with

environment: "{{ appliances_environment_vars }}"

are the ones currently requiring internet access.

@sjpb
Copy link
Collaborator Author

sjpb commented May 23, 2025

Confirmed at above revision that:

  • hpctests run, except the hpl test which tries to download the tar.gz. Maybe we should built that into the image?
  • ondemand dashboard works

@sjpb sjpb changed the title Support cluster deploy with only Ansible tasks proxied Support clusters with internet access only for Ansible via proxy May 23, 2025
proxy_http_address: "{{ hostvars[groups['squid'] | first].ansible_host }}"
```

If using an external squid, at a minimum set `proxy_http_address`. You may

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this would go in environments/$ENV/inventory/group_vars/all/proxy.yml too?

@sjpb
Copy link
Collaborator Author

sjpb commented Jun 20, 2025

Merged into client branch at 125669f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants