|
| 1 | +# Spindle in Docker |
| 2 | + |
| 3 | +This directory contains a set of container recipes and scripts to allow you |
| 4 | +to quickly bring up your own tiny cluster with [docker-compose](https://docs.docker.com/compose/install/), install |
| 5 | +spindle, and give it a try. You will need both [docker-compose](https://docs.docker.com/compose/install/) |
| 6 | +and [Docker](https://docs.docker.com/get-docker/) installed for this tutorial. |
| 7 | + |
| 8 | +## 1. Build Containers |
| 9 | + |
| 10 | +First, let's build a base container with slurm and centos with the [Dockerfile](Dockerfile) here: |
| 11 | + |
| 12 | +```bash |
| 13 | +$ docker build -t vanessa/slurm:20.11.8 . |
| 14 | +``` |
| 15 | +Then building containers is as easy as: |
| 16 | + |
| 17 | +```bash |
| 18 | +$ docker-compose build |
| 19 | +``` |
| 20 | + |
| 21 | +And then bringing them up: |
| 22 | + |
| 23 | +```bash |
| 24 | +$ docker-compose up -d |
| 25 | +``` |
| 26 | + |
| 27 | +And checking that they are running |
| 28 | + |
| 29 | +```bash |
| 30 | +$ docker-compose ps |
| 31 | + Name Command State Ports |
| 32 | +------------------------------------------------------------------------ |
| 33 | +c1 /usr/local/bin/docker-entr ... Up 6818/tcp |
| 34 | +c2 /usr/local/bin/docker-entr ... Up 6818/tcp |
| 35 | +mysql docker-entrypoint.sh mysqld Up 3306/tcp, 33060/tcp |
| 36 | +slurmctld /usr/local/bin/docker-entr ... Up 6817/tcp |
| 37 | +slurmdbd /usr/local/bin/docker-entr ... Up 6819/tcp |
| 38 | +``` |
| 39 | + |
| 40 | +Each of c1 and c2 are nodes for our cluster, and then slurmctld is like the login node. |
| 41 | + |
| 42 | +```bash |
| 43 | +$ docker exec -it slurmctld bash |
| 44 | +``` |
| 45 | + |
| 46 | +Try running a job! |
| 47 | + |
| 48 | +```bash |
| 49 | +$ sbatch --wrap="sleep 20" |
| 50 | +# squeue |
| 51 | + JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) |
| 52 | + 1 normal wrap root R 0:00 1 c1 |
| 53 | +``` |
| 54 | + |
| 55 | +## 2. Install spindle |
| 56 | + |
| 57 | +Now let's follow instructions to install spindle. |
| 58 | + |
| 59 | +```bash |
| 60 | +$ git clone https://github.com/hpc/spindle |
| 61 | +$ cd spindle |
| 62 | +``` |
| 63 | + |
| 64 | +We want to install providing paths to munge and slurm. |
| 65 | + |
| 66 | +```bash |
| 67 | +./configure --with-munge-dir=/etc/munge --enable-sec-munge --with-slurm-dir=/etc/slurm --enable-testsuite=no |
| 68 | +make |
| 69 | +make install |
| 70 | +``` |
| 71 | + |
| 72 | +Note that we are disabling the test suite otherwise we'd get an install error not detecting |
| 73 | +an MPI library. Now we can see spindle! |
| 74 | + |
| 75 | +``` |
| 76 | +# spindle --help |
| 77 | +Usage: spindle [OPTION...] mpi_command |
| 78 | +
|
| 79 | + These options specify what types of files should be loaded through the Spindle |
| 80 | + network |
| 81 | + -a, --reloc-aout=yes|no Relocate the main executable through Spindle. |
| 82 | + Default: yes |
| 83 | + -f, --follow-fork=yes|no Relocate objects in fork'd child processes. |
| 84 | + Default: yes |
| 85 | + -l, --reloc-libs=yes|no Relocate shared libraries through Spindle. |
| 86 | + Default: yes |
| 87 | + -x, --reloc-exec=yes|no Relocate the targets of exec/execv/execve/... |
| 88 | + calls. Default: yes |
| 89 | + -y, --reloc-python=yes|no Relocate python modules (.py/.pyc) files when |
| 90 | + loaded via python. Default: yes |
| 91 | +
|
| 92 | + These options specify how the Spindle network should distibute files. Push is |
| 93 | + better for SPMD programs. Pull is better for MPMD programs. Default is push. |
| 94 | + -p, --push Use a push model where objects loaded by any |
| 95 | + process are made available to all processes |
| 96 | + -q, --pull Use a pull model where objects are only made |
| 97 | + available to processes that require them |
| 98 | +
|
| 99 | + These options configure Spindle's network model. Typical Spindle runs should |
| 100 | + not need to set these. |
| 101 | + -c, --cobo Use a tree-based cobo network for distributing |
| 102 | + objects |
| 103 | + -t, --port=port1-port2 TCP/IP port range for Spindle servers. Default: |
| 104 | + 21940-21964 |
| 105 | +
|
| 106 | + These options specify the security model Spindle should use for validating TCP |
| 107 | + connections. Spindle will choose a default value if no option is specified. |
| 108 | + --security-munge Use munge for security authentication |
| 109 | +
|
| 110 | + These options specify the job launcher Spindle is being run with. If |
| 111 | + unspecified, Spindle will try to autodetect. |
| 112 | + --launcher-startup Launch spindle daemons using the system's job |
| 113 | + launcher (requires an already set-up session). |
| 114 | + --no-mpi Run serial jobs instead of MPI job |
| 115 | + --openmpi MPI job is launched with the OpenMPI job jauncher. |
| 116 | + |
| 117 | + --slurm MPI job is launched with the srun job launcher. |
| 118 | + --wreck MPI Job is launched with the wreck job launcher. |
| 119 | +
|
| 120 | + Options for managing sessions, which can run multiple jobs out of one spindle |
| 121 | + cache. |
| 122 | + --end-session=session-id End a persistent Spindle session with the |
| 123 | + given session-id |
| 124 | + --run-in-session=session-id |
| 125 | + Run a new job in the given session |
| 126 | + --start-session Start a persistent Spindle session and print the |
| 127 | + session-id to stdout |
| 128 | +
|
| 129 | + Misc options |
| 130 | + -b, --shmcache-size=size Size of client shared memory cache in kilobytes, |
| 131 | + which can be used to improve performance if |
| 132 | + multiple processes are running on each node. |
| 133 | + Default: 0 |
| 134 | + --cache-prefix=path Alias for python-prefix |
| 135 | + --cleanup-proc=yes|no Fork a dedicated process to clean-up files |
| 136 | + post-spindle. Useful for high-fault situations. |
| 137 | + Default: no |
| 138 | + -d, --debug=yes|no If yes, hide spindle from debuggers so they think |
| 139 | + libraries come from the original locations. May |
| 140 | + cause extra overhead. Default: yes |
| 141 | + -e, --preload=FILE Provides a text file containing a white-space |
| 142 | + separated list of files that should be relocated |
| 143 | + to each node before execution begins |
| 144 | + --enable-rsh=yes|no Enable startint daemons with an rsh tree, if the |
| 145 | + startup mode supports it. Default: No |
| 146 | + --hostbin=EXECUTABLE Path to a script that returns the hostlist for a |
| 147 | + job on a cluster |
| 148 | + -h, --no-hide Don't hide spindle file descriptors from |
| 149 | + application |
| 150 | + -k, --audit-type=subaudit|audit |
| 151 | + Use the new-style subaudit interface for |
| 152 | + intercepting ld.so, or the old-style audit |
| 153 | + interface. The subaudit option reduces memory |
| 154 | + overhead, but is more complex. Default is audit. |
| 155 | + --msgcache-buffer=size Enables message buffering if size is non-zero, |
| 156 | + otherwise sets the size of the buffer in |
| 157 | + kilobytes |
| 158 | + --msgcache-timeout=timeout Enables message buffering if size is |
| 159 | + non-zero, otherwise sets the buffering timeout in |
| 160 | + milliseconds |
| 161 | + -n, --noclean=yes|no Don't remove local file cache after execution. |
| 162 | + Default: no (removes the cache) |
| 163 | + -o, --location=directory Back-end directory for storing relocated files. |
| 164 | + Should be a non-shared location such as a ramdisk. |
| 165 | + Default: $TMPDIR |
| 166 | + --persist=yes|no Allow spindle servers to persist after the last |
| 167 | + client job has exited. Default: No |
| 168 | + -r, --python-prefix=path Colon-seperated list of directories that contain |
| 169 | + the python install location |
| 170 | + -s, --strip=yes|no Strip debug and symbol information from binaries |
| 171 | + before distributing them. Default: yes |
| 172 | +
|
| 173 | + -?, --help Give this help list |
| 174 | + --usage Give a short usage message |
| 175 | + -V, --version Print program version |
| 176 | +
|
| 177 | +Mandatory or optional arguments to long options are also mandatory or optional |
| 178 | +for any corresponding short options. |
| 179 | +
|
| 180 | + |
| 181 | +``` |
| 182 | + |
| 183 | +## 3. Use Spindle |
| 184 | + |
| 185 | +**TODO** we need a dummy example here |
| 186 | + |
| 187 | + |
| 188 | +## 4. Clean Up |
| 189 | + |
| 190 | +When you are done, exit from the container, stop and remove your images: |
| 191 | + |
| 192 | +```bash |
| 193 | +$ docker-compose stop |
| 194 | +$ docker-compose rm |
| 195 | +``` |
0 commit comments