Skip to content

Commit b58f446

Browse files
committed
first work to add containers
the developer should easily be able to test spindle, and the user should be able to run a small example or tutorial. Ideally we can also extend a container to be able to build and test in CI Signed-off-by: vsoch <[email protected]>
1 parent a89872e commit b58f446

File tree

2 files changed

+209
-36
lines changed

2 files changed

+209
-36
lines changed

docker/Dockerfile

+14-36
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
11
FROM centos:7
22

3-
# docker build -t vanessa/slurm:18.08.6 .
3+
# docker build -t vanessa/slurm:20.11.8 .
44

5-
LABEL org.label-schema.vcs-url="https://github.com/vsoch/ood-compose" \
5+
LABEL org.label-schema.vcs-url="https://github.com/hpc/spindle" \
66
org.label-schema.docker.cmd="docker-compose up -d" \
7-
org.label-schema.name="ood-composer" \
8-
org.label-schema.description="Open On Demand with SLURM on Centos 7" \
7+
org.label-schema.name="spindle" \
8+
org.label-schema.description="Spindle with SLURM on Centos 7" \
99
maintainer="Vanessa Sochat"
1010

11-
ARG SLURM_TAG=slurm-18-08-6-2
12-
ARG GOSU_VERSION=1.11
11+
ARG SLURM_TAG=slurm-20-11-8-1
1312

1413
RUN set -ex \
1514
&& yum makecache fast \
@@ -28,28 +27,19 @@ RUN set -ex \
2827
munge-devel \
2928
python-devel \
3029
python-pip \
31-
python34 \
32-
python34-devel \
33-
python34-pip \
30+
python3 \
31+
python3-devel \
32+
python3-pip \
3433
mariadb-server \
3534
mariadb-devel \
3635
psmisc \
3736
bash-completion \
3837
vim-enhanced \
38+
automake \
3939
&& yum clean all \
4040
&& rm -rf /var/cache/yum
4141

42-
RUN pip install Cython nose && pip3.4 install Cython nose
43-
44-
RUN set -ex \
45-
&& wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-amd64" \
46-
&& wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-amd64.asc" \
47-
&& export GNUPGHOME="$(mktemp -d)" \
48-
&& gpg --keyserver ha.pool.sks-keyservers.net --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4 \
49-
&& gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu \
50-
&& rm -rf "${GNUPGHOME}" /usr/local/bin/gosu.asc \
51-
&& chmod +x /usr/local/bin/gosu \
52-
&& gosu nobody true
42+
RUN pip install Cython nose && pip3 install Cython nose
5343

5444
RUN set -x \
5545
&& git clone https://github.com/SchedMD/slurm.git \
@@ -91,21 +81,9 @@ COPY slurmdbd.conf /etc/slurm/slurmdbd.conf
9181
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
9282
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
9383

94-
# Install Open On Demand, Singularity
9584
RUN yum install -y net-tools openssh-server openssh-clients singularity && \
96-
yum install -y epel-release centos-release-scl lsof sudo httpd24-mod_ssl httpd24-mod_ldap && \
97-
yum install -y https://yum.osc.edu/ondemand/latest/ondemand-release-web-latest-1-2.el7.noarch.rpm && \
98-
yum install --nogpgcheck -y ondemand && \
99-
mkdir -p /etc/ood/config/clusters.d && \
100-
mkdir -p /etc/ood/config/apps/shell
101-
102-
COPY ./ood_portal.yml /etc/ood/config/ood_portal.yml
103-
RUN /opt/ood/ood-portal-generator/sbin/update_ood_portal && \
104-
systemctl enable httpd24-httpd && \
105-
groupadd ood && \
106-
useradd --create-home --gid ood ood && \
107-
echo -n "ood" | passwd --stdin ood && \
108-
scl enable httpd24 -- htdbm -bc /opt/rh/httpd24/root/etc/httpd/.htpasswd.dbm ood ood
85+
yum install -y epel-release centos-release-scl lsof sudo httpd24-mod_ssl httpd24-mod_ldap
10986

110-
COPY launch-httpd /usr/local/bin/
111-
CMD ["/usr/local/bin/launch-httpd"]
87+
RUN groupadd spindle && \
88+
useradd --create-home --gid spindle spindle && \
89+
echo -n "spindle" | passwd --stdin spindle

docker/README.md

+195
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Spindle in Docker
2+
3+
This directory contains a set of container recipes and scripts to allow you
4+
to quickly bring up your own tiny cluster with [docker-compose](https://docs.docker.com/compose/install/), install
5+
spindle, and give it a try. You will need both [docker-compose](https://docs.docker.com/compose/install/)
6+
and [Docker](https://docs.docker.com/get-docker/) installed for this tutorial.
7+
8+
## 1. Build Containers
9+
10+
First, let's build a base container with slurm and centos with the [Dockerfile](Dockerfile) here:
11+
12+
```bash
13+
$ docker build -t vanessa/slurm:20.11.8 .
14+
```
15+
Then building containers is as easy as:
16+
17+
```bash
18+
$ docker-compose build
19+
```
20+
21+
And then bringing them up:
22+
23+
```bash
24+
$ docker-compose up -d
25+
```
26+
27+
And checking that they are running
28+
29+
```bash
30+
$ docker-compose ps
31+
Name Command State Ports
32+
------------------------------------------------------------------------
33+
c1 /usr/local/bin/docker-entr ... Up 6818/tcp
34+
c2 /usr/local/bin/docker-entr ... Up 6818/tcp
35+
mysql docker-entrypoint.sh mysqld Up 3306/tcp, 33060/tcp
36+
slurmctld /usr/local/bin/docker-entr ... Up 6817/tcp
37+
slurmdbd /usr/local/bin/docker-entr ... Up 6819/tcp
38+
```
39+
40+
Each of c1 and c2 are nodes for our cluster, and then slurmctld is like the login node.
41+
42+
```bash
43+
$ docker exec -it slurmctld bash
44+
```
45+
46+
Try running a job!
47+
48+
```bash
49+
$ sbatch --wrap="sleep 20"
50+
# squeue
51+
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
52+
1 normal wrap root R 0:00 1 c1
53+
```
54+
55+
## 2. Install spindle
56+
57+
Now let's follow instructions to install spindle.
58+
59+
```bash
60+
$ git clone https://github.com/hpc/spindle
61+
$ cd spindle
62+
```
63+
64+
We want to install providing paths to munge and slurm.
65+
66+
```bash
67+
./configure --with-munge-dir=/etc/munge --enable-sec-munge --with-slurm-dir=/etc/slurm --enable-testsuite=no
68+
make
69+
make install
70+
```
71+
72+
Note that we are disabling the test suite otherwise we'd get an install error not detecting
73+
an MPI library. Now we can see spindle!
74+
75+
```
76+
# spindle --help
77+
Usage: spindle [OPTION...] mpi_command
78+
79+
These options specify what types of files should be loaded through the Spindle
80+
network
81+
-a, --reloc-aout=yes|no Relocate the main executable through Spindle.
82+
Default: yes
83+
-f, --follow-fork=yes|no Relocate objects in fork'd child processes.
84+
Default: yes
85+
-l, --reloc-libs=yes|no Relocate shared libraries through Spindle.
86+
Default: yes
87+
-x, --reloc-exec=yes|no Relocate the targets of exec/execv/execve/...
88+
calls. Default: yes
89+
-y, --reloc-python=yes|no Relocate python modules (.py/.pyc) files when
90+
loaded via python. Default: yes
91+
92+
These options specify how the Spindle network should distibute files. Push is
93+
better for SPMD programs. Pull is better for MPMD programs. Default is push.
94+
-p, --push Use a push model where objects loaded by any
95+
process are made available to all processes
96+
-q, --pull Use a pull model where objects are only made
97+
available to processes that require them
98+
99+
These options configure Spindle's network model. Typical Spindle runs should
100+
not need to set these.
101+
-c, --cobo Use a tree-based cobo network for distributing
102+
objects
103+
-t, --port=port1-port2 TCP/IP port range for Spindle servers. Default:
104+
21940-21964
105+
106+
These options specify the security model Spindle should use for validating TCP
107+
connections. Spindle will choose a default value if no option is specified.
108+
--security-munge Use munge for security authentication
109+
110+
These options specify the job launcher Spindle is being run with. If
111+
unspecified, Spindle will try to autodetect.
112+
--launcher-startup Launch spindle daemons using the system's job
113+
launcher (requires an already set-up session).
114+
--no-mpi Run serial jobs instead of MPI job
115+
--openmpi MPI job is launched with the OpenMPI job jauncher.
116+
117+
--slurm MPI job is launched with the srun job launcher.
118+
--wreck MPI Job is launched with the wreck job launcher.
119+
120+
Options for managing sessions, which can run multiple jobs out of one spindle
121+
cache.
122+
--end-session=session-id End a persistent Spindle session with the
123+
given session-id
124+
--run-in-session=session-id
125+
Run a new job in the given session
126+
--start-session Start a persistent Spindle session and print the
127+
session-id to stdout
128+
129+
Misc options
130+
-b, --shmcache-size=size Size of client shared memory cache in kilobytes,
131+
which can be used to improve performance if
132+
multiple processes are running on each node.
133+
Default: 0
134+
--cache-prefix=path Alias for python-prefix
135+
--cleanup-proc=yes|no Fork a dedicated process to clean-up files
136+
post-spindle. Useful for high-fault situations.
137+
Default: no
138+
-d, --debug=yes|no If yes, hide spindle from debuggers so they think
139+
libraries come from the original locations. May
140+
cause extra overhead. Default: yes
141+
-e, --preload=FILE Provides a text file containing a white-space
142+
separated list of files that should be relocated
143+
to each node before execution begins
144+
--enable-rsh=yes|no Enable startint daemons with an rsh tree, if the
145+
startup mode supports it. Default: No
146+
--hostbin=EXECUTABLE Path to a script that returns the hostlist for a
147+
job on a cluster
148+
-h, --no-hide Don't hide spindle file descriptors from
149+
application
150+
-k, --audit-type=subaudit|audit
151+
Use the new-style subaudit interface for
152+
intercepting ld.so, or the old-style audit
153+
interface. The subaudit option reduces memory
154+
overhead, but is more complex. Default is audit.
155+
--msgcache-buffer=size Enables message buffering if size is non-zero,
156+
otherwise sets the size of the buffer in
157+
kilobytes
158+
--msgcache-timeout=timeout Enables message buffering if size is
159+
non-zero, otherwise sets the buffering timeout in
160+
milliseconds
161+
-n, --noclean=yes|no Don't remove local file cache after execution.
162+
Default: no (removes the cache)
163+
-o, --location=directory Back-end directory for storing relocated files.
164+
Should be a non-shared location such as a ramdisk.
165+
Default: $TMPDIR
166+
--persist=yes|no Allow spindle servers to persist after the last
167+
client job has exited. Default: No
168+
-r, --python-prefix=path Colon-seperated list of directories that contain
169+
the python install location
170+
-s, --strip=yes|no Strip debug and symbol information from binaries
171+
before distributing them. Default: yes
172+
173+
-?, --help Give this help list
174+
--usage Give a short usage message
175+
-V, --version Print program version
176+
177+
Mandatory or optional arguments to long options are also mandatory or optional
178+
for any corresponding short options.
179+
180+
Report bugs to [email protected].
181+
```
182+
183+
## 3. Use Spindle
184+
185+
**TODO** we need a dummy example here
186+
187+
188+
## 4. Clean Up
189+
190+
When you are done, exit from the container, stop and remove your images:
191+
192+
```bash
193+
$ docker-compose stop
194+
$ docker-compose rm
195+
```

0 commit comments

Comments
 (0)