Skip to content

Commit 64d277c

Browse files
committed
start of work to add demo containers
Signed-off-by: vsoch <[email protected]>
1 parent 8b66554 commit 64d277c

6 files changed

+495
-0
lines changed

docker/Dockerfile

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
FROM centos:7
2+
3+
# docker build -t vanessa/slurm:20.11.8 .
4+
5+
LABEL org.label-schema.vcs-url="https://github.com/hpc/spindle" \
6+
org.label-schema.docker.cmd="docker-compose up -d" \
7+
org.label-schema.name="spindle" \
8+
org.label-schema.description="Spindle with SLURM on Centos 7" \
9+
maintainer="Vanessa Sochat"
10+
11+
ARG SLURM_TAG=slurm-20-11-8-1
12+
13+
RUN set -ex \
14+
&& yum makecache fast \
15+
&& yum -y update \
16+
&& yum -y install epel-release \
17+
&& yum -y install \
18+
wget \
19+
bzip2 \
20+
perl \
21+
gcc \
22+
gcc-c++\
23+
git \
24+
gnupg \
25+
make \
26+
munge \
27+
munge-devel \
28+
python-devel \
29+
python-pip \
30+
python3 \
31+
python3-devel \
32+
python3-pip \
33+
mariadb-server \
34+
mariadb-devel \
35+
psmisc \
36+
bash-completion \
37+
vim-enhanced \
38+
automake \
39+
&& yum clean all \
40+
&& rm -rf /var/cache/yum
41+
42+
RUN pip install Cython nose && pip3 install Cython nose
43+
44+
RUN set -x \
45+
&& git clone https://github.com/SchedMD/slurm.git \
46+
&& pushd slurm \
47+
&& git checkout tags/$SLURM_TAG \
48+
&& ./configure --enable-debug --prefix=/usr --sysconfdir=/etc/slurm \
49+
--with-mysql_config=/usr/bin --libdir=/usr/lib64 \
50+
&& make install \
51+
&& install -D -m644 etc/cgroup.conf.example /etc/slurm/cgroup.conf.example \
52+
&& install -D -m644 etc/slurm.conf.example /etc/slurm/slurm.conf.example \
53+
&& install -D -m644 etc/slurmdbd.conf.example /etc/slurm/slurmdbd.conf.example \
54+
&& install -D -m644 contribs/slurm_completion_help/slurm_completion.sh /etc/profile.d/slurm_completion.sh \
55+
&& popd \
56+
&& rm -rf slurm \
57+
&& groupadd -r --gid=995 slurm \
58+
&& useradd -r -g slurm --uid=995 slurm \
59+
&& mkdir /etc/sysconfig/slurm \
60+
/var/spool/slurmd \
61+
/var/run/slurmd \
62+
/var/run/slurmdbd \
63+
/var/lib/slurmd \
64+
/var/log/slurm \
65+
/data \
66+
&& touch /var/lib/slurmd/node_state \
67+
/var/lib/slurmd/front_end_state \
68+
/var/lib/slurmd/job_state \
69+
/var/lib/slurmd/resv_state \
70+
/var/lib/slurmd/trigger_state \
71+
/var/lib/slurmd/assoc_mgr_state \
72+
/var/lib/slurmd/assoc_usage \
73+
/var/lib/slurmd/qos_usage \
74+
/var/lib/slurmd/fed_mgr_state \
75+
&& chown -R slurm:slurm /var/*/slurm* \
76+
&& /sbin/create-munge-key
77+
78+
COPY slurm.conf /etc/slurm/slurm.conf
79+
COPY slurmdbd.conf /etc/slurm/slurmdbd.conf
80+
81+
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
82+
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
83+
84+
RUN yum install -y net-tools openssh-server openssh-clients singularity && \
85+
yum install -y epel-release centos-release-scl lsof sudo httpd24-mod_ssl httpd24-mod_ldap
86+
87+
RUN groupadd spindle && \
88+
useradd --create-home --gid spindle spindle && \
89+
echo -n "spindle" | passwd --stdin spindle

docker/Dockerfile.node

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
FROM vanessa/slurm:18.08.6
2+
3+
# This container will be built on docker-compose up -d

docker/README.md

+195
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Spindle in Docker
2+
3+
This directory contains a set of container recipes and scripts to allow you
4+
to quickly bring up your own tiny cluster with [docker-compose](https://docs.docker.com/compose/install/), install
5+
spindle, and give it a try. You will need both [docker-compose](https://docs.docker.com/compose/install/)
6+
and [Docker](https://docs.docker.com/get-docker/) installed for this tutorial.
7+
8+
## 1. Build Containers
9+
10+
First, let's build a base container with slurm and centos with the [Dockerfile](Dockerfile) here:
11+
12+
```bash
13+
$ docker build -t vanessa/slurm:20.11.8 .
14+
```
15+
Then building containers is as easy as:
16+
17+
```bash
18+
$ docker-compose build
19+
```
20+
21+
And then bringing them up:
22+
23+
```bash
24+
$ docker-compose up -d
25+
```
26+
27+
And checking that they are running
28+
29+
```bash
30+
$ docker-compose ps
31+
Name Command State Ports
32+
------------------------------------------------------------------------
33+
c1 /usr/local/bin/docker-entr ... Up 6818/tcp
34+
c2 /usr/local/bin/docker-entr ... Up 6818/tcp
35+
mysql docker-entrypoint.sh mysqld Up 3306/tcp, 33060/tcp
36+
slurmctld /usr/local/bin/docker-entr ... Up 6817/tcp
37+
slurmdbd /usr/local/bin/docker-entr ... Up 6819/tcp
38+
```
39+
40+
Each of c1 and c2 are nodes for our cluster, and then slurmctld is like the login node.
41+
42+
```bash
43+
$ docker exec -it slurmctld bash
44+
```
45+
46+
Try running a job!
47+
48+
```bash
49+
$ sbatch --wrap="sleep 20"
50+
# squeue
51+
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
52+
1 normal wrap root R 0:00 1 c1
53+
```
54+
55+
## 2. Install spindle
56+
57+
Now let's follow instructions to install spindle.
58+
59+
```bash
60+
$ git clone https://github.com/hpc/spindle
61+
$ cd spindle
62+
```
63+
64+
We want to install providing paths to munge and slurm.
65+
66+
```bash
67+
./configure --with-munge-dir=/etc/munge --enable-sec-munge --with-slurm-dir=/etc/slurm --enable-testsuite=no
68+
make
69+
make install
70+
```
71+
72+
Note that we are disabling the test suite otherwise we'd get an install error not detecting
73+
an MPI library. Now we can see spindle!
74+
75+
```
76+
# spindle --help
77+
Usage: spindle [OPTION...] mpi_command
78+
79+
These options specify what types of files should be loaded through the Spindle
80+
network
81+
-a, --reloc-aout=yes|no Relocate the main executable through Spindle.
82+
Default: yes
83+
-f, --follow-fork=yes|no Relocate objects in fork'd child processes.
84+
Default: yes
85+
-l, --reloc-libs=yes|no Relocate shared libraries through Spindle.
86+
Default: yes
87+
-x, --reloc-exec=yes|no Relocate the targets of exec/execv/execve/...
88+
calls. Default: yes
89+
-y, --reloc-python=yes|no Relocate python modules (.py/.pyc) files when
90+
loaded via python. Default: yes
91+
92+
These options specify how the Spindle network should distibute files. Push is
93+
better for SPMD programs. Pull is better for MPMD programs. Default is push.
94+
-p, --push Use a push model where objects loaded by any
95+
process are made available to all processes
96+
-q, --pull Use a pull model where objects are only made
97+
available to processes that require them
98+
99+
These options configure Spindle's network model. Typical Spindle runs should
100+
not need to set these.
101+
-c, --cobo Use a tree-based cobo network for distributing
102+
objects
103+
-t, --port=port1-port2 TCP/IP port range for Spindle servers. Default:
104+
21940-21964
105+
106+
These options specify the security model Spindle should use for validating TCP
107+
connections. Spindle will choose a default value if no option is specified.
108+
--security-munge Use munge for security authentication
109+
110+
These options specify the job launcher Spindle is being run with. If
111+
unspecified, Spindle will try to autodetect.
112+
--launcher-startup Launch spindle daemons using the system's job
113+
launcher (requires an already set-up session).
114+
--no-mpi Run serial jobs instead of MPI job
115+
--openmpi MPI job is launched with the OpenMPI job jauncher.
116+
117+
--slurm MPI job is launched with the srun job launcher.
118+
--wreck MPI Job is launched with the wreck job launcher.
119+
120+
Options for managing sessions, which can run multiple jobs out of one spindle
121+
cache.
122+
--end-session=session-id End a persistent Spindle session with the
123+
given session-id
124+
--run-in-session=session-id
125+
Run a new job in the given session
126+
--start-session Start a persistent Spindle session and print the
127+
session-id to stdout
128+
129+
Misc options
130+
-b, --shmcache-size=size Size of client shared memory cache in kilobytes,
131+
which can be used to improve performance if
132+
multiple processes are running on each node.
133+
Default: 0
134+
--cache-prefix=path Alias for python-prefix
135+
--cleanup-proc=yes|no Fork a dedicated process to clean-up files
136+
post-spindle. Useful for high-fault situations.
137+
Default: no
138+
-d, --debug=yes|no If yes, hide spindle from debuggers so they think
139+
libraries come from the original locations. May
140+
cause extra overhead. Default: yes
141+
-e, --preload=FILE Provides a text file containing a white-space
142+
separated list of files that should be relocated
143+
to each node before execution begins
144+
--enable-rsh=yes|no Enable startint daemons with an rsh tree, if the
145+
startup mode supports it. Default: No
146+
--hostbin=EXECUTABLE Path to a script that returns the hostlist for a
147+
job on a cluster
148+
-h, --no-hide Don't hide spindle file descriptors from
149+
application
150+
-k, --audit-type=subaudit|audit
151+
Use the new-style subaudit interface for
152+
intercepting ld.so, or the old-style audit
153+
interface. The subaudit option reduces memory
154+
overhead, but is more complex. Default is audit.
155+
--msgcache-buffer=size Enables message buffering if size is non-zero,
156+
otherwise sets the size of the buffer in
157+
kilobytes
158+
--msgcache-timeout=timeout Enables message buffering if size is
159+
non-zero, otherwise sets the buffering timeout in
160+
milliseconds
161+
-n, --noclean=yes|no Don't remove local file cache after execution.
162+
Default: no (removes the cache)
163+
-o, --location=directory Back-end directory for storing relocated files.
164+
Should be a non-shared location such as a ramdisk.
165+
Default: $TMPDIR
166+
--persist=yes|no Allow spindle servers to persist after the last
167+
client job has exited. Default: No
168+
-r, --python-prefix=path Colon-seperated list of directories that contain
169+
the python install location
170+
-s, --strip=yes|no Strip debug and symbol information from binaries
171+
before distributing them. Default: yes
172+
173+
-?, --help Give this help list
174+
--usage Give a short usage message
175+
-V, --version Print program version
176+
177+
Mandatory or optional arguments to long options are also mandatory or optional
178+
for any corresponding short options.
179+
180+
Report bugs to [email protected].
181+
```
182+
183+
## 3. Use Spindle
184+
185+
**TODO** we need a dummy example here
186+
187+
188+
## 4. Clean Up
189+
190+
When you are done, exit from the container, stop and remove your images:
191+
192+
```bash
193+
$ docker-compose stop
194+
$ docker-compose rm
195+
```

docker/docker-compose.yml

+77
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
version: "2.2"
2+
3+
services:
4+
mysql:
5+
image: mysql:5.7
6+
hostname: mysql
7+
container_name: mysql
8+
environment:
9+
MYSQL_RANDOM_ROOT_PASSWORD: "yes"
10+
MYSQL_DATABASE: slurm_acct_db
11+
MYSQL_USER: slurm
12+
MYSQL_PASSWORD: password
13+
volumes:
14+
- var_lib_mysql:/var/lib/mysql
15+
16+
slurmdbd:
17+
image: vanessa/slurm:18.08.6
18+
command: "slurmdbd"
19+
container_name: slurmdbd
20+
hostname: slurmdbd
21+
volumes:
22+
- etc_munge:/etc/munge
23+
- etc_slurm:/etc/slurm
24+
- var_log_slurm:/var/log/slurm
25+
expose:
26+
- "6819"
27+
depends_on:
28+
- mysql
29+
30+
slurmctld:
31+
image: vanessa/slurm:18.08.6
32+
command: "slurmctld"
33+
container_name: slurmctld
34+
hostname: slurmctld
35+
volumes_from:
36+
- slurmdbd
37+
expose:
38+
- "6817"
39+
depends_on:
40+
- "slurmdbd"
41+
42+
c1:
43+
build:
44+
context: .
45+
dockerfile: Dockerfile.node
46+
command: "slurmd"
47+
privileged: true
48+
hostname: c1
49+
container_name: c1
50+
volumes_from:
51+
- slurmctld
52+
expose:
53+
- "6818"
54+
depends_on:
55+
- "slurmctld"
56+
57+
c2:
58+
build:
59+
context: .
60+
dockerfile: Dockerfile.node
61+
command: "slurmd"
62+
privileged: true
63+
hostname: c2
64+
container_name: c2
65+
volumes_from:
66+
- slurmctld
67+
expose:
68+
- "6818"
69+
depends_on:
70+
- "slurmctld"
71+
72+
volumes:
73+
etc_munge:
74+
etc_slurm:
75+
slurm_jobdir:
76+
var_lib_mysql:
77+
var_log_slurm:

0 commit comments

Comments
 (0)