Suggested doc improvements based on first-time user experience

Thank you for this fantastic daskhub chart. Myself and @saschahofmann has just set this up on Google Kubernetes Engine. I thought it would be helpful to share our experience, gotchas and potentially suggest improvements to the docs.

## Making the decision between charts

We started with the simpler "dask" helm chart.

However it didn't really suit our need for a few reasons:
- The jupyter notebook has no persistent disk storage
- We wanted the ability to control the number of workers, without making updates via helm

It would have been helpful to have these limitations listed on the README, especially the persistence one which I'd imagine most people would expect (I guess will #78 fix it though).

## Setting up daskhub

Setting up the initial "daskhub" worked fine, out of the box. The one part that is a bit flaky for us is the "launch cluster" button on the left sidebar. Sometimes it launches a local cluster, rather than via dask-gateway - we haven't worked out why yet.

Small thing - the quickstart does not quite work for us:

```
>>> from dask_gateway import GatewayCluster
>>> cluster = gateway.new_cluster()
>>> client = cluster.get_client()
```

Instead we ran:

```
from dask_gateway import GatewayCluster
# This line is missing
gateway = GatewayCluster()
cluster = gateway.new_cluster()
client = cluster.get_client()
```

## Customising the image 🏗️ 

For our use case, we needed to build a custom image.

The default tag for daskhub is `pangeo/base-notebook:{XXX}`. Unfortunately, googling that led us to the pangeo stacks [Github](https://github.com/pangeo-data/pangeo-stacks) and [webpage](https://pangeo-data.github.io/pangeo-stacks/) and spent time trying to get the "ONBUILD" images to work.

Thankfully we eventually stumbled on [this comment](https://github.com/pangeo-data/pangeo-docker-images/issues/60#issuecomment-617384578), and led us to the correct [repo](https://github.com/pangeo-data/pangeo-docker-images/tree/master/pangeo-notebook). The "ONBUILD" trick is neat, just took us time to get our head around.

Anyone else stuck doing the same thing, here is what worked for us:
- Create a local copy of the pangeo/base-notebook directory [here](https://github.com/pangeo-data/pangeo-docker-images/tree/master/pangeo-notebook). You minimally need the `Dockerfile`, `apt.txt`, `environment.yml`, `postBuild` and `start`.
- Modify the `apt.txt` with system packages, and `environment.yml` with your extra conda packages
- Docker build and push 
- Set this to the `jupyterhub.singleuser.image.{name, tag, pullPolicy}`

For the dask custom image, we just used `daskgateway/dask-gateway` we just did:

```
# Dockerfile
FROM daskgateway/dask-gateway:0.9.0
COPY ./environment.yml ./environment.yml
RUN conda env update -f environment.yml

# environment.yml
name: base
dependencies:
  - # extra packages...
```

And set this to `dask-gateway.gateway.backend.image.{name, tag, pullPolicy}`

## Making sure to tag specific versions 🤦 

A few days ago, we discovered inexplicably that we could no longer build new custom images that worked on the cluster. Turns out we hadn't pinned our images, and dask-gateway 0.9.0 was released.

Obviously completely our fault, but wanted to note that it's essential to have consistent versions for:
- The helm chart
- The `pangeo/base-notebook` in your jupyterhub.singleuser Dockerfile
- The `daskgateway/dask-gateway` in your Dockerfile

Otherwise things will break in hard to debug ways. A particularly nasty example was using dask dataframe, where we had pandas 1.1.* on the client and 1.0.* on the workers.

We'd be happy to submit a PR to amend the docs or maybe a separate guide if that's useful, please let us know.

Once again, thanks for building this!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Suggested doc improvements based on first-time user experience #129

Making the decision between charts

Setting up daskhub

Customising the image 🏗️

Making sure to tag specific versions 🤦

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Suggested doc improvements based on first-time user experience #129

Description

Making the decision between charts

Setting up daskhub

Customising the image 🏗️

Making sure to tag specific versions 🤦

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions