A partitioned gpu-backed dataframe, using Dask.
Setup from source repo:
-
Install dependencies into a new conda environment where
CUDA_VERSIONis either 9.2 or 10conda create -n dask-cudf \ -c rapidsai -c numba -c conda-forge -c defaults \ cudf dask cudatoolkit={CUDA_VERSION} -
Activate conda environment:
source activate dask-cudf -
Clone
dask-cudfrepo:git clone https://github.com/rapidsai/dask-cudf -
Install from source:
cd dask-cudf pip install .
-
Install
pytestconda install pytest -
Run all tests:
py.test dask_cudf -
Or, run individual tests:
py.test dask_cudf/tests/test_file.py
For style we use black, isort, and flake8. These are available as
pre-commit hooks that will run every time you are about to commit code.
From the root directory of this project run the following:
pip install pre-commit
pre-commit install