Set up Dask with RAPIDS deployment on AKS using Helm

Contributors

To set up Dask with RAPIDS deployment on AKS using Helm, complete the following steps:

  1. Create a namespace for installing Dask with RAPIDS.

    kubectl create namespace rapids-dask
  2. Create a PVC to store the click-through rate dataset:

    1. Save the following YAML content to a file to create a PVC.

      kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: pvc-criteo-data
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 1000Gi
        storageClassName: azurenetappfiles
    2. Apply the YAML file to your Kubernetes cluster.

      kubectl -n rapids-dask apply -f <your yaml file>
  3. Clone the rapidsai git repository ( https://github.com/rapidsai/helm-chart).

    git clone https://github.com/rapidsai/helm-chart helm-chart
  4. Modify values.yaml and include the PVC created earlier for workers and Jupyter workspace.

    1. Go to the rapidsai directory of the repository.

      cd helm-chart/rapidsai
    2. Update the values.yaml file and mount the volume using PVC.

      dask:
        …
        worker:
          name: worker
          …
          mounts:
            volumes:
              - name: data
                persistentVolumeClaim:
                  claimName: pvc-criteo-data
            volumeMounts:
              - name: data
                mountPath: /data
          …
        jupyter:
          name: jupyter
          …
          mounts:
            volumes:
              - name: data
                persistentVolumeClaim:
                  claimName: pvc-criteo-data
            volumeMounts:
              - name: data
                mountPath: /data
          …
  5. Go to the repository’s home directory and deploy Dask with three worker nodes on AKS using Helm.

    cd ..
    helm dep update rapidsai
    helm install rapids-dask --namespace rapids-dask rapidsai