Set up Dask with RAPIDS deployment on AKS using Helm

Contributors Download PDF of this page

To set up Dask with RAPIDS deployment on AKS using Helm, complete the following steps:

  1. Create a namespace for installing Dask with RAPIDS.

    kubectl create namespace rapids-dask
  2. Create a PVC to store the click-through rate dataset:

    1. Save the following YAML content to a file to create a PVC.

      kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: pvc-criteo-data
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 1000Gi
        storageClassName: azurenetappfiles
    2. Apply the YAML file to your Kubernetes cluster.

      kubectl -n rapids-dask apply -f <your yaml file>
  3. Clone the rapidsai git repository ( https://github.com/rapidsai/helm-chart).

    git clone https://github.com/rapidsai/helm-chart helm-chart
  4. Modify values.yaml and include the PVC created earlier for workers and Jupyter workspace.

    1. Go to the rapidsai directory of the repository.

      cd helm-chart/rapidsai
    2. Update the values.yaml file and mount the volume using PVC.

      dask:
        …
        worker:
          name: worker
          …
          mounts:
            volumes:
              - name: data
                persistentVolumeClaim:
                  claimName: pvc-criteo-data
            volumeMounts:
              - name: data
                mountPath: /data
          …
        jupyter:
          name: jupyter
          …
          mounts:
            volumes:
              - name: data
                persistentVolumeClaim:
                  claimName: pvc-criteo-data
            volumeMounts:
              - name: data
                mountPath: /data
          …
  5. Go to the repository’s home directory and deploy Dask with three worker nodes on AKS using Helm.

    cd ..
    helm dep update rapidsai
    helm install rapids-dask --namespace rapids-dask rapidsai