Provision a Jupyter Notebook Workspace for Data Scientist or Developer Use
PDF of this doc site
- AI Converged Infrastructures
Data Pipelines, Data Lakes and Management
- NetApp AI Control Plane
- Distributed training in Azure - Click-Through Rate Prediction
- Conversational AI using NVIDIA
- Anthos with NetApp
- DevOps with NetApp Astra
Red Hat OpenShift with NetApp
- NetApp Storage Integrations Overview
Solution Validation and Use Cases
- Red Hat OpenShift Virtualization with NetApp ONTAP
- VMware Tanzu with NetApp
- Data Migration and Data Protection
- Oracle Database
- SnapCenter for Databases
- Modern Data Analytics
NetApp Hybrid Multicloud with VMware
- VMware for Public Cloud
- NetApp for GCP / GCVE
- NetApp Hybrid Multicloud with Red Hat OpenShift
- VMware Virtualization for ONTAP
- Virtual Desktops
- Artificial Intelligence
Kubeflow is capable of rapidly provisioning new Jupyter Notebook servers to act as data scientist workspaces. To provision a new Jupyter Notebook server with Kubeflow, perform the following tasks. For more information about Jupyter Notebooks within the Kubeflow context, see the official Kubeflow documentation.
From the Kubeflow central dashboard, click Notebook Servers in the main menu to navigate to the Jupyter Notebook server administration page.
Click New Server to provision a new Jupyter Notebook server.
Give your new server a name, choose the Docker image that you want your server to be based on, and specify the amount of CPU and RAM to be reserved by your server. If the Namespace field is blank, use the Select Namespace menu in the page header to choose a namespace. The Namespace field is then auto-populated with the chosen namespace.
In the following example, the
kubeflow-anonymousnamespace is chosen. In addition, the default values for Docker image, CPU, and RAM are accepted.
Specify the workspace volume details. If you choose to create a new volume, then that volume or PVC is provisioned using the default StorageClass. Because a StorageClass utilizing Trident was designated as the default StorageClass in the section Kubeflow Deployment, the volume or PVC is provisioned with Trident. This volume is automatically mounted as the default workspace within the Jupyter Notebook Server container. Any notebooks that a user creates on the server that are not saved to a separate data volume are automatically saved to this workspace volume. Therefore, the notebooks are persistent across reboots.
Add data volumes. The following example specifies an existing PVC named 'pb-fg-all' and accepts the default mount point.
Optional: Request that the desired number of GPUs be allocated to your notebook server. In the following example, one GPU is requested.
Click Launch to provision your new notebook server.
Wait for your notebook server to be fully provisioned. This can take several minutes if you have never provisioned a server using the Docker image that you specified because the image needs to be downloaded. When your server has been fully provisioned, you see a green check mark in the Status column on the Jupyter Notebook server administration page.
Click Connect to connect to your new server web interface.
Confirm that the dataset volume that was specified in step 6 is mounted on the server. Note that this volume is mounted within the default workspace by default. From the perspective of the user, this is just another folder within the workspace. The user, who is likely a data scientist and not an infrastructure expert, does not need to possess any storage expertise in order to use this volume.
Open a Terminal and, assuming that a new volume was requested in step 5, execute
df -hto confirm that a new Trident-provisioned persistent volume is mounted as the default workspace.
The default workspace directory is the base directory that you are presented with when you first access the server’s web interface. Therefore, any artifacts that you create by using the web interface are stored on this Trident-provisioned persistent volume.
Using the terminal, run
nvidia-smito confirm that the correct number of GPUs were allocated to the notebook server. In the following example, one GPU has been allocated to the notebook server as requested in step 7.