Skip to main content
Cloud Insights

Before Installing or Upgrading the NetApp Kubernetes Monitoring Operator

Contributors netapp-alavoie

Read this information before installing ot upgrading your NetApp Kubernetes Monitoring Operator

Pre-requisites:

  • If you are using a custom or private docker repository, follow the instructions in the Using a custom or private docker repository section

  • NetApp Kubernetes Monitoring Operator installation is supported with Kubernetes version 1.20 or greater.

  • When Cloud Insights is monitoring the backend storage and Kubernetes is used with the Docker container runtime, Cloud Insights can display pod-to-PV-to-storage mappings and metrics for NFS and iSCSI; other runtimes only show NFS.

  • Beginning August 2022, the NetApp Kubernetes Monitoring Operator includes support for Pod Security Policy (PSP). You must upgrade to the latest NetApp Kubernetes Monitoring Operator if your environment uses PSP.

  • If you are running on OpenShift 4.6 or higher, you must follow the OpenShift Instructions below in addition to ensuring these pre-requisites are met.

  • Monitoring is only installed on Linux nodes
    Cloud Insights supports monitoring of Kubernetes nodes that are running Linux, by specifying a Kubernetes node selector that looks for the following Kubernetes labels on these platforms:

Platform

Label

Kubernetes v1.20 and above

Kubernetes.io/os = linux

Rancher + cattle.io as orchestration/Kubernetes platform

cattle.io/os = linux

  • The NetApp Kubernetes Monitoring Operator and its dependencies (telegraf, kube-state-metrics, fluentbit, etc.) are not supported on nodes that are running with Arm64 architecture.

  • The following commands must be available: curl, kubectl. The docker command is required for an optional installation step. For best results, add these commands to the PATH. Note that kubectl needs to be configured with access to the following kubernetes objects at a minimum: agents, clusterroles, clusterrolebindings, customresourcedefinitions, deployments, namespaces, roles, rolebindings, secrets, serviceaccounts, and services. See here for an example .yaml file with these minimum clusterrole privileges.

  • The host you will use for the NetApp Kubernetes Monitoring Operator installation must have kubectl configured to communicate with the target K8s cluster, and have Internet connectivity to your Cloud Insights environment.

  • If you are behind a proxy during installation, or when operating the K8s cluster to be monitored, follow the instructions in the Configuring Proxy Support section.

  • The NetApp Kubernetes Monitoring Operator installs its own kube-state-metrics to avoid conflict with any other instances.
    For accurate audit and data reporting, it is strongly recommended to synchronize the time on the Agent machine using Network Time Protocol (NTP) or Simple Network Time Protocol (SNTP).

  • If you are re-deploying the Operator (i.e. you are updating or replacing it), there is no need to create a new API token; you can re-use the previous token.

  • Also note that if you have a recent NetApp Kubernetes Monitoring Operator installed and are using an API access token that is renewable, expiring tokens will automatically be replaced by new/refreshed API access tokens.

  • Network monitoring:

    • Requires Linux kernel version 4.18.0 and above

    • Photon OS is not supported.

Configuring the Operator

In newer versions of the operator, most commonly modified settings can be configured in the AgentConfiguration custom resource. You can edit this resource before deploying the operator by editing the operator-config.yaml file. This file includes commented out examples of some settings. See the list of available settings for the most recent version of the operator.

You can also edit this resource after the operator has been deployed using the following command:

kubectl -n netapp-monitoring edit AgentConfiguration

To determine if your deployed version of the operator supports AgentConfiguration, run the following command:

kubectl get crd agentconfigurations.monitoring.netapp.com

If you see an “Error from server (NotFound)” message, your operator must be upgraded before you can use the AgentConfiguration.

Important Things to Note Before You Start

If you are running with a proxy, have a custom repository, or are using OpenShift, read the following sections carefully.

Also read about Permissions.

If you are upgrading from a previous installation, read the Upgrading information.

Configuring Proxy Support

There are two places where you may use a proxy in your environment in order to install the NetApp Kubernetes Monitoring Operator. These may be the same or separate proxy systems:

  • Proxy needed during execution of the installation code snippet (using "curl") to connect the system where the snippet is executed to your Cloud Insights environment

  • Proxy needed by the target Kubernetes cluster to communicate with your Cloud Insights environment

If you use a proxy for either or both of these, to install the NetApp Kubernetes Operating Monitor you must first ensure that your proxy is configured to allow good communication to your Cloud Insights environment. For example, from the servers/VMs from which you wish to install the Operator, you need to be able to access Cloud Insights and be able to download binaries from Cloud Insights.

For the proxy used to install the NetApp Kubernetes Operating Monitor, before installing the Operator, set the http_proxy/https_proxy environment variables. For some proxy environments, you may also need to set the no_proxy environment variable.

To set the variable(s), perform the following steps on your system before installing the NetApp Kubernetes Monitoring Operator:

  1. Set the https_proxy and/or http_proxy environment variable(s) for the current user:

    1. If the proxy being setup does not have Authentication (username/password), run the following command:

      export https_proxy=<proxy_server>:<proxy_port>
    2. If the proxy being setup does have Authentication (username/password), run this command:

      export http_proxy=<proxy_username>:<proxy_password>@<proxy_server>:<proxy_port>

For the proxy used for your Kubernetes cluster to communicate with your Cloud Insights environment, install the NetApp Kubernetes Monitoring Operator after reading all of these instructions.

Configure the proxy section of AgentConfiguration in operator-config.yaml before deploying the NetApp Kubernetes Monitoring Operator.

agent:
  ...
  proxy:
    server: <server for proxy>
    port: <port for proxy>
    username: <username for proxy>
    password: <password for proxy>

    # In the noproxy section, enter a comma-separated list of
    # IP addresses and/or resolvable hostnames that should bypass
    # the proxy
    noproxy: <comma separated list>

    isTelegrafProxyEnabled: true
    isFluentbitProxyEnabled: <true or false> # true if Events Log enabled
    isCollectorsProxyEnabled: <true or false> # true if Network Performance and Map enabled
    isAuProxyEnabled: <true or false> # true if AU enabled
  ...
...

Using a custom or private docker repository

By default, the NetApp Kubernetes Monitoring Operator will pull container images from the Cloud Insights repository. If you have a Kubernetes cluster used as the target for monitoring, and that cluster is configured to only pull container images from a custom or private Docker repository or container registry, you must configure access to the containers needed by the NetApp Kubernetes Monitoring Operator.

Run the “Image Pull Snippet” from the NetApp Monitoring Operator install tile. This command will log into the Cloud Insights repository, pull all image dependencies for the operator, and log out of the Cloud Insights repository. When prompted, enter the provided repository temporary password. This command downloads all images used by the operator, including for optional features. See below for which features these images are used for.

Core Operator Functionality and Kubernetes Monitoring

  • netapp-monitoring

  • kube-rbac-proxy

  • kube-state-metrics

  • telegraf

  • distroless-root-user

Events Log

  • fluent-bit

  • kubernetes-event-exporter

Network Performance and Map

  • ci-net-observer

Push the operator docker image to your private/local/enterprise docker repository according to your corporate policies. Ensure that the image tags and directory paths to these images in your repository are consistent with those in the Cloud Insights repository.

Edit the monitoring-operator deployment in operator-deployment.yaml, and modify all image references to use your private Docker repository.

image: <docker repo of the enterprise/corp docker repo>/kube-rbac-proxy:<kube-rbac-proxy version>
image: <docker repo of the enterprise/corp docker repo>/netapp-monitoring:<version>

Edit the AgentConfiguration in operator-config.yaml to reflect the new docker repo location. Create a new imagePullSecret for your private repository, for more details see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

agent:
  ...
  # An optional docker registry where you want docker images to be pulled from as compared to CI's docker registry
  # Please see documentation link here: https://docs.netapp.com/us-en/cloudinsights/task_config_telegraf_agent_k8s.html#using-a-custom-or-private-docker-repository
  dockerRepo: your.docker.repo/long/path/to/test
  # Optional: A docker image pull secret that maybe needed for your private docker registry
  dockerImagePullSecret: docker-secret-name

OpenShift Instructions

If you are running on OpenShift 4.6 or higher, you must edit the AgentConfiguration in operator-config.yaml to enable the runPrivileged setting:

# Set runPrivileged to true SELinux is enabled on your kubernetes nodes
runPrivileged: true

Openshift may implement an added level of security that may block access to some Kubernetes components.

Permissions

If the cluster you are monitoring contains Custom Resources which do not have a ClusterRole which aggregates to view, you will need to manually grant the operator access to these resources to monitor them with Event Logs.

  1. Edit operator-additional-permissions.yaml before installing, or after installing edit the resource ClusterRole/<namespace>-additional-permissions

  2. Create a new rule for the desired apiGroups and resources with the verbs ["get", "watch", "list"]. See https://kubernetes.io/docs/reference/access-authn-authz/rbac/

  3. Apply your changes to the cluster

Tolerations and Taints

The netapp-ci-telegraf-ds, netapp-ci-fluent-bit-ds, and netapp-ci-net-observer-l4-ds DaemonSets must schedule a pod on every node in your cluster in order to correctly collect data on all nodes. The operator has been configured to tolerate some well known taints. If you have configured any custom taints on your nodes, thus preventing pods from running on every node, you can create a toleration for those taints in the AgentConfiguration. If you have applied custom taints to all nodes in your cluster, you must also add the necessary tolerations to the operator deployment to allow the operator pod to be scheduled and executed.

Learn More about Kubernetes Taints and Tolerations.