Alerting with Monitors

Contributors netapp-alavoie Download PDF of this page

You create monitors to set thresholds that trigger alerts to notify you about issues related to the resources in your network. For example, you can create a monitor to alert for node write latency for any of a multitude of protocols.

Monitors and Alerting is available in Cloud Insights Standard Edition and higher.

When the monitored threshold and conditions are reached or exceeded, Cloud Insights creates an alert. A Monitor can have a Warning threshold, a Critical threshold, or both.

Monitor or Performance Policy?

What’s the difference between a Performance Policy and a Monitor?

Policies allow you to set thresholds on "infrastructure" objects such as storage, VM, EC2, and ports. These policies trigger violations when thresholds are met or exceeded. Each violation can be investigated for troubleshooting. Policies are described in detail elsewhere in this documentation.

Monitors provide similar functionality for "integration" data such as those collected for Kubernetes, ONTAP advanced metrics, and Telegraf plugins, and alert when thresholds are crossed. With Monitors, you can set thresholds for Warning- or Critical-level alerts, or both.

Policies and Monitors are available under the Alerts menu.

Alerts Menu

Emails can be sent when a policy or monitor is triggered.

Creating a Monitor

In the example below, we will create a Monitor to give a Warning alert when Volume Node NFS Write Latency reaches or exceeds 200ms, and a Critical alert when it reaches or exceeds 400ms. We only want to be alerted when either threshold is exceeded for at least 15 continuous minutes.

Requirements

  • Cloud Insights must be configured to collect integration data, and that data is being collected.

Create the Monitor

  1. From the Cloud Insights menu, click Alerts > Manage Monitors

    The Monitors list page is displayed, showing currently configured monitors.

  2. To add a monitor, Click + Monitor. To modify an existing monitor, click the monitor name in the list.

    The Monitor Configuration dialog is displayed.

  3. In the drop-down, search for and choose an object type and metric to monitor, for example netapp_ontap_volume_node_nfs_write_latency.

You can set filters to narrow down which object attributes or metrics to monitor.

Metrics Filtering

When working with integration data (Kubernetes, ONTAP Advanced Data, etc.), metric filtering removes the individual/unmatched data points from the plotted data series, unlike infrastructure data (storage, VM, ports etc.) where filters work on the aggregated value of the data series and potentially remove the entire object from the chart.

To create a multi-condition monitor (e.g., IOPS > X and latency > Y), define the first condition as a threshold and the second condition as a filter.

Define the Conditions of the Monitor.

  1. After choosing the object and metric to monitor, set the Warning-level and/or Critical-level thresholds.

  2. For the Warning level, enter 200. The dashed line indicating this Warning level displays in the example graph.

  3. For the Critical level, enter 400. The dashed line indicating this Critical level displays in the example graph.

    The graph displays historical data. The Warning and Critical level lines on the graph are a visual representation of the Monitor, so you can easily see when the Monitor might trigger an alert in each case.

  4. For the occurence interval, choose Continuously for a period of 15 Minutes.

    You can choose to trigger an alert the moment a threshold is breached, or wait until the threshold has been in continuous breach for a period of time. In our example, we do not want to be alerted every time the Total IOPS peaks above the Warning or Critical level, but only when a monitored object continuously exceeds one of these levels for at least 15 minutes.

    Define Conditions

Select notification type and recipients

In the Set up team notification(s) section, you can choose whether to alert your team via email or Webhook.

Choose alerting method

Alerting via Email:

Specify the email recipients for alert notifications. If desired, you can choose different recipients for warning or critical alerts.

Email Alert Recipients

Alerting via Webhook:

Specify the webhook(s) for alert notifications. If desired, you can choose different webhooks for warning or critical alerts.

Webhook Alerting

Webhooks is considered a Preview feature and is therefore subject to change.

Save your Monitor

  1. If desired, you can add a description of the monitor.

  2. Give the Monitor a meaningful name and click Save.

    Your new monitor is added to the list of active Monitors.

Monitor List

The Monitor page lists the currently configured monitors, showing the following:

  • Monitor Name

  • Status

  • Object/metric being monitored

  • Conditions of the Monitor

You can choose to temporarily suspend monitoring of an object type by clicking the menu to the right of the monitor and selecting Pause. When you are ready to resume monitoring, click Resume.

You can copy a monitor by selecting Duplicate from the menu. You can then modify the new monitor and change the object/metric, filter, conditions, email recipients, etc.

If a monitor is no longer needed, you can delete it by selecting Delete from the menu.

Monitor Groups

Grouping allows you to view and manage related monitors. For example, you can have a monitor group dedicated to the storage in your environment, or monitors relevant to a certain recipient list.

Monitor Grouping

The number of monitors contained in a group is shown next to the group name.

To create a new group, click the "+" Create New Monitor Group button. Enter a name for the group and click Create Group. An empty group is created with that name.

To add monitors to the group, go to the All Monitors group (recommended) and do one of the following:

  • To add a single monitor, click the menu to the right of the monitor and select Add to Group. Choose the group to which to add the monitor.

  • Click on the monitor name to open the monitor’s edit view, and select a group in the Associate to a monitor group section.

    Associate to group

Remove monitors by clicking on a group and selecting Remove from Group from the menu. You can not remove monitors from the All Monitors or _Custom Monitors group. To delete a monitor from these groups, you must delete the monitor itself.

Removing a monitor from a group does not delete the monitor from Cloud Insights. To completely remove a monitor, select the monitor and click Delete. This also removes it from the group to which it belonged and it is no longer available to any user.

You can also move a monitor to a different group in the same manner, selecting Move to Group.

Each monitor can belong to only a single group at any given time.

To pause or resume all monitors in a group at once, select the menu for the group and click Pause or Resume.

Use the same menu to rename or delete a group. Deleting a group does not delete the monitors from Cloud Insights; they are still available in All Monitors.

Pause a group