Skip to main content

How hardware-assisted takeover works

Contributors netapp-jsnyder netapp-thomi

Enabled by default, the hardware-assisted takeover feature can speed up the takeover process by using a node's remote management device (Service Processor).

When the remote management device detects a failure, it quickly initiates the takeover rather than waiting for ONTAP to recognize that the partner's heartbeat has stopped. If a failure occurs without this feature enabled, the partner waits until it notices that the node is no longer giving a heartbeat, confirms the loss of heartbeat, and then initiates the takeover.

The hardware-assisted takeover feature uses the following process to avoid that wait:

  1. The remote management device monitors the local system for certain types of failures.

  2. If a failure is detected, the remote management device immediately sends an alert to the partner node.

  3. Upon receiving the alert, the partner initiates takeover.

System events that trigger hardware-assisted takeover

The partner node might generate a takeover depending on the type of alert it receives from the remote management device (Service Processor).

Alert

Takeover initiated upon receipt?

Description

abnormal_reboot

No

An abnormal reboot of the node occurred.

l2_watchdog_reset

Yes

The system watchdog hardware detected an L2 reset.
The remote management device detected a lack of response from the system CPU and reset the system.

loss_of_heartbeat

No

The remote management device is no longer receiving the heartbeat message from the node.
This alert does not refer to the heartbeat messages between the nodes in the HA pair; it refers to the heartbeat between the node and its local remote management device.

periodic_message

No

A periodic message is sent during a normal hardware-assisted takeover operation.

power_cycle_via_sp

Yes

The remote management device cycled the system power off and on.

power_loss

Yes

A power loss occurred on the node.
The remote management device has a power supply that maintains power for a short period after a power loss, allowing it to report the power loss to the partner.

power_off_via_sp

Yes

The remote management device powered off the system.

reset_via_sp

Yes

The remote management device reset the system.

test

No

A test message is sent to verify a hardware-assisted takeover operation.