简体中文版经机器翻译而成，仅供参考。如与英语版出现任何冲突，应以英语版为准。

测试MetroCluster IP 配置的ONTAP节点切换

10/21/2025 贡献者

PDF

您可以测试故障情形，以确认 MetroCluster 配置是否正常运行。

验证协商的切换

您可以测试协商（计划内）切换操作，以确认无中断数据可用性。

关于此任务

此测试验证了将集群切换到第二个数据中心后数据可用性不会受到影响（SMB 和光纤通道协议除外）。

此测试需要大约 30 分钟。

此操作步骤具有以下预期结果：

MetroCluster switchover 命令将显示警告提示符。

如果对提示符回答 yes ，则发出命令的站点将切换配对站点。

对于 MetroCluster IP 配置：

对于 ONTAP 9.4 及更早版本：
- 在协商切换后，镜像聚合将降级。
对于 ONTAP 9.5 及更高版本：
- 如果可以访问远程存储，则镜像聚合将保持正常状态。
- 如果无法访问远程存储，则在协商切换后，镜像聚合将降级。
对于 ONTAP 9.8 及更高版本：
- 如果无法访问远程存储，则位于灾难站点的未镜像聚合将不可用。这可能会导致控制器中断。

步骤

确认所有节点均处于已配置状态和正常模式：

MetroCluster node show

cluster_A::>  metrocluster node show

Cluster                        Configuration State    Mode
------------------------------ ---------------------- ------------------------
 Local: cluster_A               configured             normal
Remote: cluster_B               configured             normal

开始切换操作：

MetroCluster switchover

cluster_A::> metrocluster switchover
Warning: negotiated switchover is about to start. It will stop all the data Vservers on cluster "cluster_B" and
automatically re-start them on cluster "cluster_A". It will finally gracefully shutdown cluster "cluster_B".

确认本地集群处于已配置状态和切换模式：

MetroCluster node show

cluster_A::>  metrocluster node show

Cluster                        Configuration State    Mode
------------------------------ ---------------------- ------------------------
Local: cluster_A                configured             switchover
Remote: cluster_B               not-reachable          -
              configured             normal

确认切换操作已成功：

MetroCluster 操作显示

cluster_A::> metrocluster operation show
  Operation: switchover
      State: successful
 Start Time: 2/6/2016 13:28:50
   End Time: 2/6/2016 13:29:41
     Errors: -

使用 vserver show 和 network interface show 命令验证灾难恢复 SVM 和 LIF 是否已联机。

验证修复和手动切回

您可以通过在协商切换后将集群切回原始数据中心来测试修复和手动切回操作，以验证数据可用性不受影响（ SMB 和 Solaris FC 配置除外）。

关于此任务

此测试需要大约 30 分钟。

此操作步骤的预期结果是，应将服务切换回其主节点。

运行 ONTAP 9.5 或更高版本的系统不需要执行修复步骤，而是在协商切换后自动对其执行修复。在运行 ONTAP 9.6 及更高版本的系统上，也会在计划外切换后自动执行修复。

步骤

如果系统运行的是 ONTAP 9.4 或更早版本，请修复数据聚合：

MetroCluster 修复聚合`

以下示例显示了命令的成功完成：
```
cluster_A::> metrocluster heal aggregates
[Job 936] Job succeeded: Heal Aggregates is successful.
```
如果系统运行的是 ONTAP 9.4 或更早版本，请修复根聚合：

MetroCluster 修复根聚合`

以下配置需要执行此步骤：
- MetroCluster FC 配置。
- 运行 ONTAP 9.4 或更早版本的 MetroCluster IP 配置。以下示例显示了命令的成功完成：
```
cluster_A::> metrocluster heal root-aggregates
[Job 937] Job succeeded: Heal Root Aggregates is successful.
```

验证修复是否已完成：

MetroCluster node show

以下示例显示了命令的成功完成：

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
              node_A_1         configured     enabled   heal roots completed
      cluster_B
              node_B_2         unreachable    -         switched over
42 entries were displayed.

如果自动修复操作因任何原因失败，您必须 MetroCluster 按照 ONTAP 9.5 之前的 ONTAP 版本中的步骤手动执行问题描述 heal` 命令。您可以使用 MetroCluster operation show` 和 MetroCluster operation history show -instance` 命令监控修复状态并确定故障的发生原因。

验证所有聚合是否均已镜像：

s存储聚合显示

以下示例显示所有聚合的 RAID 状态均为已镜像：

cluster_A::> storage aggregate show
cluster Aggregates:
Aggregate Size     Available Used% State   #Vols  Nodes       RAID Status
--------- -------- --------- ----- ------- ------ ----------- ------------
data_cluster
            4.19TB    4.13TB    2% online       8 node_A_1    raid_dp,
                                                              mirrored,
                                                              normal
root_cluster
           715.5GB   212.7GB   70% online       1 node_A_1    raid4,
                                                              mirrored,
                                                              normal
cluster_B Switched Over Aggregates:
Aggregate Size     Available Used% State   #Vols  Nodes       RAID Status
--------- -------- --------- ----- ------- ------ ----------- ------------
data_cluster_B
            4.19TB    4.11TB    2% online       5 node_A_1    raid_dp,
                                                              mirrored,
                                                              normal
root_cluster_B    -         -     - unknown      - node_A_1   -

检查切回恢复的状态：

MetroCluster node show

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
             node_A_1            configured     enabled   heal roots completed
      cluster_B
             node_B_2            configured     enabled   waiting for switchback
                                                          recovery
2 entries were displayed.

执行切回：

MetroCluster 切回

cluster_A::> metrocluster switchback
[Job 938] Job succeeded: Switchback is successful.Verify switchback

确认节点的状态：

MetroCluster node show

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
              node_A_1         configured     enabled   normal
      cluster_B
              node_B_2         configured     enabled   normal

2 entries were displayed.

确认 MetroCluster 操作的状态：

MetroCluster 操作显示

输出应显示成功状态。

cluster_A::> metrocluster operation show
  Operation: switchback
      State: successful
 Start Time: 2/6/2016 13:54:25
   End Time: 2/6/2016 13:56:15
     Errors: -

在电源线中断后验证操作

您可以测试 MetroCluster 配置对 PDU 故障的响应。

关于此任务

最佳做法是，将组件中的每个电源设备（ PSU ）连接到单独的电源。如果两个 PSU 都连接到同一个配电单元（ PDU ），并且发生电气中断，则站点可能会关闭，或者整个磁盘架可能不可用。测试一条电源线故障，以确认没有布线不匹配，从而发生原因可能导致服务中断。

此测试需要大约 15 分钟。

此测试需要关闭所有左侧 PDU 的电源，然后关闭包含 MetroCluster 组件的所有机架上的所有右侧 PDU 的电源。

此操作步骤具有以下预期结果：

当 PDU 断开连接时，应生成错误。
不应发生故障转移或服务丢失。

步骤

关闭包含 MetroCluster 组件的机架左侧 PDU 的电源。

在控制台上监控结果：

s系统环境传感器显示 -state fault

s存储架 show -errors

cluster_A::> system environment sensors show -state fault

Node Sensor 			State Value/Units Crit-Low Warn-Low Warn-Hi Crit-Hi
---- --------------------- ------ ----------- -------- -------- ------- -------
node_A_1
		PSU1 			fault
							PSU_OFF
		PSU1 Pwr In OK 	fault
							FAULT
node_A_2
		PSU1 			fault
							PSU_OFF
		PSU1 Pwr In OK 	fault
							FAULT
4 entries were displayed.

cluster_A::> storage shelf show -errors
    Shelf Name: 1.1
     Shelf UID: 50:0a:09:80:03:6c:44:d5
 Serial Number: SHFHU1443000059

Error Type          Description
------------------  ---------------------------
Power               Critical condition is detected in storage shelf power supply unit "1". The unit might fail.Reconnect PSU1

重新打开左侧 PDU 的电源。
确保 ONTAP 清除错误情况。
对右侧 PDU 重复上述步骤。

在丢失一个存储架后验证操作

您可以测试单个存储架的故障，以验证是否没有单点故障。

关于此任务

此操作步骤具有以下预期结果：

监控软件应报告错误消息。
不应发生故障转移或服务丢失。
硬件故障恢复后，镜像重新同步将自动启动。

步骤

检查存储故障转移状态：

s存储故障转移显示

cluster_A::> storage failover show

Node           Partner        Possible State Description
-------------- -------------- -------- -------------------------------------
node_A_1       node_A_2       true     Connected to node_A_2
node_A_2       node_A_1       true     Connected to node_A_1
2 entries were displayed.

检查聚合状态：

s存储聚合显示

cluster_A::> storage aggregate show

cluster Aggregates:
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
node_A_1data01_mirrored
            4.15TB    3.40TB   18% online       3 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_1root
           707.7GB   34.29GB   95% online       1 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_2_data01_mirrored
            4.15TB    4.12TB    1% online       2 node_A_2       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_2_data02_unmirrored
            2.18TB    2.18TB    0% online       1 node_A_2       raid_dp,
                                                                   normal
node_A_2_root
           707.7GB   34.27GB   95% online       1 node_A_2       raid_dp,
                                                                   mirrored,
                                                                   normal

验证所有数据 SVM 和数据卷是否均已联机并提供数据：

vserver show -type data

network interface show -fields is-home false

volume show ！ vol0 ，！ mdv*

cluster_A::> vserver show -type data
                               Admin      Operational Root
Vserver     Type    Subtype    State      State       Volume     Aggregate
----------- ------- ---------- ---------- ----------- ---------- ----------
SVM1        data    sync-source           running     SVM1_root  node_A_1_data01_mirrored
SVM2        data    sync-source	          running     SVM2_root  node_A_2_data01_mirrored

cluster_A::> network interface show -fields is-home false
There are no entries matching your query.

cluster_A::> volume show !vol0,!MDV*
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
SVM1
          SVM1_root
                       node_A_1data01_mirrored
                                    online     RW         10GB     9.50GB    5%
SVM1
          SVM1_data_vol
                       node_A_1data01_mirrored
                                    online     RW         10GB     9.49GB    5%
SVM2
          SVM2_root
                       node_A_2_data01_mirrored
                                    online     RW         10GB     9.49GB    5%
SVM2
          SVM2_data_vol
                       node_A_2_data02_unmirrored
                                    online     RW          1GB    972.6MB    5%

确定池 1 中用于节点 node_A_2 的磁盘架以关闭电源以模拟突然发生的硬件故障：

storage aggregate show -r -node node-name ！ * root

您选择的磁盘架必须包含镜像数据聚合中的驱动器。

在以下示例中，选择磁盘架 ID 31 失败。

cluster_A::> storage aggregate show -r -node node_A_2 !*root
Owner Node: node_A_2
 Aggregate: node_A_2_data01_mirrored (online, raid_dp, mirrored) (block checksums)
  Plex: /node_A_2_data01_mirrored/plex0 (online, normal, active, pool0)
   RAID Group /node_A_2_data01_mirrored/plex0/rg0 (normal, block checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  2.30.3                       0   BSAS    7200  827.7GB  828.0GB (normal)
     parity   2.30.4                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.6                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.8                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.5                       0   BSAS    7200  827.7GB  828.0GB (normal)

  Plex: /node_A_2_data01_mirrored/plex4 (online, normal, active, pool1)
   RAID Group /node_A_2_data01_mirrored/plex4/rg0 (normal, block checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  1.31.7                       1   BSAS    7200  827.7GB  828.0GB (normal)
     parity   1.31.6                       1   BSAS    7200  827.7GB  828.0GB (normal)
     data     1.31.3                       1   BSAS    7200  827.7GB  828.0GB (normal)
     data     1.31.4                       1   BSAS    7200  827.7GB  828.0GB (normal)
     data     1.31.5                       1   BSAS    7200  827.7GB  828.0GB (normal)

 Aggregate: node_A_2_data02_unmirrored (online, raid_dp) (block checksums)
  Plex: /node_A_2_data02_unmirrored/plex0 (online, normal, active, pool0)
   RAID Group /node_A_2_data02_unmirrored/plex0/rg0 (normal, block checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  2.30.12                      0   BSAS    7200  827.7GB  828.0GB (normal)
     parity   2.30.22                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.21                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.20                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.14                      0   BSAS    7200  827.7GB  828.0GB (normal)
15 entries were displayed.

物理关闭选定磁盘架的电源。

再次检查聚合状态：

s存储聚合显示

storage aggregate show -r -node node_A_2 ！ * root

驱动器位于已关闭电源架上的聚合应具有 " 已降级 " RAID 状态，而受影响丛上的驱动器应具有 " 故障 " 状态，如以下示例所示：

cluster_A::> storage aggregate show
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
node_A_1data01_mirrored
            4.15TB    3.40TB   18% online       3 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_1root
           707.7GB   34.29GB   95% online       1 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_2_data01_mirrored
            4.15TB    4.12TB    1% online       2 node_A_2       raid_dp,
                                                                   mirror
                                                                   degraded
node_A_2_data02_unmirrored
            2.18TB    2.18TB    0% online       1 node_A_2       raid_dp,
                                                                   normal
node_A_2_root
           707.7GB   34.27GB   95% online       1 node_A_2       raid_dp,
                                                                   mirror
                                                                   degraded
cluster_A::> storage aggregate show -r -node node_A_2 !*root
Owner Node: node_A_2
 Aggregate: node_A_2_data01_mirrored (online, raid_dp, mirror degraded) (block checksums)
  Plex: /node_A_2_data01_mirrored/plex0 (online, normal, active, pool0)
   RAID Group /node_A_2_data01_mirrored/plex0/rg0 (normal, block checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  2.30.3                       0   BSAS    7200  827.7GB  828.0GB (normal)
     parity   2.30.4                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.6                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.8                       0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.5                       0   BSAS    7200  827.7GB  828.0GB (normal)

  Plex: /node_A_2_data01_mirrored/plex4 (offline, failed, inactive, pool1)
   RAID Group /node_A_2_data01_mirrored/plex4/rg0 (partial, none checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  FAILED                       -   -          -  827.7GB        - (failed)
     parity   FAILED                       -   -          -  827.7GB        - (failed)
     data     FAILED                       -   -          -  827.7GB        - (failed)
     data     FAILED                       -   -          -  827.7GB        - (failed)
     data     FAILED                       -   -          -  827.7GB        - (failed)

 Aggregate: node_A_2_data02_unmirrored (online, raid_dp) (block checksums)
  Plex: /node_A_2_data02_unmirrored/plex0 (online, normal, active, pool0)
   RAID Group /node_A_2_data02_unmirrored/plex0/rg0 (normal, block checksums)
                                                              Usable Physical
     Position Disk                        Pool Type     RPM     Size     Size Status
     -------- --------------------------- ---- ----- ------ -------- -------- ----------
     dparity  2.30.12                      0   BSAS    7200  827.7GB  828.0GB (normal)
     parity   2.30.22                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.21                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.20                      0   BSAS    7200  827.7GB  828.0GB (normal)
     data     2.30.14                      0   BSAS    7200  827.7GB  828.0GB (normal)
15 entries were displayed.

验证是否正在提供数据，以及所有卷是否仍处于联机状态：

vserver show -type data

network interface show -fields is-home false

volume show ！ vol0 ，！ mdv*

cluster_A::> vserver show -type data

cluster_A::> vserver show -type data
                               Admin      Operational Root
Vserver     Type    Subtype    State      State       Volume     Aggregate
----------- ------- ---------- ---------- ----------- ---------- ----------
SVM1        data    sync-source           running     SVM1_root  node_A_1_data01_mirrored
SVM2        data    sync-source	          running     SVM2_root  node_A_1_data01_mirrored

cluster_A::> network interface show -fields is-home false
There are no entries matching your query.

cluster_A::> volume show !vol0,!MDV*
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
SVM1
          SVM1_root
                       node_A_1data01_mirrored
                                    online     RW         10GB     9.50GB    5%
SVM1
          SVM1_data_vol
                       node_A_1data01_mirrored
                                    online     RW         10GB     9.49GB    5%
SVM2
          SVM2_root
                       node_A_1data01_mirrored
                                    online     RW         10GB     9.49GB    5%
SVM2
          SVM2_data_vol
                       node_A_2_data02_unmirrored
                                    online     RW          1GB    972.6MB    5%

物理启动磁盘架。

重新同步将自动启动。

验证重新同步是否已启动：

s存储聚合显示

受影响聚合的 RAID 状态应为 "resyning" ，如以下示例所示：

cluster_A::> storage aggregate show
cluster Aggregates:
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
node_A_1_data01_mirrored
            4.15TB    3.40TB   18% online       3 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_1_root
           707.7GB   34.29GB   95% online       1 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_2_data01_mirrored
            4.15TB    4.12TB    1% online       2 node_A_2       raid_dp,
                                                                   resyncing
node_A_2_data02_unmirrored
            2.18TB    2.18TB    0% online       1 node_A_2       raid_dp,
                                                                   normal
node_A_2_root
           707.7GB   34.27GB   95% online       1 node_A_2       raid_dp,
                                                                   resyncing

监控聚合以确认重新同步已完成：

s存储聚合显示

受影响的聚合的 RAID 状态应为 "Normal" ，如以下示例所示：

cluster_A::> storage aggregate show
cluster Aggregates:
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
node_A_1data01_mirrored
            4.15TB    3.40TB   18% online       3 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_1root
           707.7GB   34.29GB   95% online       1 node_A_1       raid_dp,
                                                                   mirrored,
                                                                   normal
node_A_2_data01_mirrored
            4.15TB    4.12TB    1% online       2 node_A_2       raid_dp,
                                                                   normal
node_A_2_data02_unmirrored
            2.18TB    2.18TB    0% online       1 node_A_2       raid_dp,
                                                                   normal
node_A_2_root
           707.7GB   34.27GB   95% online       1 node_A_2       raid_dp,
                                                                   resyncing

测试MetroCluster IP 配置的ONTAP节点切换

Creating your file...

验证协商的切换

验证修复和手动切回

在电源线中断后验证操作

在丢失一个存储架后验证操作