Migrate from a Cisco cluster switch to a NVIDIA SN2100 cluster switch
You can migrate Cisco cluster switches for an ONTAP cluster to NVIDIA SN2100 cluster switches. This is a nondisruptive procedure.
Review requirements
You must be aware of certain configuration information, port connections and cabling requirements when you are replacing some older Cisco cluster switches with NVIDIA SN2100 cluster switches. See Overview of installation and configuration for NVIDIA SN2100 switches.
The following Cisco cluster switches are supported:
-
Nexus 9336C-FX2
-
Nexus 92300YC
-
Nexus 5596UP
-
Nexus 3232C
-
Nexus 3132Q-V
For details of supported ports and their configurations, see the Hardware Universe .
Ensure that:
-
The existing cluster is properly set up and functioning.
-
All cluster ports are in the up state to ensure nondisruptive operations.
-
The NVIDIA SN2100 cluster switches are configured and operating under the proper version of Cumulus Linux installed with the reference configuration file (RCF) applied.
-
The existing cluster network configuration have the following:
-
A redundant and fully functional NetApp cluster using both older Cisco switches.
-
Management connectivity and console access to both the older Cisco switches and the new switches.
-
All cluster LIFs in the up state with the cluster LIfs are on their home ports.
-
ISL ports enabled and cabled between the older Cisco switches and between the new switches.
-
-
Some of the ports are configured on NVIDIA SN2100 switches to run at 40 GbE or 100 GbE.
-
You have planned, migrated, and documented 40 GbE and 100 GbE connectivity from nodes to NVIDIA SN2100 cluster switches.
If you are changing the port speed of the e0a and e1a cluster ports on AFF A800 or AFF C800 systems, you might observe malformed packets being received after the speed conversion. See Bug 1570339 and the Knowledge Base article CRC errors on T6 ports after converting from 40GbE to 100GbE for guidance. |
Migrate the switches
In this procedure, Cisco Nexus 3232C cluster switches are used for example commands and outputs.
The examples in this procedure use the following switch and node nomenclature:
-
The existing Cisco Nexus 3232C cluster switches are c1 and c2.
-
The new NVIDIA SN2100 cluster switches are sw1 and sw2.
-
The nodes are node1 and node2.
-
The cluster LIFs are node1_clus1 and node1_clus2 on node 1, and node2_clus1 and node2_clus2 on node 2 respectively.
-
The
cluster1::*>
prompt indicates the name of the cluster. -
The cluster ports used in this procedure are e3a and e3b.
-
Breakout ports take the format: swp[port]s[breakout port 0-3]. For example, four breakout ports on swp1 are swp1s0, swp1s1, swp1s2, and swp1s3.
This procedure covers the following scenario:
-
Switch c2 is replaced by switch sw2 first.
-
Shut down the ports to the cluster nodes. All ports must be shut down simultaneously to avoid cluster instability.
-
Cabling between the nodes and c2 are then disconnected from c2 and reconnected to sw2.
-
-
Switch c1 is replaced by switch sw1.
-
Shut down the ports to the cluster nodes. All ports must be shut down simultaneously to avoid cluster instability.
-
Cabling between the nodes and c1 are then disconnected from c1 and reconnected to sw1.
-
Step 1: Prepare for migration
-
If AutoSupport is enabled on this cluster, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=xh
where x is the duration of the maintenance window in hours.
-
Change the privilege level to advanced, entering y when prompted to continue:
set -privilege advanced
The advanced prompt (*>) appears.
-
Disable auto-revert on the cluster LIFs:
network interface modify -vserver Cluster -lif * -auto-revert false
Step 2: Configure ports and cabling
-
Determine the administrative or operational status for each cluster interface.
Each port should display up for
Link
and healthy forHealth Status
.-
Display the network port attributes:
network port show -ipspace Cluster
Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
Display information about the logical interfaces and their designated home nodes:
network interface show -vserver Cluster
Each LIF should display
up/up
forStatus Admin/Oper
and true forIs Home
.Show example
cluster1::*> network interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ----------- ---------- ------------------ ----------- ------- ---- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
-
The cluster ports on each node are connected to existing cluster switches in the following way (from the nodes' perspective):
network device-discovery show -protocol lldp
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- ---------------- ---------------- node1 /lldp e3a c1 (6a:ad:4f:98:3b:3f) Eth1/1 - e3b c2 (6a:ad:4f:98:4c:a4) Eth1/1 - node2 /lldp e3a c1 (6a:ad:4f:98:3b:3f) Eth1/2 - e3b c2 (6a:ad:4f:98:4c:a4) Eth1/2 -
-
The cluster ports and switches are connected in the following way (from the switches' perspective):
show cdp neighbors
Show example
c1# show cdp neighbors Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge S - Switch, H - Host, I - IGMP, r - Repeater, V - VoIP-Phone, D - Remotely-Managed-Device, s - Supports-STP-Dispute Device-ID Local Intrfce Hldtme Capability Platform Port ID node1 Eth1/1 124 H AFF-A400 e3a node2 Eth1/2 124 H AFF-A400 e3a c2 Eth1/31 179 S I s N3K-C3232C Eth1/31 c2 Eth1/32 175 S I s N3K-C3232C Eth1/32 c2# show cdp neighbors Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge S - Switch, H - Host, I - IGMP, r - Repeater, V - VoIP-Phone, D - Remotely-Managed-Device, s - Supports-STP-Dispute Device-ID Local Intrfce Hldtme Capability Platform Port ID node1 Eth1/1 124 H AFF-A400 e3b node2 Eth1/2 124 H AFF-A400 e3b c1 Eth1/31 175 S I s N3K-C3232C Eth1/31 c1 Eth1/32 175 S I s N3K-C3232C Eth1/32
-
Verify the connectivity of the remote cluster interfaces:
You can use the network interface check cluster-connectivity
command to start an accessibility check for cluster connectivity and then display the details:
network interface check cluster-connectivity start
and network interface check cluster-connectivity show
cluster1::*> network interface check cluster-connectivity start
NOTE: Wait for a number of seconds before running the show
command to display the details.
cluster1::*> network interface check cluster-connectivity show Source Destination Packet Node Date LIF LIF Loss ------ -------------------------- ---------------- ---------------- ----------- node1 3/5/2022 19:21:18 -06:00 node1_clus2 node2-clus1 none 3/5/2022 19:21:20 -06:00 node1_clus2 node2_clus2 none node2 3/5/2022 19:21:18 -06:00 node2_clus2 node1_clus1 none 3/5/2022 19:21:20 -06:00 node2_clus2 node1_clus2 none
For all ONTAP releases, you can also use the cluster ping-cluster -node <name>
command to check the connectivity:
cluster ping-cluster -node <name>
cluster1::*> cluster ping-cluster -node local Host is node2 Getting addresses from network interface table... Cluster node1_clus1 169.254.209.69 node1 e3a Cluster node1_clus2 169.254.49.125 node1 e3b Cluster node2_clus1 169.254.47.194 node2 e3a Cluster node2_clus2 169.254.19.183 node2 e3b Local = 169.254.47.194 169.254.19.183 Remote = 169.254.209.69 169.254.49.125 Cluster Vserver Id = 4294967293 Ping status: .... Basic connectivity succeeds on 4 path(s) Basic connectivity fails on 0 path(s) ................ Detected 9000 byte MTU on 4 path(s): Local 169.254.19.183 to Remote 169.254.209.69 Local 169.254.19.183 to Remote 169.254.49.125 Local 169.254.47.194 to Remote 169.254.209.69 Local 169.254.47.194 to Remote 169.254.49.125 Larger than PMTU communication succeeds on 4 path(s) RPC status: 2 paths up, 0 paths down (tcp check) 2 paths up, 0 paths down (udp check)
-
On switch c2, shut down the ports connected to the cluster ports of the nodes in order to fail over the cluster LIFs.
(c2)# configure Enter configuration commands, one per line. End with CNTL/Z. (c2)(Config)# interface (c2)(config-if-range)# shutdown <interface_list> (c2)(config-if-range)# exit (c2)(Config)# exit (c2)#
-
Move the node cluster ports from the old switch c2 to the new switch sw2, using appropriate cabling supported by NVIDIA SN2100.
-
Display the network port attributes:
network port show -ipspace Cluster
Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
The cluster ports on each node are now connected to cluster switches in the following way, from the nodes' perspective:
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- ---------------- ---------------- node1 /lldp e3a c1 (6a:ad:4f:98:3b:3f) Eth1/1 - e3b sw2 (b8:ce:f6:19:1a:7e) swp3 - node2 /lldp e3a c1 (6a:ad:4f:98:3b:3f) Eth1/2 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 -
-
On switch sw2, verify that all node cluster ports are up:
net show interface
Show example
cumulus@sw2:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ---------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp15 100G 9216 BondMember sw1 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw1 (swp16) Master: cluster_isl(UP)
-
On switch c1, shut down the ports connected to the cluster ports of the nodes in order to fail over the cluster LIFs.
(c1)# configure Enter configuration commands, one per line. End with CNTL/Z. (c1)(Config)# interface (c1)(config-if-range)# shutdown <interface_list> (c1)(config-if-range)# exit (c1)(Config)# exit (c1)#
-
Move the node cluster ports from the old switch c1 to the new switch sw1, using appropriate cabling supported by NVIDIA SN2100.
-
Verify the final configuration of the cluster:
network port show -ipspace Cluster
Each port should display
up
forLink
and healthy forHealth Status
.Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
The cluster ports on each node are now connected to cluster switches in the following way, from the nodes' perspective:
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- -------------- ---------------- node1 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 -
-
On switches sw1 and sw2, verify that all node cluster ports are up:
net show interface
Show example
cumulus@sw1:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ---------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3a Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3a Master: bridge(UP) UP swp15 100G 9216 BondMember sw2 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw2 (swp16) Master: cluster_isl(UP) cumulus@sw2:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ----------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp15 100G 9216 BondMember sw1 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw1 (swp16) Master: cluster_isl(UP)
-
Verify that both nodes each have one connection to each switch:
net show lldp
Show example
The following example shows the appropriate results for both switches:
cumulus@sw1:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ------------------ ----------- swp3 100G Trunk/L2 node1 e3a swp4 100G Trunk/L2 node2 e3a swp15 100G BondMember sw2 swp15 swp16 100G BondMember sw2 swp16 cumulus@sw2:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ------------------ ----------- swp3 100G Trunk/L2 node1 e3b swp4 100G Trunk/L2 node2 e3b swp15 100G BondMember sw1 swp15 swp16 100G BondMember sw1 swp16
Step 3: Verify the configuration
-
Enable auto-revert on the cluster LIFs:
cluster1::*> network interface modify -vserver Cluster -lif * -auto-revert true
-
Verify that all cluster network LIFs are back on their home ports:
network interface show
Show example
cluster1::*> network interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ---------- ---------- ------------------ ------------- ------- ---- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
Change the privilege level back to admin:
set -privilege admin
-
If you suppressed automatic case creation, re-enable it by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=END