Migrate CN1610 cluster switches to NVIDIA SN2100 cluster switches
You can migrate NetApp CN1610 cluster switches for an ONTAP cluster to NVIDIA SN2100 cluster switches. This is a nondisruptive procedure.
Review requirements
You must be aware of certain configuration information, port connections and cabling requirements when you are replacing NetApp CN1610 cluster switches with NVIDIA SN2100 cluster switches. See Overview of installation and configuration for NVIDIA SN2100 switches.
The following cluster switches are supported:
-
NetApp CN1610
-
NVIDIA SN2100
For details of supported ports and their configurations, see the Hardware Universe.
Verify that you meet the following requirements for you configuration:
-
The existing cluster is correctly set up and functioning.
-
All cluster ports are in the up state to ensure nondisruptive operations.
-
The NVIDIA SN2100 cluster switches are configured and operating under the correct version of Cumulus Linux installed with the reference configuration file (RCF) applied.
-
The existing cluster network configuration has the following:
-
A redundant and fully functional NetApp cluster using CN1610 switches.
-
Management connectivity and console access to both the CN1610 switches and the new switches.
-
All cluster LIFs in the up state with the cluster LIfs on their home ports.
-
ISL ports enabled and cabled between the CN1610 switches and between the new switches.
-
-
Some of the ports are configured on NVIDIA SN2100 switches to run at 40GbE or 100GbE.
-
You have planned, migrated, and documented 40GbE and 100GbE connectivity from nodes to NVIDIA SN2100 cluster switches.
Migrate the switches
The examples in this procedure use the following switch and node nomenclature:
-
The existing CN1610 cluster switches are c1 and c2.
-
The new NVIDIA SN2100 cluster switches are sw1 and sw2.
-
The nodes are node1 and node2.
-
The cluster LIFs are node1_clus1 and node1_clus2 on node 1, and node2_clus1 and node2_clus2 on node 2 respectively.
-
The
cluster1::*>
prompt indicates the name of the cluster. -
The cluster ports used in this procedure are e3a and e3b.
-
Breakout ports take the format: swp[port]s[breakout port 0-3]. For example, four breakout ports on swp1 are swp1s0, swp1s1, swp1s2, and swp1s3.
This procedure covers the following scenario:
-
Switch c2 is replaced by switch sw2 first.
-
Shut down the ports to the cluster nodes. All ports must be shut down simultaneously to avoid cluster instability.
-
The cabling between the nodes and c2 is then disconnected from c2 and reconnected to sw2.
-
-
Switch c1 is replaced by switch sw1.
-
Shut down the ports to the cluster nodes. All ports must be shut down simultaneously to avoid cluster instability.
-
The cabling between the nodes and c1 is then disconnected from c1 and reconnected to sw1.
-
No operational inter-switch link (ISL) is needed during this procedure. This is by design because RCF version changes can affect ISL connectivity temporarily. To ensure non-disruptive cluster operations, the following procedure migrates all of the cluster LIFs to the operational partner switch while performing the steps on the target switch. |
Step 1: Prepare for migration
-
If AutoSupport is enabled on this cluster, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=xh
where x is the duration of the maintenance window in hours.
-
Change the privilege level to advanced, entering y when prompted to continue:
set -privilege advanced
The advanced prompt (*>) appears.
-
Disable auto-revert on the cluster LIFs:
network interface modify -vserver Cluster -lif * -auto-revert false
Step 2: Configure ports and cabling
-
Determine the administrative or operational status for each cluster interface.
Each port should display up for
Link
andhealthy
forHealth Status
.-
Display the network port attributes:
network port show -ipspace Cluster
Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
Display information about the LIFs and their designated home nodes:
network interface show -vserver Cluster
Each LIF should display
up/up
forStatus Admin/Oper
andtrue
forIs Home
.Show example
cluster1::*> network interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ----------- ---------- ------------------ ----------- ------- ---- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
-
The cluster ports on each node are connected to existing cluster switches in the following way (from the nodes' perspective) using the command:
network device-discovery show -protocol
Show example
cluster1::*> network device-discovery show -protocol cdp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- ---------------- ---------------- node1 /cdp e3a c1 (6a:ad:4f:98:3b:3f) 0/1 - e3b c2 (6a:ad:4f:98:4c:a4) 0/1 - node2 /cdp e3a c1 (6a:ad:4f:98:3b:3f) 0/2 - e3b c2 (6a:ad:4f:98:4c:a4) 0/2 -
-
The cluster ports and switches are connected in the following way (from the switches' perspective) using the command:
show cdp neighbors
Show example
c1# show cdp neighbors Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge S - Switch, H - Host, I - IGMP, r - Repeater, V - VoIP-Phone, D - Remotely-Managed-Device, s - Supports-STP-Dispute Device-ID Local Intrfce Hldtme Capability Platform Port ID node1 0/1 124 H AFF-A400 e3a node2 0/2 124 H AFF-A400 e3a c2 0/13 179 S I s CN1610 0/13 c2 0/14 175 S I s CN1610 0/14 c2 0/15 179 S I s CN1610 0/15 c2 0/16 175 S I s CN1610 0/16 c2# show cdp neighbors Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge S - Switch, H - Host, I - IGMP, r - Repeater, V - VoIP-Phone, D - Remotely-Managed-Device, s - Supports-STP-Dispute Device-ID Local Intrfce Hldtme Capability Platform Port ID node1 0/1 124 H AFF-A400 e3b node2 0/2 124 H AFF-A400 e3b c1 0/13 175 S I s CN1610 0/13 c1 0/14 175 S I s CN1610 0/14 c1 0/15 175 S I s CN1610 0/15 c1 0/16 175 S I s CN1610 0/16
-
Verify the connectivity of the remote cluster interfaces:
You can use the network interface check cluster-connectivity
command to start an accessibility check for cluster connectivity and then display the details:
network interface check cluster-connectivity start
and network interface check cluster-connectivity show
cluster1::*> network interface check cluster-connectivity start
NOTE: Wait for a number of seconds before running the show
command to display the details.
cluster1::*> network interface check cluster-connectivity show Source Destination Packet Node Date LIF LIF Loss ------ -------------------------- ---------------- ---------------- ----------- node1 3/5/2022 19:21:18 -06:00 node1_clus2 node2-clus1 none 3/5/2022 19:21:20 -06:00 node1_clus2 node2_clus2 none node2 3/5/2022 19:21:18 -06:00 node2_clus2 node1_clus1 none 3/5/2022 19:21:20 -06:00 node2_clus2 node1_clus2 none
For all ONTAP releases, you can also use the cluster ping-cluster -node <name>
command to check the connectivity:
cluster ping-cluster -node <name>
cluster1::*> cluster ping-cluster -node local Host is node2 Getting addresses from network interface table... Cluster node1_clus1 169.254.209.69 node1 e3a Cluster node1_clus2 169.254.49.125 node1 e3b Cluster node2_clus1 169.254.47.194 node2 e3a Cluster node2_clus2 169.254.19.183 node2 e3b Local = 169.254.47.194 169.254.19.183 Remote = 169.254.209.69 169.254.49.125 Cluster Vserver Id = 4294967293 Ping status: .... Basic connectivity succeeds on 4 path(s) Basic connectivity fails on 0 path(s) ................ Detected 9000 byte MTU on 4 path(s): Local 169.254.19.183 to Remote 169.254.209.69 Local 169.254.19.183 to Remote 169.254.49.125 Local 169.254.47.194 to Remote 169.254.209.69 Local 169.254.47.194 to Remote 169.254.49.125 Larger than PMTU communication succeeds on 4 path(s) RPC status: 2 paths up, 0 paths down (tcp check) 2 paths up, 0 paths down (udp check)
-
On switch c2, shut down the ports connected to the cluster ports of the nodes in order to fail over the cluster LIFs.
(c2)# configure (c2)(Config)# interface 0/1-0/12 (c2)(Interface 0/1-0/12)# shutdown (c2)(Interface 0/1-0/12)# exit (c2)(Config)# exit (c2)#
-
Move the node cluster ports from the old switch c2 to the new switch sw2, using appropriate cabling supported by NVIDIA SN2100.
-
Display the network port attributes:
network port show -ipspace Cluster
Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
The cluster ports on each node are now connected to cluster switches in the following way, from the nodes' perspective:
network device-discovery show -protocol
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- ---------------- ---------------- node1 /lldp e3a c1 (6a:ad:4f:98:3b:3f) 0/1 - e3b sw2 (b8:ce:f6:19:1a:7e) swp3 - node2 /lldp e3a c1 (6a:ad:4f:98:3b:3f) 0/2 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 -
-
On switch sw2, verify that all node cluster ports are up:
net show interface
Show example
cumulus@sw2:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ---------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp15 100G 9216 BondMember sw1 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw1 (swp16) Master: cluster_isl(UP)
-
On switch c1, shut down the ports connected to the cluster ports of the nodes in order to fail over the cluster LIFs.
(c1)# configure (c1)(Config)# interface 0/1-0/12 (c1)(Interface 0/1-0/12)# shutdown (c1)(Interface 0/1-0/12)# exit (c1)(Config)# exit (c1)#
-
Move the node cluster ports from the old switch c1 to the new switch sw1, using appropriate cabling supported by NVIDIA SN2100.
-
Verify the final configuration of the cluster:
network port show -ipspace Cluster
Each port should display
up
forLink
andhealthy
forHealth Status
.Show example
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ---------- ---------------- ---- ----- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
The cluster ports on each node are now connected to cluster switches in the following way, from the nodes' perspective:
network device-discovery show -protocol
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- -------------- ---------------- node1 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 -
-
On switches sw1 and sw2, verify that all node cluster ports are up:
net show interface
Show example
cumulus@sw1:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ---------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3a Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3a Master: bridge(UP) UP swp15 100G 9216 BondMember sw2 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw2 (swp16) Master: cluster_isl(UP) cumulus@sw2:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- ----------- ---- ----- ---------- ----------------- ----------------------- ... ... UP swp3 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 e3b Master: bridge(UP) UP swp15 100G 9216 BondMember sw1 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw1 (swp16) Master: cluster_isl(UP)
-
Verify that both nodes each have one connection to each switch:
net show lldp
Show example
The following example shows the appropriate results for both switches:
cumulus@sw1:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ------------------ ----------- swp3 100G Trunk/L2 node1 e3a swp4 100G Trunk/L2 node2 e3a swp15 100G BondMember sw2 swp15 swp16 100G BondMember sw2 swp16 cumulus@sw2:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ------------------ ----------- swp3 100G Trunk/L2 node1 e3b swp4 100G Trunk/L2 node2 e3b swp15 100G BondMember sw1 swp15 swp16 100G BondMember sw1 swp16
Step 3: Verify the configuration
-
Enable auto-revert on the cluster LIFs:
cluster1::*> network interface modify -vserver Cluster -lif * -auto-revert true
-
Verify that all cluster network LIFs are back on their home ports:
network interface show
Show example
cluster1::*> network interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ---------- ---------- ------------------ ------------- ------- ---- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
Change the privilege level back to admin:
set -privilege admin
-
If you suppressed automatic case creation, re-enable it by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=END