Migrate to a two-node switched cluster with NVIDIA SN2100 cluster switches
If you have an existing two-node switchless cluster environment, you can migrate to a two-node switched cluster environment using NVIDIA SN2100 switches to enable you to scale beyond two nodes in the cluster.
The procedure you use depends on whether you have two dedicated cluster-network ports on each controller or a single cluster port on each controller. The process documented works for all nodes using optical or Twinax ports but is not supported on this switch if nodes are using onboard 10GBASE-T RJ45 ports for the cluster-network ports.
Review requirements
Ensure that:
-
The two-node switchless configuration are properly set up and functioning.
-
The nodes are running ONTAP 9.10.1P3 and later.
-
All cluster ports are in the up state.
-
All cluster logical interfaces (LIFs) are in the up state and on their home ports.
Ensure that:
-
Both switches have management network connectivity.
-
There is console access to the cluster switches.
-
NVIDIA SN2100 node-to-node switch and switch-to-switch connections use Twinax or fiber cables.
See Review cabling and configuration considerations for caveats and further details. The Hardware Universe - Switches also contains more information about cabling. -
Inter-Switch Link (ISL) cables are connected to ports swp15 and swp16 on both NVIDIA SN2100 switches.
-
Initial customization of both the SN2100 switches are completed, so that:
-
SN2100 switches are running the latest version of Cumulus Linux
-
Reference Configuration Files (RCFs) are applied to the switches
-
Any site customization, such as SMTP, SNMP, and SSH are configured on the new switches.
The Hardware Universe contains the latest information about the actual cluster ports for your platforms.
-
Migrate the switches
The examples in this procedure use the following cluster switch and node nomenclature:
-
The names of the SN2100 switches are sw1 and sw2.
-
The names of the cluster SVMs are node1 and node2.
-
The names of the LIFs are node1_clus1 and node1_clus2 on node 1, and node2_clus1 and node2_clus2 on node 2 respectively.
-
The
cluster1::*>
prompt indicates the name of the cluster. -
The cluster ports used in this procedure are e3a and e3b.
-
Breakout ports take the format: swp[port]s[breakout port 0-3]. For example, four breakout ports on swp1 are swp1s0, swp1s1, swp1s2, and swp1s3.
Step 1: Prepare for migration
-
If AutoSupport is enabled on this cluster, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=xh
where x is the duration of the maintenance window in hours.
-
Change the privilege level to advanced, entering
y
when prompted to continue:set -privilege advanced
The advanced prompt (
*>
) appears.
Step 2: Configure ports and cabling
-
Disable all node-facing ports (not ISL ports) on both the new cluster switches sw1 and sw2.
You must not disable the ISL ports.
The following commands disable the node-facing ports on switches sw1 and sw2:
cumulus@sw1:~$ net add interface swp1s0-3, swp2s0-3, swp3-14 link down cumulus@sw1:~$ net pending cumulus@sw1:~$ net commit cumulus@sw2:~$ net add interface swp1s0-3, swp2s0-3, swp3-14 link down cumulus@sw2:~$ net pending cumulus@sw2:~$ net commit
-
Verify that the ISL and the physical ports on the ISL between the two SN2100 switches sw1 and sw2 are up on ports swp15 and swp16:
net show interface
The following commands show that the ISL ports are up on switches sw1 and sw2:
cumulus@sw1:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- ----------- ----------------------- ... ... UP swp15 100G 9216 BondMember sw2 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw2 (swp16) Master: cluster_isl(UP) cumulus@sw2:~$ net show interface State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- ----------- ----------------------- ... ... UP swp15 100G 9216 BondMember sw1 (swp15) Master: cluster_isl(UP) UP swp16 100G 9216 BondMember sw1 (swp16) Master: cluster_isl(UP)
-
Disable all node-facing ports (not ISL ports) on both new cluster switches sw1 and sw2.
You must not disable the ISL ports.
The following commands disable the node-facing ports on switches sw1 and sw2:
cumulus@sw1:~$ nv set interface swp1s0-3,swp2s0-3,swp3-14 link state down cumulus@sw1:~$ nv config apply cumulus@sw1:~$ nv save cumulus@sw2:~$ nv set interface swp1s0-3,swp2s0-3,swp3-14 link state down cumulus@sw2:~$ nv config apply cumulus@sw2:~$ nv save
-
Verify that the ISL and the physical ports on the ISL between the two SN2100 switches sw1 and sw2 are up on ports swp15 and swp16:
nv show interface
The following examples show that the ISL ports are up on switches sw1 and sw2:
cumulus@sw1:~$ nv show interface Interface MTU Speed State Remote Host Remote Port Type Summary ------------- ------ ----- ------ ------------ ------------------------------------ ------- ------- ... ... + swp14 9216 down swp + swp15 9216 100G up ossg-rcf1 Intra-Cluster Switch ISL Port swp15 swp + swp16 9216 100G up ossg-rcf2 Intra-Cluster Switch ISL Port swp16 swp cumulus@sw2:~$ nv show interface Interface MTU Speed State Remote Host Remote Port Type Summary ------------- ------ ----- ------ ------------ ------------------------------------ ------- ------- ... ... + swp14 9216 down swp + swp15 9216 100G up ossg-rcf1 Intra-Cluster Switch ISL Port swp15 swp + swp16 9216 100G up ossg-rcf2 Intra-Cluster Switch ISL Port swp16 swp
-
Verify that all cluster ports are up:
network port show
Each port should display
up
forLink
and healthy forHealth Status
.Show example
cluster1::*> network port show Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
Verify that all cluster LIFs are up and operational:
network interface show
Each cluster LIF should display true for
Is Home
and have aStatus Admin/Oper
ofup/up
.Show example
cluster1::*> network interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ---------- ---------- ------------------ ------------- ------- ----- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
Disable auto-revert on the cluster LIFs:
network interface modify -vserver Cluster -lif * -auto-revert false
Show example
cluster1::*> network interface modify -vserver Cluster -lif * -auto-revert false Logical Vserver Interface Auto-revert --------- ------------- ------------ Cluster node1_clus1 false node1_clus2 false node2_clus1 false node2_clus2 false
-
Disconnect the cable from cluster port e3a on node1, and then connect e3a to port 3 on cluster switch sw1, using the appropriate cabling supported by the SN2100 switches.
The Hardware Universe - Switches contains more information about cabling.
-
Disconnect the cable from cluster port e3a on node2, and then connect e3a to port 4 on cluster switch sw1, using the appropriate cabling supported by the SN2100 switches.
-
On switch sw1, verify that all ports are up:
net show interface all
cumulus@sw1:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- --------------- -------- ... DN swp1s0 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s1 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s2 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s3 10G 9216 Trunk/L2 Master: br_default(UP) DN swp2s0 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s1 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s2 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s3 25G 9216 Trunk/L2 Master: br_default(UP) UP swp3 100G 9216 Trunk/L2 node1 (e3a) Master: br_default(UP) UP swp4 100G 9216 Trunk/L2 node2 (e3a) Master: br_default(UP) ... ... UP swp15 100G 9216 BondMember swp15 Master: cluster_isl(UP) UP swp16 100G 9216 BondMember swp16 Master: cluster_isl(UP) ...
-
On switch sw1, verify that all ports are up:
nv show interface
cumulus@sw1:~$ nv show interface Interface State Speed MTU Type Remote Host Remote Port Summary ----------- ----- ----- ----- -------- -------------------------- ----------- ---------- ... ... swp1s0 up 10G 9216 swp odq-a300-1a e0a swp1s1 up 10G 9216 swp odq-a300-1b e0a swp1s2 down 10G 9216 swp swp1s3 down 10G 9216 swp swp2s0 down 25G 9216 swp swp2s1 down 25G 9216 swp swp2s2 down 25G 9216 swp swp2s3 down 25G 9216 swp swp3 down 9216 swp swp4 down 9216 swp ... ... swp14 down 9216 swp swp15 up 100G 9216 swp ossg-int-rcf10 swp15 swp16 up 100G 9216 swp ossg-int-rcf10 swp16
-
Verify that all cluster ports are up:
network port show -ipspace Cluster
Show example
The following example shows that all of the cluster ports are up on node1 and node2:
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ------------ -------- ------ e3a Cluster Cluster up 9000 auto/100000 healthy false e3b Cluster Cluster up 9000 auto/100000 healthy false
-
Display information about the status of the nodes in the cluster:
cluster show
Show example
The following example displays information about the health and eligibility of the nodes in the cluster:
cluster1::*> cluster show Node Health Eligibility Epsilon -------------------- ------- ------------ ------------ node1 true true false node2 true true false
-
Disconnect the cable from cluster port e3b on node1, and then connect e3b to port 3 on cluster switch sw2, using the appropriate cabling supported by the SN2100 switches.
-
Disconnect the cable from cluster port e3b on node2, and then connect e3b to port 4 on cluster switch sw2, using the appropriate cabling supported by the SN2100 switches.
-
On switch sw2, enable all node-facing ports.
The following commands enable the node-facing ports on switch sw2:
cumulus@sw2:~$ net del interface swp1s0-3, swp2s0-3, swp3-14 link down cumulus@sw2:~$ net pending cumulus@sw2:~$ net commit
-
On switch sw2, verify that all ports are up:
net show interface all
cumulus@sw2:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- --------------- -------- ... DN swp1s0 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s1 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s2 10G 9216 Trunk/L2 Master: br_default(UP) DN swp1s3 10G 9216 Trunk/L2 Master: br_default(UP) DN swp2s0 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s1 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s2 25G 9216 Trunk/L2 Master: br_default(UP) DN swp2s3 25G 9216 Trunk/L2 Master: br_default(UP) UP swp3 100G 9216 Trunk/L2 node1 (e3b) Master: br_default(UP) UP swp4 100G 9216 Trunk/L2 node2 (e3b) Master: br_default(UP) ... ... UP swp15 100G 9216 BondMember swp15 Master: cluster_isl(UP) UP swp16 100G 9216 BondMember swp16 Master: cluster_isl(UP) ...
cumulus@sw1:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ----------------- ----------- swp3 100G Trunk/L2 node1 e3a swp4 100G Trunk/L2 node2 e3a swp15 100G BondMember sw2 swp15 swp16 100G BondMember sw2 swp16 cumulus@sw2:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ----------------- ----------- swp3 100G Trunk/L2 node1 e3b swp4 100G Trunk/L2 node2 e3b swp15 100G BondMember sw1 swp15 swp16 100G BondMember sw1 swp16
-
On switch sw2, enable all node-facing ports.
The following commands enable the node-facing ports on switch sw2:
cumulus@sw2:~$ nv set interface swp1s0-3,swp2s0-3,swp3-14 link state up cumulus@sw2:~$ nv config apply cumulus@sw2:~$ nv config save
-
On switch sw2, verify that all ports are up:
nv show interface
cumulus@sw2:~$ nv show interface Interface State Speed MTU Type Remote Host Remote Port Summary ----------- ----- ----- ----- -------- -------------------------- ----------- ---------- ... ... swp1s0 up 10G 9216 swp odq-a300-1a e0a swp1s1 up 10G 9216 swp odq-a300-1b e0a swp1s2 down 10G 9216 swp swp1s3 down 10G 9216 swp swp2s0 down 25G 9216 swp swp2s1 down 25G 9216 swp swp2s2 down 25G 9216 swp swp2s3 down 25G 9216 swp swp3 down 9216 swp swp4 down 9216 swp ... ... swp14 down 9216 swp swp15 up 100G 9216 swp ossg-int-rcf10 swp15 swp16 up 100G 9216 swp ossg-int-rcf10 swp16
-
On both switches sw1 and sw2, verify that both nodes each have one connection to each switch:
nv show interface --view=lldp
The following examples show the appropriate results for both switches sw1 and sw2:
cumulus@sw1:~$ nv show interface --view=lldp Interface Speed Type Remote Host Remote Port ----------- ----- -------- ---------------------------------- ----------- ... ... swp1s0 10G swp odq-a300-1a e0a swp1s1 10G swp odq-a300-1b e0a swp1s2 10G swp swp1s3 10G swp swp2s0 25G swp swp2s1 25G swp swp2s2 25G swp swp2s3 25G swp swp3 swp swp4 swp ... ... swp14 swp swp15 100G swp ossg-int-rcf10 swp15 swp16 100G swp ossg-int-rcf10 swp16 cumulus@sw2:~$ nv show interface --view=lldp Interface Speed Type Remote Host Remote Port ----------- ----- -------- ---------------------------------- ----------- ... ... swp1s0 10G swp odq-a300-1a e0a swp1s1 10G swp odq-a300-1b e0a swp1s2 10G swp swp1s3 10G swp swp2s0 25G swp swp2s1 25G swp swp2s2 25G swp swp2s3 25G swp swp3 swp swp4 swp ... ... swp14 swp swp15 100G swp ossg-int-rcf10 swp15 swp16 100G swp ossg-int-rcf10 swp16
-
Display information about the discovered network devices in your cluster:
network device-discovery show -protocol lldp
Show example
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- ------------ ---------------- node1 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2 /lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 -
-
Verify that all cluster ports are up:
network port show -ipspace Cluster
Show example
The following example shows that all of the cluster ports are up on node1 and node2:
cluster1::*> network port show -ipspace Cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false
Step 3: Verify the configuration
-
Enable auto-revert on all cluster LIFs:
net interface modify -vserver Cluster -lif * -auto-revert true
Show example
cluster1::*> net interface modify -vserver Cluster -lif * -auto-revert true Logical Vserver Interface Auto-revert --------- ------------- ------------ Cluster node1_clus1 true node1_clus2 true node2_clus1 true node2_clus2 true
-
Verify that all interfaces display
true
forIs Home
:net interface show -vserver Cluster
This might take a minute to complete. Show example
The following example shows that all LIFs are up on node1 and node2 and that
Is Home
results are true:cluster1::*> net interface show -vserver Cluster Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home --------- ------------ ---------- ------------------ ---------- ------- ---- Cluster node1_clus1 up/up 169.254.209.69/16 node1 e3a true node1_clus2 up/up 169.254.49.125/16 node1 e3b true node2_clus1 up/up 169.254.47.194/16 node2 e3a true node2_clus2 up/up 169.254.19.183/16 node2 e3b true
-
Verify that the settings are disabled:
network options switchless-cluster show
Show example
The false output in the following example shows that the configuration settings are disabled:
cluster1::*> network options switchless-cluster show Enable Switchless Cluster: false
-
Verify the status of the node members in the cluster:
cluster show
Show example
The following example shows information about the health and eligibility of the nodes in the cluster:
cluster1::*> cluster show Node Health Eligibility Epsilon -------------------- ------- ------------ -------- node1 true true false node2 true true false
-
Verify the connectivity of the remote cluster interfaces:
You can use the network interface check cluster-connectivity
command to start an accessibility check for cluster connectivity and then display the details:
network interface check cluster-connectivity start
and network interface check cluster-connectivity show
cluster1::*> network interface check cluster-connectivity start
NOTE: Wait for a number of seconds before running the show
command to display the details.
cluster1::*> network interface check cluster-connectivity show Source Destination Packet Node Date LIF LIF Loss ------ -------------------------- ---------------- ---------------- ----------- node1 3/5/2022 19:21:18 -06:00 node1_clus2 node2-clus1 none 3/5/2022 19:21:20 -06:00 node1_clus2 node2_clus2 none node2 3/5/2022 19:21:18 -06:00 node2_clus2 node1_clus1 none 3/5/2022 19:21:20 -06:00 node2_clus2 node1_clus2 none
For all ONTAP releases, you can also use the cluster ping-cluster -node <name>
command to check the connectivity:
cluster ping-cluster -node <name>
cluster1::*> cluster ping-cluster -node local Host is node1 Getting addresses from network interface table... Cluster node1_clus1 169.254.209.69 node1 e3a Cluster node1_clus2 169.254.49.125 node1 e3b Cluster node2_clus1 169.254.47.194 node2 e3a Cluster node2_clus2 169.254.19.183 node2 e3b Local = 169.254.47.194 169.254.19.183 Remote = 169.254.209.69 169.254.49.125 Cluster Vserver Id = 4294967293 Ping status: Basic connectivity succeeds on 4 path(s) Basic connectivity fails on 0 path(s) Detected 9000 byte MTU on 4 path(s): Local 169.254.47.194 to Remote 169.254.209.69 Local 169.254.47.194 to Remote 169.254.49.125 Local 169.254.19.183 to Remote 169.254.209.69 Local 169.254.19.183 to Remote 169.254.49.125 Larger than PMTU communication succeeds on 4 path(s) RPC status: 2 paths up, 0 paths down (tcp check) 2 paths up, 0 paths down (udp check)