Replacing a Cisco Nexus 9336C-FX2 cluster switch

Replacing a defective Nexus 9336C-FX2 switch in a cluster network is a nondisruptive procedure (NDU).

Before you begin

The following conditions must exist before performing the switch replacement in the current environment and on the replacement switch.

About this task

You must execute the command for migrating a cluster LIF from the node where the cluster LIF is hosted.

The examples in this procedure use the following switch and node nomenclature:

Note: The following procedure is based on the following cluster network topology:
cluster1::*> network port show -ipspace Cluster

Node: node1
                                                                       Ignore
                                                  Speed(Mbps) Health   Health
Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0b       Cluster      Cluster          up   9000  auto/10000 healthy  false

Node: node2
                                                                       Ignore
                                                  Speed(Mbps) Health   Health
Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a       Cluster      Cluster          up   9000  auto/10000 healthy  false
e0b       Cluster      Cluster          up   9000  auto/10000 healthy  false
4 entries were displayed.


 
cluster1::*> network interface show -vserver Cluster
            Logical    Status     Network            Current       Current Is
Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
            node1_clus1  up/up    169.254.209.69/16  node1         e0a     true
            node1_clus2  up/up    169.254.49.125/16  node1         e0b     true
            node2_clus1  up/up    169.254.47.194/16  node2         e0a     true
            node2_clus2  up/up    169.254.19.183/16  node2         e0b     true
4 entries were displayed.



cluster1::*> network device-discovery show -protocol cdp
Node/       Local  Discovered
Protocol    Port   Device (LLDP: ChassisID)  Interface         Platform
----------- ------ ------------------------- ----------------  ----------------
node2      /cdp
            e0a    cs1                       Eth1/2            N9K-C9336C
            e0b    cs2                       Eth1/2            N9K-C9336C
node1      /cdp
            e0a    cs1                       Eth1/1            N9K-C9336C
            e0b    cs2                       Eth1/1            N9K-C9336C
4 entries were displayed.



cs1# show cdp neighbors

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater,
                  V - VoIP-Phone, D - Remotely-Managed-Device,
                  s - Supports-STP-Dispute

Device-ID          Local Intrfce  Hldtme Capability  Platform      Port ID     
node1              Eth1/1         144    H           FAS2980       e0a 
node2              Eth1/2         145    H           FAS2980       e0a           
cs2                Eth1/35        176    R S I s     N9K-C9336C    Eth1/35       
cs2(FDO220329V5)   Eth1/36        176    R S I s     N9K-C9336C    Eth1/36       

Total entries displayed: 4


cs2# show cdp neighbors 

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater,
                  V - VoIP-Phone, D - Remotely-Managed-Device,
                  s - Supports-STP-Dispute

Device-ID          Local Intrfce  Hldtme Capability  Platform      Port ID      
node1              Eth1/1         139    H           FAS2980       e0b 
node2              Eth1/2         124    H           FAS2980       e0b          
cs1                Eth1/35        178    R S I s     N9K-C9336C    Eth1/35            
cs1                Eth1/36        178    R S I s     N9K-C9336C    Eth1/36              

Total entries displayed: 4

Procedure

  1. If AutoSupport is enabled on this cluster, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=xh
    where x is the duration of the maintenance window in hours.
    Note: The AutoSupport message notifies technical support of this maintenance task so that automatic case creation is suppressed during the maintenance window.
  2. Optional: Install the appropriate RCF and image on the switch, newcs2, and make any necessary site preparations.
    If necessary, verify, download, and install the appropriate versions of the RCF and NX-OS software for the new switch. If you have verified that the new switch is correctly set up and does not need updates to the RCF and NX-OS software, continue to step 2.
    1. Go to the NetApp Cluster and Management Network Switches Reference Configuration File Description Page on the NetApp Support Site.
    2. Click the link for the Cluster Network and Management Network Compatibility Matrix, and then note the required switch software version.
    3. Click your browser's back arrow to return to the Description page, click CONTINUE, accept the license agreement, and then go to the Download page.
    4. Follow the steps on the Download page to download the correct RCF and NX-OS files for the version of ONTAP software you are installing.
  3. On the new switch, log in as admin and shut down all of the ports that will be connected to the node cluster interfaces (ports 1/1 to 1/34).
    If the switch that you are replacing is not functional and is powered down, go to Step 4. The LIFs on the cluster nodes should have already failed over to the other cluster port for each node.
    newcs2# config
    Enter configuration commands, one per line. End with CNTL/Z.
    newcs2(config)# interface e1/1-34
    newcs2(config-if-range)# shutdown
    
  4. Verify that all cluster LIFs have auto-revert enabled: network interface show -vserver Cluster -fields auto-revert
    cluster1::> network interface show -vserver Cluster -fields auto-revert
    
                 Logical
    Vserver      Interface     Auto-revert
    ------------ ------------- -------------
    Cluster      node1_clus1   true
    Cluster      node1_clus2   true
    Cluster      node2_clus1   true
    Cluster      node2_clus2   true
    
    4 entries were displayed.
    
  5. Verify that all the cluster LIFs can communicate: cluster ping-cluster
    cluster1::*> cluster ping-cluster node1
    
    Host is node2
    Getting addresses from network interface table...
    Cluster node1_clus1 169.254.209.69 node1 e0a
    Cluster node1_clus2 169.254.49.125 node1 e0b
    Cluster node2_clus1 169.254.47.194 node2 e0a
    Cluster node2_clus2 169.254.19.183 node2 e0b
    Local = 169.254.47.194 169.254.19.183
    Remote = 169.254.209.69 169.254.49.125
    Cluster Vserver Id = 4294967293
    Ping status:
    ....
    Basic connectivity succeeds on 4 path(s)
    Basic connectivity fails on 0 path(s)
    ................
    Detected 9000 byte MTU on 4 path(s):
    Local 169.254.47.194 to Remote 169.254.209.69
    Local 169.254.47.194 to Remote 169.254.49.125
    Local 169.254.19.183 to Remote 169.254.209.69
    Local 169.254.19.183 to Remote 169.254.49.125
    Larger than PMTU communication succeeds on 4 path(s)
    RPC status:
    2 paths up, 0 paths down (tcp check)
    2 paths up, 0 paths down (udp check)
    
  6. Shut down the ISL ports 1/35 and 1/36 on the Nexus 9336C-FX2 switch cs1:
    cs1# configure 
    Enter configuration commands, one per line. End with CNTL/Z.
    cs1(config)# interface e1/35-36
    cs1(config-if-range)# shutdown
    cs1(config-if-range)#
    
  7. Remove all of the cables from the Nexus 9336C-FX2 cs2 switch, and then connect them to the same ports on the Nexus C9336C-FX2 newcs2 switch.
  8. Bring up the ISLs ports 1/35 and 1/36 between the cs1 and newcs2 switches, and then verify the port channel operation status.
    Port-Channel should indicate Po1(SU) and Member Ports should indicate Eth1/35(P) and Eth1/36(P).

    This example enables ISL ports 1/35 and 1/36 and displays the port channel summary on switch cs1:

    cs1# configure
    Enter configuration commands, one per line. End with CNTL/Z.
    cs1(config)# int e1/35-36
    cs1(config-if-range)# no shutdown
    
    cs1(config-if-range)# show port-channel summary
    Flags:  D - Down        P - Up in port-channel (members)
            I - Individual  H - Hot-standby (LACP only)
            s - Suspended   r - Module-removed
            b - BFD Session Wait
            S - Switched    R - Routed
            U - Up (port-channel)
            p - Up in delay-lacp mode (member)
            M - Not in use. Min-links not met
    --------------------------------------------------------------------------------
    Group Port-       Type     Protocol  Member       Ports
          Channel
    --------------------------------------------------------------------------------
    1     Po1(SU)     Eth      LACP      Eth1/35(P)   Eth1/36(P)
       
    cs1(config-if-range)# 
    
  9. Verify that port e0b is up on all nodes: network port show ipspace Cluster

    The output should be similar to the following:

    cluster1::*> network port show -ipspace Cluster
       
    Node: node1
                                                                            Ignore
                                                       Speed(Mbps) Health   Health
    Port      IPspace      Broadcast Domain Link MTU   Admin/Oper  Status   Status
    --------- ------------ ---------------- ---- ----- ----------- -------- -------
    e0a       Cluster      Cluster          up   9000  auto/10000  healthy  false
    e0b       Cluster      Cluster          up   9000  auto/10000  healthy  false
    
    Node: node2
                                                                            Ignore
                                                       Speed(Mbps) Health   Health
    Port      IPspace      Broadcast Domain Link MTU   Admin/Oper  Status   Status
    --------- ------------ ---------------- ---- ----- ----------- -------- -------
    e0a       Cluster      Cluster          up   9000  auto/10000  healthy  false
    e0b       Cluster      Cluster          up   9000  auto/auto   -        false
    
    4 entries were displayed.
  10. On the same node you used in the previous step, revert the cluster LIF associated with the port in the previous step by using the network interface revert command.
    In this example, LIF node1_clus2 on node1 is successfully reverted if the Home value is true and the port is e0b.

    The following commands return LIF node1_clus2 on node1 to home port e0a and displays information about the LIFs on both nodes. Bringing up the first node is successful if the Is Home column is true for both cluster interfaces and they show the correct port assignments, in this example e0a and e0b on node1.

    cluster1::*> network interface show -vserver Cluster
                                     
                Logical      Status     Network            Current    Current Is
    Vserver     Interface    Admin/Oper Address/Mask       Node       Port    Home
    ----------- ------------ ---------- ------------------ ---------- ------- -----
    Cluster
                node1_clus1  up/up      169.254.209.69/16  node1      e0a     true
                node1_clus2  up/up      169.254.49.125/16  node1      e0b     true
                node2_clus1  up/up      169.254.47.194/16  node2      e0a     true
                node2_clus2  up/up      169.254.19.183/16  node2      e0a     false
    
    4 entries were displayed.
    
  11. Display information about the nodes in a cluster: cluster show

    This example shows that the node health for node1 and node2 in this cluster is true:

    cluster1::*> cluster show
    
    Node          Health  Eligibility   
    ------------- ------- ------------  
    node1         false   true          
    node2         true    true           
    
  12. Verify that all physical cluster ports are up: network port show ipspace Cluster
    cluster1::*> network port show -ipspace Cluster
    
    Node node1                                                               Ignore
                                                        Speed(Mbps) Health   Health
    Port      IPspace     Broadcast Domain  Link  MTU   Admin/Oper  Status   Status
    --------- ----------- ----------------- ----- ----- ----------- -------- ------
    e0a       Cluster     Cluster           up    9000  auto/10000  healthy  false
    e0b       Cluster     Cluster           up    9000  auto/10000  healthy  false
    
    Node: node2
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
    Port      IPspace      Broadcast Domain Link  MTU   Admin/Oper  Status   Status
    --------- ------------ ---------------- ----- ----- ----------- -------- ------
    e0a       Cluster      Cluster          up    9000  auto/10000  healthy  false
    e0b       Cluster      Cluster          up    9000  auto/10000  healthy  false
    
    4 entries were displayed.
    
  13. Verify that all the cluster LIFs can communicate: cluster ping-cluster
    cluster1::*> cluster ping-cluster -node node2
    Host is node2
    Getting addresses from network interface table...
    Cluster node1_clus1 169.254.209.69 node1 e0a
    Cluster node1_clus2 169.254.49.125 node1 e0b
    Cluster node2_clus1 169.254.47.194 node2 e0a
    Cluster node2_clus2 169.254.19.183 node2 e0b
    Local = 169.254.47.194 169.254.19.183
    Remote = 169.254.209.69 169.254.49.125
    Cluster Vserver Id = 4294967293
    Ping status:
    ....
    Basic connectivity succeeds on 4 path(s)
    Basic connectivity fails on 0 path(s)
    ................
    Detected 9000 byte MTU on 4 path(s):
    Local 169.254.47.194 to Remote 169.254.209.69
    Local 169.254.47.194 to Remote 169.254.49.125
    Local 169.254.19.183 to Remote 169.254.209.69
    Local 169.254.19.183 to Remote 169.254.49.125
    Larger than PMTU communication succeeds on 4 path(s)
    RPC status:
    2 paths up, 0 paths down (tcp check)
    2 paths up, 0 paths down (udp check)
  14. Confirm the following cluster network configuration: network port show
    cluster1::*> network port show -ipspace Cluster
    Node: node1
                                                                           Ignore
                                           Speed(Mbps)            Health   Health
    Port      IPspace     Broadcast Domain Link MTU   Admin/Oper  Status   Status
    --------- ----------- ---------------- ---- ----- ----------- -------- ------
    e0a       Cluster     Cluster          up   9000  auto/10000  healthy  false
    e0b       Cluster     Cluster          up   9000  auto/10000  healthy  false
    
    Node: node2
                                                                           Ignore
                                            Speed(Mbps)           Health   Health
    Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
    --------- ------------ ---------------- ---- ---- ----------- -------- ------
    e0a       Cluster      Cluster          up   9000 auto/10000  healthy  false
    e0b       Cluster      Cluster          up   9000 auto/10000  healthy  false
    
    4 entries were displayed.
    
    
    cluster1::*> network interface show -vserver Cluster
    
                Logical    Status     Network            Current       Current Is
    Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
    ----------- ---------- ---------- ------------------ ------------- ------- ----
    Cluster
                node1_clus1  up/up    169.254.209.69/16  node1         e0a     true
                node1_clus2  up/up    169.254.49.125/16  node1         e0b     true
                node2_clus1  up/up    169.254.47.194/16  node2         e0a     true
                node2_clus2  up/up    169.254.19.183/16  node2         e0b     true
    
    4 entries were displayed.
    
    cluster1::> network device-discovery show -protocol cdp
    
    Node/       Local  Discovered
    Protocol    Port   Device (LLDP: ChassisID)  Interface         Platform
    ----------- ------ ------------------------- ----------------  ----------------
    node2      /cdp
                e0a    cs1                       0/2               N9K-C9336C
                e0b    newcs2                    0/2               N9K-C9336C
    node1      /cdp
                e0a    cs1                       0/1               N9K-C9336C 
                e0b    newcs2                    0/1               N9K-C9336C 
    
    4 entries were displayed.
    
    
    cs1# show cdp neighbors
    
    Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
                      S - Switch, H - Host, I - IGMP, r - Repeater,
                      V - VoIP-Phone, D - Remotely-Managed-Device,
                      s - Supports-STP-Dispute
    
    Device-ID            Local Intrfce  Hldtme Capability  Platform      Port ID     
    node1                Eth1/1         144    H           FAS2980       e0a 
    node2                Eth1/2         145    H           FAS2980       e0a           
    newcs2               Eth1/35        176    R S I s     N9K-C9336C    Eth1/35       
    newcs2               Eth1/36        176    R S I s     N9K-C9336C    Eth1/36       
    
    Total entries displayed: 4
    
    
    cs2# show cdp neighbors 
    
    Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
                      S - Switch, H - Host, I - IGMP, r - Repeater,
                      V - VoIP-Phone, D - Remotely-Managed-Device,
                      s - Supports-STP-Dispute
    
    Device-ID          Local Intrfce  Hldtme Capability  Platform      Port ID      
    node1              Eth1/1         139    H           FAS2980       e0b 
    node2              Eth1/2         124    H           FAS2980       e0b          
    cs1                Eth1/35        178    R S I s     N9K-C9336C    Eth1/35            
    cs1                Eth1/36        178    R S I s     N9K-C9336C    Eth1/36              
    
    Total entries displayed: 4
    
  15. For ONTAP 9.8 and later, enable the Ethernet switch health monitor log collection feature for collecting switch-related log files, using the commands: system switch ethernet log setup-password system switch ethernet log enable-collection
    cluster1::*> system switch ethernet log setup-password
    Enter the switch name: <return>
    The switch name entered is not recognized.                                     
    Choose from the following list: 
    cs1
    cs2
    
    cluster1::*> system switch ethernet log setup-password 
    
    Enter the switch name: cs1
    RSA key fingerprint is e5:8b:c6:dc:e2:18:18:09:36:63:d9:63:dd:03:d9:cc
    Do you want to continue? {y|n}::[n] y 
    
    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>
    
    cluster1::*> system switch ethernet log setup-password 
    
    Enter the switch name: cs2 
    RSA key fingerprint is 57:49:86:a1:b9:80:6a:61:9a:86:8e:3c:e3:b7:1f:b1
    Do you want to continue? {y|n}:: [n] y
    
    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>
    
    cluster1::*> system  switch ethernet log enable-collection 
    
    Do you want to enable cluster log collection for all nodes in the cluster?
    {y|n}: [n] y 
    
    Enabling cluster switch log collection.                                       
    
    cluster1::*>
    
    Note: If any of these commands return an error, contact NetApp support.
  16. For ONTAP releases 9.5P16, 9.6P12, and 9.7P10 and later patch releases, enable the Ethernet switch health monitor log collection feature for collecting switch-related log files, using the commands: system cluster-switch log setup-password system cluster-switch log enable-collection
    cluster1::*> system cluster-switch log setup-password
    Enter the switch name: <return>
    The switch name entered is not recognized.                                     
    Choose from the following list: 
    cs1
    cs2
    
    cluster1::*> system cluster-switch log setup-password 
    
    Enter the switch name: cs1
    RSA key fingerprint is e5:8b:c6:dc:e2:18:18:09:36:63:d9:63:dd:03:d9:cc
    Do you want to continue? {y|n}::[n] y 
    
    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>
    
    cluster1::*> system cluster-switch log setup-password 
    
    Enter the switch name: cs2 
    RSA key fingerprint is 57:49:86:a1:b9:80:6a:61:9a:86:8e:3c:e3:b7:1f:b1
    Do you want to continue? {y|n}:: [n] y
    
    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>
    
    cluster1::*> system cluster-switch log enable-collection 
    
    Do you want to enable cluster log collection for all nodes in the cluster?
    {y|n}: [n] y 
    
    Enabling cluster switch log collection.                                       
    
    cluster1::*>
    
    Note: If any of these commands return an error, contact NetApp support.
  17. If you suppressed automatic case creation, re-enable it by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=END