Skip to main content
本繁體中文版使用機器翻譯,譯文僅供參考,若與英文版本牴觸,應以英文版本為準。

安裝或升級參考設定檔 (RCF) 腳本

貢獻者 netapp-yvonneo netapp-jolieg

請依照以下步驟安裝或升級 RCF 腳本。

開始之前

在安裝或升級 RCF 腳本之前,請確保交換器上具備以下條件:

  • Cumulus Linux 已安裝。參見 "Hardware Universe"適用於支援的版本。

  • IP 位址、子網路遮罩和預設閘道透過 DHCP 定義或手動設定。

註 除了管理員使用者之外,您還必須在 RCF 中指定專門用於日誌收集的使用者。
客戶配置

可用的參考配置類別如下:

在配置為 4x10GbE 分支的連接埠上,一個連接埠配置為 4x25GbE 分支,其他連接埠配置為 40/100GbE。對於使用共享叢集/HA 連接埠的節點,支援連接埠上的共用叢集/HA 流量。請參閱知識庫文章中的平台表。 "哪些AFF、 ASA和FAS平台使用共用叢集和 HA 乙太網路連接埠?" 。所有連接埠也可以用作專用集群連接埠。

儲存

所有連接埠均配置為 100GbE NVMe 儲存連線。

目前 RCF 腳本版本

叢集和儲存應用可以使用兩種 RCF 腳本。從 "NVIDIA SN2100 軟體下載"頁。每種情況的處理步驟都相同。

  • 叢集:MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP

  • 儲存:MSN2100-RCF-v1.x-儲存

關於範例

以下範例步驟顯示如何下載和套用叢集交換器的 RCF 腳本。

範例指令輸出使用交換器管理 IP 位址 10.233.204.71,子網路遮罩 255.255.254.0 和預設閘道 10.233.204.1。

範例 1. 步驟
Cumulus Linux 4.4.3
  1. 將集群交換器連接到管理網路。

  2. 使用 `ping`用於驗證與託管 Cumulus Linux 和 RCF 的伺服器的連接性的命令。

  3. 顯示每個節點上連接到叢集交換器的叢集連接埠:

    network device-discovery show

  4. 檢查每個叢集連接埠的管理和運作狀態。

    1. 確認叢集所有連接埠均已啟動且狀態正常:

      network port show -role cluster

    2. 確認所有叢集介面(LIF)都位於主連接埠上:

      network interface show -role cluster

    3. 確認集群顯示兩個集群交換器的資訊:

      system cluster-switch show -is-monitoring-enabled-operational true

  5. 停用群集 LIF 的自動回滾功能。叢集 LIF 會故障轉移到夥伴叢集交換機,並在您對目標交換器執行升級程序時保留在該交換器上:

    network interface modify -vserver Cluster -lif * -auto-revert false

  • 如果您要升級 RCF,則必須在此步驟中停用自動回滾功能。

  • 如果您剛剛升級了 Cumulus Linux 版本,則無需在此步驟中停用自動還原功能,因為它已經停用。

  1. 顯示SN2100交換器上的可用介面:

    admin@sw1:mgmt:~$ net show interface all
    
    State  Name   Spd  MTU    Mode         LLDP                Summary
    -----  -----  ---  -----  -----------  ------------------  --------------
    ...
    ...
    ADMDN  swp1   N/A  9216   NotConfigured
    ADMDN  swp2   N/A  9216   NotConfigured
    ADMDN  swp3   N/A  9216   NotConfigured
    ADMDN  swp4   N/A  9216   NotConfigured
    ADMDN  swp5   N/A  9216   NotConfigured
    ADMDN  swp6   N/A  9216   NotConfigured
    ADMDN  swp7   N/A  9216   NotConfigured
    ADMDN  swp8   N/A  9216   NotConfigured
    ADMDN  swp9   N/A  9216   NotConfigured
    ADMDN  swp10  N/A  9216   NotConfigured
    ADMDN  swp11  N/A  9216   NotConfigured
    ADMDN  swp12  N/A  9216   NotConfigured
    ADMDN  swp13  N/A  9216   NotConfigured
    ADMDN  swp14  N/A  9216   NotConfigured
    ADMDN  swp15  N/A  9216   NotConfigured
    ADMDN  swp16  N/A  9216   NotConfigured
  2. 將 RCF Python 腳本複製到交換器。

    cumulus@cumulus:mgmt:~$ cd /tmp
    cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP .
    ssologin@10.233.204.71's password:
    MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP         100% 8607   111.2KB/s         00:00
    註 儘管 `scp`如果範例中使用的是這種方式,您可以使用您喜歡的檔案傳輸方式,例如 SFTP、HTTPS 或 FTP。
  3. 應用 RCF python 腳本 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP

    cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP
    [sudo] password for cumulus:
    ...
    Step 1: Creating the banner file
    Step 2: Registering banner message
    Step 3: Updating the MOTD file
    Step 4: Ensuring passwordless use of cl-support command by admin
    Step 5: Disabling apt-get
    Step 6: Creating the interfaces
    Step 7: Adding the interface config
    Step 8: Disabling cdp
    Step 9: Adding the lldp config
    Step 10: Adding the RoCE base config
    Step 11: Modifying RoCE Config
    Step 12: Configure SNMP
    Step 13: Reboot the switch

    RCF 腳本完成了上面範例中列出的步驟。

    註 在上述步驟 3「更新 MOTD 檔案」中,該指令 `cat /etc/motd`正在運行。這樣,您就可以驗證 RCF 檔案名稱、RCF 版本、要使用的連接埠以及 RCF 橫幅中的其他重要資訊。
    註 如果遇到任何無法解決的 RCF Python 腳本問題,請聯絡我們。 "NetApp支援"尋求幫助。
  4. 將先前對交換器配置所做的任何自訂設定重新套用。請參閱"審查佈線和配置注意事項"有關任何後續變更的詳細資訊。

  5. 重啟後驗證配置:

    admin@sw1:mgmt:~$ net show interface all
    
    State  Name      Spd   MTU    Mode       LLDP              Summary
    -----  --------- ----  -----  ---------- ----------------- --------
    ...
    ...
    DN     swp1s0    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp1s1    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp1s2    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp1s3    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp2s0    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp2s1    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp2s2    N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp2s3    N/A   9216   Trunk/L2                     Master: bridge(UP)
    UP     swp3      100G  9216   Trunk/L2                     Master: bridge(UP)
    UP     swp4      100G  9216   Trunk/L2                     Master: bridge(UP)
    DN     swp5      N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp6      N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp7      N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp8      N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp9      N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp10     N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp11     N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp12     N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp13     N/A   9216   Trunk/L2                     Master: bridge(UP)
    DN     swp14     N/A   9216   Trunk/L2                     Master: bridge(UP)
    UP     swp15     N/A   9216   BondMember                   Master: bond_15_16(UP)
    UP     swp16     N/A   9216   BondMember                   Master: bond_15_16(UP)
    ...
    ...
    
    admin@sw1:mgmt:~$ net show roce config
    RoCE mode.......... lossless
    Congestion Control:
      Enabled SPs.... 0 2 5
      Mode........... ECN
      Min Threshold.. 150 KB
      Max Threshold.. 1500 KB
    PFC:
      Status......... enabled
      Enabled SPs.... 2 5
      Interfaces......... swp10-16,swp1s0-3,swp2s0-3,swp3-9
    
    DSCP                     802.1p  switch-priority
    -----------------------  ------  ---------------
    0 1 2 3 4 5 6 7               0                0
    8 9 10 11 12 13 14 15         1                1
    16 17 18 19 20 21 22 23       2                2
    24 25 26 27 28 29 30 31       3                3
    32 33 34 35 36 37 38 39       4                4
    40 41 42 43 44 45 46 47       5                5
    48 49 50 51 52 53 54 55       6                6
    56 57 58 59 60 61 62 63       7                7
    
    switch-priority  TC  ETS
    ---------------  --  --------
    0 1 3 4 6 7       0  DWRR 28%
    2                 2  DWRR 28%
    5                 5  DWRR 43%
  6. 請核對介面中收發器的資訊:

    admin@sw1:mgmt:~$ net show interface pluggables
    Interface  Identifier     Vendor Name  Vendor PN        Vendor SN       Vendor Rev
    ---------  -------------  -----------  ---------------  --------------  ----------
    swp3       0x11 (QSFP28)  Amphenol     112-00574        APF20379253516  B0
    swp4       0x11 (QSFP28)  AVAGO        332-00440        AF1815GU05Z     A0
    swp15      0x11 (QSFP28)  Amphenol     112-00573        APF21109348001  B0
    swp16      0x11 (QSFP28)  Amphenol     112-00573        APF21109347895  B0
  7. 確認每個節點都與每個交換器有連接:

    admin@sw1:mgmt:~$ net show lldp
    
    LocalPort  Speed  Mode        RemoteHost              RemotePort
    ---------  -----  ----------  ----------------------  -----------
    swp3       100G   Trunk/L2    sw1                     e3a
    swp4       100G   Trunk/L2    sw2                     e3b
    swp15      100G   BondMember  sw13                    swp15
    swp16      100G   BondMember  sw14                    swp16
  8. 檢查叢集上叢集連接埠的運作狀況。

    1. 確認叢集中所有節點的叢集連接埠均已啟動且運作狀況良好:

      cluster1::*> network port show -role cluster
      
      Node: node1
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
      
      Node: node2
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
    2. 從叢集驗證交換器的健康狀況(這可能不會顯示交換器 sw2,因為 LIF 沒有歸位到 e0d)。

      cluster1::*> network device-discovery show -protocol lldp
      Node/       Local  Discovered
      Protocol    Port   Device (LLDP: ChassisID)  Interface Platform
      ----------- ------ ------------------------- --------- ----------
      node1/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp3      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp3      -
      
      node2/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp4      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp4      -
      
      
      cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true
      Switch                      Type               Address          Model
      --------------------------- ------------------ ---------------- -----
      sw1                         cluster-network    10.233.205.90    MSN2100-CB2RC
           Serial Number: MNXXXXXXGD
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 4.4.3 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
      
      sw2                         cluster-network    10.233.205.91    MSN2100-CB2RC
           Serial Number: MNCXXXXXXGS
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 4.4.3 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
  9. 驗證叢集是否運作正常:

    cluster show

  10. 對第二個開關重複步驟 1 至 14。

  11. 啟用叢集 LIF 的自動回滾功能。

    network interface modify -vserver Cluster -lif * -auto-revert true

  1. 將集群交換器連接到管理網路。

  2. 使用 `ping`用於驗證與託管 Cumulus Linux 和 RCF 的伺服器的連接性的命令。

  3. 顯示每個節點上連接到叢集交換器的叢集連接埠:

    network device-discovery show

  4. 檢查每個叢集連接埠的管理和運作狀態。

    1. 確認叢集所有連接埠均已啟動且狀態正常:

      network port show -role cluster

    2. 確認所有叢集介面(LIF)都位於主連接埠上:

      network interface show -role cluster

    3. 確認集群顯示兩個集群交換器的資訊:

      system cluster-switch show -is-monitoring-enabled-operational true

  5. 停用群集 LIF 的自動回滾功能。叢集 LIF 會故障轉移到夥伴叢集交換機,並在您對目標交換器執行升級程序時保留在該交換器上:

    network interface modify -vserver Cluster -lif * -auto-revert false

  • 如果您要升級 RCF,則必須在此步驟中停用自動回滾功能。

  • 如果您剛剛升級了 Cumulus Linux 版本,則無需在此步驟中停用自動還原功能,因為它已經停用。

  1. 顯示SN2100交換器上的可用介面:

    admin@sw1:mgmt:~$ nv show interface
    Interface     MTU   Speed State Remote Host         Remote Port- Type      Summary
    ------------- ----- ----- ----- ------------------- ------------ --------- -------------
    + cluster_isl 9216  200G  up                                      bond
    + eth0        1500  100M  up    mgmt-sw1            Eth105/1/14   eth       IP Address: 10.231.80 206/22
      eth0                                                                      IP Address: fd20:8b1e:f6ff:fe31:4a0e/64
    + lo          65536       up                                      loopback  IP Address: 127.0.0.1/8
      lo                                                                        IP Address: ::1/128
    + swp1s0      9216 10G    up cluster01                e0b         swp
    .
    .
    .
    + swp15      9216 100G    up sw2                      swp15       swp
    + swp16      9216 100G    up sw2                      swp16       swp
  2. 將 RCF Python 腳本複製到交換器。

    cumulus@cumulus:mgmt:~$ cd /tmp
    cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP .
    ssologin@10.233.204.71's password:
    MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP          100% 8607   111.2KB/s         00:00
    註 儘管 `scp`如果範例中使用的是這種方式,您可以使用您喜歡的檔案傳輸方式,例如 SFTP、HTTPS 或 FTP。
  3. 應用 RCF python 腳本 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP

    cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP
    [sudo] password for cumulus:
    .
    .
    Step 1: Creating the banner file
    Step 2: Registering banner message
    Step 3: Updating the MOTD file
    Step 4: Ensuring passwordless use of cl-support command by admin
    Step 5: Disabling apt-get
    Step 6: Creating the interfaces
    Step 7: Adding the interface config
    Step 8: Disabling cdp
    Step 9: Adding the lldp config
    Step 10: Adding the RoCE base config
    Step 11: Modifying RoCE Config
    Step 12: Configure SNMP
    Step 13: Reboot the switch

    RCF 腳本完成了上面範例中列出的步驟。

    註 在上述步驟 3「更新 MOTD 檔案」中,該指令 `cat /etc/issue.net`正在運行。這樣,您就可以驗證 RCF 檔案名稱、RCF 版本、要使用的連接埠以及 RCF 橫幅中的其他重要資訊。

    例如:

    admin@sw1:mgmt:~$ cat /etc/issue.net
    ******************************************************************************
    *
    * NetApp Reference Configuration File (RCF)
    * Switch       : Mellanox MSN2100
    * Filename     : MSN2100-RCF-1._x_-Cluster-HA-Breakout-LLDP
    * Release Date : 13-02-2023
    * Version      : 1._x_-Cluster-HA-Breakout-LLDP
    *
    * Port Usage:
    * Port 1      : 4x10G Breakout mode for Cluster+HA Ports, swp1s0-3
    * Port 2      : 4x25G Breakout mode for Cluster+HA Ports, swp2s0-3
    * Ports 3-14  : 40/100G for Cluster+HA Ports, swp3-14
    * Ports 15-16 : 100G Cluster ISL Ports, swp15-16
    *
    * NOTE:
    *   RCF manually sets swp1s0-3 link speed to 10000 and
    *   auto-negotiation to off for Intel 10G
    *   RCF manually sets swp2s0-3 link speed to 25000 and
    *   auto-negotiation to off for Chelsio 25G
    *
    *
    * IMPORTANT: Perform the following steps to ensure proper RCF installation:
    * - Copy the RCF file to /tmp
    * - Ensure the file has execute permission
    * - From /tmp run the file as sudo python3 <filename>
    *
    ******************************************************************************
    註 如果遇到任何無法解決的 RCF Python 腳本問題,請聯絡我們。 "NetApp支援"尋求幫助。
  4. 將先前對交換器配置所做的任何自訂設定重新套用。請參閱"審查佈線和配置注意事項"有關任何後續變更的詳細資訊。

  5. 重啟後驗證配置:

    admin@sw1:mgmt:~$ nv show interface
    Interface     MTU   Speed State Remote Host         Remote Port- Type      Summary
    ------------- ----- ----- ----- ------------------- ------------ --------- -------------
    + cluster_isl 9216  200G  up                                      bond
    + eth0        1500  100M  up    mgmt-sw1            Eth105/1/14   eth       IP Address: 10.231.80 206/22
      eth0                                                                      IP Address: fd20:8b1e:f6ff:fe31:4a0e/64
    + lo          65536       up                                      loopback  IP Address: 127.0.0.1/8
      lo                                                                        IP Address: ::1/128
    + swp1s0      9216 10G    up cluster01                e0b         swp
    .
    .
    .
    + swp15      9216 100G    up sw2                      swp15       swp
    + swp16      9216 100G    up sw2                      swp16       swp
    
    admin@sw1:mgmt:~$ nv show qos roce
                       operational  applied   description
    -----------------  -----------  --------- ----------------------------------------
    enable             on                     Turn feature 'on' or 'off'. This feature is disabled by default.
    mode               lossless     lossless  Roce Mode
    congestion-control
      congestion-mode   ECN,RED                Congestion config mode
      enabled-tc        0,2,5                  Congestion config enabled Traffic Class
      max-threshold     195.31 KB              Congestion config max-threshold
      min-threshold     39.06 KB               Congestion config min-threshold
      probability       100
    lldp-app-tlv
      priority          3                      switch-priority of roce
      protocol-id       4791                   L4 port number
      selector          UDP                    L4 protocol
    pfc
      pfc-priority      2, 5                   switch-prio on which PFC is enabled
      rx-enabled        enabled                PFC Rx Enabled status
      tx-enabled        enabled                PFC Tx Enabled status
    trust
      trust-mode        pcp,dscp               Trust Setting on the port for packet classification
    
    RoCE PCP/DSCP->SP mapping configurations
    ===========================================
            pcp  dscp                     switch-prio
        --  ---  -----------------------  -----------
        0   0    0,1,2,3,4,5,6,7          0
        1   1    8,9,10,11,12,13,14,15    1
        2   2    16,17,18,19,20,21,22,23  2
        3   3    24,25,26,27,28,29,30,31  3
        4   4    32,33,34,35,36,37,38,39  4
        5   5    40,41,42,43,44,45,46,47  5
        6   6    48,49,50,51,52,53,54,55  6
        7   7    56,57,58,59,60,61,62,63  7
    
    RoCE SP->TC mapping and ETS configurations
    =============================================
            switch-prio  traffic-class  scheduler-weight
        --  -----------  -------------  ----------------
        0   0            0              DWRR-28%
        1   1            0              DWRR-28%
        2   2            2              DWRR-28%
        3   3            0              DWRR-28%
        4   4            0              DWRR-28%
        5   5            5              DWRR-43%
        6   6            0              DWRR-28%
        7   7            0              DWRR-28%
    
    RoCE pool config
    ===================
            name                   mode     size  switch-priorities  traffic-class
        --  ---------------------  -------  ----  -----------------  -------------
        0   lossy-default-ingress  Dynamic  50%   0,1,3,4,6,7        -
        1   roce-reserved-ingress  Dynamic  50%   2,5                -
        2   lossy-default-egress   Dynamic  50%   -                  0
        3   roce-reserved-egress   Dynamic  inf   -                  2,5
    
    Exception List
    =================
            description
        --  -----------------------------------------------------------------------…
        1   RoCE PFC Priority Mismatch.Expected pfc-priority: 3.
        2   Congestion Config TC Mismatch.Expected enabled-tc: 0,3.
        3   Congestion Config mode Mismatch.Expected congestion-mode: ECN.
        4   Congestion Config min-threshold Mismatch.Expected min-threshold: 150000.
        5   Congestion Config max-threshold Mismatch.Expected max-threshold:
            1500000.
        6   Scheduler config mismatch for traffic-class mapped to switch-prio0.
            Expected scheduler-weight: DWRR-50%.
        7   Scheduler config mismatch for traffic-class mapped to switch-prio1.
            Expected scheduler-weight: DWRR-50%.
        8   Scheduler config mismatch for traffic-class mapped to switch-prio2.
            Expected scheduler-weight: DWRR-50%.
        9   Scheduler config mismatch for traffic-class mapped to switch-prio3.
            Expected scheduler-weight: DWRR-50%.
        10  Scheduler config mismatch for traffic-class mapped to switch-prio4.
            Expected scheduler-weight: DWRR-50%.
        11  Scheduler config mismatch for traffic-class mapped to switch-prio5.
            Expected scheduler-weight: DWRR-50%.
        12  Scheduler config mismatch for traffic-class mapped to switch-prio6.
            Expected scheduler-weight: strict-priority.
        13  Scheduler config mismatch for traffic-class mapped to switch-prio7.
            Expected scheduler-weight: DWRR-50%.
        14  Invalid reserved config for ePort.TC[2].Expected 0 Got 1024
        15  Invalid reserved config for ePort.TC[5].Expected 0 Got 1024
        16  Invalid traffic-class mapping for switch-priority 2.Expected 0 Got 2
        17  Invalid traffic-class mapping for switch-priority 3.Expected 3 Got 0
        18  Invalid traffic-class mapping for switch-priority 5.Expected 0 Got 5
        19  Invalid traffic-class mapping for switch-priority 6.Expected 6 Got 0
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    註 所列例外情況不影響效能,可以安全地忽略。
  6. 請核對介面中收發器的資訊:

    admin@sw1:mgmt:~$ nv show interface --view=pluggables
    Interface  Identifier     Vendor Name  Vendor PN        Vendor SN       Vendor Rev
    ---------  -------------  -----------  ---------------  --------------  ----------
    swp1s0     0x00 None
    swp1s1     0x00 None
    swp1s2     0x00 None
    swp1s3     0x00 None
    swp2s0     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s1     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s2     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s3     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp3       0x00 None
    swp4       0x00 None
    swp5       0x00 None
    swp6       0x00 None
    .
    .
    .
    swp15      0x11 (QSFP28)  Amphenol     112-00595        APF20279210117  B0
    swp16      0x11 (QSFP28)  Amphenol     112-00595        APF20279210166  B0
  7. 確認每個節點都與每個交換器有連接:

    admin@sw1:mgmt:~$ nv show interface --view=lldp
    
    LocalPort  Speed  Mode        RemoteHost               RemotePort
    ---------  -----  ----------  -----------------------  -----------
    eth0       100M   Mgmt        mgmt-sw1                 Eth110/1/29
    swp2s1     25G    Trunk/L2    node1                    e0a
    swp15      100G   BondMember  sw2                      swp15
    swp16      100G   BondMember  sw2                      swp16
  8. 檢查叢集上叢集連接埠的運作狀況。

    1. 確認叢集中所有節點的叢集連接埠均已啟動且運作狀況良好:

      cluster1::*> network port show -role cluster
      
      Node: node1
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
      
      Node: node2
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
    2. 從叢集驗證交換器的健康狀況(這可能不會顯示交換器 sw2,因為 LIF 沒有歸位到 e0d)。

      cluster1::*> network device-discovery show -protocol lldp
      Node/       Local  Discovered
      Protocol    Port   Device (LLDP: ChassisID)  Interface Platform
      ----------- ------ ------------------------- --------- ----------
      node1/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp3      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp3      -
      
      node2/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp4      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp4      -
      
      
      cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true
      Switch                      Type               Address          Model
      --------------------------- ------------------ ---------------- -----
      sw1                         cluster-network    10.233.205.90    MSN2100-CB2RC
           Serial Number: MNXXXXXXGD
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 5.4.0 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
      
      sw2                         cluster-network    10.233.205.91    MSN2100-CB2RC
           Serial Number: MNCXXXXXXGS
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 5.4.0 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
  9. 驗證叢集是否運作正常:

    cluster show

  10. 對第二個開關重複步驟 1 至 14。

  11. 啟用叢集 LIF 的自動回滾功能。

    network interface modify -vserver Cluster -lif * -auto-revert true

  1. 將集群交換器連接到管理網路。

  2. 使用 `ping`用於驗證與託管 Cumulus Linux 和 RCF 的伺服器的連接性的命令。

  3. 顯示每個節點上連接到叢集交換器的叢集連接埠:

    network device-discovery show

  4. 檢查每個叢集連接埠的管理和運作狀態。

    1. 確認叢集所有連接埠均已啟動且狀態正常:

      network port show -role cluster

    2. 確認所有叢集介面(LIF)都位於主連接埠上:

      network interface show -role cluster

    3. 確認集群顯示兩個集群交換器的資訊:

      system cluster-switch show -is-monitoring-enabled-operational true

  5. 停用群集 LIF 的自動回滾功能。叢集 LIF 會故障轉移到夥伴叢集交換機,並在您對目標交換器執行升級程序時保留在該交換器上:

    network interface modify -vserver Cluster -lif * -auto-revert false

  • 如果您要升級 RCF,則必須在此步驟中停用自動回滾功能。

  • 如果您剛剛升級了 Cumulus Linux 版本,則無需在此步驟中停用自動還原功能,因為它已經停用。

  1. 顯示SN2100交換器上的可用介面:

    admin@sw1:mgmt:~$ nv show interface
    Interface     MTU   Speed State Remote Host         Remote Port- Type      Summary
    ------------- ----- ----- ----- ------------------- ------------ --------- -------------
    + cluster_isl 9216  200G  up                                      bond
    + eth0        1500  100M  up    mgmt-sw1            Eth105/1/14   eth       IP Address: 10.231.80 206/22
      eth0                                                                      IP Address: fd20:8b1e:f6ff:fe31:4a0e/64
    + lo          65536       up                                      loopback  IP Address: 127.0.0.1/8
      lo                                                                        IP Address: ::1/128
    + swp1s0      9216 10G    up cluster01                e0b         swp
    .
    .
    .
    + swp15      9216 100G    up sw2                      swp15       swp
    + swp16      9216 100G    up sw2                      swp16       swp
  2. 將 RCF Python 腳本複製到交換器。

    cumulus@cumulus:mgmt:~$ cd /tmp
    cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP .
    ssologin@10.233.204.71's password:
    MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP          100% 8607   111.2KB/s         00:00
    註 雖然 `scp`如果範例中使用的是這種方式,您可以使用您喜歡的檔案傳輸方式,例如 SFTP、HTTPS 或 FTP。
  3. 應用 RCF python 腳本 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP

    cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP
    [sudo] password for cumulus:
    .
    .
    Step 1: Creating the banner file
    Step 2: Registering banner message
    Step 3: Updating the MOTD file
    Step 4: Ensuring passwordless use of cl-support command by admin
    Step 5: Disabling apt-get
    Step 6: Creating the interfaces
    Step 7: Adding the interface config
    Step 8: Disabling cdp
    Step 9: Adding the lldp config
    Step 10: Adding the RoCE base config
    Step 11: Modifying RoCE Config
    Step 12: Configure SNMP
    Step 13: Reboot the switch

    RCF 腳本完成了上面範例中列出的步驟。

    註 在上述步驟 3 更新 MOTD 檔案 中,執行指令 cat /etc/issue.net。這樣,您就可以驗證 RCF 檔案名稱、RCF 版本、要使用的連接埠以及 RCF 橫幅中的其他重要資訊。

    例如:

    admin@sw1:mgmt:~$ cat /etc/issue.net
    ******************************************************************************
    *
    * NetApp Reference Configuration File (RCF)
    * Switch       : Mellanox MSN2100
    * Filename     : MSN2100-RCF-1._x_-Cluster-HA-Breakout-LLDP
    * Release Date : 13-02-2023
    * Version      : 1._x_-Cluster-HA-Breakout-LLDP
    *
    * Port Usage:
    * Port 1      : 4x10G Breakout mode for Cluster+HA Ports, swp1s0-3
    * Port 2      : 4x25G Breakout mode for Cluster+HA Ports, swp2s0-3
    * Ports 3-14  : 40/100G for Cluster+HA Ports, swp3-14
    * Ports 15-16 : 100G Cluster ISL Ports, swp15-16
    *
    * NOTE:
    *   RCF manually sets swp1s0-3 link speed to 10000 and
    *   auto-negotiation to off for Intel 10G
    *   RCF manually sets swp2s0-3 link speed to 25000 and
    *   auto-negotiation to off for Chelsio 25G
    *
    *
    * IMPORTANT: Perform the following steps to ensure proper RCF installation:
    * - Copy the RCF file to /tmp
    * - Ensure the file has execute permission
    * - From /tmp run the file as sudo python3 <filename>
    *
    ******************************************************************************
    註 如果遇到任何無法解決的 RCF Python 腳本問題,請聯絡我們。 "NetApp支援"尋求幫助。
  4. 將先前對交換器配置所做的任何自訂設定重新套用。請參閱"審查佈線和配置注意事項"有關任何後續變更的詳細資訊。

  5. 重啟後驗證配置:

    admin@sw1:mgmt:~$ nv show interface
    Interface     MTU   Speed State Remote Host         Remote Port- Type      Summary
    ------------- ----- ----- ----- ------------------- ------------ --------- -------------
    + cluster_isl 9216  200G  up                                      bond
    + eth0        1500  100M  up    mgmt-sw1            Eth105/1/14   eth       IP Address: 10.231.80 206/22
      eth0                                                                      IP Address: fd20:8b1e:f6ff:fe31:4a0e/64
    + lo          65536       up                                      loopback  IP Address: 127.0.0.1/8
      lo                                                                        IP Address: ::1/128
    + swp1s0      9216 10G    up cluster01                e0b         swp
    .
    .
    .
    + swp15      9216 100G    up sw2                      swp15       swp
    + swp16      9216 100G    up sw2                      swp16       swp
    
    admin@sw1:mgmt:~$ nv show qos roce
                       operational  applied   description
    -----------------  -----------  --------- ----------------------------------------
    enable             on                     Turn feature 'on' or 'off'. This feature is disabled by default.
    mode               lossless     lossless  Roce Mode
    congestion-control
      congestion-mode   ECN,RED                Congestion config mode
      enabled-tc        0,2,5                  Congestion config enabled Traffic Class
      max-threshold     195.31 KB              Congestion config max-threshold
      min-threshold     39.06 KB               Congestion config min-threshold
      probability       100
    lldp-app-tlv
      priority          3                      switch-priority of roce
      protocol-id       4791                   L4 port number
      selector          UDP                    L4 protocol
    pfc
      pfc-priority      2, 5                   switch-prio on which PFC is enabled
      rx-enabled        enabled                PFC Rx Enabled status
      tx-enabled        enabled                PFC Tx Enabled status
    trust
      trust-mode        pcp,dscp               Trust Setting on the port for packet classification
    
    RoCE PCP/DSCP->SP mapping configurations
    ===========================================
            pcp  dscp                     switch-prio
        --  ---  -----------------------  -----------
        0   0    0,1,2,3,4,5,6,7          0
        1   1    8,9,10,11,12,13,14,15    1
        2   2    16,17,18,19,20,21,22,23  2
        3   3    24,25,26,27,28,29,30,31  3
        4   4    32,33,34,35,36,37,38,39  4
        5   5    40,41,42,43,44,45,46,47  5
        6   6    48,49,50,51,52,53,54,55  6
        7   7    56,57,58,59,60,61,62,63  7
    
    RoCE SP->TC mapping and ETS configurations
    =============================================
            switch-prio  traffic-class  scheduler-weight
        --  -----------  -------------  ----------------
        0   0            0              DWRR-28%
        1   1            0              DWRR-28%
        2   2            2              DWRR-28%
        3   3            0              DWRR-28%
        4   4            0              DWRR-28%
        5   5            5              DWRR-43%
        6   6            0              DWRR-28%
        7   7            0              DWRR-28%
    
    RoCE pool config
    ===================
            name                   mode     size  switch-priorities  traffic-class
        --  ---------------------  -------  ----  -----------------  -------------
        0   lossy-default-ingress  Dynamic  50%   0,1,3,4,6,7        -
        1   roce-reserved-ingress  Dynamic  50%   2,5                -
        2   lossy-default-egress   Dynamic  50%   -                  0
        3   roce-reserved-egress   Dynamic  inf   -                  2,5
    
    Exception List
    =================
            description
        --  -----------------------------------------------------------------------…
        1   RoCE PFC Priority Mismatch.Expected pfc-priority: 3.
        2   Congestion Config TC Mismatch.Expected enabled-tc: 0,3.
        3   Congestion Config mode Mismatch.Expected congestion-mode: ECN.
        4   Congestion Config min-threshold Mismatch.Expected min-threshold: 150000.
        5   Congestion Config max-threshold Mismatch.Expected max-threshold:
            1500000.
        6   Scheduler config mismatch for traffic-class mapped to switch-prio0.
            Expected scheduler-weight: DWRR-50%.
        7   Scheduler config mismatch for traffic-class mapped to switch-prio1.
            Expected scheduler-weight: DWRR-50%.
        8   Scheduler config mismatch for traffic-class mapped to switch-prio2.
            Expected scheduler-weight: DWRR-50%.
        9   Scheduler config mismatch for traffic-class mapped to switch-prio3.
            Expected scheduler-weight: DWRR-50%.
        10  Scheduler config mismatch for traffic-class mapped to switch-prio4.
            Expected scheduler-weight: DWRR-50%.
        11  Scheduler config mismatch for traffic-class mapped to switch-prio5.
            Expected scheduler-weight: DWRR-50%.
        12  Scheduler config mismatch for traffic-class mapped to switch-prio6.
            Expected scheduler-weight: strict-priority.
        13  Scheduler config mismatch for traffic-class mapped to switch-prio7.
            Expected scheduler-weight: DWRR-50%.
        14  Invalid reserved config for ePort.TC[2].Expected 0 Got 1024
        15  Invalid reserved config for ePort.TC[5].Expected 0 Got 1024
        16  Invalid traffic-class mapping for switch-priority 2.Expected 0 Got 2
        17  Invalid traffic-class mapping for switch-priority 3.Expected 3 Got 0
        18  Invalid traffic-class mapping for switch-priority 5.Expected 0 Got 5
        19  Invalid traffic-class mapping for switch-priority 6.Expected 6 Got 0
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
    註 所列例外情況不影響效能,可以忽略。
  6. 請核對介面中收發器的資訊:

    admin@sw1:mgmt:~$ nv show platform transceiver
    Interface  Identifier     Vendor Name  Vendor PN        Vendor SN       Vendor Rev
    ---------  -------------  -----------  ---------------  --------------  ----------
    swp1s0     0x00 None
    swp1s1     0x00 None
    swp1s2     0x00 None
    swp1s3     0x00 None
    swp2s0     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s1     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s2     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp2s3     0x11 (QSFP28)  CISCO-LEONI  L45593-D278-D20  LCC2321GTTJ     00
    swp3       0x00 None
    swp4       0x00 None
    swp5       0x00 None
    swp6       0x00 None
    .
    .
    .
    swp15      0x11 (QSFP28)  Amphenol     112-00595        APF20279210117  B0
    swp16      0x11 (QSFP28)  Amphenol     112-00595        APF20279210166  B0
  7. 確認每個節點都與每個交換器有連接:

    admin@sw1:mgmt:~$ nv show interface lldp
    
    LocalPort  Speed  Mode        RemoteHost               RemotePort
    ---------  -----  ----------  -----------------------  -----------
    eth0       100M   Mgmt        mgmt-sw1                 Eth110/1/29
    swp2s1     25G    Trunk/L2    node1                    e0a
    swp15      100G   BondMember  sw2                      swp15
    swp16      100G   BondMember  sw2                      swp16
  8. 檢查叢集上叢集連接埠的運作狀況。

    1. 確認叢集中所有節點的叢集連接埠均已啟動且運作狀況良好:

      cluster1::*> network port show -role cluster
      
      Node: node1
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
      
      Node: node2
                                                                             Ignore
                                                        Speed(Mbps) Health   Health
      Port      IPspace      Broadcast Domain Link MTU  Admin/Oper  Status   Status
      --------- ------------ ---------------- ---- ---- ----------- -------- ------
      e3a       Cluster      Cluster          up   9000  auto/10000 healthy  false
      e3b       Cluster      Cluster          up   9000  auto/10000 healthy  false
    2. 從叢集驗證交換器的健康狀況(這可能不會顯示交換器 sw2,因為 LIF 沒有歸位到 e0d)。

      cluster1::*> network device-discovery show -protocol lldp
      Node/       Local  Discovered
      Protocol    Port   Device (LLDP: ChassisID)  Interface Platform
      ----------- ------ ------------------------- --------- ----------
      node1/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp3      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp3      -
      
      node2/lldp
                  e3a    sw1 (b8:ce:f6:19:1a:7e)   swp4      -
                  e3b    sw2 (b8:ce:f6:19:1b:96)   swp4      -
      
      
      cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true
      Switch                      Type               Address          Model
      --------------------------- ------------------ ---------------- -----
      sw1                         cluster-network    10.233.205.90    MSN2100-CB2RC
           Serial Number: MNXXXXXXGD
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 5.4.0 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
      
      sw2                         cluster-network    10.233.205.91    MSN2100-CB2RC
           Serial Number: MNCXXXXXXGS
            Is Monitored: true
                  Reason: None
        Software Version: Cumulus Linux version 5.4.0 running on Mellanox
                          Technologies Ltd. MSN2100
          Version Source: LLDP
  9. 驗證叢集是否運作正常:

    cluster show

  10. 對第二個開關重複步驟 1 至 14。

  11. 啟用叢集 LIF 的自動回滾功能。

    network interface modify -vserver Cluster -lif * -auto-revert true

下一步是什麼?

安裝 RCF 後,您可以… "安裝 CSHM 文件"