安装参考配置文件(Reference Configuration File、RCF)脚本
按照此操作步骤 安装RCF脚本。
在安装RCF脚本之前、请确保交换机上具有以下配置:
-
安装了Cumulus Linux。请参见 "Hardware Universe" 支持的版本。
-
通过DHCP定义或手动配置的IP地址、子网掩码和默认网关。
除了管理员用户之外、您还必须在RC框架 中指定一个用户、以专门用于收集日志。 |
集群和存储应用程序可以使用两个RC框架 脚本。从下载RCF "此处"。每个的操作步骤 是相同的。
-
集群:* MSN2100-RCP-v1._x—cluster-HA-Breakout—LCDP*
-
存储:* MSN2100-RFP-v1.x-Storage*
以下示例操作步骤 显示了如何下载并应用集群交换机的RCF脚本。
示例命令输出使用交换机管理IP地址10.233.204.71、网络掩码255.255.254.0和默认网关10.233.204.1。
-
显示SN2100交换机上的可用接口:
admin@sw1:mgmt:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- ----- --- ----- ----------- ------------------ -------------- ... ... ADMDN swp1 N/A 9216 NotConfigured ADMDN swp2 N/A 9216 NotConfigured ADMDN swp3 N/A 9216 NotConfigured ADMDN swp4 N/A 9216 NotConfigured ADMDN swp5 N/A 9216 NotConfigured ADMDN swp6 N/A 9216 NotConfigured ADMDN swp7 N/A 9216 NotConfigured ADMDN swp8 N/A 9216 NotConfigured ADMDN swp9 N/A 9216 NotConfigured ADMDN swp10 N/A 9216 NotConfigured ADMDN swp11 N/A 9216 NotConfigured ADMDN swp12 N/A 9216 NotConfigured ADMDN swp13 N/A 9216 NotConfigured ADMDN swp14 N/A 9216 NotConfigured ADMDN swp15 N/A 9216 NotConfigured ADMDN swp16 N/A 9216 NotConfigured
-
将RCF python脚本复制到交换机。
cumulus@cumulus:mgmt:~$ cd /tmp cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP . ssologin@10.233.204.71's password: MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP 100% 8607 111.2KB/s 00:00
同时 scp
在本示例中、您可以使用首选的文件传输方法。 -
应用RCF pyPython脚本*MSN2100-RCP-v1._x—Cluster-HA-Breakout-LCDP*。
cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP [sudo] password for cumulus: ... Step 1: Creating the banner file Step 2: Registering banner message Step 3: Updating the MOTD file Step 4: Ensuring passwordless use of cl-support command by admin Step 5: Disabling apt-get Step 6: Creating the interfaces Step 7: Adding the interface config Step 8: Disabling cdp Step 9: Adding the lldp config Step 10: Adding the RoCE base config Step 11: Modifying RoCE Config Step 12: Configure SNMP Step 13: Reboot the switch
RCF脚本将完成上述示例中列出的步骤。
在步骤3*更新上面的MOTD文件*中,命令 cat /etc/motd
已运行。这样、您可以验证RCV文件名、RCV版本、要使用的端口以及RCV横幅中的其他重要信息。对于无法更正的任何RCF python脚本问题、请联系 "NetApp 支持" 以获得帮助。 -
将先前的所有自定义设置重新应用于交换机配置。"查看布线和配置注意事项"有关所需的任何进一步更改的详细信息、请参见。
-
重新启动后验证配置:
admin@sw1:mgmt:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- ----------------- -------- ... ... DN swp1s0 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s1 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s2 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s3 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s0 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s1 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s2 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s3 N/A 9216 Trunk/L2 Master: bridge(UP) UP swp3 100G 9216 Trunk/L2 Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 Master: bridge(UP) DN swp5 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp6 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp7 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp8 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp9 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp10 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp11 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp12 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp13 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp14 N/A 9216 Trunk/L2 Master: bridge(UP) UP swp15 N/A 9216 BondMember Master: bond_15_16(UP) UP swp16 N/A 9216 BondMember Master: bond_15_16(UP) ... ... admin@sw1:mgmt:~$ net show roce config RoCE mode.......... lossless Congestion Control: Enabled SPs.... 0 2 5 Mode........... ECN Min Threshold.. 150 KB Max Threshold.. 1500 KB PFC: Status......... enabled Enabled SPs.... 2 5 Interfaces......... swp10-16,swp1s0-3,swp2s0-3,swp3-9 DSCP 802.1p switch-priority ----------------------- ------ --------------- 0 1 2 3 4 5 6 7 0 0 8 9 10 11 12 13 14 15 1 1 16 17 18 19 20 21 22 23 2 2 24 25 26 27 28 29 30 31 3 3 32 33 34 35 36 37 38 39 4 4 40 41 42 43 44 45 46 47 5 5 48 49 50 51 52 53 54 55 6 6 56 57 58 59 60 61 62 63 7 7 switch-priority TC ETS --------------- -- -------- 0 1 3 4 6 7 0 DWRR 28% 2 2 DWRR 28% 5 5 DWRR 43%
-
验证接口中收发器的信息:
admin@sw1:mgmt:~$ net show interface pluggables Interface Identifier Vendor Name Vendor PN Vendor SN Vendor Rev --------- ------------- ----------- --------------- -------------- ---------- swp3 0x11 (QSFP28) Amphenol 112-00574 APF20379253516 B0 swp4 0x11 (QSFP28) AVAGO 332-00440 AF1815GU05Z A0 swp15 0x11 (QSFP28) Amphenol 112-00573 APF21109348001 B0 swp16 0x11 (QSFP28) Amphenol 112-00573 APF21109347895 B0
-
验证每个节点是否都与每个交换机建立了连接:
admin@sw1:mgmt:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ---------------------- ----------- swp3 100G Trunk/L2 sw1 e3a swp4 100G Trunk/L2 sw2 e3b swp15 100G BondMember sw13 swp15 swp16 100G BondMember sw14 swp16
-
验证集群上集群端口的运行状况。
-
验证集群中所有节点上的集群端口是否均已启动且运行正常:
cluster1::*> network port show -role cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false
-
从集群验证交换机运行状况(此操作可能不会显示交换机SW2、因为LIF不驻留在e0d上)。
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- --------- ---------- node1/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 - cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true Switch Type Address Model --------------------------- ------------------ ---------------- ----- sw1 cluster-network 10.233.205.90 MSN2100-CB2RC Serial Number: MNXXXXXXGD Is Monitored: true Reason: None Software Version: Cumulus Linux version 4.4.3 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP sw2 cluster-network 10.233.205.91 MSN2100-CB2RC Serial Number: MNCXXXXXXGS Is Monitored: true Reason: None Software Version: Cumulus Linux version 4.4.3 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP
-
-
显示SN2100交换机上的可用接口:
admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port- Type Summary ------------- ----- ----- ----- ------------------- ------------ --------- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up mgmt-sw1 Eth105/1/14 eth IP Address: 10.231.80 206/22 eth0 IP Address: fd20:8b1e:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cluster01 e0b swp . . . + swp15 9216 100G up sw2 swp15 swp + swp16 9216 100G up sw2 swp16 swp
-
将RCF python脚本复制到交换机。
cumulus@cumulus:mgmt:~$ cd /tmp cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP . ssologin@10.233.204.71's password: MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP 100% 8607 111.2KB/s 00:00
同时 scp
在本示例中、您可以使用首选的文件传输方法。 -
应用RCF pyPython脚本*MSN2100-RCP-v1._x—Cluster-HA-Breakout-LCDP*。
cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP [sudo] password for cumulus: . . Step 1: Creating the banner file Step 2: Registering banner message Step 3: Updating the MOTD file Step 4: Ensuring passwordless use of cl-support command by admin Step 5: Disabling apt-get Step 6: Creating the interfaces Step 7: Adding the interface config Step 8: Disabling cdp Step 9: Adding the lldp config Step 10: Adding the RoCE base config Step 11: Modifying RoCE Config Step 12: Configure SNMP Step 13: Reboot the switch
RCF脚本将完成上述示例中列出的步骤。
在步骤3*更新上面的MOTD文件*中,命令 cat /etc/issue
已运行。这样、您可以验证RCV文件名、RCV版本、要使用的端口以及RCV横幅中的其他重要信息。例如:
admin@sw1:mgmt:~$ cat /etc/issue ****************************************************************************** * * NetApp Reference Configuration File (RCF) * Switch : Mellanox MSN2100 * Filename : MSN2100-RCF-1._x_-Cluster-HA-Breakout-LLDP * Release Date : 13-02-2023 * Version : 1._x_-Cluster-HA-Breakout-LLDP * * Port Usage: * Port 1 : 4x10G Breakout mode for Cluster+HA Ports, swp1s0-3 * Port 2 : 4x25G Breakout mode for Cluster+HA Ports, swp2s0-3 * Ports 3-14 : 40/100G for Cluster+HA Ports, swp3-14 * Ports 15-16 : 100G Cluster ISL Ports, swp15-16 * * NOTE: * RCF manually sets swp1s0-3 link speed to 10000 and * auto-negotiation to off for Intel 10G * RCF manually sets swp2s0-3 link speed to 25000 and * auto-negotiation to off for Chelsio 25G * * * IMPORTANT: Perform the following steps to ensure proper RCF installation: * - Copy the RCF file to /tmp * - Ensure the file has execute permission * - From /tmp run the file as sudo python3 <filename> * ******************************************************************************
对于无法更正的任何RCF python脚本问题、请联系 "NetApp 支持" 以获得帮助。 -
将先前的所有自定义设置重新应用于交换机配置。"查看布线和配置注意事项"有关所需的任何进一步更改的详细信息、请参见。
-
重新启动后验证配置:
admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port Type Summary ----------- ----- ----- ----- ----------- ----------- ---- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up RTP-LF01-410G38.rtp.eng.netapp.com Eth105/1/14 eth IP Address: 10.231.80.206/22 eth0 IP Address: fd20:8b1e:b255:85a0:bace:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cumulus1 e0b swp . . . + swp15 9216 100G up cumulus swp15 swp admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port- Type Summary ------------- ----- ----- ----- ------------------- ------------ --------- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up mgmt-sw1 Eth105/1/14 eth IP Address: 10.231.80 206/22 eth0 IP Address: fd20:8b1e:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cluster01 e0b swp . . . + swp15 9216 100G up sw2 swp15 swp + swp16 9216 100G up sw2 swp16 swp admin@sw1:mgmt:~$ nv show qos roce operational applied description ----------------- ----------- --------- ---------------------------------------- enable on Turn feature 'on' or 'off'. This feature is disabled by default. mode lossless lossless Roce Mode congestion-control congestion-mode ECN,RED Congestion config mode enabled-tc 0,2,5 Congestion config enabled Traffic Class max-threshold 195.31 KB Congestion config max-threshold min-threshold 39.06 KB Congestion config min-threshold probability 100 lldp-app-tlv priority 3 switch-priority of roce protocol-id 4791 L4 port number selector UDP L4 protocol pfc pfc-priority 2, 5 switch-prio on which PFC is enabled rx-enabled enabled PFC Rx Enabled status tx-enabled enabled PFC Tx Enabled status trust trust-mode pcp,dscp Trust Setting on the port for packet classification RoCE PCP/DSCP->SP mapping configurations =========================================== pcp dscp switch-prio -- --- ----------------------- ----------- 0 0 0,1,2,3,4,5,6,7 0 1 1 8,9,10,11,12,13,14,15 1 2 2 16,17,18,19,20,21,22,23 2 3 3 24,25,26,27,28,29,30,31 3 4 4 32,33,34,35,36,37,38,39 4 5 5 40,41,42,43,44,45,46,47 5 6 6 48,49,50,51,52,53,54,55 6 7 7 56,57,58,59,60,61,62,63 7 RoCE SP->TC mapping and ETS configurations ============================================= switch-prio traffic-class scheduler-weight -- ----------- ------------- ---------------- 0 0 0 DWRR-28% 1 1 0 DWRR-28% 2 2 2 DWRR-28% 3 3 0 DWRR-28% 4 4 0 DWRR-28% 5 5 5 DWRR-43% 6 6 0 DWRR-28% 7 7 0 DWRR-28% RoCE pool config =================== name mode size switch-priorities traffic-class -- --------------------- ------- ---- ----------------- ------------- 0 lossy-default-ingress Dynamic 50% 0,1,3,4,6,7 - 1 roce-reserved-ingress Dynamic 50% 2,5 - 2 lossy-default-egress Dynamic 50% - 0 3 roce-reserved-egress Dynamic inf - 2,5 Exception List ================= description -- -----------------------------------------------------------------------… 1 RoCE PFC Priority Mismatch.Expected pfc-priority: 3. 2 Congestion Config TC Mismatch.Expected enabled-tc: 0,3. 3 Congestion Config mode Mismatch.Expected congestion-mode: ECN. 4 Congestion Config min-threshold Mismatch.Expected min-threshold: 150000. 5 Congestion Config max-threshold Mismatch.Expected max-threshold: 1500000. 6 Scheduler config mismatch for traffic-class mapped to switch-prio0. Expected scheduler-weight: DWRR-50%. 7 Scheduler config mismatch for traffic-class mapped to switch-prio1. Expected scheduler-weight: DWRR-50%. 8 Scheduler config mismatch for traffic-class mapped to switch-prio2. Expected scheduler-weight: DWRR-50%. 9 Scheduler config mismatch for traffic-class mapped to switch-prio3. Expected scheduler-weight: DWRR-50%. 10 Scheduler config mismatch for traffic-class mapped to switch-prio4. Expected scheduler-weight: DWRR-50%. 11 Scheduler config mismatch for traffic-class mapped to switch-prio5. Expected scheduler-weight: DWRR-50%. 12 Scheduler config mismatch for traffic-class mapped to switch-prio6. Expected scheduler-weight: strict-priority. 13 Scheduler config mismatch for traffic-class mapped to switch-prio7. Expected scheduler-weight: DWRR-50%. 14 Invalid reserved config for ePort.TC[2].Expected 0 Got 1024 15 Invalid reserved config for ePort.TC[5].Expected 0 Got 1024 16 Invalid traffic-class mapping for switch-priority 2.Expected 0 Got 2 17 Invalid traffic-class mapping for switch-priority 3.Expected 3 Got 0 18 Invalid traffic-class mapping for switch-priority 5.Expected 0 Got 5 19 Invalid traffic-class mapping for switch-priority 6.Expected 6 Got 0 Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
列出的例外不会影响性能、可以放心地忽略。 -
验证接口中收发器的信息:
admin@sw1:mgmt:~$ nv show interface --view=pluggables Interface Identifier Vendor Name Vendor PN Vendor SN Vendor Rev --------- ------------- ----------- --------------- -------------- ---------- swp1s0 0x00 None swp1s1 0x00 None swp1s2 0x00 None swp1s3 0x00 None swp2s0 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s1 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s2 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s3 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp3 0x00 None swp4 0x00 None swp5 0x00 None swp6 0x00 None . . . swp15 0x11 (QSFP28) Amphenol 112-00595 APF20279210117 B0 swp16 0x11 (QSFP28) Amphenol 112-00595 APF20279210166 B0
-
验证每个节点是否都与每个交换机建立了连接:
admin@sw1:mgmt:~$ nv show interface --view=lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ----------------------- ----------- eth0 100M Mgmt mgmt-sw1 Eth110/1/29 swp2s1 25G Trunk/L2 node1 e0a swp15 100G BondMember sw2 swp15 swp16 100G BondMember sw2 swp16
-
验证集群上集群端口的运行状况。
-
验证集群中所有节点上的集群端口是否均已启动且运行正常:
cluster1::*> network port show -role cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false
-
从集群验证交换机运行状况(此操作可能不会显示交换机SW2、因为LIF不驻留在e0d上)。
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- --------- ---------- node1/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 - cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true Switch Type Address Model --------------------------- ------------------ ---------------- ----- sw1 cluster-network 10.233.205.90 MSN2100-CB2RC Serial Number: MNXXXXXXGD Is Monitored: true Reason: None Software Version: Cumulus Linux version 5.4.0 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP sw2 cluster-network 10.233.205.91 MSN2100-CB2RC Serial Number: MNCXXXXXXGS Is Monitored: true Reason: None Software Version: Cumulus Linux version 5.4.0 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP
-
"安装CSHM文件"(英文)