Instale o script RCF (Reference Configuration File)
Siga este procedimento para instalar o script RCF.
Antes de instalar o script RCF, certifique-se de que o seguinte está disponível no switch:
-
Cumulus Linux está instalado. Consulte "Hardware Universe" para obter as versões suportadas.
-
Endereço IP, máscara de sub-rede e gateway padrão definido via DHCP ou configurado manualmente.
Você deve especificar um usuário no RCF (além do usuário admin) para ser usado especificamente para a coleção de logs. |
Há dois scripts RCF disponíveis para aplicativos de cluster e armazenamento. Baixe RCFs de "aqui". O procedimento para cada um é o mesmo.
-
Cluster: MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP
-
Armazenamento: MSN2100-RCF-v1.x-Storage
O procedimento de exemplo a seguir mostra como baixar e aplicar o script RCF para switches de cluster.
Exemplo de saída de comando usa o endereço IP de gerenciamento de switch 10.233.204.71, máscara de rede 255.255.254.0 e gateway padrão 10.233.204.1.
-
Apresentar as interfaces disponíveis no interrutor SN2100:
admin@sw1:mgmt:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- ----- --- ----- ----------- ------------------ -------------- ... ... ADMDN swp1 N/A 9216 NotConfigured ADMDN swp2 N/A 9216 NotConfigured ADMDN swp3 N/A 9216 NotConfigured ADMDN swp4 N/A 9216 NotConfigured ADMDN swp5 N/A 9216 NotConfigured ADMDN swp6 N/A 9216 NotConfigured ADMDN swp7 N/A 9216 NotConfigured ADMDN swp8 N/A 9216 NotConfigured ADMDN swp9 N/A 9216 NotConfigured ADMDN swp10 N/A 9216 NotConfigured ADMDN swp11 N/A 9216 NotConfigured ADMDN swp12 N/A 9216 NotConfigured ADMDN swp13 N/A 9216 NotConfigured ADMDN swp14 N/A 9216 NotConfigured ADMDN swp15 N/A 9216 NotConfigured ADMDN swp16 N/A 9216 NotConfigured
-
Copie o script Python do RCF para o switch.
cumulus@cumulus:mgmt:~$ cd /tmp cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP . ssologin@10.233.204.71's password: MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP 100% 8607 111.2KB/s 00:00
Enquanto scp
for usado no exemplo, você pode usar o método preferido de transferência de arquivos. -
Aplique o script Python RCF MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP.
cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP [sudo] password for cumulus: ... Step 1: Creating the banner file Step 2: Registering banner message Step 3: Updating the MOTD file Step 4: Ensuring passwordless use of cl-support command by admin Step 5: Disabling apt-get Step 6: Creating the interfaces Step 7: Adding the interface config Step 8: Disabling cdp Step 9: Adding the lldp config Step 10: Adding the RoCE base config Step 11: Modifying RoCE Config Step 12: Configure SNMP Step 13: Reboot the switch
O script RCF completa as etapas listadas no exemplo acima.
No passo 3 Atualizando o arquivo MOTD acima, o comando cat /etc/motd
é executado. Isso permite verificar o nome do arquivo RCF, a versão RCF, as portas a usar e outras informações importantes no banner RCF.Para quaisquer problemas de script Python do RCF que não possam ser corrigidos, entre em Contato "Suporte à NetApp" para obter assistência. -
Reaplique quaisquer personalizações anteriores à configuração do switch. "Analise as considerações sobre cabeamento e configuração"Consulte para obter detalhes sobre quaisquer alterações adicionais necessárias.
-
Verifique a configuração após a reinicialização:
admin@sw1:mgmt:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- --------- ---- ----- ---------- ----------------- -------- ... ... DN swp1s0 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s1 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s2 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp1s3 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s0 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s1 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s2 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp2s3 N/A 9216 Trunk/L2 Master: bridge(UP) UP swp3 100G 9216 Trunk/L2 Master: bridge(UP) UP swp4 100G 9216 Trunk/L2 Master: bridge(UP) DN swp5 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp6 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp7 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp8 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp9 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp10 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp11 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp12 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp13 N/A 9216 Trunk/L2 Master: bridge(UP) DN swp14 N/A 9216 Trunk/L2 Master: bridge(UP) UP swp15 N/A 9216 BondMember Master: bond_15_16(UP) UP swp16 N/A 9216 BondMember Master: bond_15_16(UP) ... ... admin@sw1:mgmt:~$ net show roce config RoCE mode.......... lossless Congestion Control: Enabled SPs.... 0 2 5 Mode........... ECN Min Threshold.. 150 KB Max Threshold.. 1500 KB PFC: Status......... enabled Enabled SPs.... 2 5 Interfaces......... swp10-16,swp1s0-3,swp2s0-3,swp3-9 DSCP 802.1p switch-priority ----------------------- ------ --------------- 0 1 2 3 4 5 6 7 0 0 8 9 10 11 12 13 14 15 1 1 16 17 18 19 20 21 22 23 2 2 24 25 26 27 28 29 30 31 3 3 32 33 34 35 36 37 38 39 4 4 40 41 42 43 44 45 46 47 5 5 48 49 50 51 52 53 54 55 6 6 56 57 58 59 60 61 62 63 7 7 switch-priority TC ETS --------------- -- -------- 0 1 3 4 6 7 0 DWRR 28% 2 2 DWRR 28% 5 5 DWRR 43%
-
Verifique as informações do transcetor na interface:
admin@sw1:mgmt:~$ net show interface pluggables Interface Identifier Vendor Name Vendor PN Vendor SN Vendor Rev --------- ------------- ----------- --------------- -------------- ---------- swp3 0x11 (QSFP28) Amphenol 112-00574 APF20379253516 B0 swp4 0x11 (QSFP28) AVAGO 332-00440 AF1815GU05Z A0 swp15 0x11 (QSFP28) Amphenol 112-00573 APF21109348001 B0 swp16 0x11 (QSFP28) Amphenol 112-00573 APF21109347895 B0
-
Verifique se os nós têm uma conexão com cada switch:
admin@sw1:mgmt:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ---------------------- ----------- swp3 100G Trunk/L2 sw1 e3a swp4 100G Trunk/L2 sw2 e3b swp15 100G BondMember sw13 swp15 swp16 100G BondMember sw14 swp16
-
Verifique a integridade das portas de cluster no cluster.
-
Verifique se as portas do cluster estão ativas e íntegras em todos os nós do cluster:
cluster1::*> network port show -role cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false
-
Verifique a integridade do switch a partir do cluster (isso pode não mostrar o switch SW2, uma vez que LIFs não são homed em e0d).
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- --------- ---------- node1/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 - cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true Switch Type Address Model --------------------------- ------------------ ---------------- ----- sw1 cluster-network 10.233.205.90 MSN2100-CB2RC Serial Number: MNXXXXXXGD Is Monitored: true Reason: None Software Version: Cumulus Linux version 4.4.3 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP sw2 cluster-network 10.233.205.91 MSN2100-CB2RC Serial Number: MNCXXXXXXGS Is Monitored: true Reason: None Software Version: Cumulus Linux version 4.4.3 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP
-
-
Apresentar as interfaces disponíveis no interrutor SN2100:
admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port- Type Summary ------------- ----- ----- ----- ------------------- ------------ --------- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up mgmt-sw1 Eth105/1/14 eth IP Address: 10.231.80 206/22 eth0 IP Address: fd20:8b1e:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cluster01 e0b swp . . . + swp15 9216 100G up sw2 swp15 swp + swp16 9216 100G up sw2 swp16 swp
-
Copie o script Python do RCF para o switch.
cumulus@cumulus:mgmt:~$ cd /tmp cumulus@cumulus:mgmt:/tmp$ scp <user>@<host:/<path>/MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP . ssologin@10.233.204.71's password: MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP 100% 8607 111.2KB/s 00:00
Enquanto scp
for usado no exemplo, você pode usar o método preferido de transferência de arquivos. -
Aplique o script Python RCF MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP.
cumulus@cumulus:mgmt:/tmp$ sudo python3 MSN2100-RCF-v1.x-Cluster-HA-Breakout-LLDP [sudo] password for cumulus: . . Step 1: Creating the banner file Step 2: Registering banner message Step 3: Updating the MOTD file Step 4: Ensuring passwordless use of cl-support command by admin Step 5: Disabling apt-get Step 6: Creating the interfaces Step 7: Adding the interface config Step 8: Disabling cdp Step 9: Adding the lldp config Step 10: Adding the RoCE base config Step 11: Modifying RoCE Config Step 12: Configure SNMP Step 13: Reboot the switch
O script RCF completa as etapas listadas no exemplo acima.
No passo 3 Atualizando o arquivo MOTD acima, o comando cat /etc/issue
é executado. Isso permite verificar o nome do arquivo RCF, a versão RCF, as portas a usar e outras informações importantes no banner RCF.Por exemplo:
admin@sw1:mgmt:~$ cat /etc/issue ****************************************************************************** * * NetApp Reference Configuration File (RCF) * Switch : Mellanox MSN2100 * Filename : MSN2100-RCF-1._x_-Cluster-HA-Breakout-LLDP * Release Date : 13-02-2023 * Version : 1._x_-Cluster-HA-Breakout-LLDP * * Port Usage: * Port 1 : 4x10G Breakout mode for Cluster+HA Ports, swp1s0-3 * Port 2 : 4x25G Breakout mode for Cluster+HA Ports, swp2s0-3 * Ports 3-14 : 40/100G for Cluster+HA Ports, swp3-14 * Ports 15-16 : 100G Cluster ISL Ports, swp15-16 * * NOTE: * RCF manually sets swp1s0-3 link speed to 10000 and * auto-negotiation to off for Intel 10G * RCF manually sets swp2s0-3 link speed to 25000 and * auto-negotiation to off for Chelsio 25G * * * IMPORTANT: Perform the following steps to ensure proper RCF installation: * - Copy the RCF file to /tmp * - Ensure the file has execute permission * - From /tmp run the file as sudo python3 <filename> * ******************************************************************************
Para quaisquer problemas de script Python do RCF que não possam ser corrigidos, entre em Contato "Suporte à NetApp" para obter assistência. -
Reaplique quaisquer personalizações anteriores à configuração do switch. "Analise as considerações sobre cabeamento e configuração"Consulte para obter detalhes sobre quaisquer alterações adicionais necessárias.
-
Verifique a configuração após a reinicialização:
admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port Type Summary ----------- ----- ----- ----- ----------- ----------- ---- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up RTP-LF01-410G38.rtp.eng.netapp.com Eth105/1/14 eth IP Address: 10.231.80.206/22 eth0 IP Address: fd20:8b1e:b255:85a0:bace:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cumulus1 e0b swp . . . + swp15 9216 100G up cumulus swp15 swp admin@sw1:mgmt:~$ nv show interface Interface MTU Speed State Remote Host Remote Port- Type Summary ------------- ----- ----- ----- ------------------- ------------ --------- ------------- + cluster_isl 9216 200G up bond + eth0 1500 100M up mgmt-sw1 Eth105/1/14 eth IP Address: 10.231.80 206/22 eth0 IP Address: fd20:8b1e:f6ff:fe31:4a0e/64 + lo 65536 up loopback IP Address: 127.0.0.1/8 lo IP Address: ::1/128 + swp1s0 9216 10G up cluster01 e0b swp . . . + swp15 9216 100G up sw2 swp15 swp + swp16 9216 100G up sw2 swp16 swp admin@sw1:mgmt:~$ nv show qos roce operational applied description ----------------- ----------- --------- ---------------------------------------- enable on Turn feature 'on' or 'off'. This feature is disabled by default. mode lossless lossless Roce Mode congestion-control congestion-mode ECN,RED Congestion config mode enabled-tc 0,2,5 Congestion config enabled Traffic Class max-threshold 195.31 KB Congestion config max-threshold min-threshold 39.06 KB Congestion config min-threshold probability 100 lldp-app-tlv priority 3 switch-priority of roce protocol-id 4791 L4 port number selector UDP L4 protocol pfc pfc-priority 2, 5 switch-prio on which PFC is enabled rx-enabled enabled PFC Rx Enabled status tx-enabled enabled PFC Tx Enabled status trust trust-mode pcp,dscp Trust Setting on the port for packet classification RoCE PCP/DSCP->SP mapping configurations =========================================== pcp dscp switch-prio -- --- ----------------------- ----------- 0 0 0,1,2,3,4,5,6,7 0 1 1 8,9,10,11,12,13,14,15 1 2 2 16,17,18,19,20,21,22,23 2 3 3 24,25,26,27,28,29,30,31 3 4 4 32,33,34,35,36,37,38,39 4 5 5 40,41,42,43,44,45,46,47 5 6 6 48,49,50,51,52,53,54,55 6 7 7 56,57,58,59,60,61,62,63 7 RoCE SP->TC mapping and ETS configurations ============================================= switch-prio traffic-class scheduler-weight -- ----------- ------------- ---------------- 0 0 0 DWRR-28% 1 1 0 DWRR-28% 2 2 2 DWRR-28% 3 3 0 DWRR-28% 4 4 0 DWRR-28% 5 5 5 DWRR-43% 6 6 0 DWRR-28% 7 7 0 DWRR-28% RoCE pool config =================== name mode size switch-priorities traffic-class -- --------------------- ------- ---- ----------------- ------------- 0 lossy-default-ingress Dynamic 50% 0,1,3,4,6,7 - 1 roce-reserved-ingress Dynamic 50% 2,5 - 2 lossy-default-egress Dynamic 50% - 0 3 roce-reserved-egress Dynamic inf - 2,5 Exception List ================= description -- -----------------------------------------------------------------------… 1 RoCE PFC Priority Mismatch.Expected pfc-priority: 3. 2 Congestion Config TC Mismatch.Expected enabled-tc: 0,3. 3 Congestion Config mode Mismatch.Expected congestion-mode: ECN. 4 Congestion Config min-threshold Mismatch.Expected min-threshold: 150000. 5 Congestion Config max-threshold Mismatch.Expected max-threshold: 1500000. 6 Scheduler config mismatch for traffic-class mapped to switch-prio0. Expected scheduler-weight: DWRR-50%. 7 Scheduler config mismatch for traffic-class mapped to switch-prio1. Expected scheduler-weight: DWRR-50%. 8 Scheduler config mismatch for traffic-class mapped to switch-prio2. Expected scheduler-weight: DWRR-50%. 9 Scheduler config mismatch for traffic-class mapped to switch-prio3. Expected scheduler-weight: DWRR-50%. 10 Scheduler config mismatch for traffic-class mapped to switch-prio4. Expected scheduler-weight: DWRR-50%. 11 Scheduler config mismatch for traffic-class mapped to switch-prio5. Expected scheduler-weight: DWRR-50%. 12 Scheduler config mismatch for traffic-class mapped to switch-prio6. Expected scheduler-weight: strict-priority. 13 Scheduler config mismatch for traffic-class mapped to switch-prio7. Expected scheduler-weight: DWRR-50%. 14 Invalid reserved config for ePort.TC[2].Expected 0 Got 1024 15 Invalid reserved config for ePort.TC[5].Expected 0 Got 1024 16 Invalid traffic-class mapping for switch-priority 2.Expected 0 Got 2 17 Invalid traffic-class mapping for switch-priority 3.Expected 3 Got 0 18 Invalid traffic-class mapping for switch-priority 5.Expected 0 Got 5 19 Invalid traffic-class mapping for switch-priority 6.Expected 6 Got 0 Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup Incomplete Command: set interface swp3-16 link fast-linkupp3-16 link fast-linkup
As exceções listadas não afetam o desempenho e podem ser ignoradas com segurança. -
Verifique as informações do transcetor na interface:
admin@sw1:mgmt:~$ nv show interface --view=pluggables Interface Identifier Vendor Name Vendor PN Vendor SN Vendor Rev --------- ------------- ----------- --------------- -------------- ---------- swp1s0 0x00 None swp1s1 0x00 None swp1s2 0x00 None swp1s3 0x00 None swp2s0 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s1 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s2 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp2s3 0x11 (QSFP28) CISCO-LEONI L45593-D278-D20 LCC2321GTTJ 00 swp3 0x00 None swp4 0x00 None swp5 0x00 None swp6 0x00 None . . . swp15 0x11 (QSFP28) Amphenol 112-00595 APF20279210117 B0 swp16 0x11 (QSFP28) Amphenol 112-00595 APF20279210166 B0
-
Verifique se os nós têm uma conexão com cada switch:
admin@sw1:mgmt:~$ nv show interface --view=lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ---------- ----------------------- ----------- eth0 100M Mgmt mgmt-sw1 Eth110/1/29 swp2s1 25G Trunk/L2 node1 e0a swp15 100G BondMember sw2 swp15 swp16 100G BondMember sw2 swp16
-
Verifique a integridade das portas de cluster no cluster.
-
Verifique se as portas do cluster estão ativas e íntegras em todos os nós do cluster:
cluster1::*> network port show -role cluster Node: node1 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false Node: node2 Ignore Speed(Mbps) Health Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status --------- ------------ ---------------- ---- ---- ----------- -------- ------ e3a Cluster Cluster up 9000 auto/10000 healthy false e3b Cluster Cluster up 9000 auto/10000 healthy false
-
Verifique a integridade do switch a partir do cluster (isso pode não mostrar o switch SW2, uma vez que LIFs não são homed em e0d).
cluster1::*> network device-discovery show -protocol lldp Node/ Local Discovered Protocol Port Device (LLDP: ChassisID) Interface Platform ----------- ------ ------------------------- --------- ---------- node1/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp3 - e3b sw2 (b8:ce:f6:19:1b:96) swp3 - node2/lldp e3a sw1 (b8:ce:f6:19:1a:7e) swp4 - e3b sw2 (b8:ce:f6:19:1b:96) swp4 - cluster1::*> system switch ethernet show -is-monitoring-enabled-operational true Switch Type Address Model --------------------------- ------------------ ---------------- ----- sw1 cluster-network 10.233.205.90 MSN2100-CB2RC Serial Number: MNXXXXXXGD Is Monitored: true Reason: None Software Version: Cumulus Linux version 5.4.0 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP sw2 cluster-network 10.233.205.91 MSN2100-CB2RC Serial Number: MNCXXXXXXGS Is Monitored: true Reason: None Software Version: Cumulus Linux version 5.4.0 running on Mellanox Technologies Ltd. MSN2100 Version Source: LLDP
-