NVMe-oF Host Configuration for SUSE Linux Enterprise Server 15 SP3 with ONTAP
NVMe over Fabrics or NVMe-oF (including NVMe/FC and other transports) is supported for SUSE Linux Enterprise Server 15 SP3 with ANA (Asymmetric Namespace Access). ANA is the ALUA equivalent in NVMe-oF environments, and is currently implemented with in-kernel NVMe Multipath. Using this procedure, you can enable NVMe-oF with in-kernel NVMe Multipath using ANA on SUSE Linux Enterprise Server 15 SP3 and ONTAP as the target.
Refer to the NetApp Interoperability Matrix for accurate details regarding supported configurations.
Features
-
SUSE Linux Enterprise Server 15 SP3 supports NVMe/FC and other transports.
-
There is no sanlun support for NVMe-oF. Therefore, there is no LUHU support for NVMe-oF on SUSE Linux Enterprise Server 15 SP3. You can rely on the NetApp plug-in included in the native nvme-cli package for NVMe-oF. This should support all NVMe-oF transports.
-
Both NVMe and SCSI traffic can be run on the same co-existent host. In fact, that is expected to be the commonly deployed host config for customers. Therefore, for SCSI, you may configure
dm-multipath
as usual for SCSI LUNs resulting in mpath devices, whereas NVMe multipath might be used to configure NVMe-oF multipath devices on the host.
Known limitations
SAN booting using the NVMe-oF protocol is currently not supported.
Enable in-kernel NVMe Multipath
In-kernel NVMe multipath is already enabled by default on SUSE Linux Enterprise Server hosts, such as SUSE Linux Enterprise Server 15 SP3. Therefore, no additional setting is required here. Refer to the NetApp Interoperability Matrix for accurate details regarding supported configurations.
NVMe-oF initiator packages
Refer to the NetApp Interoperability Matrix for accurate details regarding supported configurations.
-
Verify that you have the requisite kernel & nvme-cli MU packages installed on the SUSE Linux Enterprise Server 15 SP3 MU host.
Example:
# uname -r 5.3.18-59.5-default # rpm -qa|grep nvme-cli nvme-cli-1.13-3.3.1.x86_64
The above nvme-cli MU package now includes the following:
-
NVMe/FC auto-connect scripts - Required for NVMe/FC auto-(re)connect when underlying paths to the namespaces are restored as well as during the host reboot:
# rpm -ql nvme-cli-1.13-3.3.1.x86_64 /etc/nvme /etc/nvme/hostid /etc/nvme/hostnqn /usr/lib/systemd/system/nvmefc-boot-connections.service /usr/lib/systemd/system/nvmefc-connect.target /usr/lib/systemd/system/nvmefc-connect@.service ...
-
ONTAP udev rule - New udev rule to ensure NVMe multipath round-robin loadbalancer default applies to all ONTAP namespaces:
# rpm -ql nvme-cli-1.13-3.3.1.x86_64 /etc/nvme /etc/nvme/hostid /etc/nvme/hostnqn /usr/lib/systemd/system/nvmefc-boot-connections.service /usr/lib/systemd/system/nvmf-autoconnect.service /usr/lib/systemd/system/nvmf-connect.target /usr/lib/systemd/system/nvmf-connect@.service /usr/lib/udev/rules.d/70-nvmf-autoconnect.rules /usr/lib/udev/rules.d/71-nvmf-iopolicy-netapp.rules ... # cat /usr/lib/udev/rules.d/71-nvmf-iopolicy-netapp.rules # Enable round-robin for NetApp ONTAP and NetApp E-Series ACTION=="add", SUBSYSTEM=="nvme-subsystem", ATTR{model}=="NetApp ONTAP Controller", ATTR{iopolicy}="round-robin" ACTION=="add", SUBSYSTEM=="nvme-subsystem", ATTR{model}=="NetApp E-Series", ATTR{iopolicy}="round-robin"
-
NetApp plug-in for ONTAP devices - The existing NetApp plug-in has now been modified to handle ONTAP namespaces as well.
-
-
Check the hostnqn string at
/etc/nvme/hostnqn
on the host and ensure that it properly matches with the hostnqn string for the corresponding subsystem on the ONTAP array. For example,# cat /etc/nvme/hostnqn nqn.2014-08.org.nvmexpress:uuid:3ca559e1-5588-4fc4-b7d6-5ccfb0b9f054 ::> vserver nvme subsystem host show -vserver vs_fcnvme_145 Vserver Subsystem Host NQN ------- --------- ---------------------------------- vs_nvme_145 nvme_145_1 nqn.2014-08.org.nvmexpress:uuid:c7b07b16-a22e-41a6-a1fd-cf8262c8713f nvme_145_2 nqn.2014-08.org.nvmexpress:uuid:c7b07b16-a22e-41a6-a1fd-cf8262c8713f nvme_145_3 nqn.2014-08.org.nvmexpress:uuid:c7b07b16-a22e-41a6-a1fd-cf8262c8713f nvme_145_4 nqn.2014-08.org.nvmexpress:uuid:c7b07b16-a22e-41a6-a1fd-cf8262c8713f nvme_145_5 nqn.2014-08.org.nvmexpress:uuid:c7b07b16-a22e-41a6-a1fd-cf8262c8713f 5 entries were displayed.
Proceed with the below steps depending on the FC adapter being used on the host.
Configure NVMe/FC
Broadcom/Emulex
-
Verify that you have the recommended adapter and firmware versions. For example,
# cat /sys/class/scsi_host/host*/modelname LPe32002-M2 LPe32002-M2 # cat /sys/class/scsi_host/host*/modeldesc Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter # cat /sys/class/scsi_host/host*/fwrev 12.8.340.8, sli-4:2:c 12.8.840.8, sli-4:2:c
-
The newer lpfc drivers (both inbox and outbox) already have lpfc_enable_fc4_type default set to 3, therefore, you no longer need to set this explicitly in the
/etc/modprobe.d/lpfc.conf
, and recreate theinitrd
. Thelpfc nvme
support is already enabled by default:# cat /sys/module/lpfc/parameters/lpfc_enable_fc4_type 3
-
The existing native inbox lpfc driver is already the latest and compatible with NVMe/FC. Therefore, you do not need to install the lpfc oob driver.
# cat /sys/module/lpfc/version 0:12.8.0.10
-
-
Verify that the initiator ports are up and running:
# cat /sys/class/fc_host/host*/port_name 0x100000109b579d5e 0x100000109b579d5f # cat /sys/class/fc_host/host*/port_state Online Online
-
Verify that the NVMe/FC initiator ports are enabled, you are able to see the target ports, and all ports are up and running.
In the following example, only one initiator port is enabled and connected with two target LIFs:# cat /sys/class/scsi_host/host*/nvme_info NVME Initiator Enabled XRI Dist lpfc0 Total 6144 IO 5894 ELS 250 NVME LPORT lpfc0 WWPN x100000109b579d5e WWNN x200000109b579d5e DID x011c00 ONLINE NVME RPORT WWPN x208400a098dfdd91 WWNN x208100a098dfdd91 DID x011503 TARGET DISCSRVC ONLINE NVME RPORT WWPN x208500a098dfdd91 WWNN x208100a098dfdd91 DID x010003 TARGET DISCSRVC ONLINE NVME Statistics LS: Xmt 0000000e49 Cmpl 0000000e49 Abort 00000000 LS XMIT: Err 00000000 CMPL: xb 00000000 Err 00000000 Total FCP Cmpl 000000003ceb594f Issue 000000003ce65dbe OutIO fffffffffffb046f abort 00000bd2 noxri 00000000 nondlp 00000000 qdepth 00000000 wqerr 00000000 err 00000000 FCP CMPL: xb 000014f4 Err 00012abd NVME Initiator Enabled XRI Dist lpfc1 Total 6144 IO 5894 ELS 250 NVME LPORT lpfc1 WWPN x100000109b579d5f WWNN x200000109b579d5f DID x011b00 ONLINE NVME RPORT WWPN x208300a098dfdd91 WWNN x208100a098dfdd91 DID x010c03 TARGET DISCSRVC ONLINE NVME RPORT WWPN x208200a098dfdd91 WWNN x208100a098dfdd91 DID x012a03 TARGET DISCSRVC ONLINE NVME Statistics LS: Xmt 0000000e50 Cmpl 0000000e50 Abort 00000000 LS XMIT: Err 00000000 CMPL: xb 00000000 Err 00000000 Total FCP Cmpl 000000003c9859ca Issue 000000003c93515e OutIO fffffffffffaf794 abort 00000b73 noxri 00000000 nondlp 00000000 qdepth 00000000 wqerr 00000000 err 00000000 FCP CMPL: xb 0000159d Err 000135c3
-
Reboot the host.
Enable 1MB I/O Size (Optional)
ONTAP reports an MDTS (Max Data Transfer Size) of 8 in the Identify Controller data which means the maximum I/O request size should be up to 1 MB. However, to issue I/O requests of size 1 MB for the Broadcom NVMe/FC host, the lpfc parameter lpfc_sg_seg_cnt
should also be bumped up to 256 from the default value of 64. Use the following instructions to do so:
-
Append the value 256 in the respective
modprobe lpfc.conf
file:# cat /etc/modprobe.d/lpfc.conf options lpfc lpfc_sg_seg_cnt=256
-
Run a
dracut -f
command, and reboot the host. -
After reboot, verify that the above setting has been applied by checking the corresponding sysfs value:
# cat /sys/module/lpfc/parameters/lpfc_sg_seg_cnt 256
Now the Broadcom NVMe/FC host should be able to send up 1MB I/O requests on the ONTAP namespace devices.
Marvell/QLogic
The native inbox qla2xxx driver included in the newer SUSE Linux Enterprise Server 15 SP3 MU kernel has the latest upstream fixes. These fixes are essential for ONTAP support.
-
Verify that you are running the supported adapter driver and firmware versions, for example:
# cat /sys/class/fc_host/host*/symbolic_name QLE2742 FW:v9.06.02 DVR:v10.02.00.106-k QLE2742 FW:v9.06.02 DVR:v10.02.00.106-k
-
Verify
ql2xnvmeenable
is set which enables the Marvell adapter to function as a NVMe/FC initiator:# cat /sys/module/qla2xxx/parameters/ql2xnvmeenable
1
Configure NVMe/TCP
Unlike NVMe/FC, NVMe/TCP has no auto-connect functionality. This manifests two major limitations on the Linux NVMe/TCP host:
-
No auto-reconnect after paths get reinstated NVMe/TCP cannot automatically reconnect to a path that is reinstated beyond the default
ctrl-loss-tmo
timer of 10 minutes following a path down. -
No auto-connect during host bootup NVMe/TCP cannot automatically connect during host bootup as well.
You should set the retry period for failover events to at least 30 minutes to prevent timeouts. You can increase the retry period by increasing the value of the ctrl_loss_tmo timer. Following are the details:
-
Verify whether the initiator port can fetch the discovery log page data across the supported NVMe/TCP LIFs:
# nvme discover -t tcp -w 192.168.1.8 -a 192.168.1.51 Discovery Log Number of Records 10, Generation counter 119 =====Discovery Log Entry 0====== trtype: tcp adrfam: ipv4 subtype: nvme subsystem treq: not specified portid: 0 trsvcid: 4420 subnqn: nqn.1992-08.com.netapp:sn.56e362e9bb4f11ebbaded039ea165abc:subsystem.nvme_118_tcp_1 traddr: 192.168.2.56 sectype: none =====Discovery Log Entry 1====== trtype: tcp adrfam: ipv4 subtype: nvme subsystem treq: not specified portid: 1 trsvcid: 4420 subnqn: nqn.1992-08.com.netapp:sn.56e362e9bb4f11ebbaded039ea165abc:subsystem.nvme_118_tcp_1 traddr: 192.168.1.51 sectype: none =====Discovery Log Entry 2====== trtype: tcp adrfam: ipv4 subtype: nvme subsystem treq: not specified portid: 0 trsvcid: 4420 subnqn: nqn.1992-08.com.netapp:sn.56e362e9bb4f11ebbaded039ea165abc:subsystem.nvme_118_tcp_2 traddr: 192.168.2.56 sectype: none ...
-
Verify that other NVMe/TCP initiator-target LIF combos are able to successfully fetch discovery log page data. For example,
# nvme discover -t tcp -w 192.168.1.8 -a 192.168.1.52 # nvme discover -t tcp -w 192.168.2.9 -a 192.168.2.56 # nvme discover -t tcp -w 192.168.2.9 -a 192.168.2.57
-
Run
nvme connect-all
command across all the supported NVMe/TCP initiator-target LIFs across the nodes. Ensure you set a longerctrl_loss_tmo
timer retry period (for example, 30 minutes, which can be set through-l 1800
) during the connect-all so that it would retry for a longer period of time in the event of a path loss. For example,# nvme connect-all -t tcp -w 192.168.1.8 -a 192.168.1.51 -l 1800 # nvme connect-all -t tcp -w 192.168.1.8 -a 192.168.1.52 -l 1800 # nvme connect-all -t tcp -w 192.168.2.9 -a 192.168.2.56 -l 1800 # nvme connect-all -t tcp -w 192.168.2.9 -a 192.168.2.57 -l 1800
Validate NVMe-oF
-
Verify that in-kernel NVMe multipath is indeed enabled by checking:
# cat /sys/module/nvme_core/parameters/multipath Y
-
Verify that the appropriate NVMe-oF settings (such as,
model
set toNetApp ONTAP Controller
andload balancing iopolicy
set toround-robin
) for the respective ONTAP namespaces properly reflect on the host:# cat /sys/class/nvme-subsystem/nvme-subsys*/model NetApp ONTAP Controller NetApp ONTAP Controller # cat /sys/class/nvme-subsystem/nvme-subsys*/iopolicy round-robin round-robin
-
Verify that the ONTAP namespaces properly reflect on the host. For example,
# nvme list Node SN Model Namespace ------------ --------------------- --------------------------------- /dev/nvme0n1 81CZ5BQuUNfGAAAAAAAB NetApp ONTAP Controller 1 Usage Format FW Rev ------------------- ----------- -------- 85.90 GB / 85.90 GB 4 KiB + 0 B FFFFFFFF
Another example:
# nvme list Node SN Model Namespace ------------ --------------------- --------------------------------- /dev/nvme0n1 81CYrBQuTHQFAAAAAAAC NetApp ONTAP Controller 1 Usage Format FW Rev ------------------- ----------- -------- 85.90 GB / 85.90 GB 4 KiB + 0 B FFFFFFFF
-
Verify that the controller state of each path is live and has proper ANA status. For example,
# nvme list-subsys /dev/nvme1n1 nvme-subsys1 - NQN=nqn.1992-08.com.netapp:sn.04ba0732530911ea8e8300a098dfdd91:subsystem.nvme_145_1 \ +- nvme2 fc traddr=nn-0x208100a098dfdd91:pn-0x208200a098dfdd91 host_traddr=nn-0x200000109b579d5f:pn-0x100000109b579d5f live non-optimized +- nvme3 fc traddr=nn-0x208100a098dfdd91:pn-0x208500a098dfdd91 host_traddr=nn-0x200000109b579d5e:pn-0x100000109b579d5e live non-optimized +- nvme4 fc traddr=nn-0x208100a098dfdd91:pn-0x208400a098dfdd91 host_traddr=nn-0x200000109b579d5e:pn-0x100000109b579d5e live optimized +- nvme6 fc traddr=nn-0x208100a098dfdd91:pn-0x208300a098dfdd91 host_traddr=nn-0x200000109b579d5f:pn-0x100000109b579d5f live optimized
Another example:
#nvme list-subsys /dev/nvme0n1 nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.37ba7d9cbfba11eba35dd039ea165514:subsystem.nvme_114_tcp_1 \ +- nvme0 tcp traddr=192.168.2.36 trsvcid=4420 host_traddr=192.168.1.4 live optimized +- nvme1 tcp traddr=192.168.1.31 trsvcid=4420 host_traddr=192.168.1.4 live optimized +- nvme10 tcp traddr=192.168.2.37 trsvcid=4420 host_traddr=192.168.1.4 live non-optimized +- nvme11 tcp traddr=192.168.1.32 trsvcid=4420 host_traddr=192.168.1.4 live non-optimized +- nvme20 tcp traddr=192.168.2.36 trsvcid=4420 host_traddr=192.168.2.5 live optimized +- nvme21 tcp traddr=192.168.1.31 trsvcid=4420 host_traddr=192.168.2.5 live optimized +- nvme30 tcp traddr=192.168.2.37 trsvcid=4420 host_traddr=192.168.2.5 live non-optimized +- nvme31 tcp traddr=192.168.1.32 trsvcid=4420 host_traddr=192.168.2.5 live non-optimized
-
Verify that the NetApp plug-in displays proper values for each ONTAP namespace device. For example,
# nvme netapp ontapdevices -o column Device Vserver Namespace Path --------- ------- -------------------------------------------------- /dev/nvme1n1 vserver_fcnvme_145 /vol/fcnvme_145_vol_1_0_0/fcnvme_145_ns NSID UUID Size ---- ------------------------------ ------ 1 23766b68-e261-444e-b378-2e84dbe0e5e1 85.90GB # nvme netapp ontapdevices -o json { "ONTAPdevices" : [ { "Device" : "/dev/nvme1n1", "Vserver" : "vserver_fcnvme_145", "Namespace_Path" : "/vol/fcnvme_145_vol_1_0_0/fcnvme_145_ns", "NSID" : 1, "UUID" : "23766b68-e261-444e-b378-2e84dbe0e5e1", "Size" : "85.90GB", "LBA_Data_Size" : 4096, "Namespace_Size" : 20971520 } ] }
Another example:
# nvme netapp ontapdevices -o column Device Vserver Namespace Path --------- ------- -------------------------------------------------- /dev/nvme0n1 vs_tcp_114 /vol/tcpnvme_114_1_0_1/tcpnvme_114_ns NSID UUID Size ---- ------------------------------ ------ 1 a6aee036-e12f-4b07-8e79-4d38a9165686 85.90GB # nvme netapp ontapdevices -o json { "ONTAPdevices" : [ { "Device" : "/dev/nvme0n1", "Vserver" : "vs_tcp_114", "Namespace_Path" : "/vol/tcpnvme_114_1_0_1/tcpnvme_114_ns", "NSID" : 1, "UUID" : "a6aee036-e12f-4b07-8e79-4d38a9165686", "Size" : "85.90GB", "LBA_Data_Size" : 4096, "Namespace_Size" : 20971520 } ] }
Known issues
There are no known issues.