Configure subnet manager

Using an InfiniBand switch to run subnet manager might cause unexpected path loss during high loads. To avoid path loss, configure the subnet manager on one or more of your hosts using opensm.

Before you begin

Procedure

  1. Use the ibstat -p command to find GUID0 and GUID1 of the HCA ports. For example:
    # ibstat -p
     0x248a070300a80a80
     0x248a070300a80a81
  2. The way that you configure Subnet Manager depends on your configuration:
    • If you are using a single switch, start and enable the opensm service, then add the HCA port identifier values you found in step 2 to the opensm.conf file on each port. Repeat for the other port.
      • Edit the /etc/rdma/opensm.conf file to add the identifier for that port:
        opensm -c /etc/rdma/opensm.conf
        
        # The port GUID on which the OpenSM is running
        guid 0x248a070300a80a80
        
        
    • If you are using the direct connect method, or if you have multiple switches, enable Subnet Manager on each port of the connected HCA on the host:
      • Add the following two lines to /etc/rc.d/after.local (for SUSE Linux Enterprise Server 12 and SLES 15 service pack ). Substitute the values you found in step 2 for GUID0 and GUID1. For P0 and P1, use the subnet manager priorities, with 1 being the lowest and 15 the highest:
        SLES example
         opensm -B -g GUID0 -p P0 -f /var/log/opensm-ib0.log
         opensm -B -g GUID1 -p P1 -f /var/log/opensm-ib1.log

        An example of the command with value substitutions.

        # cat /etc/rc.d/rc.local
         opensm -B -g 0x248a070300a80a80 -p 15 -f /var/log/opensm-ib0.log
         opensm -B -g 0x248a070300a80a81 -p 1 -f /var/log/opensm-ib1.log
      • Add the following two lines to /etc/rc.d/rc.local (for RHEL 7 ).Substitute the values you found in step 2 for GUID0 and GUID1. For P0 and P1, use the subnet manager priorities, with 1 being the lowest and 15 the highest:
        RHEL example
         opensm -B -g GUID0 -p P0 -f /var/log/opensm-ib0.log
         opensm -B -g GUID1 -p P1 -f /var/log/opensm-ib1.log

        An example of the command with value substitutions.

        # cat /etc/rc.d/rc.local
         opensm -B -g 0x248a070300a80a80 -p 15 -f /var/log/opensm-ib0.log
         opensm -B -g 0x248a070300a80a81 -p 1 -f /var/log/opensm-ib1.log