Skip to main content
In this article, you will learn how to test the physical state of the InfiniBand™ connection. The guides below will help you to check that InfiniBand connections are established between GPUs in a GPU cluster.

Testing the port state

  1. Connect to the VM using SSH.
  2. In the VM’s shell, run ibstatus command that displays operational information about InfiniBand network devices. Result:
    Infiniband device 'mlx5_0' port 1 status:
       default gid:    fe80:0000:0000:0000:****:****:****:03c5
       base lid:    0x***
       sm lid:      0x*
       state:       4: ACTIVE
       phys state:  5: LinkUp
       rate:     400 Gb/sec (4X NDR)
       link_layer:  InfiniBand
    
    Infiniband device 'mlx5_1' port 1 status:
       default gid:    fe80:0000:0000:0000:****:****:****:03c6
       base lid:    0x***
       sm lid:      0x*
       state:       4: ACTIVE
       phys state:  5: LinkUp
       rate:     400 Gb/sec (4X NDR)
       link_layer:  InfiniBand
    ...
    
  3. For each device in the result, check the physical state (phys state): it should be LinkUp.

Testing network performance

You can also emulate the network activity by sending some data from GPUs on one VM to GPUs on another:
  1. Install the perftest package on each one of the test VMs:
    sudo apt install perftest
    
  2. Connect to the first VM using SSH.
  3. Run ib_send_bw --report_gbits.
  4. Copy the first VM’s private IP address.
  5. Connect to the second VM using SSH.
  6. Run ib_send_bw <first_VM_IP_address> --report_gbits.
In the commands output, you should see non-zero values for the bytes sent, average bandwith speed, and average message rate. The bandwidth peak speed might not reach the theoretical maximum 400 Gbps. Example:
+--------------------------------------------------------------------------------+
| #bytes   #iterations   #BW peak[Gb/sec]   #BW average[Gb/sec]   #MsgRate[Mpps] |
+--------------------------------------------------------------------------------+
| 65536    1000          360.39             359.91                0.686466       |
+--------------------------------------------------------------------------------+
See also

InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.