ibv_get_device_list()
Contents
struct ibv_device** ibv_get_device_list(int *num_devices); |
Prerequisite
ibv_fork_init() should be called before calling any other function in libibverbs.
Description
ibv_get_device_list() returns a NULL-terminated array of RDMA devices currently available. The array should be released with ibv_free_device_list().
The array entries shouldn't be accessed directly. Instead, they should be used with the following service verbs: ibv_get_device_name(), ibv_get_device_guid() and ibv_open_device().
Parameters
Name | Direction | Description |
---|---|---|
num_devices | out | (optional) If not NULL, it is set to the number of devices returned in the array |
Return Values
ibv_get_device_list() returns the array of available RDMA devices on success, returns NULL and sets errno if the request fails. If no devices are found, then num_devices is set to 0, and non-NULL is returned.
Possible errno values are:
- EPERM - Permission denied.
- ENOMEM - Insufficient memory to complete the operation.
- ENOSYS - No kernel support for RDMA.
Examples
Get device list without a parameter:
struct ibv_device **dev_list; dev_list = ibv_get_device_list(NULL); if (!dev_list) exit(1); |
Get device list with a parameter:
struct ibv_device **dev_list; int num_devices; dev_list = ibv_get_device_list(&num_devices); if (!dev_list) exit(1); |
FAQs
I called ibv_get_device_list() and it returned NULL, what does it mean?
This is a basic verb that shouldn't fail, check if the module ib_uverbs is loaded.
I called ibv_get_device_list() and it didn't find any RDMA device at all (empty list), what does it mean?
The driver couldn't find any RDMA device.
- Check with lspci, if you have any RDMA device in your machine
- Check if the low-level driver for your RDMA device is loaded, using lsmod
- Check dmesg/var/log/messages for errors
Comments
Tell us what do you think.
Hi Dotan,
There are two RDMA devices on my machines, but ibv_get_device_list return only one of them (which is not active). They used to work before, but this was observed only today.
The NIC on the devices is :
Mellanox Technologies MT27500 Family [ConnectX-3]
ibstat shows both the devices, but ibv_devinfo shows only one.
Here are the output of the relevant linux commands :
$ ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 1
Firmware version: 2.7.0
Hardware version: a0
Node GUID: 0x0002c903000f8ab0
System image GUID: 0x0002c903000f8ab3
Port 1:
State: Down
Physical state: Polling
Rate: 10
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x0251086a
Port GUID: 0x0002c903000f8ab1
CA 'mlx4_1'
CA type: MT4099
Number of ports: 2
Firmware version: 2.32.5100
Hardware version: 1
Node GUID: 0xe41d2d03000b1d50
System image GUID: 0xe41d2d03000b1d50
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x0c010000
Port GUID: 0xe61d2dfffe0b1d50
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x0c010000
Port GUID: 0xe41d2d00010b1d51
$ ibv_devinfo
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs1
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.000
node_guid: 0002:c903:000f:8ab0
sys_image_guid: 0002:c903:000f:8ab3
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xA0
board_id: MT_0C40110009
phys_port_cnt: 1
port: 1
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 1
port_lmc: 0x00
$ ibv_devices
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs1
device node GUID
------ ----------------
mlx4_0 0002c903000f8ab0
As you can see, port1 of mlx4_1 is active, but ibv_devices detects only mlx4_0 which is not active. So, the RDMA operations (read and write) do not work with no completion found in the completion queue.
All the modules seem to be loaded as well (As I said, everything worked before).
Any help will be appreciated. Thanks.
Sagar
Hi Sagar.
I would suggest to check /var/log/messages for any error message related to the mlx4 driver.
Can you update the driver/FW of the adapters?
Thanks
Dotan
Thanks Dotan for the reply.
The error actually was that the port for mlx4_0 was down. After reloading the drivers, RDMA started working. I still don't know though, the reason for different outputs by ibv_devinfo and ibstat.
Hi.
Restarting the driver can do miracles
:)
Anyway, ibstat and ibv_devinfo get their information using different methods;
usually the result of them is equal, unless there is a problem ...
I would suggest to search for error messages in /var/log/messages.
Thanks
Dotan