ibv_create_cq()
Contents
struct ibv_cq *ibv_create_cq(struct ibv_context *context, int cqe, void *cq_context, struct ibv_comp_channel *channel, int comp_vector); |
Description
ibv_create_cq() creates a Completion Queue (CQ) for an RDMA device context.
When an outstanding Work Request, within a Send or Receive Queue, is completed, a Work Completion is being added to the CQ of that Work Queue. This Work Completion indicates that the outstanding Work Request has been completed (and no longer considered outstanding) and provides details on it (status, direction, opcode, etc.).
A single CQ can be shared for sending, receiving, and sharing across multiple QPs. The Work Completion holds the information to specify the QP number and the Queue (Send or Receive) that it came from.
The user can define the minimum size of the CQ. The actual created size can be equal or higher than this value.
Parameters
Name | Direction | Description |
---|---|---|
context | in | RDMA device context that was returned from ibv_open_device() |
cqe | in | The minimum requested capacity of the CQ, value can be [1..dev_cap.max_cqe] |
cq_context | in | (optional) User defined value which will be available in cq->cq_context. When waiting for Completion Event notification using the verb ibv_get_cq_event(), this value will be returned |
channel | in | (optional) The Completion event channel that will be used to indicate that new Work Completion was added to this CQ. NULL means that no Completion event channel will be used |
comp_vector | in | MSI-X completion vector that will be used for signaling Completion events. If the IRQ affinity masks of these interrupts have been configured to spread each MSI-X interrupt to be handled by a different core, this parameter can be used to spread the completion workload over multiple cores. Value can be [0..context->num_comp_vectors). |
Return Values
Value | Description | ||||
---|---|---|---|---|---|
CQ | A pointer to the newly allocated Completion Queue. This pointer also contains the following fields:
|
||||
NULL | On failure, errno indicates the failure reason:
|
Examples
1. Create a CQ with 100 entries and destroy it:
struct ibv_cq *cq; cq = ibv_create_cq(context, 100, NULL, NULL, 0); if (!cq) { fprintf(stderr, "Error, ibv_create_cq() failed\n"); return -1; } if (ibv_destroy_cq(cq)) { fprintf(stderr, "Error, ibv_destroy_cq() failed\n"); return -1; } |
2. Create a CQ with 100 entries associated with a Completion event channel:
(in this example we assume that the Completion event channel was already created before):
struct ibv_cq *cq; struct ibv_comp_channel *channel; cq = ibv_create_cq(context, 100, NULL, channel, 0); if (!cq) { fprintf(stderr, "Error, ibv_create_cq() failed\n"); return -1; } |
FAQs
Why is a CQ good for anyway?
CQ is used to hold Work Completion for any Work Request, that was completed and should produce a Work Completion, and provides details on it.
Can I use different CQs for Send/Receive Queues in the same QP?
Yes. In any QP the CQ of the Send Queue and the CQ of the Receive Queue may be the same or may be different. This is flexible and up to the user to decide.
Can several QPs be associated with the same CQ?
Yes. Several QPs can be associated with the same CQ in they Send or Receive Queues or in both of them.
What should be the CQ size?
A CQ should have enough room to hold all of Work Completions of the Outstanding Work Requests of the Queues that are associated with that CQ, so the CQ size shouldn't be less than the total number of Work Request that may be outstanding.
What will happen if the CQ size that I choose is too small??
If there will be a case that the CQ will be full and a new Work Completion is going to be added to that CQ, there will be a CQ overrun. A Completion with error will be added to the CQ and an asynchronous event IBV_EVENT_CQ_ERR will be generated.
Which value can I use as the cq_context?
Since the cq_context is a void *, you can put any value that you wish.
Comments
Tell us what do you think.
Hi,
ibv_create_cq is failing when size of cq was specified 1024 but successfully created when size was 512. Device limit is much larger than 1024. Memory registration is also failing for 20MB memory size block. can you have any idea or thought why this is issue occurring.
-Thanks
and the ENOMEM error was occurred when size was 1024.
Hi.
What is the value of 'ulimit -l'?
I'm sure that it is either 32 or 64, change it to unlimited and then try again...
Thanks
Dotan
yes that was the issue. I forgot to set it to unlimited.
Thanks for your help.
Hello Dotan,
I have a question regarding on interrupt.
In RDMA, we have sendcomplhandler() which will be called when an interrupt comes. And within the sendcomplhandler(), I need to acquire a spin lock, while I am still in sendcomplhandler(), a new interrupt comes in and invoke this function again and tries to acquire the spin lock. In this case, the system will crash due to lock re-acquire. Because in my design, I need to use spin lock in sendcomplhandler(). I was wondering is it a good way to disable rdma device interrupt at the beginning of sendcomplhandler() and enable it at the end? How to solve this problem, any suggestion?
All the best
Jack
Hi Jack.
I assume that you are working in the kernel level.
AFAIK, working with Interrupts/polling mode is an attribute of the low-level driver and not of the RDMA code
and most of the low-level drivers don't have an ability to disable interrupt mode
(otherwise, the CPU consumption would have gone to the sky).
I think that you think about a different approach to this issue.
What are you trying to do in the spinlock? handle the completion?
Maybe instead of actually handle the Work Completion in that handler, you should initiate a job in a workqueue?
Thanks
Dotan
Hi Dotan,
Is it possible to share completion queue for two different context?
For first connection
rc->rc_comp_channel = IBV_CREATE_COMP_CHANNEL(context1);
rc->rc_cq = IBV_CREATE_CQ(context1, cq_msgcount, NULL,
rc->rc_comp_channel, 0);
For second connection
rc2->rc_c=rc->rc_cq
rc2->rc_comp_channel =rc->rc_comp_channel;
I am trying to do some thing similar to above code.
Thanks,
Hi.
What do you mean by context?
Can you specify explicitly the pointer type that you try to share?
Thanks
Dotan
By context I mean two different HCA. Both are used to communicate to same application. I just want to know can two HCA share same completion queue. As I think completion queue is part of Hardware so it can't be share.
Hi.
Two different RDMA devices cannot share any RDMA resource;
any resource is an attribute (in most cases, even HW attribute) of the RDMA device.
Thanks
Dotan
Hi Dotan,
When I create a cq with calling ibv_create_cq, then i must call ibv_post_recv, it is right?
If I don't call ibv_post_recv, what things will happen?
Hi.
ibv_post_recv() should be posted once you have a QP and messages are expected
(it is highly advised when when the QP state is INIT, before transitioning to RTR).
Calling ibv_post_recv() without having any incoming messages won't have any effect.
Thanks
Dotan
Hi Dotan. Is it possible to use a CQ size smaller than the number of outstanding work requests when using unsignalled (i.e., not setting IBV_SEND_SIGNALED) work requests?
Hi.
I would suggest not to do it. For the following reasons:
* it is implementation dependent whether actual CQEs are added to the CQ or not
* In case of an error (for example, the first WR), more outstanding WRs than expected may be enqueued to the CQ;
which may cause CQ overrun
Thanks
Dotan
Hi - we've encountered some issues working with kernel 4.5.0.rc3.
we saw that sometimes a work request received multiple completion queue entries or sometime did not receive any at all.
It didnt happen in kernel 3.10 nor 4.7.
Thanks
Gili
Hi Gili.
If this is the case, I suggest that you'll send a mail to the relevant maintainer in linux-rdma mailing list,
and provide more information for debug this issue.
Thanks
Dotan
Hi.
When I call ibv_create_cq(), it returns segmentation fault.
> ibv_create_cq(ctx, 16, NULL, comp_channel, 0)
I'm using CentOS 7.4.1708
I think cq is allocated in already allocated memory. can you guess the reason?
Hi.
Is 'ctx' valid?
Was the 'comp_channel' allocated or initialized with NULL?
Sorry, but I don't have enough information to answer
(program source code is required to fully answer this question)
Thanks
Dotan
Hi.
Is 'ctx' valid?
Was the 'comp_channel' allocated or initialized with NULL?
Sorry, but I don't have enough information to answer
(source code is required to fully answer this question)
Thanks
Dotan
Hi,
We have two machines, first with ConnectX-3 and the other with ConnectX-5
I get with ibv_query_device() ibv_device_attr.max_cqe = 4,194,303 on both.
When I try to create cq with this value I get ENOMEM on both.
On ConnectX-3 I succeed with value of ~160K
On ConnectX-5 I succeed with value of ~80k
I want to use maximum capability of each machine, is there a way to know maximum value that will not fail on ENOMEM (and will not need to know which HCA is installed)?
Thanks,
Eyal
Hi.
In general, you should be able to create a CQ with the maximum number of entries,
assuming that you environment is configured correctly and you have enough memory in your machine.
I suggest that you'll check how much memory can be locked for every process in your machine.
Thanks
Dotan
Parameter cqe is specified as the minimum requested capacity of the CQ, user would expect the cq to grow dynamically until it reaches dev_cap.max_cqe . The meaning of that argument is pretty misleading.
Hi.
The dev_cap.max_cqe specifies the maximum number of CQEs a CQ can handle.
Once you create a CQ, you must specify the maximum amount of CQEs it can hold
(since a memory should be allocated for that CQ).
The number of CQEs in that CQ will grow to the maximum number of CQEs that CQ can hold,
but the CQ size is a fixed value.
The rational for this is to allow the user to create small CQs when needed.
Thanks
Dotan
Hi Dotan,
If I want to send packets to different receivers(different hosts) from a same sender.
Should I use different RDMA NICs on the sender?
If the answer is that it can run at same NIC.
Should I open ibv_device twice and create completion channel, cq, qp for each connection?
I have tried send packets to different receivers simultaneously with same RDMA NIC on sender.
And open ibv_device twice and create completion channel, cq, qp for each connection
However, when connection 2 starts to send packets after connection 1 starts, connection 1 will stop sending packet.
Thanks,
Hi.
You can send messages from the same sender to different receivers from the same NIC.
You can open the ibv_device once, and use a single channel, cq and different QP to each connection
(assuming that this is a connected transport type).
If you are using a datagram transport type, you can even use a single QP.
All of this can work from the same sender NIC.
If you want, you can use it from different ibv_devices or different local NICs.
Theoretically, the flow you described can work - please check you code.
Thanks
Dotan