ibv_destroy_cq()
Contents
int ibv_destroy_cq(struct ibv_cq *cq); |
Description
ibv_destroy_cq() destroys a Completion Queue.
The destruction of a CQ will fail if any QP is still associated with it.
If there is any affiliated asynchronous event on that CQ that was read, using ibv_get_async_event(), but still wasn't acknowledged, using ibv_ack_async_event(), calling to ibv_destroy_cq() will never end until that event will be acknowledged.
A CQ can be destroyed either if it is empty or contains Work Completions that still weren't polled by ibv_poll_cq(). Also, it can be destroyed if a request for notification was requested using ibv_req_notify_cq() before any notification event was created.
Parameters
Name | Direction | Description |
---|---|---|
cq | in | CQ that was returned from ibv_create_cq() |
Return Values
Value | Description |
---|---|
0 | On success |
errno | On failure |
EBUSY | One or more Work Queues is still associated with the CQ |
Examples
Create a CQ with 100 entries and destroy it:
struct ibv_cq *cq; cq = ibv_create_cq(context, 100, NULL, NULL, 0); if (!cq) { fprintf(stderr, "Error, ibv_create_cq() failed\n"); return -1; } if (ibv_destroy_cq(cq)) { fprintf(stderr, "Error, ibv_destroy_cq() failed\n"); return -1; } |
FAQs
ibv_destroy_cq() failed, what will happen to the QPs which are associated with it?
Nothing at all. You can continue working with them without any side-effect.
ibv_destroy_cq() failed, can I know which QPs are associated with it and caused this failure?
No, currently the RDMA stack doesn’t have this capability.
Can I destroy a CQ with Work Completion in it?
Yes, you can. When a CQ is being destroyed, it doesn't matter if it contains Work Completions or not.
I called ibv_req_notify_cq() on a CQ and didn't get any Completion event on it. Can I destroy that CQ?
Yes, you can.
I called ibv_destroy_cq(), but it never ended. What happened?
There is at least one affiliated asynchronous event on that CQ that was read without a proper acknowledgement.
Comments
Tell us what do you think.
Will destroying a CQ which has work completions cause memory leak?
Hi.
Destroying a CQ which has Work Completion won't cause a memory leak to the RDMA resources
(the buffer of the CQ will be freed, if there are or aren't any Work Completions in it).
However, if your program needs to use information in those Work Completion
(for example the wr_id, to understand which buffers are used),
then *your* code will have memory leaks.
Thanks
Dotan
Regarding to the following statement:
"Also, it can be destroyed if a request for notification was requested using ibv_req_notify_cq() before any notification event was created."
Can a CQ be destroyed using ibv_destroy_cq() after a notification event was generated as a result of calling ibv_req_notify_cq()?
Will calling ibv_destroy_cq() in this situation never return?
Thanks.
Hi.
The rule is very simple:
You can ask for events as many times that you like.
However, once you *got* an event - you must acknowledge it; not doing so will cause ibv_destory_cq() to never end.
Thanks
Dotan
Thanks Dotan for the reply.
For the following scenario:
0. ibv_req_notify_cq() was requested previously for CQ X
A while later there's an event that requires CQ X be destroyed, so we proceed with the following:
1. Non-blocking call to ibv_get_cq_event() returns errno EAGAIN since there's no notification posted. So there is nothing to acknowledge.
2. ibv_destroy_cq() to destroy CQ X
Even though we've already checked that there was no notification pending for CQ X at step 1, there's a window between 1 & 2, that a new notification for CQ X is posted before we call ibv_destroy_cq().
In this scenario, will the call to ibv_destroy_cq() never return?
If it causes ibv_destroy_cq() to hang, what would be the best way to handle this scenario?
Thanks again.
Hi.
I really thought about it, but here is the answer:
you describe what seems to be a potential race.
However, the CQ evens are events which created for completed Work Completion.
Before a CQ is destroyed, all the associated QPs should be destroyed,
thus no completion will be added to the CQ when destroying it.
So, no race should occur, and destroy CQ should end without any problem.
Does it make sense to you?
Thanks
Dotan
Thanks Dotan for the pointer.
Yes, it does make sense.
After moving the Queue Pair into the Error state and waiting for IBV_EVENT_QP_LAST_WQE_REACHED (i.e. all its WQEs have been flushed, hence no WQE will be consumed anymore) and destroying the associated QPs, then calling ibv_destroy_cq() should end without any problem as you stated.
Thanks.
Hendrik
Great.
Thanks for the update
Dotan
Hi Dotan,
I've implemented the sequence that was discussed previously but the call to ibv_destroy_cq() still never returns.
Here's the sequence:
1. Put Queue Pair (QP) into an error state (IBV_QPS_ERR) using
ibv_modify_qp()
2. Drain the CQ
2.1 Wait for CQ event using ibv_get_cq_event().
2.2 Once CQ event is detected, poll CQ using ibv_poll_cq()
2.3 All posted work requests are returned with status
IBV_WC_WR_FLUSH_ERR
2.4 Ack CQ event using ibv_ack_cq_events()
3. rdma_destroy_qp()
4. ibv_destroy_cq() hangs
At step 2.3 I've verified all of work requests I previously posted with ibv_post_recv() are returned. Hence there will not be any more work completion on the CQ. The CQ event has also been ack'ed.
Am I missing any step here that causes ibv_destroy_cq() to hang?
Thanks.
Hendrik
Hi.
The described scenario seems right.
Can you share the source code with me,
so I'll review it?
Thanks
Dotan
Hi Dotan,
I've root caused the issue. There was a bug in our I/O code path that could ack event to a wrong CQ.
After fixing it, the destroy sequence we discussed previously works just fine. ibv_destroy_cq() now returns successfully.
Thanks for your help.
Hendrik
This is great.
I'm working on a set of tools that will help debug this kind of things faster...
Thanks
Dotan