diff mbox series

[RFC] ib_uverbs: atomically flush and mark closed the comp event queue

Message ID 264e959fc18a4e96fdfc43cfa348f95c3c9ee000.1535730794.git.swise@opengridcomputing.com (mailing list archive)
State Rejected
Headers show
Series [RFC] ib_uverbs: atomically flush and mark closed the comp event queue | expand

Commit Message

Steve Wise Aug. 31, 2018, 2:16 p.m. UTC
Currently a uverbs completion event queue is flushed of events in
ib_uverbs_comp_event_close() with the queue spinlock held and then
released.  Yet setting ev_queue->is_closed is not set until later in
uverbs_hot_unplug_completion_event_file().

In between the time ib_uverbs_comp_event_close() releases the lock
and uverbs_hot_unplug_completion_event_file() acquires the lock, a
completion event can arrive and be inserted into the event queue by
ib_uverbs_comp_handler().

This can cause a "double add" list_add warning or crash depending on
the kernel configuration, or a memory leak because the event is never
dequeued since the queue is already closed down.

So move the setting of ev_queue->is_closed to
ib_uverbs_comp_event_close().

Cc: stable@vger.kernel.org
Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---

I added the RFC tag because I'm not sure removing setting
is_closed in uverbs_hot_unplug_completion_event_file() is correct.
Can uverbs_hot_unplug_completion_event_file() be called w/o
ib_uverbs_comp_event_close() being called first?  Should is_closed be
set in both places?

---
 drivers/infiniband/core/uverbs_main.c      | 1 +
 drivers/infiniband/core/uverbs_std_types.c | 4 ----
 2 files changed, 1 insertion(+), 4 deletions(-)

Comments

Jason Gunthorpe Sept. 1, 2018, 8:45 p.m. UTC | #1
On Fri, Aug 31, 2018 at 07:16:03AM -0700, Steve Wise wrote:
> Currently a uverbs completion event queue is flushed of events in
> ib_uverbs_comp_event_close() with the queue spinlock held and then
> released.  Yet setting ev_queue->is_closed is not set until later in
> uverbs_hot_unplug_completion_event_file().
> 
> In between the time ib_uverbs_comp_event_close() releases the lock
> and uverbs_hot_unplug_completion_event_file() acquires the lock, a
> completion event can arrive and be inserted into the event queue by
> ib_uverbs_comp_handler().
> 
> This can cause a "double add" list_add warning or crash depending on
> the kernel configuration, or a memory leak because the event is never
> dequeued since the queue is already closed down.
> 
> So move the setting of ev_queue->is_closed to
> ib_uverbs_comp_event_close().
> 
> Cc: stable@vger.kernel.org
> Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> 
> I added the RFC tag because I'm not sure removing setting
> is_closed in uverbs_hot_unplug_completion_event_file() is correct.
> Can uverbs_hot_unplug_completion_event_file() be called w/o
> ib_uverbs_comp_event_close() being called first?  Should is_closed be
> set in both places?
> 
>  drivers/infiniband/core/uverbs_main.c      | 1 +
>  drivers/infiniband/core/uverbs_std_types.c | 4 ----
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 823beca448e1..b3ef7db68bde 100644
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -440,6 +440,7 @@ static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
>  			list_del(&entry->obj_list);
>  		kfree(entry);
>  	}
> +	file->ev_queue.is_closed = 1;
>  	spin_unlock_irq(&file->ev_queue.lock);

Since event_close is essentially the release function of a kref, it is
definately wrong to set is_closed here, as there can be nothing to
read it.

>  	uverbs_close_fd(filp);
> diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
> index 203cc96ac6f5..76fc9b7fec59 100644
> +++ b/drivers/infiniband/core/uverbs_std_types.c
> @@ -199,10 +199,6 @@ static int uverbs_hot_unplug_completion_event_file(struct ib_uobject *uobj,
>  			     uobj);
>  	struct ib_uverbs_event_queue *event_queue = &comp_event_file->ev_queue;
>  
> -	spin_lock_irq(&event_queue->lock);
> -	event_queue->is_closed = 1;
> -	spin_unlock_irq(&event_queue->lock);
> -

Hot unplug and close are totally different things, the is_closed must
remain here to block access to a disassociated device.

Jason
Steve Wise Sept. 10, 2018, 7:24 p.m. UTC | #2
On 9/1/2018 3:45 PM, Jason Gunthorpe wrote:
> On Fri, Aug 31, 2018 at 07:16:03AM -0700, Steve Wise wrote:
>> Currently a uverbs completion event queue is flushed of events in
>> ib_uverbs_comp_event_close() with the queue spinlock held and then
>> released.  Yet setting ev_queue->is_closed is not set until later in
>> uverbs_hot_unplug_completion_event_file().
>>
>> In between the time ib_uverbs_comp_event_close() releases the lock
>> and uverbs_hot_unplug_completion_event_file() acquires the lock, a
>> completion event can arrive and be inserted into the event queue by
>> ib_uverbs_comp_handler().
>>
>> This can cause a "double add" list_add warning or crash depending on
>> the kernel configuration, or a memory leak because the event is never
>> dequeued since the queue is already closed down.
>>
>> So move the setting of ev_queue->is_closed to
>> ib_uverbs_comp_event_close().
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
>> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
>>
>> I added the RFC tag because I'm not sure removing setting
>> is_closed in uverbs_hot_unplug_completion_event_file() is correct.
>> Can uverbs_hot_unplug_completion_event_file() be called w/o
>> ib_uverbs_comp_event_close() being called first?  Should is_closed be
>> set in both places?
>>
>>  drivers/infiniband/core/uverbs_main.c      | 1 +
>>  drivers/infiniband/core/uverbs_std_types.c | 4 ----
>>  2 files changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
>> index 823beca448e1..b3ef7db68bde 100644
>> +++ b/drivers/infiniband/core/uverbs_main.c
>> @@ -440,6 +440,7 @@ static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
>>  			list_del(&entry->obj_list);
>>  		kfree(entry);
>>  	}
>> +	file->ev_queue.is_closed = 1;
>>  	spin_unlock_irq(&file->ev_queue.lock);
> 
> Since event_close is essentially the release function of a kref, it is
> definately wrong to set is_closed here, as there can be nothing to
> read it.

Hey Jason,

Question:  Why is there similar logic in ib_uverbs_async_event_close()
to set ev_queue.is_closed for the async ev_queue?:

...
        mutex_lock(&uverbs_file->device->lists_mutex);
        spin_lock_irq(&file->ev_queue.lock);
        closed_already = file->ev_queue.is_closed;
        file->ev_queue.is_closed = 1;
        list_for_each_entry_safe(entry, tmp, &file->ev_queue.event_list,
list) {
                if (entry->counter)
                        list_del(&entry->obj_list);
                kfree(entry);
        }
        spin_unlock_irq(&file->ev_queue.lock);
        if (!closed_already) {
                list_del(&file->list);
                ib_unregister_event_handler(&uverbs_file->event_handler);
        }
        mutex_unlock(&uverbs_file->device->lists_mutex);
...


Thanks,

Steve.
Jason Gunthorpe Sept. 10, 2018, 8:19 p.m. UTC | #3
On Mon, Sep 10, 2018 at 02:24:03PM -0500, Steve Wise wrote:
> 
> 
> On 9/1/2018 3:45 PM, Jason Gunthorpe wrote:
> > On Fri, Aug 31, 2018 at 07:16:03AM -0700, Steve Wise wrote:
> >> Currently a uverbs completion event queue is flushed of events in
> >> ib_uverbs_comp_event_close() with the queue spinlock held and then
> >> released.  Yet setting ev_queue->is_closed is not set until later in
> >> uverbs_hot_unplug_completion_event_file().
> >>
> >> In between the time ib_uverbs_comp_event_close() releases the lock
> >> and uverbs_hot_unplug_completion_event_file() acquires the lock, a
> >> completion event can arrive and be inserted into the event queue by
> >> ib_uverbs_comp_handler().
> >>
> >> This can cause a "double add" list_add warning or crash depending on
> >> the kernel configuration, or a memory leak because the event is never
> >> dequeued since the queue is already closed down.
> >>
> >> So move the setting of ev_queue->is_closed to
> >> ib_uverbs_comp_event_close().
> >>
> >> Cc: stable@vger.kernel.org
> >> Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
> >> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> >>
> >> I added the RFC tag because I'm not sure removing setting
> >> is_closed in uverbs_hot_unplug_completion_event_file() is correct.
> >> Can uverbs_hot_unplug_completion_event_file() be called w/o
> >> ib_uverbs_comp_event_close() being called first?  Should is_closed be
> >> set in both places?
> >>
> >>  drivers/infiniband/core/uverbs_main.c      | 1 +
> >>  drivers/infiniband/core/uverbs_std_types.c | 4 ----
> >>  2 files changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> >> index 823beca448e1..b3ef7db68bde 100644
> >> +++ b/drivers/infiniband/core/uverbs_main.c
> >> @@ -440,6 +440,7 @@ static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
> >>  			list_del(&entry->obj_list);
> >>  		kfree(entry);
> >>  	}
> >> +	file->ev_queue.is_closed = 1;
> >>  	spin_unlock_irq(&file->ev_queue.lock);
> > 
> > Since event_close is essentially the release function of a kref, it is
> > definately wrong to set is_closed here, as there can be nothing to
> > read it.
> 
> Hey Jason,
> 
> Question:  Why is there similar logic in ib_uverbs_async_event_close()
> to set ev_queue.is_closed for the async ev_queue?:

Presumably because those functions use a different refcounting scheme
for their event queue..

Oh, but they don't! ib_uverbs_lookup_comp_file() drops the kref on the
file * and just retains on the uobject! Why!? 

That creates the possibility of a half destructed uobject!!

Everytime I look at this stuff I end up in tears :P

So, yes, it does look helpfull to move the is_closed like this..

But better would be to just not have something so insane in the first
place :(

Jason
Steve Wise Sept. 10, 2018, 9:58 p.m. UTC | #4
On 9/10/2018 3:19 PM, Jason Gunthorpe wrote:
> On Mon, Sep 10, 2018 at 02:24:03PM -0500, Steve Wise wrote:
>>
>>
>> On 9/1/2018 3:45 PM, Jason Gunthorpe wrote:
>>> On Fri, Aug 31, 2018 at 07:16:03AM -0700, Steve Wise wrote:
>>>> Currently a uverbs completion event queue is flushed of events in
>>>> ib_uverbs_comp_event_close() with the queue spinlock held and then
>>>> released.  Yet setting ev_queue->is_closed is not set until later in
>>>> uverbs_hot_unplug_completion_event_file().
>>>>
>>>> In between the time ib_uverbs_comp_event_close() releases the lock
>>>> and uverbs_hot_unplug_completion_event_file() acquires the lock, a
>>>> completion event can arrive and be inserted into the event queue by
>>>> ib_uverbs_comp_handler().
>>>>
>>>> This can cause a "double add" list_add warning or crash depending on
>>>> the kernel configuration, or a memory leak because the event is never
>>>> dequeued since the queue is already closed down.
>>>>
>>>> So move the setting of ev_queue->is_closed to
>>>> ib_uverbs_comp_event_close().
>>>>
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
>>>> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
>>>>
>>>> I added the RFC tag because I'm not sure removing setting
>>>> is_closed in uverbs_hot_unplug_completion_event_file() is correct.
>>>> Can uverbs_hot_unplug_completion_event_file() be called w/o
>>>> ib_uverbs_comp_event_close() being called first?  Should is_closed be
>>>> set in both places?
>>>>
>>>>  drivers/infiniband/core/uverbs_main.c      | 1 +
>>>>  drivers/infiniband/core/uverbs_std_types.c | 4 ----
>>>>  2 files changed, 1 insertion(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
>>>> index 823beca448e1..b3ef7db68bde 100644
>>>> +++ b/drivers/infiniband/core/uverbs_main.c
>>>> @@ -440,6 +440,7 @@ static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
>>>>  			list_del(&entry->obj_list);
>>>>  		kfree(entry);
>>>>  	}
>>>> +	file->ev_queue.is_closed = 1;
>>>>  	spin_unlock_irq(&file->ev_queue.lock);
>>>
>>> Since event_close is essentially the release function of a kref, it is
>>> definately wrong to set is_closed here, as there can be nothing to
>>> read it.
>>
>> Hey Jason,
>>
>> Question:  Why is there similar logic in ib_uverbs_async_event_close()
>> to set ev_queue.is_closed for the async ev_queue?:
> 
> Presumably because those functions use a different refcounting scheme
> for their event queue..
> 
> Oh, but they don't! ib_uverbs_lookup_comp_file() drops the kref on the
> file * and just retains on the uobject! Why!? 
> 
> That creates the possibility of a half destructed uobject!!
> 
> Everytime I look at this stuff I end up in tears :P
> 
> So, yes, it does look helpfull to move the is_closed like this..
> 
> But better would be to just not have something so insane in the first
> place :(
> 
> Jason
> 

So do you need another patch from me for this?  You previously stated
the device removal path also needed setting is_closed to 1.  Should I
resend a v2 patch, with just adding setting is_closed to 1 in
ib_uverbs_comp_event_close()?

Thanks!

Steve.
diff mbox series

Patch

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 823beca448e1..b3ef7db68bde 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -440,6 +440,7 @@  static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp)
 			list_del(&entry->obj_list);
 		kfree(entry);
 	}
+	file->ev_queue.is_closed = 1;
 	spin_unlock_irq(&file->ev_queue.lock);
 
 	uverbs_close_fd(filp);
diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
index 203cc96ac6f5..76fc9b7fec59 100644
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -199,10 +199,6 @@  static int uverbs_hot_unplug_completion_event_file(struct ib_uobject *uobj,
 			     uobj);
 	struct ib_uverbs_event_queue *event_queue = &comp_event_file->ev_queue;
 
-	spin_lock_irq(&event_queue->lock);
-	event_queue->is_closed = 1;
-	spin_unlock_irq(&event_queue->lock);
-
 	if (why == RDMA_REMOVE_DRIVER_REMOVE) {
 		wake_up_interruptible(&event_queue->poll_wait);
 		kill_fasync(&event_queue->async_queue, SIGIO, POLL_IN);