diff mbox

[1/3] drm/i915: Only update the current userptr worker

Message ID 1435683333-17844-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson June 30, 2015, 4:55 p.m. UTC
The userptr worker allows for a slight race condition where upon there
may two or more threads calling get_user_pages for the same object. When
we have the array of pages, then we serialise the update of the object.
However, the worker should only overwrite the obj->userptr.work pointer
if and only if it is the active one. Currently we clear it for a
secondary worker with the effect that we may rarely force a second
lookup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Comments

Tvrtko Ursulin July 1, 2015, 9:48 a.m. UTC | #1
On 06/30/2015 05:55 PM, Chris Wilson wrote:
> The userptr worker allows for a slight race condition where upon there
> may two or more threads calling get_user_pages for the same object. When
> we have the array of pages, then we serialise the update of the object.
> However, the worker should only overwrite the obj->userptr.work pointer
> if and only if it is the active one. Currently we clear it for a
> secondary worker with the effect that we may rarely force a second
> lookup.

Secondary worker can fire only if invalidate clears the current one, no? 
(if (obj->userptr.work == NULL && ...))

It then "cancels" the worker so that the st_set_pages path is avoided.

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_userptr.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 7a5242cd5ea5..cb367d9f7909 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -581,17 +581,17 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
>   	}
>
>   	mutex_lock(&dev->struct_mutex);
> -	if (obj->userptr.work != &work->work) {
> -		ret = 0;
> -	} else if (pinned == num_pages) {
> -		ret = st_set_pages(&obj->pages, pvec, num_pages);
> -		if (ret == 0) {
> -			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> -			pinned = 0;
> +	if (obj->userptr.work == &work->work) {
> +		if (pinned == num_pages) {
> +			ret = st_set_pages(&obj->pages, pvec, num_pages);
> +			if (ret == 0) {
> +				list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> +				pinned = 0;
> +			}
>   		}
> +		obj->userptr.work = ERR_PTR(ret);
>   	}
>
> -	obj->userptr.work = ERR_PTR(ret);
>   	obj->userptr.workers--;
>   	drm_gem_object_unreference(&obj->base);
>   	mutex_unlock(&dev->struct_mutex);

Previously the canceled worker would allow another worker to be created 
in case it failed (obj->userptr.work != &work->work; ret = 0;) and now 
it still does since obj->userptr.work remains at NULL from cancellation.

Both seem wrong, am I missing the change?

Regards,

Tvrtko
Chris Wilson July 1, 2015, 9:59 a.m. UTC | #2
On Wed, Jul 01, 2015 at 10:48:59AM +0100, Tvrtko Ursulin wrote:
> 
> On 06/30/2015 05:55 PM, Chris Wilson wrote:
> >The userptr worker allows for a slight race condition where upon there
> >may two or more threads calling get_user_pages for the same object. When
> >we have the array of pages, then we serialise the update of the object.
> >However, the worker should only overwrite the obj->userptr.work pointer
> >if and only if it is the active one. Currently we clear it for a
> >secondary worker with the effect that we may rarely force a second
> >lookup.
> 
> Secondary worker can fire only if invalidate clears the current one,
> no? (if (obj->userptr.work == NULL && ...))
> 
> It then "cancels" the worker so that the st_set_pages path is avoided.

I may have overegged the changelog, but what I did not like here was
that we would touch obj->userptr.work when we clearly had lost ownership
of that field.
 
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_gem_userptr.c | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >index 7a5242cd5ea5..cb367d9f7909 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> >@@ -581,17 +581,17 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
> >  	}
> >
> >  	mutex_lock(&dev->struct_mutex);
> >-	if (obj->userptr.work != &work->work) {
> >-		ret = 0;
> >-	} else if (pinned == num_pages) {
> >-		ret = st_set_pages(&obj->pages, pvec, num_pages);
> >-		if (ret == 0) {
> >-			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> >-			pinned = 0;
> >+	if (obj->userptr.work == &work->work) {
> >+		if (pinned == num_pages) {
> >+			ret = st_set_pages(&obj->pages, pvec, num_pages);
> >+			if (ret == 0) {
> >+				list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> >+				pinned = 0;
> >+			}
> >  		}
> >+		obj->userptr.work = ERR_PTR(ret);
> >  	}
> >
> >-	obj->userptr.work = ERR_PTR(ret);
> >  	obj->userptr.workers--;
> >  	drm_gem_object_unreference(&obj->base);
> >  	mutex_unlock(&dev->struct_mutex);
> 
> Previously the canceled worker would allow another worker to be
> created in case it failed (obj->userptr.work != &work->work; ret =
> 0;) and now it still does since obj->userptr.work remains at NULL
> from cancellation.
> 
> Both seem wrong, am I missing the change?

No, the obj->userptr.work must remain NULL until a new get_pages()
because we don't actually know if this worker's gup was before or after
the cancellation  - mmap_sem vs struct_mutex ordering.
-Chris
Tvrtko Ursulin July 1, 2015, 10:58 a.m. UTC | #3
On 07/01/2015 10:59 AM, Chris Wilson wrote:
> On Wed, Jul 01, 2015 at 10:48:59AM +0100, Tvrtko Ursulin wrote:
>>
>> On 06/30/2015 05:55 PM, Chris Wilson wrote:
>>> The userptr worker allows for a slight race condition where upon there
>>> may two or more threads calling get_user_pages for the same object. When
>>> we have the array of pages, then we serialise the update of the object.
>>> However, the worker should only overwrite the obj->userptr.work pointer
>>> if and only if it is the active one. Currently we clear it for a
>>> secondary worker with the effect that we may rarely force a second
>>> lookup.
>>
>> Secondary worker can fire only if invalidate clears the current one,
>> no? (if (obj->userptr.work == NULL && ...))
>>
>> It then "cancels" the worker so that the st_set_pages path is avoided.
>
> I may have overegged the changelog, but what I did not like here was
> that we would touch obj->userptr.work when we clearly had lost ownership
> of that field.

Yes that part makes sense.

>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_userptr.c | 16 ++++++++--------
>>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> index 7a5242cd5ea5..cb367d9f7909 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> @@ -581,17 +581,17 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
>>>   	}
>>>
>>>   	mutex_lock(&dev->struct_mutex);
>>> -	if (obj->userptr.work != &work->work) {
>>> -		ret = 0;
>>> -	} else if (pinned == num_pages) {
>>> -		ret = st_set_pages(&obj->pages, pvec, num_pages);
>>> -		if (ret == 0) {
>>> -			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
>>> -			pinned = 0;
>>> +	if (obj->userptr.work == &work->work) {
>>> +		if (pinned == num_pages) {
>>> +			ret = st_set_pages(&obj->pages, pvec, num_pages);
>>> +			if (ret == 0) {
>>> +				list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
>>> +				pinned = 0;
>>> +			}
>>>   		}
>>> +		obj->userptr.work = ERR_PTR(ret);
>>>   	}
>>>
>>> -	obj->userptr.work = ERR_PTR(ret);
>>>   	obj->userptr.workers--;
>>>   	drm_gem_object_unreference(&obj->base);
>>>   	mutex_unlock(&dev->struct_mutex);
>>
>> Previously the canceled worker would allow another worker to be
>> created in case it failed (obj->userptr.work != &work->work; ret =
>> 0;) and now it still does since obj->userptr.work remains at NULL
>> from cancellation.
>>
>> Both seem wrong, am I missing the change?
>
> No, the obj->userptr.work must remain NULL until a new get_pages()
> because we don't actually know if this worker's gup was before or after
> the cancellation  - mmap_sem vs struct_mutex ordering.

No one is not wrong, or no I was not missing the change?

I am thinking more and more that we should just mark it canceled forever 
and not allow get_pages to succeed ever since.

Regards,

Tvrtko
Chris Wilson July 1, 2015, 11:09 a.m. UTC | #4
On Wed, Jul 01, 2015 at 11:58:46AM +0100, Tvrtko Ursulin wrote:
> On 07/01/2015 10:59 AM, Chris Wilson wrote:
> >On Wed, Jul 01, 2015 at 10:48:59AM +0100, Tvrtko Ursulin wrote:
> >>Previously the canceled worker would allow another worker to be
> >>created in case it failed (obj->userptr.work != &work->work; ret =
> >>0;) and now it still does since obj->userptr.work remains at NULL
> >>from cancellation.
> >>
> >>Both seem wrong, am I missing the change?
> >
> >No, the obj->userptr.work must remain NULL until a new get_pages()
> >because we don't actually know if this worker's gup was before or after
> >the cancellation  - mmap_sem vs struct_mutex ordering.
> 
> No one is not wrong, or no I was not missing the change?

The only change is that we don't change the value of userptr.work if it
is set to something else. The only time it should be different was if it
had been cancelled and so NULL. The patch just makes it so that a coding
error is less damaging - and I think easier to read because of that.
 
> I am thinking more and more that we should just mark it canceled
> forever and not allow get_pages to succeed ever since.

Yes, I toyed with that yesterday in response to you being able to alias
a GTT mmap address with the userptr after munmap(userptr.ptr). The
problem is that cancel_userptr() is caller for any change in the CPU
PTE's, including mprotect() or cow after forking. Both of those are
valid situations where we want to keep the userptr around, but with a
new gup.

It's tricky to know what the right thing to do is. For example, another
quirk is that we can recover a failed get_pages() by repeatedly invoking
it after a new aliasing. Again, I'm not sure if the current behaviour is
a little too lax.
-Chris
Tvrtko Ursulin July 1, 2015, 12:26 p.m. UTC | #5
On 07/01/2015 12:09 PM, Chris Wilson wrote:
> On Wed, Jul 01, 2015 at 11:58:46AM +0100, Tvrtko Ursulin wrote:
>> On 07/01/2015 10:59 AM, Chris Wilson wrote:
>>> On Wed, Jul 01, 2015 at 10:48:59AM +0100, Tvrtko Ursulin wrote:
>>>> Previously the canceled worker would allow another worker to be
>>>> created in case it failed (obj->userptr.work != &work->work; ret =
>>>> 0;) and now it still does since obj->userptr.work remains at NULL
>>> >from cancellation.
>>>>
>>>> Both seem wrong, am I missing the change?
>>>
>>> No, the obj->userptr.work must remain NULL until a new get_pages()
>>> because we don't actually know if this worker's gup was before or after
>>> the cancellation  - mmap_sem vs struct_mutex ordering.
>>
>> No one is not wrong, or no I was not missing the change?
>
> The only change is that we don't change the value of userptr.work if it
> is set to something else. The only time it should be different was if it
> had been cancelled and so NULL. The patch just makes it so that a coding
> error is less damaging - and I think easier to read because of that.
>
>> I am thinking more and more that we should just mark it canceled
>> forever and not allow get_pages to succeed ever since.
>
> Yes, I toyed with that yesterday in response to you being able to alias
> a GTT mmap address with the userptr after munmap(userptr.ptr). The
> problem is that cancel_userptr() is caller for any change in the CPU
> PTE's, including mprotect() or cow after forking. Both of those are
> valid situations where we want to keep the userptr around, but with a
> new gup.

Why do we want that? I would be surprised if someone is using it like 
that. How would it be defined on the GEM handle level even?

Regards,

Tvrtko
Chris Wilson July 1, 2015, 1:11 p.m. UTC | #6
On Wed, Jul 01, 2015 at 01:26:59PM +0100, Tvrtko Ursulin wrote:
> 
> On 07/01/2015 12:09 PM, Chris Wilson wrote:
> >On Wed, Jul 01, 2015 at 11:58:46AM +0100, Tvrtko Ursulin wrote:
> >>On 07/01/2015 10:59 AM, Chris Wilson wrote:
> >>>On Wed, Jul 01, 2015 at 10:48:59AM +0100, Tvrtko Ursulin wrote:
> >>>>Previously the canceled worker would allow another worker to be
> >>>>created in case it failed (obj->userptr.work != &work->work; ret =
> >>>>0;) and now it still does since obj->userptr.work remains at NULL
> >>>>from cancellation.
> >>>>
> >>>>Both seem wrong, am I missing the change?
> >>>
> >>>No, the obj->userptr.work must remain NULL until a new get_pages()
> >>>because we don't actually know if this worker's gup was before or after
> >>>the cancellation  - mmap_sem vs struct_mutex ordering.
> >>
> >>No one is not wrong, or no I was not missing the change?
> >
> >The only change is that we don't change the value of userptr.work if it
> >is set to something else. The only time it should be different was if it
> >had been cancelled and so NULL. The patch just makes it so that a coding
> >error is less damaging - and I think easier to read because of that.
> >
> >>I am thinking more and more that we should just mark it canceled
> >>forever and not allow get_pages to succeed ever since.
> >
> >Yes, I toyed with that yesterday in response to you being able to alias
> >a GTT mmap address with the userptr after munmap(userptr.ptr). The
> >problem is that cancel_userptr() is caller for any change in the CPU
> >PTE's, including mprotect() or cow after forking. Both of those are
> >valid situations where we want to keep the userptr around, but with a
> >new gup.
> 
> Why do we want that? I would be surprised if someone is using it
> like that. How would it be defined on the GEM handle level even?

I would be surprised as well, but it is a race condition we can handle
correctly and succinctly.

The race is just
	bo = userptr(ptr, size);
	set-to-domain(bo);
	mremap(ptr, newptr, size);
	set-to-domain(bo); // or exec(bo);
-Chris
MichaƂ Winiarski July 3, 2015, 10:48 a.m. UTC | #7
On Tue, Jun 30, 2015 at 05:55:31PM +0100, Chris Wilson wrote:
> The userptr worker allows for a slight race condition where upon there
> may two or more threads calling get_user_pages for the same object. When
> we have the array of pages, then we serialise the update of the object.
> However, the worker should only overwrite the obj->userptr.work pointer
> if and only if it is the active one. Currently we clear it for a
> secondary worker with the effect that we may rarely force a second
> lookup.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Whole series:
Tested-by: Micha? Winiarski <michal.winiarski@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 7a5242cd5ea5..cb367d9f7909 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -581,17 +581,17 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
>  	}
>  
>  	mutex_lock(&dev->struct_mutex);
> -	if (obj->userptr.work != &work->work) {
> -		ret = 0;
> -	} else if (pinned == num_pages) {
> -		ret = st_set_pages(&obj->pages, pvec, num_pages);
> -		if (ret == 0) {
> -			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> -			pinned = 0;
> +	if (obj->userptr.work == &work->work) {
> +		if (pinned == num_pages) {
> +			ret = st_set_pages(&obj->pages, pvec, num_pages);
> +			if (ret == 0) {
> +				list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
> +				pinned = 0;
> +			}
>  		}
> +		obj->userptr.work = ERR_PTR(ret);
>  	}
>  
> -	obj->userptr.work = ERR_PTR(ret);
>  	obj->userptr.workers--;
>  	drm_gem_object_unreference(&obj->base);
>  	mutex_unlock(&dev->struct_mutex);
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson July 3, 2015, 10:53 a.m. UTC | #8
On Fri, Jul 03, 2015 at 12:48:03PM +0200, Micha? Winiarski wrote:
> On Tue, Jun 30, 2015 at 05:55:31PM +0100, Chris Wilson wrote:
> > The userptr worker allows for a slight race condition where upon there
> > may two or more threads calling get_user_pages for the same object. When
> > we have the array of pages, then we serialise the update of the object.
> > However, the worker should only overwrite the obj->userptr.work pointer
> > if and only if it is the active one. Currently we clear it for a
> > secondary worker with the effect that we may rarely force a second
> > lookup.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Whole series:
> Tested-by: Micha? Winiarski <michal.winiarski@intel.com>

That reminds me there was a refleak in patch 3 if a second
invalidate-range notification before the first's worker had run (we
would take the ref for the active mo, but since the worker was queued,
it would still only run once and not drop our new ref.)
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7a5242cd5ea5..cb367d9f7909 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -581,17 +581,17 @@  __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 	}
 
 	mutex_lock(&dev->struct_mutex);
-	if (obj->userptr.work != &work->work) {
-		ret = 0;
-	} else if (pinned == num_pages) {
-		ret = st_set_pages(&obj->pages, pvec, num_pages);
-		if (ret == 0) {
-			list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
-			pinned = 0;
+	if (obj->userptr.work == &work->work) {
+		if (pinned == num_pages) {
+			ret = st_set_pages(&obj->pages, pvec, num_pages);
+			if (ret == 0) {
+				list_add_tail(&obj->global_list, &to_i915(dev)->mm.unbound_list);
+				pinned = 0;
+			}
 		}
+		obj->userptr.work = ERR_PTR(ret);
 	}
 
-	obj->userptr.work = ERR_PTR(ret);
 	obj->userptr.workers--;
 	drm_gem_object_unreference(&obj->base);
 	mutex_unlock(&dev->struct_mutex);