diff mbox series

[2/2] gpu/i915: use HMM mirror for userptr buffer object.

Message ID 20180910005736.5805-3-jglisse@redhat.com (mailing list archive)
State New, archived
Headers show
Series Getting rid of GUP and use HMM for user ptr features. | expand

Commit Message

Jerome Glisse Sept. 10, 2018, 12:57 a.m. UTC
From: Jérôme Glisse <jglisse@redhat.com>

This replace existing code that rely on get_user_page() aka GUP with
code that now use HMM mirror to mirror a range of virtual address as
a buffer object accessible by the GPU. There is no functional changes
from userspace point of view.

From kernel point of view we no longer pin pages for userptr buffer
object which is a welcome change (i am assuming that everyone dislike
page pin as i do).

Another change, from kernel point of view, is that it does no longer
have a fast path with get_user_pages_fast() this can eventually added
back through HMM.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 206 ++++++++++++------------
 1 file changed, 102 insertions(+), 104 deletions(-)

Comments

Joonas Lahtinen Nov. 22, 2018, 1:59 p.m. UTC | #1
Hi Jerome,

Bit late reply, but here goes :)

We're working quite hard to avoid pinning any pages unless they're in
the GPU page tables. And when they are in the GPU page tables, they must
be pinned for whole of that duration, for the reason that our GPUs can
not take a fault. And to avoid thrashing GPU page tables, we do leave
objects in page tables with the expectation that smart userspace
recycles buffers.

So what I understand of your proposal, it wouldn't really make a
difference for us in the amount of pinned pages (which I agree,
we'd love to see going down). When we're unable to take a fault,
the first use effectively forces us to pin any pages and keep them
pinned to avoid thrashing GPU page tables.

So from i915 perspective, it just seems to be mostly an exchange of
an API to an another for getting the pages. You already mentioned
the fast path is being worked on, which is an obvious difference.
But is there some other improvement one would be expecting, beyond
the page pinning?

Also, is the requirement for a single non-file-backed VMA in the
plans of being eliminated or is that inherent restriction of the
HMM_MIRROR feature? We're currently not imposing such a limitation.

Regards, Joonas

Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> This replace existing code that rely on get_user_page() aka GUP with
> code that now use HMM mirror to mirror a range of virtual address as
> a buffer object accessible by the GPU. There is no functional changes
> from userspace point of view.
> 
> From kernel point of view we no longer pin pages for userptr buffer
> object which is a welcome change (i am assuming that everyone dislike
> page pin as i do).
> 
> Another change, from kernel point of view, is that it does no longer
> have a fast path with get_user_pages_fast() this can eventually added
> back through HMM.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: intel-gfx@lists.freedesktop.org
Daniel Vetter Nov. 22, 2018, 2:31 p.m. UTC | #2
On Thu, Nov 22, 2018 at 03:59:50PM +0200, Joonas Lahtinen wrote:
> Hi Jerome,
> 
> Bit late reply, but here goes :)
> 
> We're working quite hard to avoid pinning any pages unless they're in
> the GPU page tables. And when they are in the GPU page tables, they must
> be pinned for whole of that duration, for the reason that our GPUs can
> not take a fault. And to avoid thrashing GPU page tables, we do leave
> objects in page tables with the expectation that smart userspace
> recycles buffers.
> 
> So what I understand of your proposal, it wouldn't really make a
> difference for us in the amount of pinned pages (which I agree,
> we'd love to see going down). When we're unable to take a fault,
> the first use effectively forces us to pin any pages and keep them
> pinned to avoid thrashing GPU page tables.
> 
> So from i915 perspective, it just seems to be mostly an exchange of
> an API to an another for getting the pages. You already mentioned
> the fast path is being worked on, which is an obvious difference.
> But is there some other improvement one would be expecting, beyond
> the page pinning?
> 
> Also, is the requirement for a single non-file-backed VMA in the
> plans of being eliminated or is that inherent restriction of the
> HMM_MIRROR feature? We're currently not imposing such a limitation.

I think a clear plus for HMM would be if this helps us fix the deadlocks
and races we're seeing. But I have no idea whether this gets us any closer
here or not.
-Daniel

> 
> Regards, Joonas
> 
> Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > This replace existing code that rely on get_user_page() aka GUP with
> > code that now use HMM mirror to mirror a range of virtual address as
> > a buffer object accessible by the GPU. There is no functional changes
> > from userspace point of view.
> > 
> > From kernel point of view we no longer pin pages for userptr buffer
> > object which is a welcome change (i am assuming that everyone dislike
> > page pin as i do).
> > 
> > Another change, from kernel point of view, is that it does no longer
> > have a fast path with get_user_pages_fast() this can eventually added
> > back through HMM.
> > 
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: intel-gfx@lists.freedesktop.org
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 5e09b654b5ad..378aab438ebd 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -464,7 +464,7 @@  __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
 
 static int
 __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
-			      bool value)
+			      struct hmm_range *range)
 {
 	int ret = 0;
 
@@ -486,86 +486,120 @@  __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 	/* In order to serialise get_pages with an outstanding
 	 * cancel_userptr, we must drop the struct_mutex and try again.
 	 */
-	if (!value)
+	if (range) {
+		if (!hmm_vma_range_done(range))
+			ret = -EAGAIN;
+		else
+			add_object(obj->userptr.mmu_object);
+	} else
 		del_object(obj->userptr.mmu_object);
-	else if (!work_pending(&obj->userptr.mmu_object->work))
-		add_object(obj->userptr.mmu_object);
-	else
-		ret = -EAGAIN;
 	spin_unlock(&obj->userptr.mmu_object->mirror->lock);
 #endif
 
 	return ret;
 }
 
-static void
-__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+static int
+i915_gem_userptr_map(struct drm_i915_gem_object *obj, bool try)
 {
-	struct get_pages_work *work = container_of(_work, typeof(*work), work);
-	struct drm_i915_gem_object *obj = work->obj;
-	const int npages = obj->base.size >> PAGE_SHIFT;
+#if defined(CONFIG_HMM_MIRROR)
+	static const uint64_t i915_range_flags[HMM_PFN_FLAG_MAX] = {
+		(1 << 0), /* HMM_PFN_VALID */
+		(1 << 1), /* HMM_PFN_WRITE */
+		0 /* HMM_PFN_DEVICE_PRIVATE */
+	};
+	static const uint64_t i915_range_values[HMM_PFN_VALUE_MAX] = {
+		0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+		0, /* HMM_PFN_NONE */
+		0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+	};
+
+	const unsigned long npages = obj->base.size >> PAGE_SHIFT;
+	struct mm_struct *mm = obj->userptr.mm->mm;
+	struct sg_table *pages;
+	struct hmm_range range;
 	struct page **pvec;
-	int pinned, ret;
-
-	ret = -ENOMEM;
-	pinned = 0;
-
-	pvec = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (pvec != NULL) {
-		struct mm_struct *mm = obj->userptr.mm->mm;
-		unsigned int flags = 0;
-
-		if (!i915_gem_object_is_readonly(obj))
-			flags |= FOLL_WRITE;
-
-		ret = -EFAULT;
-		if (mmget_not_zero(mm)) {
-			down_read(&mm->mmap_sem);
-			while (pinned < npages) {
-				ret = get_user_pages_remote
-					(work->task, mm,
-					 obj->userptr.ptr + pinned * PAGE_SIZE,
-					 npages - pinned,
-					 flags,
-					 pvec + pinned, NULL, NULL);
-				if (ret < 0)
-					break;
-
-				pinned += ret;
-			}
-			up_read(&mm->mmap_sem);
-			mmput(mm);
-		}
+	unsigned long i;
+	bool write = !i915_gem_object_is_readonly(obj);
+	int err;
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t),
+				    try ? GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN : GFP_KERNEL);
+	if (range.pfns == NULL)
+		return try ? -EAGAIN : -ENOMEM;
+
+	range.pfn_shift = 12;
+	range.start = obj->userptr.ptr;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + obj->base.size;
+
+	range.vma = find_vma(mm, range.start);
+	if (!range.vma || range.vma->vm_file || range.vma->vm_end < range.end) {
+		kvfree(range.pfns);
+		return -EFAULT;
 	}
 
-	mutex_lock(&obj->mm.lock);
-	if (obj->userptr.work == &work->work) {
-		struct sg_table *pages = ERR_PTR(ret);
-
-		if (pinned == npages) {
-			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
-							       npages);
-			if (!IS_ERR(pages)) {
-				pinned = 0;
-				pages = NULL;
+again:
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		range.pfns[i] |= write ? range.flags[HMM_PFN_WRITE] : 0;
+	}
+
+	err = hmm_vma_fault(&range, true);
+	if (err) {
+		kvfree(range.pfns);
+		return err;
+	}
+
+	for (i = 0, pvec = (void *)range.pfns; i < npages; ++i) {
+		struct page *page = hmm_pfn_to_page(&range, range.pfns[i]);
+
+		if (page == NULL) {
+			hmm_vma_range_done(&range);
+
+			if (try) {
+				kvfree(range.pfns);
+				return -EAGAIN;
 			}
+			goto again;
 		}
-
-		obj->userptr.work = ERR_CAST(pages);
-		if (IS_ERR(pages))
-			__i915_gem_userptr_set_active(obj, false);
+		pvec[i] = page;
 	}
-	mutex_unlock(&obj->mm.lock);
 
-	release_pages(pvec, pinned);
-	kvfree(pvec);
+	if (!try)
+		mutex_lock(&obj->mm.lock);
+	pages = __i915_gem_userptr_alloc_pages(obj, pvec, npages);
+	if (!IS_ERR(pages))
+		__i915_gem_userptr_set_active(obj, &range);
+	else
+		hmm_vma_range_done(&range);
+	if (!try)
+		mutex_unlock(&obj->mm.lock);
+
+	kvfree(range.pfns);
+	return PTR_ERR_OR_ZERO(pages);
+#else /* CONFIG_HMM_MIRROR */
+	return -EFAULT;
+#endif
+}
+
+static void
+__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+{
+	struct get_pages_work *work = container_of(_work, typeof(*work), work);
+	struct drm_i915_gem_object *obj = work->obj;
+	int ret;
+
+	ret = i915_gem_userptr_map(obj, false);
+	obj->userptr.work = ERR_PTR(ret);
 
 	i915_gem_object_put(obj);
 	put_task_struct(work->task);
 	kfree(work);
 }
 
-static struct sg_table *
+static int
 __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 {
 	struct get_pages_work *work;
@@ -591,7 +625,7 @@  __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	 */
 	work = kmalloc(sizeof(*work), GFP_KERNEL);
 	if (work == NULL)
-		return ERR_PTR(-ENOMEM);
+		return -ENOMEM;
 
 	obj->userptr.work = &work->work;
 
@@ -603,17 +637,12 @@  __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	INIT_WORK(&work->work, __i915_gem_userptr_get_pages_worker);
 	queue_work(to_i915(obj->base.dev)->mm.userptr_wq, &work->work);
 
-	return ERR_PTR(-EAGAIN);
+	return -EAGAIN;
 }
 
 static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
-	const int num_pages = obj->base.size >> PAGE_SHIFT;
-	struct mm_struct *mm = obj->userptr.mm->mm;
-	struct page **pvec;
-	struct sg_table *pages;
-	bool active;
-	int pinned;
+	int ret;
 
 	/* If userspace should engineer that these pages are replaced in
 	 * the vma between us binding this page into the GTT and completion
@@ -640,40 +669,10 @@  static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 			return -EAGAIN;
 	}
 
-	pvec = NULL;
-	pinned = 0;
-
-	if (mm == current->mm) {
-		pvec = kvmalloc_array(num_pages, sizeof(struct page *),
-				      GFP_KERNEL |
-				      __GFP_NORETRY |
-				      __GFP_NOWARN);
-		if (pvec) /* defer to worker if malloc fails */
-			pinned = __get_user_pages_fast(obj->userptr.ptr,
-						       num_pages,
-						       !i915_gem_object_is_readonly(obj),
-						       pvec);
-	}
-
-	active = false;
-	if (pinned < 0) {
-		pages = ERR_PTR(pinned);
-		pinned = 0;
-	} else if (pinned < num_pages) {
-		pages = __i915_gem_userptr_get_pages_schedule(obj);
-		active = pages == ERR_PTR(-EAGAIN);
-	} else {
-		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
-		active = !IS_ERR(pages);
-	}
-	if (active)
-		__i915_gem_userptr_set_active(obj, true);
-
-	if (IS_ERR(pages))
-		release_pages(pvec, pinned);
-	kvfree(pvec);
-
-	return PTR_ERR_OR_ZERO(pages);
+	ret = i915_gem_userptr_map(obj, true);
+	if (ret == -EAGAIN)
+		ret = __i915_gem_userptr_get_pages_schedule(obj);
+	return ret;
 }
 
 static void
@@ -684,7 +683,7 @@  i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	struct page *page;
 
 	BUG_ON(obj->userptr.work != NULL);
-	__i915_gem_userptr_set_active(obj, false);
+	__i915_gem_userptr_set_active(obj, NULL);
 
 	if (obj->mm.madv != I915_MADV_WILLNEED)
 		obj->mm.dirty = false;
@@ -696,7 +695,6 @@  i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 			set_page_dirty(page);
 
 		mark_page_accessed(page);
-		put_page(page);
 	}
 	obj->mm.dirty = false;