Message ID | 20191105090148.30269-1-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Switch obj->mm.lock lockdep annotations on its head | expand |
On Tue, 5 Nov 2019 at 09:01, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > The trouble with having a plain nesting flag for locks which do not > naturally nest (unlike block devices and their partitions, which is > the original motivation for nesting levels) is that lockdep will > never spot a true deadlock if you screw up. > > This patch is an attempt at trying better, by highlighting a bit more > the actual nature of the nesting that's going on. Essentially we have > two kinds of objects: > > - objects without pages allocated, which cannot be on any lru and are > hence inaccessible to the shrinker. > > - objects which have pages allocated, which are on an lru, and which > the shrinker can decide to throw out. > > For the former type of object, memory allcoations while holding > obj->mm.lock are permissible. For the latter they are not. And > get/put_pages transitions between the two types of objects. > > This is still not entirely fool-proof since the rules might chance. > But as long as we run such a code ever at runtime lockdep should be > able to observe the inconsistency and complain (like with any other > lockdep class that we've split up in multiple classes). But there are > a few clear benefits: > > - We can drop the nesting flag parameter from > __i915_gem_object_put_pages, because that function by definition is > never going allocate memory, and calling it on an object which > doesn't have its pages allocated would be a bug. > > - We strictly catch more bugs, since there's not only one place in the > entire tree which is annotated with the special class. All the > other places that had explicit lockdep nesting annotations we're now > going to leave up to lockdep again. > > - Specifically this catches stuff like calling get_pages from > put_pages (which isn't really a good idea, if we can call get_pages > so could the shrinker). I've seen patches do exactly that. > > Of course I fully expect CI will show me for the fool I am with this > one here :-) > > v2: There can only be one (lockdep only has a cache for the first > subclass, not for deeper ones, and we don't want to make these locks > even slower). Still separate enums for better documentation. > > Real fix: don forget about phys objs and pin_map(), and fix the > shrinker to have the right annotations ... silly me. > > v3: Forgot usertptr too ... > > v4: Improve comment for pages_pin_count, drop the IMPORTANT comment > and instead prime lockdep (Chris). > > v5: Appease checkpatch, no double empty lines (Chris) > > v6: More rebasing over selftest changes. Also somehow I forgot to > push this patch :-/ > > Also format comments consistently while at it. > > v7: Fix typo in commit message (Joonas) > > Also drop the priming, with the lmem merge we now have allocations > while holding the lmem lock, which wreaks the generic priming I've > done in earlier patches. Should probably be resurrected when lmem is > fixed. See > > commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 > Author: Matthew Auld <matthew.auld@intel.com> > Date: Tue Oct 8 17:01:14 2019 +0100 > > drm/i915: introduce intel_memory_region > > I'm keeping the priming patch locally so it wont get lost. Any idea how we can fix this? AFAIK for something like LMEM, its objects are always marked as !shrinkable, and so shouldn't be accessible from the shrinker.
On Tue, Nov 05, 2019 at 10:49:41AM +0000, Matthew Auld wrote: > On Tue, 5 Nov 2019 at 09:01, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > > > The trouble with having a plain nesting flag for locks which do not > > naturally nest (unlike block devices and their partitions, which is > > the original motivation for nesting levels) is that lockdep will > > never spot a true deadlock if you screw up. > > > > This patch is an attempt at trying better, by highlighting a bit more > > the actual nature of the nesting that's going on. Essentially we have > > two kinds of objects: > > > > - objects without pages allocated, which cannot be on any lru and are > > hence inaccessible to the shrinker. > > > > - objects which have pages allocated, which are on an lru, and which > > the shrinker can decide to throw out. > > > > For the former type of object, memory allcoations while holding > > obj->mm.lock are permissible. For the latter they are not. And > > get/put_pages transitions between the two types of objects. > > > > This is still not entirely fool-proof since the rules might chance. > > But as long as we run such a code ever at runtime lockdep should be > > able to observe the inconsistency and complain (like with any other > > lockdep class that we've split up in multiple classes). But there are > > a few clear benefits: > > > > - We can drop the nesting flag parameter from > > __i915_gem_object_put_pages, because that function by definition is > > never going allocate memory, and calling it on an object which > > doesn't have its pages allocated would be a bug. > > > > - We strictly catch more bugs, since there's not only one place in the > > entire tree which is annotated with the special class. All the > > other places that had explicit lockdep nesting annotations we're now > > going to leave up to lockdep again. > > > > - Specifically this catches stuff like calling get_pages from > > put_pages (which isn't really a good idea, if we can call get_pages > > so could the shrinker). I've seen patches do exactly that. > > > > Of course I fully expect CI will show me for the fool I am with this > > one here :-) > > > > v2: There can only be one (lockdep only has a cache for the first > > subclass, not for deeper ones, and we don't want to make these locks > > even slower). Still separate enums for better documentation. > > > > Real fix: don forget about phys objs and pin_map(), and fix the > > shrinker to have the right annotations ... silly me. > > > > v3: Forgot usertptr too ... > > > > v4: Improve comment for pages_pin_count, drop the IMPORTANT comment > > and instead prime lockdep (Chris). > > > > v5: Appease checkpatch, no double empty lines (Chris) > > > > v6: More rebasing over selftest changes. Also somehow I forgot to > > push this patch :-/ > > > > Also format comments consistently while at it. > > > > v7: Fix typo in commit message (Joonas) > > > > Also drop the priming, with the lmem merge we now have allocations > > while holding the lmem lock, which wreaks the generic priming I've > > done in earlier patches. Should probably be resurrected when lmem is > > fixed. See > > > > commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 > > Author: Matthew Auld <matthew.auld@intel.com> > > Date: Tue Oct 8 17:01:14 2019 +0100 > > > > drm/i915: introduce intel_memory_region > > > > I'm keeping the priming patch locally so it wont get lost. > > Any idea how we can fix this? AFAIK for something like LMEM, its > objects are always marked as !shrinkable, and so shouldn't be > accessible from the shrinker. On one hand I don't think you need to fix this, since it works. Otoh I think it's generally good practice to not allocate memory (or at least be very conscious about it) when holding memory manager locks. Because sooner or later you somehow create a dependency from one memory manager to the next, or something else, and then you end up with the shrinker in your dependencies. In the locking rules for the new lmem locking we've discussed this a bit, and agreed to just encode it as best practice. Including lockdep priming (i.e. tell lockdep that we might get at mm_lock from fs_reclaim, to make sure no one can allocate anything while holding mm_lock). Wrt fixing: preallocate, then take lock, is the standard pattern. -Daniel
Just some nits/typos that made this a little difficult for me to read. I am still trying to understand what is going on, so unfortunately I have no comments on the patch. >-----Original Message----- >From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of >Daniel Vetter >Sent: Tuesday, November 5, 2019 4:02 AM >To: Intel Graphics Development <intel-gfx@lists.freedesktop.org> >Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Auld, Matthew ><matthew.auld@intel.com>; Vetter, Daniel <daniel.vetter@intel.com> >Subject: [Intel-gfx] [PATCH] drm/i915: Switch obj->mm.lock lockdep >annotations on its head > >The trouble with having a plain nesting flag for locks which do not >naturally nest (unlike block devices and their partitions, which is >the original motivation for nesting levels) is that lockdep will >never spot a true deadlock if you screw up. > >This patch is an attempt at trying better, by highlighting a bit more "a bit more of the" >the actual nature of the nesting that's going on. Essentially we have >two kinds of objects: > >- objects without pages allocated, which cannot be on any lru and are > hence inaccessible to the shrinker. > >- objects which have pages allocated, which are on an lru, and which > the shrinker can decide to throw out. > >For the former type of object, memory allcoations while holding s/allcoations/allocations >obj->mm.lock are permissible. For the latter they are not. And >get/put_pages transitions between the two types of objects. I am not sure what the sentence, "And get/put_page transitions between the two types of objects." means. Can you clarify? > >This is still not entirely fool-proof since the rules might chance. s/chance/change/ >But as long as we run such a code ever at runtime lockdep should be >able to observe the inconsistency and complain (like with any other I am having difficulty with "But as long as we run such a code ever at". Should this be, "With this code, runtime lockdep should be able to..."? >lockdep class that we've split up in multiple classes). But there are >a few clear benefits: > >- We can drop the nesting flag parameter from > __i915_gem_object_put_pages, because that function by definition is > never going allocate memory, and calling it on an object which > doesn't have its pages allocated would be a bug. > >- We strictly catch more bugs, since there's not only one place in the > entire tree which is annotated with the special class. All the > other places that had explicit lockdep nesting annotations we're now > going to leave up to lockdep again. > >- Specifically this catches stuff like calling get_pages from > put_pages (which isn't really a good idea, if we can call get_pages > so could the shrinker). I've seen patches do exactly that. > >Of course I fully expect CI will show me for the fool I am with this >one here :-) > >v2: There can only be one (lockdep only has a cache for the first >subclass, not for deeper ones, and we don't want to make these locks >even slower). Still separate enums for better documentation. > >Real fix: don forget about phys objs and pin_map(), and fix the s/don/don't/ Thanks, Mike >shrinker to have the right annotations ... silly me. > >v3: Forgot usertptr too ... > >v4: Improve comment for pages_pin_count, drop the IMPORTANT comment >and instead prime lockdep (Chris). > >v5: Appease checkpatch, no double empty lines (Chris) > >v6: More rebasing over selftest changes. Also somehow I forgot to >push this patch :-/ > >Also format comments consistently while at it. > >v7: Fix typo in commit message (Joonas) > >Also drop the priming, with the lmem merge we now have allocations >while holding the lmem lock, which wreaks the generic priming I've >done in earlier patches. Should probably be resurrected when lmem is >fixed. See > >commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 >Author: Matthew Auld <matthew.auld@intel.com> >Date: Tue Oct 8 17:01:14 2019 +0100 > > drm/i915: introduce intel_memory_region > >I'm keeping the priming patch locally so it wont get lost. > >Cc: Matthew Auld <matthew.auld@intel.com> >Cc: Chris Wilson <chris@chris-wilson.co.uk> >Cc: "Tang, CQ" <cq.tang@intel.com> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> >Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) >Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6) >Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >--- > drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +++- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 17 ++++++++++++++--- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 +++++- > drivers/gpu/drm/i915/gem/i915_gem_pages.c | 9 ++++----- > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++-- > drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++++------- > .../drm/i915/selftests/intel_memory_region.c | 4 ++-- > 9 files changed, 40 insertions(+), 25 deletions(-) > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c >b/drivers/gpu/drm/i915/gem/i915_gem_object.c >index a50296cce0d8..db103d3c8760 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c >@@ -22,6 +22,8 @@ > * > */ > >+#include <linux/sched/mm.h> >+ > #include "display/intel_frontbuffer.h" > #include "gt/intel_gt.h" > #include "i915_drv.h" >@@ -186,7 +188,7 @@ static void __i915_gem_free_objects(struct >drm_i915_private *i915, > GEM_BUG_ON(!list_empty(&obj->lut_list)); > > atomic_set(&obj->mm.pages_pin_count, 0); >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > GEM_BUG_ON(i915_gem_object_has_pages(obj)); > bitmap_free(obj->bit_17); > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h >b/drivers/gpu/drm/i915/gem/i915_gem_object.h >index 458cd51331f1..edaf7126a84d 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h >+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h >@@ -319,11 +319,22 @@ i915_gem_object_unpin_pages(struct >drm_i915_gem_object *obj) > > enum i915_mm_subclass { /* lockdep subclass for obj- >>mm.lock/struct_mutex */ > I915_MM_NORMAL = 0, >- I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque >*/ >+ /* >+ * Only used by struct_mutex, when called "recursively" from >+ * direct-reclaim-esque. Safe because there is only every one >+ * struct_mutex in the entire system. >+ */ >+ I915_MM_SHRINKER = 1, >+ /* >+ * Used for obj->mm.lock when allocating pages. Safe because the >object >+ * isn't yet on any LRU, and therefore the shrinker can't deadlock on >+ * it. As soon as the object has pages, obj->mm.lock nests within >+ * fs_reclaim. >+ */ >+ I915_MM_GET_PAGES = 1, > }; > >-int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, >- enum i915_mm_subclass subclass); >+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > void i915_gem_object_truncate(struct drm_i915_gem_object *obj); > void i915_gem_object_writeback(struct drm_i915_gem_object *obj); > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h >b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h >index 96008374a412..15f8297dc34e 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h >+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h >@@ -162,7 +162,11 @@ struct drm_i915_gem_object { > atomic_t bind_count; > > struct { >- struct mutex lock; /* protects the pages and their use */ >+ /* >+ * Protects the pages and their use. Do not use directly, but >+ * instead go through the pin/unpin interfaces. >+ */ >+ struct mutex lock; > atomic_t pages_pin_count; > atomic_t shrink_pin; > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c >b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >index 29f4c2850745..f402c2c415c2 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >@@ -106,7 +106,7 @@ int __i915_gem_object_get_pages(struct >drm_i915_gem_object *obj) > { > int err; > >- err = mutex_lock_interruptible(&obj->mm.lock); >+ err = mutex_lock_interruptible_nested(&obj->mm.lock, >I915_MM_GET_PAGES); > if (err) > return err; > >@@ -190,8 +190,7 @@ __i915_gem_object_unset_pages(struct >drm_i915_gem_object *obj) > return pages; > } > >-int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, >- enum i915_mm_subclass subclass) >+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) > { > struct sg_table *pages; > int err; >@@ -202,7 +201,7 @@ int __i915_gem_object_put_pages(struct >drm_i915_gem_object *obj, > GEM_BUG_ON(atomic_read(&obj->bind_count)); > > /* May be called by shrinker from within get_pages() (on another bo) >*/ >- mutex_lock_nested(&obj->mm.lock, subclass); >+ mutex_lock(&obj->mm.lock); > if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { > err = -EBUSY; > goto unlock; >@@ -308,7 +307,7 @@ void *i915_gem_object_pin_map(struct >drm_i915_gem_object *obj, > if (!i915_gem_object_type_has(obj, flags)) > return ERR_PTR(-ENXIO); > >- err = mutex_lock_interruptible(&obj->mm.lock); >+ err = mutex_lock_interruptible_nested(&obj->mm.lock, >I915_MM_GET_PAGES); > if (err) > return ERR_PTR(err); > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c >b/drivers/gpu/drm/i915/gem/i915_gem_phys.c >index 8043ff63d73f..b1b7c1b3038a 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c >@@ -164,7 +164,7 @@ int i915_gem_object_attach_phys(struct >drm_i915_gem_object *obj, int align) > if (err) > return err; > >- mutex_lock(&obj->mm.lock); >+ mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > > if (obj->mm.madv != I915_MADV_WILLNEED) { > err = -EFAULT; >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c >b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c >index fd3ce6da8497..066b3df677e8 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c >@@ -57,7 +57,7 @@ static bool unsafe_drop_pages(struct >drm_i915_gem_object *obj, > flags = I915_GEM_OBJECT_UNBIND_ACTIVE; > > if (i915_gem_object_unbind(obj, flags) == 0) >- __i915_gem_object_put_pages(obj, I915_MM_SHRINKER); >+ __i915_gem_object_put_pages(obj); > > return !i915_gem_object_has_pages(obj); > } >@@ -209,8 +209,7 @@ i915_gem_shrink(struct drm_i915_private *i915, > > if (unsafe_drop_pages(obj, shrink)) { > /* May arrive from get_pages on another bo >*/ >- mutex_lock_nested(&obj->mm.lock, >- I915_MM_SHRINKER); >+ mutex_lock(&obj->mm.lock); > if (!i915_gem_object_has_pages(obj)) { > try_to_writeback(obj, shrink); > count += obj->base.size >> >PAGE_SHIFT; >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c >b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c >index 1e045c337044..ee65c6acf0e2 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c >@@ -131,7 +131,7 @@ userptr_mn_invalidate_range_start(struct >mmu_notifier *_mn, > ret = i915_gem_object_unbind(obj, > >I915_GEM_OBJECT_UNBIND_ACTIVE); > if (ret == 0) >- ret = __i915_gem_object_put_pages(obj, >I915_MM_SHRINKER); >+ ret = __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > if (ret) > return ret; >@@ -483,7 +483,7 @@ __i915_gem_userptr_get_pages_worker(struct >work_struct *_work) > } > } > >- mutex_lock(&obj->mm.lock); >+ mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > if (obj->userptr.work == &work->work) { > struct sg_table *pages = ERR_PTR(ret); > >diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c >b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c >index 688c49a24f32..5c9583349077 100644 >--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c >+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c >@@ -517,7 +517,7 @@ static int igt_mock_memory_region_huge_pages(void >*arg) > i915_vma_unpin(vma); > i915_vma_close(vma); > >- __i915_gem_object_put_pages(obj, >I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } >@@ -650,7 +650,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) > i915_vma_close(vma); > > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > >@@ -678,7 +678,7 @@ static void close_object_list(struct list_head *objects, > > list_del(&obj->st_link); > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } >@@ -948,7 +948,7 @@ static int igt_mock_ppgtt_64K(void *arg) > i915_vma_close(vma); > > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, >I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } >@@ -1301,7 +1301,7 @@ static int igt_ppgtt_exhaust_huge(void *arg) > } > > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, >I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } >@@ -1442,7 +1442,7 @@ static int igt_ppgtt_smoke_huge(void *arg) > } > out_unpin: > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > out_put: > i915_gem_object_put(obj); > >@@ -1530,7 +1530,7 @@ static int igt_ppgtt_sanity_check(void *arg) > err = igt_write_huge(ctx, obj); > > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, >I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > > if (err) { >diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c >b/drivers/gpu/drm/i915/selftests/intel_memory_region.c >index 19e1cca8f143..95d609abd39b 100644 >--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c >+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c >@@ -32,7 +32,7 @@ static void close_objects(struct intel_memory_region >*mem, > if (i915_gem_object_has_pinned_pages(obj)) > i915_gem_object_unpin_pages(obj); > /* No polluting the memory region between tests */ >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > list_del(&obj->st_link); > i915_gem_object_put(obj); > } >@@ -122,7 +122,7 @@ igt_object_create(struct intel_memory_region *mem, > static void igt_object_release(struct drm_i915_gem_object *obj) > { > i915_gem_object_unpin_pages(obj); >- __i915_gem_object_put_pages(obj, I915_MM_NORMAL); >+ __i915_gem_object_put_pages(obj); > list_del(&obj->st_link); > i915_gem_object_put(obj); > } >-- >2.24.0.rc2 > >_______________________________________________ >Intel-gfx mailing list >Intel-gfx@lists.freedesktop.org >https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> -----Original Message----- > From: Daniel Vetter <daniel.vetter@ffwll.ch> > Sent: Tuesday, November 5, 2019 2:02 AM > To: Intel Graphics Development <intel-gfx@lists.freedesktop.org> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Auld, Matthew > <matthew.auld@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>; Tang, > CQ <cq.tang@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Joonas > Lahtinen <joonas.lahtinen@linux.intel.com>; Vetter, Daniel > <daniel.vetter@intel.com> > Subject: [PATCH] drm/i915: Switch obj->mm.lock lockdep annotations on its > head > > The trouble with having a plain nesting flag for locks which do not naturally > nest (unlike block devices and their partitions, which is the original motivation > for nesting levels) is that lockdep will never spot a true deadlock if you screw > up. > > This patch is an attempt at trying better, by highlighting a bit more the actual > nature of the nesting that's going on. Essentially we have two kinds of > objects: > > - objects without pages allocated, which cannot be on any lru and are > hence inaccessible to the shrinker. > > - objects which have pages allocated, which are on an lru, and which > the shrinker can decide to throw out. > > For the former type of object, memory allcoations while holding > obj->mm.lock are permissible. For the latter they are not. And > get/put_pages transitions between the two types of objects. > > This is still not entirely fool-proof since the rules might chance. > But as long as we run such a code ever at runtime lockdep should be able to > observe the inconsistency and complain (like with any other lockdep class > that we've split up in multiple classes). But there are a few clear benefits: > > - We can drop the nesting flag parameter from > __i915_gem_object_put_pages, because that function by definition is > never going allocate memory, and calling it on an object which > doesn't have its pages allocated would be a bug. > > - We strictly catch more bugs, since there's not only one place in the > entire tree which is annotated with the special class. All the > other places that had explicit lockdep nesting annotations we're now > going to leave up to lockdep again. > > - Specifically this catches stuff like calling get_pages from > put_pages (which isn't really a good idea, if we can call get_pages > so could the shrinker). I've seen patches do exactly that. If we don't allow get_pages from put_pages, then we need to think a new way to swap the pages freed by put_pages. In the lmem swapping case, put_pages can't just free the pages, it needs to save the pages to somewhere else. The saving operation requires to call get_pages because we need temp objects for blitter engine to do the copying. Can we use another thread to do the async copying? --CQ > > Of course I fully expect CI will show me for the fool I am with this one here :-) > > v2: There can only be one (lockdep only has a cache for the first subclass, not > for deeper ones, and we don't want to make these locks even slower). Still > separate enums for better documentation. > > Real fix: don forget about phys objs and pin_map(), and fix the shrinker to > have the right annotations ... silly me. > > v3: Forgot usertptr too ... > > v4: Improve comment for pages_pin_count, drop the IMPORTANT comment > and instead prime lockdep (Chris). > > v5: Appease checkpatch, no double empty lines (Chris) > > v6: More rebasing over selftest changes. Also somehow I forgot to push this > patch :-/ > > Also format comments consistently while at it. > > v7: Fix typo in commit message (Joonas) > > Also drop the priming, with the lmem merge we now have allocations while > holding the lmem lock, which wreaks the generic priming I've done in earlier > patches. Should probably be resurrected when lmem is fixed. See > > commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 > Author: Matthew Auld <matthew.auld@intel.com> > Date: Tue Oct 8 17:01:14 2019 +0100 > > drm/i915: introduce intel_memory_region > > I'm keeping the priming patch locally so it wont get lost. > > Cc: Matthew Auld <matthew.auld@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: "Tang, CQ" <cq.tang@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6) > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +++- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 17 ++++++++++++++--- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 +++++- > drivers/gpu/drm/i915/gem/i915_gem_pages.c | 9 ++++----- > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++-- > drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++++------- > .../drm/i915/selftests/intel_memory_region.c | 4 ++-- > 9 files changed, 40 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > b/drivers/gpu/drm/i915/gem/i915_gem_object.c > index a50296cce0d8..db103d3c8760 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > @@ -22,6 +22,8 @@ > * > */ > > +#include <linux/sched/mm.h> > + > #include "display/intel_frontbuffer.h" > #include "gt/intel_gt.h" > #include "i915_drv.h" > @@ -186,7 +188,7 @@ static void __i915_gem_free_objects(struct > drm_i915_private *i915, > GEM_BUG_ON(!list_empty(&obj->lut_list)); > > atomic_set(&obj->mm.pages_pin_count, 0); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > GEM_BUG_ON(i915_gem_object_has_pages(obj)); > bitmap_free(obj->bit_17); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h > b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index 458cd51331f1..edaf7126a84d 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -319,11 +319,22 @@ i915_gem_object_unpin_pages(struct > drm_i915_gem_object *obj) > > enum i915_mm_subclass { /* lockdep subclass for obj- > >mm.lock/struct_mutex */ > I915_MM_NORMAL = 0, > - I915_MM_SHRINKER /* called "recursively" from direct-reclaim- > esque */ > + /* > + * Only used by struct_mutex, when called "recursively" from > + * direct-reclaim-esque. Safe because there is only every one > + * struct_mutex in the entire system. > + */ > + I915_MM_SHRINKER = 1, > + /* > + * Used for obj->mm.lock when allocating pages. Safe because the > object > + * isn't yet on any LRU, and therefore the shrinker can't deadlock on > + * it. As soon as the object has pages, obj->mm.lock nests within > + * fs_reclaim. > + */ > + I915_MM_GET_PAGES = 1, > }; > > -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > - enum i915_mm_subclass subclass); > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void > i915_gem_object_writeback(struct drm_i915_gem_object *obj); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > index 96008374a412..15f8297dc34e 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > @@ -162,7 +162,11 @@ struct drm_i915_gem_object { > atomic_t bind_count; > > struct { > - struct mutex lock; /* protects the pages and their use */ > + /* > + * Protects the pages and their use. Do not use directly, but > + * instead go through the pin/unpin interfaces. > + */ > + struct mutex lock; > atomic_t pages_pin_count; > atomic_t shrink_pin; > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c > b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > index 29f4c2850745..f402c2c415c2 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > @@ -106,7 +106,7 @@ int __i915_gem_object_get_pages(struct > drm_i915_gem_object *obj) { > int err; > > - err = mutex_lock_interruptible(&obj->mm.lock); > + err = mutex_lock_interruptible_nested(&obj->mm.lock, > +I915_MM_GET_PAGES); > if (err) > return err; > > @@ -190,8 +190,7 @@ __i915_gem_object_unset_pages(struct > drm_i915_gem_object *obj) > return pages; > } > > -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > - enum i915_mm_subclass subclass) > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) > { > struct sg_table *pages; > int err; > @@ -202,7 +201,7 @@ int __i915_gem_object_put_pages(struct > drm_i915_gem_object *obj, > GEM_BUG_ON(atomic_read(&obj->bind_count)); > > /* May be called by shrinker from within get_pages() (on another bo) > */ > - mutex_lock_nested(&obj->mm.lock, subclass); > + mutex_lock(&obj->mm.lock); > if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { > err = -EBUSY; > goto unlock; > @@ -308,7 +307,7 @@ void *i915_gem_object_pin_map(struct > drm_i915_gem_object *obj, > if (!i915_gem_object_type_has(obj, flags)) > return ERR_PTR(-ENXIO); > > - err = mutex_lock_interruptible(&obj->mm.lock); > + err = mutex_lock_interruptible_nested(&obj->mm.lock, > +I915_MM_GET_PAGES); > if (err) > return ERR_PTR(err); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > index 8043ff63d73f..b1b7c1b3038a 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > @@ -164,7 +164,7 @@ int i915_gem_object_attach_phys(struct > drm_i915_gem_object *obj, int align) > if (err) > return err; > > - mutex_lock(&obj->mm.lock); > + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > > if (obj->mm.madv != I915_MADV_WILLNEED) { > err = -EFAULT; > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > index fd3ce6da8497..066b3df677e8 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > @@ -57,7 +57,7 @@ static bool unsafe_drop_pages(struct > drm_i915_gem_object *obj, > flags = I915_GEM_OBJECT_UNBIND_ACTIVE; > > if (i915_gem_object_unbind(obj, flags) == 0) > - __i915_gem_object_put_pages(obj, I915_MM_SHRINKER); > + __i915_gem_object_put_pages(obj); > > return !i915_gem_object_has_pages(obj); } @@ -209,8 +209,7 @@ > i915_gem_shrink(struct drm_i915_private *i915, > > if (unsafe_drop_pages(obj, shrink)) { > /* May arrive from get_pages on another bo > */ > - mutex_lock_nested(&obj->mm.lock, > - I915_MM_SHRINKER); > + mutex_lock(&obj->mm.lock); > if (!i915_gem_object_has_pages(obj)) { > try_to_writeback(obj, shrink); > count += obj->base.size >> > PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > index 1e045c337044..ee65c6acf0e2 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > @@ -131,7 +131,7 @@ userptr_mn_invalidate_range_start(struct > mmu_notifier *_mn, > ret = i915_gem_object_unbind(obj, > > I915_GEM_OBJECT_UNBIND_ACTIVE); > if (ret == 0) > - ret = __i915_gem_object_put_pages(obj, > I915_MM_SHRINKER); > + ret = __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > if (ret) > return ret; > @@ -483,7 +483,7 @@ __i915_gem_userptr_get_pages_worker(struct > work_struct *_work) > } > } > > - mutex_lock(&obj->mm.lock); > + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > if (obj->userptr.work == &work->work) { > struct sg_table *pages = ERR_PTR(ret); > > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > index 688c49a24f32..5c9583349077 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > @@ -517,7 +517,7 @@ static int > igt_mock_memory_region_huge_pages(void *arg) > i915_vma_unpin(vma); > i915_vma_close(vma); > > - __i915_gem_object_put_pages(obj, > I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } > @@ -650,7 +650,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) > i915_vma_close(vma); > > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > > @@ -678,7 +678,7 @@ static void close_object_list(struct list_head *objects, > > list_del(&obj->st_link); > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } > @@ -948,7 +948,7 @@ static int igt_mock_ppgtt_64K(void *arg) > i915_vma_close(vma); > > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, > I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } > @@ -1301,7 +1301,7 @@ static int igt_ppgtt_exhaust_huge(void *arg) > } > > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, > I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > } > } > @@ -1442,7 +1442,7 @@ static int igt_ppgtt_smoke_huge(void *arg) > } > out_unpin: > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > out_put: > i915_gem_object_put(obj); > > @@ -1530,7 +1530,7 @@ static int igt_ppgtt_sanity_check(void *arg) > err = igt_write_huge(ctx, obj); > > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, > I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > i915_gem_object_put(obj); > > if (err) { > diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > index 19e1cca8f143..95d609abd39b 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > @@ -32,7 +32,7 @@ static void close_objects(struct intel_memory_region > *mem, > if (i915_gem_object_has_pinned_pages(obj)) > i915_gem_object_unpin_pages(obj); > /* No polluting the memory region between tests */ > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > list_del(&obj->st_link); > i915_gem_object_put(obj); > } > @@ -122,7 +122,7 @@ igt_object_create(struct intel_memory_region *mem, > static void igt_object_release(struct drm_i915_gem_object *obj) { > i915_gem_object_unpin_pages(obj); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > list_del(&obj->st_link); > i915_gem_object_put(obj); > } > -- > 2.24.0.rc2
On Tue, Nov 5, 2019 at 7:38 PM Tang, CQ <cq.tang@intel.com> wrote: > > > > > -----Original Message----- > > From: Daniel Vetter <daniel.vetter@ffwll.ch> > > Sent: Tuesday, November 5, 2019 2:02 AM > > To: Intel Graphics Development <intel-gfx@lists.freedesktop.org> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Auld, Matthew > > <matthew.auld@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>; Tang, > > CQ <cq.tang@intel.com>; Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Joonas > > Lahtinen <joonas.lahtinen@linux.intel.com>; Vetter, Daniel > > <daniel.vetter@intel.com> > > Subject: [PATCH] drm/i915: Switch obj->mm.lock lockdep annotations on its > > head > > > > The trouble with having a plain nesting flag for locks which do not naturally > > nest (unlike block devices and their partitions, which is the original motivation > > for nesting levels) is that lockdep will never spot a true deadlock if you screw > > up. > > > > This patch is an attempt at trying better, by highlighting a bit more the actual > > nature of the nesting that's going on. Essentially we have two kinds of > > objects: > > > > - objects without pages allocated, which cannot be on any lru and are > > hence inaccessible to the shrinker. > > > > - objects which have pages allocated, which are on an lru, and which > > the shrinker can decide to throw out. > > > > For the former type of object, memory allcoations while holding > > obj->mm.lock are permissible. For the latter they are not. And > > get/put_pages transitions between the two types of objects. > > > > This is still not entirely fool-proof since the rules might chance. > > But as long as we run such a code ever at runtime lockdep should be able to > > observe the inconsistency and complain (like with any other lockdep class > > that we've split up in multiple classes). But there are a few clear benefits: > > > > - We can drop the nesting flag parameter from > > __i915_gem_object_put_pages, because that function by definition is > > never going allocate memory, and calling it on an object which > > doesn't have its pages allocated would be a bug. > > > > - We strictly catch more bugs, since there's not only one place in the > > entire tree which is annotated with the special class. All the > > other places that had explicit lockdep nesting annotations we're now > > going to leave up to lockdep again. > > > > - Specifically this catches stuff like calling get_pages from > > put_pages (which isn't really a good idea, if we can call get_pages > > so could the shrinker). I've seen patches do exactly that. > > If we don't allow get_pages from put_pages, then we need to think a new way to swap the pages freed by put_pages. > In the lmem swapping case, put_pages can't just free the pages, it needs to save the pages to somewhere else. > > The saving operation requires to call get_pages because we need temp objects for blitter engine to do the copying. > > Can we use another thread to do the async copying? Nah, it's a lot simpler. - roll out dma_resv locking - remove the obj->mm.lock locking for lmem With ww_mutex you can nest however you want to, as long as there's no other locks in between it's all going to work out. Some of the recently added new locks (like the vma->mutex or whatever it was exactly) might also need to be switched over. But that's details we still need to figure out. Only downside to all this it's a lot of to switch the locking around. -Daniel > > > --CQ > > > > > > Of course I fully expect CI will show me for the fool I am with this one here :-) > > > > v2: There can only be one (lockdep only has a cache for the first subclass, not > > for deeper ones, and we don't want to make these locks even slower). Still > > separate enums for better documentation. > > > > Real fix: don forget about phys objs and pin_map(), and fix the shrinker to > > have the right annotations ... silly me. > > > > v3: Forgot usertptr too ... > > > > v4: Improve comment for pages_pin_count, drop the IMPORTANT comment > > and instead prime lockdep (Chris). > > > > v5: Appease checkpatch, no double empty lines (Chris) > > > > v6: More rebasing over selftest changes. Also somehow I forgot to push this > > patch :-/ > > > > Also format comments consistently while at it. > > > > v7: Fix typo in commit message (Joonas) > > > > Also drop the priming, with the lmem merge we now have allocations while > > holding the lmem lock, which wreaks the generic priming I've done in earlier > > patches. Should probably be resurrected when lmem is fixed. See > > > > commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 > > Author: Matthew Auld <matthew.auld@intel.com> > > Date: Tue Oct 8 17:01:14 2019 +0100 > > > > drm/i915: introduce intel_memory_region > > > > I'm keeping the priming patch locally so it wont get lost. > > > > Cc: Matthew Auld <matthew.auld@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: "Tang, CQ" <cq.tang@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6) > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +++- > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 17 ++++++++++++++--- > > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 +++++- > > drivers/gpu/drm/i915/gem/i915_gem_pages.c | 9 ++++----- > > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- > > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- > > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++-- > > drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++++------- > > .../drm/i915/selftests/intel_memory_region.c | 4 ++-- > > 9 files changed, 40 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > index a50296cce0d8..db103d3c8760 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > @@ -22,6 +22,8 @@ > > * > > */ > > > > +#include <linux/sched/mm.h> > > + > > #include "display/intel_frontbuffer.h" > > #include "gt/intel_gt.h" > > #include "i915_drv.h" > > @@ -186,7 +188,7 @@ static void __i915_gem_free_objects(struct > > drm_i915_private *i915, > > GEM_BUG_ON(!list_empty(&obj->lut_list)); > > > > atomic_set(&obj->mm.pages_pin_count, 0); > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > GEM_BUG_ON(i915_gem_object_has_pages(obj)); > > bitmap_free(obj->bit_17); > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > index 458cd51331f1..edaf7126a84d 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > @@ -319,11 +319,22 @@ i915_gem_object_unpin_pages(struct > > drm_i915_gem_object *obj) > > > > enum i915_mm_subclass { /* lockdep subclass for obj- > > >mm.lock/struct_mutex */ > > I915_MM_NORMAL = 0, > > - I915_MM_SHRINKER /* called "recursively" from direct-reclaim- > > esque */ > > + /* > > + * Only used by struct_mutex, when called "recursively" from > > + * direct-reclaim-esque. Safe because there is only every one > > + * struct_mutex in the entire system. > > + */ > > + I915_MM_SHRINKER = 1, > > + /* > > + * Used for obj->mm.lock when allocating pages. Safe because the > > object > > + * isn't yet on any LRU, and therefore the shrinker can't deadlock on > > + * it. As soon as the object has pages, obj->mm.lock nests within > > + * fs_reclaim. > > + */ > > + I915_MM_GET_PAGES = 1, > > }; > > > > -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > - enum i915_mm_subclass subclass); > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > > void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void > > i915_gem_object_writeback(struct drm_i915_gem_object *obj); > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > index 96008374a412..15f8297dc34e 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > @@ -162,7 +162,11 @@ struct drm_i915_gem_object { > > atomic_t bind_count; > > > > struct { > > - struct mutex lock; /* protects the pages and their use */ > > + /* > > + * Protects the pages and their use. Do not use directly, but > > + * instead go through the pin/unpin interfaces. > > + */ > > + struct mutex lock; > > atomic_t pages_pin_count; > > atomic_t shrink_pin; > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c > > b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > > index 29f4c2850745..f402c2c415c2 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c > > @@ -106,7 +106,7 @@ int __i915_gem_object_get_pages(struct > > drm_i915_gem_object *obj) { > > int err; > > > > - err = mutex_lock_interruptible(&obj->mm.lock); > > + err = mutex_lock_interruptible_nested(&obj->mm.lock, > > +I915_MM_GET_PAGES); > > if (err) > > return err; > > > > @@ -190,8 +190,7 @@ __i915_gem_object_unset_pages(struct > > drm_i915_gem_object *obj) > > return pages; > > } > > > > -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > - enum i915_mm_subclass subclass) > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) > > { > > struct sg_table *pages; > > int err; > > @@ -202,7 +201,7 @@ int __i915_gem_object_put_pages(struct > > drm_i915_gem_object *obj, > > GEM_BUG_ON(atomic_read(&obj->bind_count)); > > > > /* May be called by shrinker from within get_pages() (on another bo) > > */ > > - mutex_lock_nested(&obj->mm.lock, subclass); > > + mutex_lock(&obj->mm.lock); > > if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { > > err = -EBUSY; > > goto unlock; > > @@ -308,7 +307,7 @@ void *i915_gem_object_pin_map(struct > > drm_i915_gem_object *obj, > > if (!i915_gem_object_type_has(obj, flags)) > > return ERR_PTR(-ENXIO); > > > > - err = mutex_lock_interruptible(&obj->mm.lock); > > + err = mutex_lock_interruptible_nested(&obj->mm.lock, > > +I915_MM_GET_PAGES); > > if (err) > > return ERR_PTR(err); > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > > b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > > index 8043ff63d73f..b1b7c1b3038a 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > > @@ -164,7 +164,7 @@ int i915_gem_object_attach_phys(struct > > drm_i915_gem_object *obj, int align) > > if (err) > > return err; > > > > - mutex_lock(&obj->mm.lock); > > + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > > > > if (obj->mm.madv != I915_MADV_WILLNEED) { > > err = -EFAULT; > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > > b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > > index fd3ce6da8497..066b3df677e8 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c > > @@ -57,7 +57,7 @@ static bool unsafe_drop_pages(struct > > drm_i915_gem_object *obj, > > flags = I915_GEM_OBJECT_UNBIND_ACTIVE; > > > > if (i915_gem_object_unbind(obj, flags) == 0) > > - __i915_gem_object_put_pages(obj, I915_MM_SHRINKER); > > + __i915_gem_object_put_pages(obj); > > > > return !i915_gem_object_has_pages(obj); } @@ -209,8 +209,7 @@ > > i915_gem_shrink(struct drm_i915_private *i915, > > > > if (unsafe_drop_pages(obj, shrink)) { > > /* May arrive from get_pages on another bo > > */ > > - mutex_lock_nested(&obj->mm.lock, > > - I915_MM_SHRINKER); > > + mutex_lock(&obj->mm.lock); > > if (!i915_gem_object_has_pages(obj)) { > > try_to_writeback(obj, shrink); > > count += obj->base.size >> > > PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > > b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > > index 1e045c337044..ee65c6acf0e2 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > > @@ -131,7 +131,7 @@ userptr_mn_invalidate_range_start(struct > > mmu_notifier *_mn, > > ret = i915_gem_object_unbind(obj, > > > > I915_GEM_OBJECT_UNBIND_ACTIVE); > > if (ret == 0) > > - ret = __i915_gem_object_put_pages(obj, > > I915_MM_SHRINKER); > > + ret = __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > if (ret) > > return ret; > > @@ -483,7 +483,7 @@ __i915_gem_userptr_get_pages_worker(struct > > work_struct *_work) > > } > > } > > > > - mutex_lock(&obj->mm.lock); > > + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); > > if (obj->userptr.work == &work->work) { > > struct sg_table *pages = ERR_PTR(ret); > > > > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > index 688c49a24f32..5c9583349077 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > @@ -517,7 +517,7 @@ static int > > igt_mock_memory_region_huge_pages(void *arg) > > i915_vma_unpin(vma); > > i915_vma_close(vma); > > > > - __i915_gem_object_put_pages(obj, > > I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > } > > } > > @@ -650,7 +650,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) > > i915_vma_close(vma); > > > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > } > > > > @@ -678,7 +678,7 @@ static void close_object_list(struct list_head *objects, > > > > list_del(&obj->st_link); > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > } > > } > > @@ -948,7 +948,7 @@ static int igt_mock_ppgtt_64K(void *arg) > > i915_vma_close(vma); > > > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, > > I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > } > > } > > @@ -1301,7 +1301,7 @@ static int igt_ppgtt_exhaust_huge(void *arg) > > } > > > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, > > I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > } > > } > > @@ -1442,7 +1442,7 @@ static int igt_ppgtt_smoke_huge(void *arg) > > } > > out_unpin: > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > out_put: > > i915_gem_object_put(obj); > > > > @@ -1530,7 +1530,7 @@ static int igt_ppgtt_sanity_check(void *arg) > > err = igt_write_huge(ctx, obj); > > > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, > > I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > i915_gem_object_put(obj); > > > > if (err) { > > diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > index 19e1cca8f143..95d609abd39b 100644 > > --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > @@ -32,7 +32,7 @@ static void close_objects(struct intel_memory_region > > *mem, > > if (i915_gem_object_has_pinned_pages(obj)) > > i915_gem_object_unpin_pages(obj); > > /* No polluting the memory region between tests */ > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > list_del(&obj->st_link); > > i915_gem_object_put(obj); > > } > > @@ -122,7 +122,7 @@ igt_object_create(struct intel_memory_region *mem, > > static void igt_object_release(struct drm_i915_gem_object *obj) { > > i915_gem_object_unpin_pages(obj); > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > + __i915_gem_object_put_pages(obj); > > list_del(&obj->st_link); > > i915_gem_object_put(obj); > > } > > -- > > 2.24.0.rc2 >
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index a50296cce0d8..db103d3c8760 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -22,6 +22,8 @@ * */ +#include <linux/sched/mm.h> + #include "display/intel_frontbuffer.h" #include "gt/intel_gt.h" #include "i915_drv.h" @@ -186,7 +188,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, GEM_BUG_ON(!list_empty(&obj->lut_list)); atomic_set(&obj->mm.pages_pin_count, 0); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); GEM_BUG_ON(i915_gem_object_has_pages(obj)); bitmap_free(obj->bit_17); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 458cd51331f1..edaf7126a84d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -319,11 +319,22 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ I915_MM_NORMAL = 0, - I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */ + /* + * Only used by struct_mutex, when called "recursively" from + * direct-reclaim-esque. Safe because there is only every one + * struct_mutex in the entire system. + */ + I915_MM_SHRINKER = 1, + /* + * Used for obj->mm.lock when allocating pages. Safe because the object + * isn't yet on any LRU, and therefore the shrinker can't deadlock on + * it. As soon as the object has pages, obj->mm.lock nests within + * fs_reclaim. + */ + I915_MM_GET_PAGES = 1, }; -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass); +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void i915_gem_object_writeback(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 96008374a412..15f8297dc34e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -162,7 +162,11 @@ struct drm_i915_gem_object { atomic_t bind_count; struct { - struct mutex lock; /* protects the pages and their use */ + /* + * Protects the pages and their use. Do not use directly, but + * instead go through the pin/unpin interfaces. + */ + struct mutex lock; atomic_t pages_pin_count; atomic_t shrink_pin; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 29f4c2850745..f402c2c415c2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -106,7 +106,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) { int err; - err = mutex_lock_interruptible(&obj->mm.lock); + err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES); if (err) return err; @@ -190,8 +190,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) return pages; } -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass) +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) { struct sg_table *pages; int err; @@ -202,7 +201,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, GEM_BUG_ON(atomic_read(&obj->bind_count)); /* May be called by shrinker from within get_pages() (on another bo) */ - mutex_lock_nested(&obj->mm.lock, subclass); + mutex_lock(&obj->mm.lock); if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { err = -EBUSY; goto unlock; @@ -308,7 +307,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, if (!i915_gem_object_type_has(obj, flags)) return ERR_PTR(-ENXIO); - err = mutex_lock_interruptible(&obj->mm.lock); + err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES); if (err) return ERR_PTR(err); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 8043ff63d73f..b1b7c1b3038a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -164,7 +164,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) if (err) return err; - mutex_lock(&obj->mm.lock); + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); if (obj->mm.madv != I915_MADV_WILLNEED) { err = -EFAULT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index fd3ce6da8497..066b3df677e8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -57,7 +57,7 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj, flags = I915_GEM_OBJECT_UNBIND_ACTIVE; if (i915_gem_object_unbind(obj, flags) == 0) - __i915_gem_object_put_pages(obj, I915_MM_SHRINKER); + __i915_gem_object_put_pages(obj); return !i915_gem_object_has_pages(obj); } @@ -209,8 +209,7 @@ i915_gem_shrink(struct drm_i915_private *i915, if (unsafe_drop_pages(obj, shrink)) { /* May arrive from get_pages on another bo */ - mutex_lock_nested(&obj->mm.lock, - I915_MM_SHRINKER); + mutex_lock(&obj->mm.lock); if (!i915_gem_object_has_pages(obj)) { try_to_writeback(obj, shrink); count += obj->base.size >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 1e045c337044..ee65c6acf0e2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -131,7 +131,7 @@ userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE); if (ret == 0) - ret = __i915_gem_object_put_pages(obj, I915_MM_SHRINKER); + ret = __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); if (ret) return ret; @@ -483,7 +483,7 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work) } } - mutex_lock(&obj->mm.lock); + mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); if (obj->userptr.work == &work->work) { struct sg_table *pages = ERR_PTR(ret); diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 688c49a24f32..5c9583349077 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -517,7 +517,7 @@ static int igt_mock_memory_region_huge_pages(void *arg) i915_vma_unpin(vma); i915_vma_close(vma); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); } } @@ -650,7 +650,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) i915_vma_close(vma); i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); } @@ -678,7 +678,7 @@ static void close_object_list(struct list_head *objects, list_del(&obj->st_link); i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); } } @@ -948,7 +948,7 @@ static int igt_mock_ppgtt_64K(void *arg) i915_vma_close(vma); i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); } } @@ -1301,7 +1301,7 @@ static int igt_ppgtt_exhaust_huge(void *arg) } i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); } } @@ -1442,7 +1442,7 @@ static int igt_ppgtt_smoke_huge(void *arg) } out_unpin: i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); out_put: i915_gem_object_put(obj); @@ -1530,7 +1530,7 @@ static int igt_ppgtt_sanity_check(void *arg) err = igt_write_huge(ctx, obj); i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); i915_gem_object_put(obj); if (err) { diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index 19e1cca8f143..95d609abd39b 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -32,7 +32,7 @@ static void close_objects(struct intel_memory_region *mem, if (i915_gem_object_has_pinned_pages(obj)) i915_gem_object_unpin_pages(obj); /* No polluting the memory region between tests */ - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); list_del(&obj->st_link); i915_gem_object_put(obj); } @@ -122,7 +122,7 @@ igt_object_create(struct intel_memory_region *mem, static void igt_object_release(struct drm_i915_gem_object *obj) { i915_gem_object_unpin_pages(obj); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); list_del(&obj->st_link); i915_gem_object_put(obj); }