diff mbox series

drm/i915/ttm: Fix illegal addition to shrinker list

Message ID 20211110085527.1033475-1-thomas.hellstrom@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/ttm: Fix illegal addition to shrinker list | expand

Commit Message

Thomas Hellström Nov. 10, 2021, 8:55 a.m. UTC
There's a small window of opportunity during which the adjust_lru()
function can be called with a GEM refcount of zero from the TTM
eviction code. This results in a kernel BUG().

Ensure that we don't attempt to modify the GEM shrinker lists unless
we have a GEM refcount.

Fixes: ebd4a8ec7799 ("drm/i915/ttm: move shrinker management into adjust_lru")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 31 +++++++++++++++++--------
 1 file changed, 21 insertions(+), 10 deletions(-)

Comments

Matthew Auld Nov. 10, 2021, 10:29 a.m. UTC | #1
On 10/11/2021 08:55, Thomas Hellström wrote:
> There's a small window of opportunity during which the adjust_lru()
> function can be called with a GEM refcount of zero from the TTM
> eviction code. This results in a kernel BUG().
> 
> Ensure that we don't attempt to modify the GEM shrinker lists unless
> we have a GEM refcount.
> 
> Fixes: ebd4a8ec7799 ("drm/i915/ttm: move shrinker management into adjust_lru")
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Thomas Hellström Nov. 10, 2021, 10:31 a.m. UTC | #2
On 11/10/21 11:29, Matthew Auld wrote:
> On 10/11/2021 08:55, Thomas Hellström wrote:
>> There's a small window of opportunity during which the adjust_lru()
>> function can be called with a GEM refcount of zero from the TTM
>> eviction code. This results in a kernel BUG().
>>
>> Ensure that we don't attempt to modify the GEM shrinker lists unless
>> we have a GEM refcount.
>>
>> Fixes: ebd4a8ec7799 ("drm/i915/ttm: move shrinker management into 
>> adjust_lru")
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>


Thanks for reviewing, Matt.

/Thomas
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index e98503c0830b..68cfe6e9ceab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -771,18 +771,29 @@  void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	 *
 	 * TODO: consider maybe also bumping the shrinker list here when we have
 	 * already unpinned it, which should give us something more like an LRU.
+	 *
+	 * TODO: There is a small window of opportunity for this function to
+	 * get called from eviction after we've dropped the last GEM refcount,
+	 * but before the TTM deleted flag is set on the object. Avoid
+	 * adjusting the shrinker list in such cases, since the object is
+	 * not available to the shrinker anyway due to its zero refcount.
+	 * To fix this properly we should move to a TTM shrinker LRU list for
+	 * these objects.
 	 */
-	if (shrinkable != obj->mm.ttm_shrinkable) {
-		if (shrinkable) {
-			if (obj->mm.madv == I915_MADV_WILLNEED)
-				__i915_gem_object_make_shrinkable(obj);
-			else
-				__i915_gem_object_make_purgeable(obj);
-		} else {
-			i915_gem_object_make_unshrinkable(obj);
+	if (kref_get_unless_zero(&obj->base.refcount)) {
+		if (shrinkable != obj->mm.ttm_shrinkable) {
+			if (shrinkable) {
+				if (obj->mm.madv == I915_MADV_WILLNEED)
+					__i915_gem_object_make_shrinkable(obj);
+				else
+					__i915_gem_object_make_purgeable(obj);
+			} else {
+				i915_gem_object_make_unshrinkable(obj);
+			}
+
+			obj->mm.ttm_shrinkable = shrinkable;
 		}
-
-		obj->mm.ttm_shrinkable = shrinkable;
+		i915_gem_object_put(obj);
 	}
 
 	/*