From patchwork Wed Jan  9 14:12:46 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
X-Patchwork-Id: 10754249
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F2F1091E
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Wed,  9 Jan 2019 14:12:56 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E217D288D3
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Wed,  9 Jan 2019 14:12:56 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id D5EA928F21; Wed,  9 Jan 2019 14:12:56 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,HK_RANDOM_FROM,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D39B5288D3
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Wed,  9 Jan 2019 14:12:55 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 45ABC89503;
	Wed,  9 Jan 2019 14:12:54 +0000 (UTC)
X-Original-To: Intel-gfx@lists.freedesktop.org
Delivered-To: Intel-gfx@lists.freedesktop.org
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by gabe.freedesktop.org (Postfix) with ESMTPS id C7F7989503
 for <Intel-gfx@lists.freedesktop.org>; Wed,  9 Jan 2019 14:12:53 +0000 (UTC)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga006.jf.intel.com ([10.7.209.51])
 by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 09 Jan 2019 06:12:52 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.56,457,1539673200"; d="scan'208";a="106895147"
Received: from heganx-mobl2.ger.corp.intel.com (HELO localhost.localdomain)
 ([10.252.16.106])
 by orsmga006.jf.intel.com with ESMTP; 09 Jan 2019 06:12:51 -0800
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Intel-gfx@lists.freedesktop.org
Date: Wed,  9 Jan 2019 14:12:46 +0000
Message-Id: <20190109141247.32166-1-tvrtko.ursulin@linux.intel.com>
X-Mailer: git-send-email 2.19.1
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Reduce recursive mutex locking
 from the shrinker
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
 <intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

In two codepaths internal to the shrinker we know we will end up taking
the resursive mutex path.

It instead feels more elegant to avoid this altogether and not call
mutex_trylock_recursive in those cases.

We achieve this by extracting the shrinking part to __i915_gem_shrink,
wrapped by struct mutex taking i915_gem_shrink.

At the same time move the runtime pm reference taking to always be in the
usual, before struct mutex, order.

v2:
 * Don't use flags but split i915_gem_shrink into locked and unlocked.

v3:
 * Whitespace and checkpatch reported errors.

v4:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   2 +-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 152 +++++++++++++----------
 2 files changed, 84 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d57438d87bc0..677a9af3771b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3178,7 +3178,7 @@ i915_gem_object_create_internal(struct drm_i915_private *dev_priv,
 unsigned long i915_gem_shrink(struct drm_i915_private *i915,
 			      unsigned long target,
 			      unsigned long *nr_scanned,
-			      unsigned flags);
+			      unsigned int flags);
 #define I915_SHRINK_PURGEABLE 0x1
 #define I915_SHRINK_UNBOUND 0x2
 #define I915_SHRINK_BOUND 0x4
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 72d6ea0cac7e..ce539d38461c 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -115,36 +115,11 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj)
 	return !i915_gem_object_has_pages(obj);
 }
 
-/**
- * i915_gem_shrink - Shrink buffer object caches
- * @i915: i915 device
- * @target: amount of memory to make available, in pages
- * @nr_scanned: optional output for number of pages scanned (incremental)
- * @flags: control flags for selecting cache types
- *
- * This function is the main interface to the shrinker. It will try to release
- * up to @target pages of main memory backing storage from buffer objects.
- * Selection of the specific caches can be done with @flags. This is e.g. useful
- * when purgeable objects should be removed from caches preferentially.
- *
- * Note that it's not guaranteed that released amount is actually available as
- * free system memory - the pages might still be in-used to due to other reasons
- * (like cpu mmaps) or the mm core has reused them before we could grab them.
- * Therefore code that needs to explicitly shrink buffer objects caches (e.g. to
- * avoid deadlocks in memory reclaim) must fall back to i915_gem_shrink_all().
- *
- * Also note that any kind of pinning (both per-vma address space pins and
- * backing storage pins at the buffer object level) result in the shrinker code
- * having to skip the object.
- *
- * Returns:
- * The number of pages of backing storage actually released.
- */
-unsigned long
-i915_gem_shrink(struct drm_i915_private *i915,
-		unsigned long target,
-		unsigned long *nr_scanned,
-		unsigned flags)
+static unsigned long
+__i915_gem_shrink(struct drm_i915_private *i915,
+		  unsigned long target,
+		  unsigned long *nr_scanned,
+		  unsigned int flags)
 {
 	const struct {
 		struct list_head *list;
@@ -156,10 +131,8 @@ i915_gem_shrink(struct drm_i915_private *i915,
 	}, *phase;
 	unsigned long count = 0;
 	unsigned long scanned = 0;
-	bool unlock;
 
-	if (!shrinker_lock(i915, flags, &unlock))
-		return 0;
+	lockdep_assert_held(&i915->drm.struct_mutex);
 
 	/*
 	 * When shrinking the active list, also consider active contexts.
@@ -175,18 +148,8 @@ i915_gem_shrink(struct drm_i915_private *i915,
 				       I915_WAIT_LOCKED,
 				       MAX_SCHEDULE_TIMEOUT);
 
-	trace_i915_gem_shrink(i915, target, flags);
 	i915_retire_requests(i915);
 
-	/*
-	 * Unbinding of objects will require HW access; Let us not wake the
-	 * device just to recover a little memory. If absolutely necessary,
-	 * we will force the wake during oom-notifier.
-	 */
-	if ((flags & I915_SHRINK_BOUND) &&
-	    !intel_runtime_pm_get_if_in_use(i915))
-		flags &= ~I915_SHRINK_BOUND;
-
 	/*
 	 * As we may completely rewrite the (un)bound list whilst unbinding
 	 * (due to retiring requests) we have to strictly process only
@@ -265,15 +228,70 @@ i915_gem_shrink(struct drm_i915_private *i915,
 		spin_unlock(&i915->mm.obj_lock);
 	}
 
-	if (flags & I915_SHRINK_BOUND)
-		intel_runtime_pm_put(i915);
-
 	i915_retire_requests(i915);
 
-	shrinker_unlock(i915, unlock);
-
 	if (nr_scanned)
 		*nr_scanned += scanned;
+
+	return count;
+}
+
+/**
+ * i915_gem_shrink - Shrink buffer object caches
+ * @i915: i915 device
+ * @target: amount of memory to make available, in pages
+ * @nr_scanned: optional output for number of pages scanned (incremental)
+ * @flags: control flags for selecting cache types
+ *
+ * This function is the main interface to the shrinker. It will try to release
+ * up to @target pages of main memory backing storage from buffer objects.
+ * Selection of the specific caches can be done with @flags. This is e.g. useful
+ * when purgeable objects should be removed from caches preferentially.
+ *
+ * Note that it's not guaranteed that released amount is actually available as
+ * free system memory - the pages might still be in-used to due to other reasons
+ * (like cpu mmaps) or the mm core has reused them before we could grab them.
+ * Therefore code that needs to explicitly shrink buffer objects caches (e.g. to
+ * avoid deadlocks in memory reclaim) must fall back to i915_gem_shrink_all().
+ *
+ * Also note that any kind of pinning (both per-vma address space pins and
+ * backing storage pins at the buffer object level) result in the shrinker code
+ * having to skip the object.
+ *
+ * Returns:
+ * The number of pages of backing storage actually released.
+ */
+unsigned long
+i915_gem_shrink(struct drm_i915_private *i915,
+		unsigned long target,
+		unsigned long *nr_scanned,
+		unsigned int flags)
+{
+	unsigned long count = 0;
+	bool unlock;
+
+	trace_i915_gem_shrink(i915, target, flags);
+
+	/*
+	 * Unbinding of objects will require HW access; Let us not wake the
+	 * device just to recover a little memory. If absolutely necessary,
+	 * we will force the wake during oom-notifier.
+	 */
+	if ((flags & I915_SHRINK_BOUND) &&
+	    !intel_runtime_pm_get_if_in_use(i915))
+		flags &= ~I915_SHRINK_BOUND;
+
+	if (!shrinker_lock(i915, flags, &unlock))
+		goto out;
+
+	count = __i915_gem_shrink(i915, target, nr_scanned, flags);
+
+	shrinker_unlock(i915, unlock);
+
+out:
+	if (flags & I915_SHRINK_BOUND)
+		intel_runtime_pm_put(i915);
+
 	return count;
 }
 
@@ -350,6 +368,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 {
 	struct drm_i915_private *i915 =
 		container_of(shrinker, struct drm_i915_private, mm.shrinker);
+	const unsigned int flags = I915_SHRINK_BOUND | I915_SHRINK_UNBOUND;
 	unsigned long freed;
 	bool unlock;
 
@@ -358,26 +377,21 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 	if (!shrinker_lock(i915, 0, &unlock))
 		return SHRINK_STOP;
 
-	freed = i915_gem_shrink(i915,
-				sc->nr_to_scan,
-				&sc->nr_scanned,
-				I915_SHRINK_BOUND |
-				I915_SHRINK_UNBOUND |
-				I915_SHRINK_PURGEABLE);
+	freed = __i915_gem_shrink(i915,
+				  sc->nr_to_scan,
+				  &sc->nr_scanned,
+				  flags | I915_SHRINK_PURGEABLE);
 	if (sc->nr_scanned < sc->nr_to_scan)
-		freed += i915_gem_shrink(i915,
-					 sc->nr_to_scan - sc->nr_scanned,
-					 &sc->nr_scanned,
-					 I915_SHRINK_BOUND |
-					 I915_SHRINK_UNBOUND);
+		freed += __i915_gem_shrink(i915,
+					   sc->nr_to_scan - sc->nr_scanned,
+					   &sc->nr_scanned,
+					   flags);
 	if (sc->nr_scanned < sc->nr_to_scan && current_is_kswapd()) {
 		intel_runtime_pm_get(i915);
-		freed += i915_gem_shrink(i915,
-					 sc->nr_to_scan - sc->nr_scanned,
-					 &sc->nr_scanned,
-					 I915_SHRINK_ACTIVE |
-					 I915_SHRINK_BOUND |
-					 I915_SHRINK_UNBOUND);
+		freed += __i915_gem_shrink(i915,
+					   sc->nr_to_scan - sc->nr_scanned,
+					   &sc->nr_scanned,
+					   flags | I915_SHRINK_ACTIVE);
 		intel_runtime_pm_put(i915);
 	}
 
@@ -475,10 +489,10 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
 		goto out;
 
 	intel_runtime_pm_get(i915);
-	freed_pages += i915_gem_shrink(i915, -1UL, NULL,
-				       I915_SHRINK_BOUND |
-				       I915_SHRINK_UNBOUND |
-				       I915_SHRINK_VMAPS);
+	freed_pages += __i915_gem_shrink(i915, -1UL, NULL,
+					 I915_SHRINK_BOUND |
+					 I915_SHRINK_UNBOUND |
+					 I915_SHRINK_VMAPS);
 	intel_runtime_pm_put(i915);
 
 	/* We also want to clear any cached iomaps as they wrap vmap */