From patchwork Tue Dec 17 23:28:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11299069 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B507E1580 for ; Tue, 17 Dec 2019 23:28:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9396521582 for ; Tue, 17 Dec 2019 23:28:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9396521582 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6D8E56E1CF; Tue, 17 Dec 2019 23:28:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2E2DF6E1CF for ; Tue, 17 Dec 2019 23:28:51 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 19615199-1500050 for multiple; Tue, 17 Dec 2019 23:28:39 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 17 Dec 2019 23:28:37 +0000 Message-Id: <20191217232837.3411664-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191217230740.3399493-1-chris@chris-wilson.co.uk> References: <20191217230740.3399493-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH] drm/i915: Ratelimit i915_globals_park X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" When doing our global park, we like to be a good citizen and shrink our slab caches (of which we have quite a few now), but each kmem_cache_shrink() incurs a stop_machine() and so ends up being quite expensive, causing machine-wide stalls. While ideally we would like to throw away unused pages in our slab caches whenever it appears that we are idling, doing so will require a much cheaper mechanism. In the meantime use a delayed worked to impose a rate-limit that means we have to have been idle for more than 2 seconds before we start shrinking. References: https://gitlab.freedesktop.org/drm/intel/issues/848 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_globals.c | 31 +++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c index be127cd28931..06d2be2daca2 100644 --- a/drivers/gpu/drm/i915/i915_globals.c +++ b/drivers/gpu/drm/i915/i915_globals.c @@ -20,7 +20,8 @@ static LIST_HEAD(globals); static atomic_t active; static atomic_t epoch; static struct park_work { - struct rcu_work work; + struct delayed_work work; + struct rcu_head rcu; int epoch; } park; @@ -37,11 +38,27 @@ static void i915_globals_shrink(void) global->shrink(); } +static void __i915_globals_grace(struct rcu_head *rcu) +{ + /* Ratelimit parking as shrinking is quite slow */ + mod_delayed_work(system_wq, &park.work, + round_jiffies_up_relative(2 * HZ)); +} + +static void __i915_globals_queue_rcu(void) +{ + park.epoch = atomic_inc_return(&epoch); + if (!atomic_read(&active)) + call_rcu(&park.rcu, __i915_globals_grace); +} + static void __i915_globals_park(struct work_struct *work) { /* Confirm nothing woke up in the last grace period */ if (park.epoch == atomic_read(&epoch)) i915_globals_shrink(); + else + __i915_globals_queue_rcu(); } void __init i915_global_register(struct i915_global *global) @@ -85,7 +102,7 @@ int __init i915_globals_init(void) } } - INIT_RCU_WORK(&park.work, __i915_globals_park); + INIT_DELAYED_WORK(&park.work, __i915_globals_park); return 0; } @@ -103,8 +120,9 @@ void i915_globals_park(void) if (!atomic_dec_and_test(&active)) return; - park.epoch = atomic_inc_return(&epoch); - queue_rcu_work(system_wq, &park.work); + /* Queue cleanup after the next RCU grace period has freed slabs */ + if (!work_pending(&park.work.work)) + __i915_globals_queue_rcu(); } void i915_globals_unpark(void) @@ -116,8 +134,9 @@ void i915_globals_unpark(void) void __exit i915_globals_exit(void) { /* Flush any residual park_work */ - atomic_inc(&epoch); - flush_rcu_work(&park.work); + atomic_inc(&epoch); /* skip shrinking */ + rcu_barrier(); /* wait for the work to be queued */ + flush_delayed_work(&park.work); __i915_globals_cleanup();