From patchwork Wed Jul 27 12:29:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3511FC04A68 for ; Wed, 27 Jul 2022 12:30:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 46B85C5CF0; Wed, 27 Jul 2022 12:30:03 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id AB81EC5D67; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 216C06118C; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7A9BDC433B5; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925000; bh=+RB6a52AG9Wuuej9xwxTPJsBVg63GQuYFqKk14nxsL8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kFS31LTGWtu6UsyHZFhoWgN+gAxEp6uDD8BAlgj113QA/7KmExjepcsOVktSS+Du4 fT4I9aZU/g/TPNMUcpD89DD7h4M3ejCXieWgAGMUqKlt0TuOvSpCip8RRJcw/YhOMH 3o8TKjS0W1K5vmOF7Utyn22jlfBZfUKapGw+AJKD2vMPw9j70vYOd09Y3U1Z4gUsw7 LP93q7TDhe4KkSs7h+mBOBaCcq/96RgFxqRwm2J+0HKqPvyb+PlNkzIPBASXOiHIp+ n3gi0Vsp3fqQqarQHLlbu0H7e9AB1JhvSim4CmR8LdR+gJuu/ZG1n+IKBtEoH1KWdD LXY6TuKwwibnA== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xm9-6i; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 1/6] drm/i915/gt: Ignore TLB invalidations on idle engines Date: Wed, 27 Jul 2022 14:29:51 +0200 Message-Id: <278a57a672edac75683f0818b292e95da583a5fe.1658924372.git.mchehab@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: David Airlie , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , Fei Yang , Matthew Brost , Chris Wilson , Matthew Auld , Andi Shyti , Dave Airlie , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Lucas De Marchi , intel-gfx@lists.freedesktop.org, Rodrigo Vivi , Mauro Carvalho Chehab , Tvrtko Ursulin , Tvrtko Ursulin , linux-kernel@vger.kernel.org, stable@vger.kernel.org, John Harrison Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson Check if the device is powered down prior to any engine activity, as, on such cases, all the TLBs were already invalidated, so an explicit TLB invalidation is not needed, thus reducing the performance regression impact due to it. This becomes more significant with GuC, as it can only do so when the connection to the GuC is awake. Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Signed-off-by: Chris Wilson Cc: Fei Yang Reviewed-by: Andi Shyti Acked-by: Thomas Hellström Acked-by: Tvrtko Ursulin Signed-off-by: Mauro Carvalho Chehab --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++++++---- drivers/gpu/drm/i915/gt/intel_gt.c | 17 ++++++++++------- drivers/gpu/drm/i915/gt/intel_gt_pm.h | 3 +++ 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 97c820eee115..6835279943df 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -6,14 +6,15 @@ #include +#include "gt/intel_gt.h" +#include "gt/intel_gt_pm.h" + #include "i915_drv.h" #include "i915_gem_object.h" #include "i915_scatterlist.h" #include "i915_gem_lmem.h" #include "i915_gem_mman.h" -#include "gt/intel_gt.h" - void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct sg_table *pages, unsigned int sg_page_sizes) @@ -217,10 +218,11 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) if (test_and_clear_bit(I915_BO_WAS_BOUND_BIT, &obj->flags)) { struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_gt *gt = to_gt(i915); intel_wakeref_t wakeref; - with_intel_runtime_pm_if_active(&i915->runtime_pm, wakeref) - intel_gt_invalidate_tlbs(to_gt(i915)); + with_intel_gt_pm_if_awake(gt, wakeref) + intel_gt_invalidate_tlbs(gt); } return pages; diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 68c2b0d8f187..c4d43da84d8e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -12,6 +12,7 @@ #include "i915_drv.h" #include "intel_context.h" +#include "intel_engine_pm.h" #include "intel_engine_regs.h" #include "intel_ggtt_gmch.h" #include "intel_gt.h" @@ -924,6 +925,7 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) struct drm_i915_private *i915 = gt->i915; struct intel_uncore *uncore = gt->uncore; struct intel_engine_cs *engine; + intel_engine_mask_t awake, tmp; enum intel_engine_id id; const i915_reg_t *regs; unsigned int num = 0; @@ -947,26 +949,31 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) GEM_TRACE("\n"); - assert_rpm_wakelock_held(&i915->runtime_pm); - mutex_lock(>->tlb_invalidate_lock); intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */ + awake = 0; for_each_engine(engine, gt, id) { struct reg_and_bit rb; + if (!intel_engine_pm_is_awake(engine)) + continue; + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); if (!i915_mmio_reg_offset(rb.reg)) continue; intel_uncore_write_fw(uncore, rb.reg, rb.bit); + awake |= engine->mask; } spin_unlock_irq(&uncore->lock); - for_each_engine(engine, gt, id) { + for_each_engine_masked(engine, gt, awake, tmp) { + struct reg_and_bit rb; + /* * HW architecture suggest typical invalidation time at 40us, * with pessimistic cases up to 100us and a recommendation to @@ -974,12 +981,8 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) */ const unsigned int timeout_us = 100; const unsigned int timeout_ms = 4; - struct reg_and_bit rb; rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (!i915_mmio_reg_offset(rb.reg)) - continue; - if (__intel_wait_for_register_fw(uncore, rb.reg, rb.bit, 0, timeout_us, timeout_ms, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h index bc898df7a48c..a334787a4939 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h @@ -55,6 +55,9 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt) for (tmp = 1, intel_gt_pm_get(gt); tmp; \ intel_gt_pm_put(gt), tmp = 0) +#define with_intel_gt_pm_if_awake(gt, wf) \ + for (wf = intel_gt_pm_get_if_awake(gt); wf; intel_gt_pm_put_async(gt), wf = 0) + static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt) { return intel_wakeref_wait_for_idle(>->wakeref); From patchwork Wed Jul 27 12:29:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51F91C04A68 for ; Wed, 27 Jul 2022 12:30:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EC6C6C5D75; Wed, 27 Jul 2022 12:30:03 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id B3709C5D75; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 21BB66118D; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 86642C433D6; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925000; bh=Nm93lIHfAfepbN4O0hsAfnZomLbvcVLQp4MPg4Aq9vI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZAQlO772UyaZbJ0IE7U/AQqndZjAc71FhOxikHLF0dl3+8qpOkq9pqpqcYv+aUHoP 1uGIFHvwntdcuWwU8Hf7Ua2oFi0DT71upyfhRFK6JsUcFq2pUNBylUZpmnPml8Y3Lk emf1OK/GWPEY27uCUP/08xvJagBq+5g+UGqCNJiDyC2hSof88mX+AZDTyu7SFdpybS +MeIYQRz9atGjmIUAeqWZ2EukM5SkKpATLUNy2A800Uqr6MitD3Jov/6mWQoEA8kZf yc99KNDgFqqZF1wW3kGYlaMU4NlZluBCnst0KfABicZFPgqWoFVpSXk3eN8SX16vhx uCw6kiJXQAsuQ== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xmC-8F; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 2/6] drm/i915/gt: document with_intel_gt_pm_if_awake() Date: Wed, 27 Jul 2022 14:29:52 +0200 Message-Id: X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , Tvrtko Ursulin , Tvrtko Ursulin , David Airlie , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Chris Wilson , Rodrigo Vivi , Mauro Carvalho Chehab , intel-gfx@lists.freedesktop.org, John Harrison Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a kernel-doc markup to document this new macro. Reviewed-by: Tvrtko Ursulin Signed-off-by: Mauro Carvalho Chehab Reviewed-by: Andi Shyti --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ drivers/gpu/drm/i915/gt/intel_gt_pm.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h index a334787a4939..6c9a46452364 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h @@ -55,6 +55,14 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt) for (tmp = 1, intel_gt_pm_get(gt); tmp; \ intel_gt_pm_put(gt), tmp = 0) +/** + * with_intel_gt_pm_if_awake - if GT is PM awake, get a reference to prevent + * it to sleep, run some code and then asynchrously put the reference + * away. + * + * @gt: pointer to the gt + * @wf: pointer to a temporary wakeref. + */ #define with_intel_gt_pm_if_awake(gt, wf) \ for (wf = intel_gt_pm_get_if_awake(gt); wf; intel_gt_pm_put_async(gt), wf = 0) From patchwork Wed Jul 27 12:29:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BDA2CC19F2B for ; Wed, 27 Jul 2022 12:30:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C4DC2C5E3F; Wed, 27 Jul 2022 12:30:06 +0000 (UTC) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by gabe.freedesktop.org (Postfix) with ESMTPS id B565EC5D4F; Wed, 27 Jul 2022 12:30:03 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DDE42B82078; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7AF7FC43470; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925000; bh=lOE1MGL9bb8yddRydxM1fpZY6tycwyxv4aHoWOD7OZQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ESBCJw8vK8aZGD73cSuSmAif9H+WC+AkurHUu18BuZR1aeJbhJxsymz/0keBKJ9pR VHSD0tmiKMhV6hTvDoyO63rDqE4UleB2ZYJ/KCpEo1B28+091qu/qpXZcyPJmWkKWI G3KD0wU/CVkY8RSXRTolnSLky3yxMoFdReL0PeSandZ5rv3U5o3nd16h3RbjvIpOMX 1yLwu4UudpUcHHyn9K+uikbxV3NLRBisCclgzR0CDvqpLpQyKakvGgJey/tveSM+7/ ftQZUhHca7Zb81zIHQgHS7yMz4qxf/EOnIeSlXblk1wTX+E3KuUs/pqb685wu3p27X lY1caumfDn5AA== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xmH-9n; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 3/6] drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations Date: Wed, 27 Jul 2022 14:29:53 +0200 Message-Id: <59724d9f5cf1e93b1620d01b8332ac991555283d.1658924372.git.mchehab@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , =?utf-8?q?Thomas_Hellst?= =?utf-8?q?r=C3=B6m?= , Andi Shyti , Tvrtko Ursulin , David Airlie , dri-devel@lists.freedesktop.org, Lucas De Marchi , linux-kernel@vger.kernel.org, Chris Wilson , Daniele Ceraolo Spurio , Rodrigo Vivi , Dave Airlie , stable@vger.kernel.org, Mauro Carvalho Chehab , intel-gfx@lists.freedesktop.org, Fei Yang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson Ensure that the TLB of the OA unit is also invalidated on gen12 HW, as just invalidating the TLB of an engine is not enough. Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Signed-off-by: Chris Wilson Cc: Fei Yang Reviewed-by: Andi Shyti Acked-by: Tvrtko Ursulin Acked-by: Thomas Hellström Signed-off-by: Mauro Carvalho Chehab --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ drivers/gpu/drm/i915/gt/intel_gt.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index c4d43da84d8e..1d84418e8676 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -11,6 +11,7 @@ #include "pxp/intel_pxp.h" #include "i915_drv.h" +#include "i915_perf_oa_regs.h" #include "intel_context.h" #include "intel_engine_pm.h" #include "intel_engine_regs.h" @@ -969,6 +970,15 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) awake |= engine->mask; } + /* Wa_2207587034:tgl,dg1,rkl,adl-s,adl-p */ + if (awake && + (IS_TIGERLAKE(i915) || + IS_DG1(i915) || + IS_ROCKETLAKE(i915) || + IS_ALDERLAKE_S(i915) || + IS_ALDERLAKE_P(i915))) + intel_uncore_write_fw(uncore, GEN12_OA_TLB_INV_CR, 1); + spin_unlock_irq(&uncore->lock); for_each_engine_masked(engine, gt, awake, tmp) { From patchwork Wed Jul 27 12:29:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0534C19F28 for ; Wed, 27 Jul 2022 12:30:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 61D3AC5D19; Wed, 27 Jul 2022 12:30:03 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3AEFBC5CBC; Wed, 27 Jul 2022 12:30:02 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AC9CE61222; Wed, 27 Jul 2022 12:30:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89ECCC4347C; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925000; bh=GQnU144pkObOuuSOogSq2lICybfwG6TF/c3AiwyFiLU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iIUU+xwD3iVDACp1ZnSRc544KHgche8Xy0PlTjUYVBQ/hPu2sI9nV01zGlvR6HMmN 8hWPFzbcpPnfY/3wZ2V7Z7G/7sjFRxgnD51Z6gRBz+UDLaO5d7eAfieCMrIfB52K8t 0Rd0BnFe/tiNEdb/wpqyEwv8/B16/JtJbAS5Z18pEA5dsUo6Ca9/iYfYV87On/WBBZ EE+ZjOenfcBGZuIQA2CZVC+84I/4jyT/So3w9uUT8WJe7pZDJxo5wZ2e5epuQOwATl deUoe+Waz2xy4n/MqIsQDoa92AY78A3piVR7d9TPGNqbwD6iPDZYZRmd50WJQhjNCA jYVmNJCH3NoBA== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xmL-B2; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 4/6] drm/i915/gt: Skip TLB invalidations once wedged Date: Wed, 27 Jul 2022 14:29:54 +0200 Message-Id: <5aa86564b9ec5fe7fe605c1dd7de76855401ed73.1658924372.git.mchehab@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , =?utf-8?q?Thomas_Hellst?= =?utf-8?q?r=C3=B6m?= , Andi Shyti , Tvrtko Ursulin , David Airlie , dri-devel@lists.freedesktop.org, Lucas De Marchi , linux-kernel@vger.kernel.org, Chris Wilson , Daniele Ceraolo Spurio , Rodrigo Vivi , Dave Airlie , stable@vger.kernel.org, Mauro Carvalho Chehab , intel-gfx@lists.freedesktop.org, Fei Yang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson Skip all further TLB invalidations once the device is wedged and had been reset, as, on such cases, it can no longer process instructions on the GPU and the user no longer has access to the TLB's in each engine. So, an attempt to do a TLB cache invalidation will produce a timeout. That helps to reduce the performance regression introduced by TLB invalidate logic. Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Signed-off-by: Chris Wilson Cc: Fei Yang Cc: Tvrtko Ursulin Reviewed-by: Andi Shyti Acked-by: Thomas Hellström Signed-off-by: Mauro Carvalho Chehab --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ drivers/gpu/drm/i915/gt/intel_gt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 1d84418e8676..5c55a90672f4 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -934,6 +934,9 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) if (I915_SELFTEST_ONLY(gt->awake == -ENODEV)) return; + if (intel_gt_is_wedged(gt)) + return; + if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; num = ARRAY_SIZE(gen12_regs); From patchwork Wed Jul 27 12:29:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EC862C04A68 for ; Wed, 27 Jul 2022 12:30:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C2A76C5E42; Wed, 27 Jul 2022 12:30:07 +0000 (UTC) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by gabe.freedesktop.org (Postfix) with ESMTPS id 99D2DC5CCC; Wed, 27 Jul 2022 12:30:04 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CC74AB8207C; Wed, 27 Jul 2022 12:30:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D780C43141; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925001; bh=c7qOC1C12r1LYkOoQUuN/lHfMMwKwIf4vSK+nBo3JKs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GXhCf6Z4+ufJE9p/Te7ARaH9Fhp7F1CT+IRa1XWKVztKn2U8p9IUAlXX/vLGPuM81 /kxR2ahK3nTo0y+KdeBZc0RSmmca8E+c2eO5u+9QfdCLJp8D8Yk5Rj9SEH3o4vwfOy hPBZK8b4jDcrBqEUJUGno9g7WWMzkPMpSgws+Pmiaj0EKf72THAbMne5CkaZzghRV1 csN2GSKecogdRIHb5fULY0FaSvPGsNIVvelkP1kuOqGB8KSCjcLbiryxvQ0hjno+XV IRbpCUY0qBiZHyOc+ODbPtNLv0JOlcUhR04oTQIQue3NAtp6sW/qGYXRgekTd1bppz 3v5lxmZr3Tc4w== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xmP-CW; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 5/6] drm/i915/gt: Batch TLB invalidations Date: Wed, 27 Jul 2022 14:29:55 +0200 Message-Id: <4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fei Yang , David Airlie , dri-devel@lists.freedesktop.org, Daniele Ceraolo Spurio , Andrzej Hajda , Sumit Semwal , Michael Cheng , Chris Wilson , Ayaz A Siddiqui , Andi Shyti , Dave Airlie , Tomas Winkler , Matthew Auld , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Casey Bowman , Lucas De Marchi , intel-gfx@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, Nirmoy Das , Rodrigo Vivi , Mauro Carvalho Chehab , Tvrtko Ursulin , =?utf-8?q?Micha=C5=82_Wini?= =?utf-8?q?arski?= , Tvrtko Ursulin , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Ashutosh Dixit , linux-media@vger.kernel.org, =?utf-8?q?Christian_K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson Invalidate TLB in batches, in order to reduce performance regressions. Currently, every caller performs a full barrier around a TLB invalidation, ignoring all other invalidations that may have already removed their PTEs from the cache. As this is a synchronous operation and can be quite slow, we cause multiple threads to contend on the TLB invalidate mutex blocking userspace. We only need to invalidate the TLB once after replacing our PTE to ensure that there is no possible continued access to the physical address before releasing our pages. By tracking a seqno for each full TLB invalidate we can quickly determine if one has been performed since rewriting the PTE, and only if necessary trigger one for ourselves. That helps to reduce the performance regression introduced by TLB invalidate logic. [mchehab: rebased to not require moving the code to a separate file] Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Suggested-by: Tvrtko Ursulin Signed-off-by: Chris Wilson Cc: Fei Yang Signed-off-by: Mauro Carvalho Chehab --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 21 +++++--- drivers/gpu/drm/i915/gt/intel_gt.c | 53 ++++++++++++++----- drivers/gpu/drm/i915/gt/intel_gt.h | 12 ++++- drivers/gpu/drm/i915/gt/intel_gt_types.h | 18 ++++++- drivers/gpu/drm/i915/gt/intel_ppgtt.c | 8 ++- drivers/gpu/drm/i915/i915_vma.c | 33 +++++++++--- drivers/gpu/drm/i915/i915_vma.h | 1 + drivers/gpu/drm/i915/i915_vma_resource.c | 5 +- drivers/gpu/drm/i915/i915_vma_resource.h | 6 ++- 10 files changed, 125 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 5cf36a130061..9f6b14ec189a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -335,7 +335,6 @@ struct drm_i915_gem_object { #define I915_BO_READONLY BIT(7) #define I915_TILING_QUIRK_BIT 8 /* unknown swizzling; do not release! */ #define I915_BO_PROTECTED BIT(9) -#define I915_BO_WAS_BOUND_BIT 10 /** * @mem_flags - Mutable placement-related flags * @@ -616,6 +615,8 @@ struct drm_i915_gem_object { * pages were last acquired. */ bool dirty:1; + + u32 tlb; } mm; struct { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 6835279943df..8357dbdcab5c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -191,6 +191,18 @@ static void unmap_object(struct drm_i915_gem_object *obj, void *ptr) vunmap(ptr); } +static void flush_tlb_invalidate(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_gt *gt = to_gt(i915); + + if (!obj->mm.tlb) + return; + + intel_gt_invalidate_tlb(gt, obj->mm.tlb); + obj->mm.tlb = 0; +} + struct sg_table * __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) { @@ -216,14 +228,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) __i915_gem_object_reset_page_iter(obj); obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0; - if (test_and_clear_bit(I915_BO_WAS_BOUND_BIT, &obj->flags)) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct intel_gt *gt = to_gt(i915); - intel_wakeref_t wakeref; - - with_intel_gt_pm_if_awake(gt, wakeref) - intel_gt_invalidate_tlbs(gt); - } + flush_tlb_invalidate(obj); return pages; } diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 5c55a90672f4..f435e06125aa 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -38,8 +38,6 @@ static void __intel_gt_init_early(struct intel_gt *gt) { spin_lock_init(>->irq_lock); - mutex_init(>->tlb_invalidate_lock); - INIT_LIST_HEAD(>->closed_vma); spin_lock_init(>->closed_lock); @@ -50,6 +48,8 @@ static void __intel_gt_init_early(struct intel_gt *gt) intel_gt_init_reset(gt); intel_gt_init_requests(gt); intel_gt_init_timelines(gt); + mutex_init(>->tlb.invalidate_lock); + seqcount_mutex_init(>->tlb.seqno, >->tlb.invalidate_lock); intel_gt_pm_init_early(gt); intel_uc_init_early(>->uc); @@ -770,6 +770,7 @@ void intel_gt_driver_late_release_all(struct drm_i915_private *i915) intel_gt_fini_requests(gt); intel_gt_fini_reset(gt); intel_gt_fini_timelines(gt); + mutex_destroy(>->tlb.invalidate_lock); intel_engines_free(gt); } } @@ -908,7 +909,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, return rb; } -void intel_gt_invalidate_tlbs(struct intel_gt *gt) +static void mmio_invalidate_full(struct intel_gt *gt) { static const i915_reg_t gen8_regs[] = { [RENDER_CLASS] = GEN8_RTCR, @@ -931,12 +932,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) const i915_reg_t *regs; unsigned int num = 0; - if (I915_SELFTEST_ONLY(gt->awake == -ENODEV)) - return; - - if (intel_gt_is_wedged(gt)) - return; - if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; num = ARRAY_SIZE(gen12_regs); @@ -951,9 +946,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) "Platform does not implement TLB invalidation!")) return; - GEM_TRACE("\n"); - - mutex_lock(>->tlb_invalidate_lock); intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */ @@ -973,6 +965,8 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) awake |= engine->mask; } + GT_TRACE(gt, "invalidated engines %08x\n", awake); + /* Wa_2207587034:tgl,dg1,rkl,adl-s,adl-p */ if (awake && (IS_TIGERLAKE(i915) || @@ -1012,5 +1006,38 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) * transitions. */ intel_uncore_forcewake_put_delayed(uncore, FORCEWAKE_ALL); - mutex_unlock(>->tlb_invalidate_lock); +} + +static bool tlb_seqno_passed(const struct intel_gt *gt, u32 seqno) +{ + u32 cur = intel_gt_tlb_seqno(gt); + + /* Only skip if a *full* TLB invalidate barrier has passed */ + return (s32)(cur - ALIGN(seqno, 2)) > 0; +} + +void intel_gt_invalidate_tlb(struct intel_gt *gt, u32 seqno) +{ + intel_wakeref_t wakeref; + + if (I915_SELFTEST_ONLY(gt->awake == -ENODEV)) + return; + + if (intel_gt_is_wedged(gt)) + return; + + if (tlb_seqno_passed(gt, seqno)) + return; + + with_intel_gt_pm_if_awake(gt, wakeref) { + mutex_lock(>->tlb.invalidate_lock); + if (tlb_seqno_passed(gt, seqno)) + goto unlock; + + mmio_invalidate_full(gt); + + write_seqcount_invalidate(>->tlb.seqno); +unlock: + mutex_unlock(>->tlb.invalidate_lock); + } } diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index 82d6f248d876..40b06adf509a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -101,6 +101,16 @@ void intel_gt_info_print(const struct intel_gt_info *info, void intel_gt_watchdog_work(struct work_struct *work); -void intel_gt_invalidate_tlbs(struct intel_gt *gt); +static inline u32 intel_gt_tlb_seqno(const struct intel_gt *gt) +{ + return seqprop_sequence(>->tlb.seqno); +} + +static inline u32 intel_gt_next_invalidate_tlb_full(const struct intel_gt *gt) +{ + return intel_gt_tlb_seqno(gt) | 1; +} + +void intel_gt_invalidate_tlb(struct intel_gt *gt, u32 seqno); #endif /* __INTEL_GT_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index df708802889d..3804a583382b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -83,7 +84,22 @@ struct intel_gt { struct intel_uc uc; struct intel_gsc gsc; - struct mutex tlb_invalidate_lock; + struct { + /* Serialize global tlb invalidations */ + struct mutex invalidate_lock; + + /* + * Batch TLB invalidations + * + * After unbinding the PTE, we need to ensure the TLB + * are invalidated prior to releasing the physical pages. + * But we only need one such invalidation for all unbinds, + * so we track how many TLB invalidations have been + * performed since unbind the PTE and only emit an extra + * invalidate if no full barrier has been passed. + */ + seqcount_mutex_t seqno; + } tlb; struct i915_wa_list wa_list; diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c index d8b94d638559..2da6c82a8bd2 100644 --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c @@ -206,8 +206,12 @@ void ppgtt_bind_vma(struct i915_address_space *vm, void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma_resource *vma_res) { - if (vma_res->allocated) - vm->clear_range(vm, vma_res->start, vma_res->vma_size); + if (!vma_res->allocated) + return; + + vm->clear_range(vm, vma_res->start, vma_res->vma_size); + if (vma_res->tlb) + vma_invalidate_tlb(vm, *vma_res->tlb); } static unsigned long pd_count(u64 size, int shift) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index ef3b04c7e153..84a9ccbc5fc5 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -538,8 +538,6 @@ int i915_vma_bind(struct i915_vma *vma, bind_flags); } - set_bit(I915_BO_WAS_BOUND_BIT, &vma->obj->flags); - atomic_or(bind_flags, &vma->flags); return 0; } @@ -1310,6 +1308,19 @@ I915_SELFTEST_EXPORT int i915_vma_get_pages(struct i915_vma *vma) return err; } +void vma_invalidate_tlb(struct i915_address_space *vm, u32 tlb) +{ + /* + * Before we release the pages that were bound by this vma, we + * must invalidate all the TLBs that may still have a reference + * back to our physical address. It only needs to be done once, + * so after updating the PTE to point away from the pages, record + * the most recent TLB invalidation seqno, and if we have not yet + * flushed the TLBs upon release, perform a full invalidation. + */ + WRITE_ONCE(tlb, intel_gt_next_invalidate_tlb_full(vm->gt)); +} + static void __vma_put_pages(struct i915_vma *vma, unsigned int count) { /* We allocate under vma_get_pages, so beware the shrinker */ @@ -1941,7 +1952,12 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async) vma->vm->skip_pte_rewrite; trace_i915_vma_unbind(vma); - unbind_fence = i915_vma_resource_unbind(vma_res); + if (async) + unbind_fence = i915_vma_resource_unbind(vma_res, + &vma->obj->mm.tlb); + else + unbind_fence = i915_vma_resource_unbind(vma_res, NULL); + vma->resource = NULL; atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE), @@ -1949,10 +1965,13 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async) i915_vma_detach(vma); - if (!async && unbind_fence) { - dma_fence_wait(unbind_fence, false); - dma_fence_put(unbind_fence); - unbind_fence = NULL; + if (!async) { + if (unbind_fence) { + dma_fence_wait(unbind_fence, false); + dma_fence_put(unbind_fence); + unbind_fence = NULL; + } + vma_invalidate_tlb(vma->vm, vma->obj->mm.tlb); } /* diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 88ca0bd9c900..5048eed536da 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -213,6 +213,7 @@ bool i915_vma_misplaced(const struct i915_vma *vma, u64 size, u64 alignment, u64 flags); void __i915_vma_set_map_and_fenceable(struct i915_vma *vma); void i915_vma_revoke_mmap(struct i915_vma *vma); +void vma_invalidate_tlb(struct i915_address_space *vm, u32 tlb); struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async); int __i915_vma_unbind(struct i915_vma *vma); int __must_check i915_vma_unbind(struct i915_vma *vma); diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c index 27c55027387a..5a67995ea5fe 100644 --- a/drivers/gpu/drm/i915/i915_vma_resource.c +++ b/drivers/gpu/drm/i915/i915_vma_resource.c @@ -223,10 +223,13 @@ i915_vma_resource_fence_notify(struct i915_sw_fence *fence, * Return: A refcounted pointer to a dma-fence that signals when unbinding is * complete. */ -struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res) +struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res, + u32 *tlb) { struct i915_address_space *vm = vma_res->vm; + vma_res->tlb = tlb; + /* Reference for the sw fence */ i915_vma_resource_get(vma_res); diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h index 5d8427caa2ba..06923d1816e7 100644 --- a/drivers/gpu/drm/i915/i915_vma_resource.h +++ b/drivers/gpu/drm/i915/i915_vma_resource.h @@ -67,6 +67,7 @@ struct i915_page_sizes { * taken when the unbind is scheduled. * @skip_pte_rewrite: During ggtt suspend and vm takedown pte rewriting * needs to be skipped for unbind. + * @tlb: pointer for obj->mm.tlb, if async unbind. Otherwise, NULL * * The lifetime of a struct i915_vma_resource is from a binding request to * the actual possible asynchronous unbind has completed. @@ -119,6 +120,8 @@ struct i915_vma_resource { bool immediate_unbind:1; bool needs_wakeref:1; bool skip_pte_rewrite:1; + + u32 *tlb; }; bool i915_vma_resource_hold(struct i915_vma_resource *vma_res, @@ -131,7 +134,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void); void i915_vma_resource_free(struct i915_vma_resource *vma_res); -struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res); +struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res, + u32 *tlb); void __i915_vma_resource_init(struct i915_vma_resource *vma_res); From patchwork Wed Jul 27 12:29:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 12930409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5B5DC04A68 for ; Wed, 27 Jul 2022 12:30:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CF1A2C5E40; Wed, 27 Jul 2022 12:30:05 +0000 (UTC) Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by gabe.freedesktop.org (Postfix) with ESMTPS id C1E4EC5D67; Wed, 27 Jul 2022 12:30:03 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 56EBCB82079; Wed, 27 Jul 2022 12:30:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9734DC43143; Wed, 27 Jul 2022 12:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1658925000; bh=Bnw95kKCUQrDufKC0VVSFnBSR2tCZOWA2za105c45Go=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VdF5FWJ87oU8m4GpcVcn5AQmVa6fnLlbv/G8KNY1ksm9ejz0QGgPF02zIRHkwiyDM s8XuQX17mD4u7aCJv6VFFSum9WoSb6AmaBQ3th2rqNNOWXJuZAF58QvBlV/kh6j4Yb BuBu/YX9hNaoNz/wq0GC9/YwuPV+8zCmLQmsmPJTDprbMXz24x44zlUPLwR+GqPad1 bMHz64HE6H8HUkBmBFr/Mr6g59ClxjdudJ6ZV0SQMftFtjObfQcKVJ56kzhMPqZ/UD Uy+AEwnaGjueKifx4GVnbs21YmVxyuv6aFt18Fo9IgZz7zjlzgZh04+Jki3MMEo8/2 Bgdza87u2BqCg== Received: from mchehab by mail.kernel.org with local (Exim 4.95) (envelope-from ) id 1oGgAo-003xmT-Du; Wed, 27 Jul 2022 14:29:58 +0200 From: Mauro Carvalho Chehab To: Subject: [PATCH v3 6/6] drm/i915/gt: describe the new tlb parameter at i915_vma_resource Date: Wed, 27 Jul 2022 14:29:56 +0200 Message-Id: X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , David Airlie , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Rodrigo Vivi , Mauro Carvalho Chehab Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" TLB cache invalidation can happen on two different situations: 1. synchronously, at __vma_put_pages(); 2. asynchronously. On the first case, TLB cache invalidation happens inside __vma_put_pages(). So, no need to do it later on. However, on the second case, the pages will keep in memory until __i915_vma_evict() is called. So, we need to store the TLB data at struct i915_vma_resource, in order to do a TLB cache invalidation before allowing userspace to re-use the same memory. So, i915_vma_resource_unbind() has gained a new parameter in order to store the TLB data at the second case. Document it. Signed-off-by: Mauro Carvalho Chehab Reviewed-by: Andi Shyti --- To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. See [PATCH v3 0/6] at: https://lore.kernel.org/all/cover.1658924372.git.mchehab@kernel.org/ drivers/gpu/drm/i915/i915_vma_resource.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c index 5a67995ea5fe..4fe09ea0a825 100644 --- a/drivers/gpu/drm/i915/i915_vma_resource.c +++ b/drivers/gpu/drm/i915/i915_vma_resource.c @@ -216,6 +216,10 @@ i915_vma_resource_fence_notify(struct i915_sw_fence *fence, /** * i915_vma_resource_unbind - Unbind a vma resource * @vma_res: The vma resource to unbind. + * @tlb: pointer to vma->obj->mm.tlb associated with the resource + * to be stored at vma_res->tlb. When not-NULL, it will be used + * to do TLB cache invalidation before freeing a VMA resource. + * used only for async unbind. * * At this point this function does little more than publish a fence that * signals immediately unless signaling is held back.