From patchwork Mon Mar 16 11:42:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440267 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A8F7F6CA for ; Mon, 16 Mar 2020 11:43:14 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 910D020724 for ; Mon, 16 Mar 2020 11:43:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 910D020724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5CF2A6E437; Mon, 16 Mar 2020 11:43:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id A893F6E428 for ; Mon, 16 Mar 2020 11:43:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574768-1500050 for multiple; Mon, 16 Mar 2020 11:42:36 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:23 +0000 Message-Id: <20200316114237.5436-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/15] drm/i915: Move GGTT fence registers under gt/ X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since the fence registers control HW detiling through the GGTT aperture, make them a part of the intel_ggtt under gt/ Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 +- .../intel_ggtt_fencing.c} | 27 +++++++------------ .../intel_ggtt_fencing.h} | 9 +++---- drivers/gpu/drm/i915/gt/intel_gtt.h | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/gvt/aperture_gm.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 6 ++--- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_vma.h | 1 - drivers/gpu/drm/i915/selftests/i915_gem.c | 2 +- 12 files changed, 24 insertions(+), 34 deletions(-) rename drivers/gpu/drm/i915/{i915_gem_fence_reg.c => gt/intel_ggtt_fencing.c} (97%) rename drivers/gpu/drm/i915/{i915_gem_fence_reg.h => gt/intel_ggtt_fencing.h} (92%) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9f887a86e555..1b2ed963179c 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -92,6 +92,7 @@ gt-y += \ gt/intel_engine_pool.o \ gt/intel_engine_user.o \ gt/intel_ggtt.o \ + gt/intel_ggtt_fencing.o \ gt/intel_gt.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ @@ -153,7 +154,6 @@ i915-y += \ i915_buddy.o \ i915_cmd_parser.o \ i915_gem_evict.o \ - i915_gem_fence_reg.o \ i915_gem_gtt.o \ i915_gem.o \ i915_globals.o \ diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index aed498a0d032..a7b72fa569a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -65,7 +65,7 @@ static int ggtt_init_hw(struct i915_ggtt *ggtt) ggtt->mappable_end); } - i915_ggtt_init_fences(ggtt); + intel_ggtt_init_fences(ggtt); return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c similarity index 97% rename from drivers/gpu/drm/i915/i915_gem_fence_reg.c rename to drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index d152b648c73c..94af75673a58 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -233,16 +233,9 @@ static int fence_update(struct i915_fence_reg *fence, int ret; if (vma) { - if (!i915_vma_is_map_and_fenceable(vma)) - return -EINVAL; - - if (drm_WARN(&uncore->i915->drm, - !i915_gem_object_get_stride(vma->obj) || - !i915_gem_object_get_tiling(vma->obj), - "bogus fence setup with stride: 0x%x, tiling mode: %i\n", - i915_gem_object_get_stride(vma->obj), - i915_gem_object_get_tiling(vma->obj))) - return -EINVAL; + GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); + GEM_BUG_ON(!i915_gem_object_get_stride(vma->obj) || + !i915_gem_object_get_tiling(vma->obj)); ret = i915_vma_sync(vma); if (ret) @@ -276,7 +269,7 @@ static int fence_update(struct i915_fence_reg *fence, /* * We only need to update the register itself if the device is awake. * If the device is currently powered down, we will defer the write - * to the runtime resume, see i915_gem_restore_fences(). + * to the runtime resume, see intel_ggtt_restore_fences(). * * This only works for removing the fence register, on acquisition * the caller must hold the rpm wakeref. The fence register must @@ -487,14 +480,14 @@ void i915_unreserve_fence(struct i915_fence_reg *fence) } /** - * i915_gem_restore_fences - restore fence state + * intel_ggtt_restore_fences - restore fence state * @ggtt: Global GTT * * Restore the hw fence state to match the software tracking again, to be called * after a gpu reset and on resume. Note that on runtime suspend we only cancel * the fences, to be reacquired by the user later. */ -void i915_gem_restore_fences(struct i915_ggtt *ggtt) +void intel_ggtt_restore_fences(struct i915_ggtt *ggtt) { int i; @@ -746,7 +739,7 @@ static void detect_bit_6_swizzle(struct i915_ggtt *ggtt) * bit 17 of its physical address and therefore being interpreted differently * by the GPU. */ -static void i915_gem_swizzle_page(struct page *page) +static void swizzle_page(struct page *page) { char temp[64]; char *vaddr; @@ -791,7 +784,7 @@ i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, for_each_sgt_page(page, sgt_iter, pages) { char new_bit_17 = page_to_phys(page) >> 17; if ((new_bit_17 & 0x1) != (test_bit(i, obj->bit_17) != 0)) { - i915_gem_swizzle_page(page); + swizzle_page(page); set_page_dirty(page); } i++; @@ -836,7 +829,7 @@ i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, } } -void i915_ggtt_init_fences(struct i915_ggtt *ggtt) +void intel_ggtt_init_fences(struct i915_ggtt *ggtt) { struct drm_i915_private *i915 = ggtt->vm.i915; struct intel_uncore *uncore = ggtt->vm.gt->uncore; @@ -875,7 +868,7 @@ void i915_ggtt_init_fences(struct i915_ggtt *ggtt) } ggtt->num_fences = num_fences; - i915_gem_restore_fences(ggtt); + intel_ggtt_restore_fences(ggtt); } void intel_gt_init_swizzling(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h similarity index 92% rename from drivers/gpu/drm/i915/i915_gem_fence_reg.h rename to drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 7bd521cd7cd7..3b3eb5bf1b75 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -22,8 +22,8 @@ * */ -#ifndef __I915_FENCE_REG_H__ -#define __I915_FENCE_REG_H__ +#ifndef __INTEL_GGTT_FENCING_H__ +#define __INTEL_GGTT_FENCING_H__ #include #include @@ -53,18 +53,17 @@ struct i915_fence_reg { bool dirty; }; -/* i915_gem_fence_reg.c */ struct i915_fence_reg *i915_reserve_fence(struct i915_ggtt *ggtt); void i915_unreserve_fence(struct i915_fence_reg *fence); -void i915_gem_restore_fences(struct i915_ggtt *ggtt); +void intel_ggtt_restore_fences(struct i915_ggtt *ggtt); void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); -void i915_ggtt_init_fences(struct i915_ggtt *ggtt); +void intel_ggtt_init_fences(struct i915_ggtt *ggtt); void intel_gt_init_swizzling(struct intel_gt *gt); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index b3116fe8d180..ce6ff9d3a350 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -26,7 +26,7 @@ #include #include "gt/intel_reset.h" -#include "i915_gem_fence_reg.h" +#include "gt/intel_ggtt_fencing.h" #include "i915_selftest.h" #include "i915_vma_types.h" diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 8b170c1876b3..9a15bdf31c7f 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -750,7 +750,7 @@ static int gt_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask) for_each_engine(engine, gt, id) __intel_engine_reset(engine, stalled_mask & engine->mask); - i915_gem_restore_fences(gt->ggtt); + intel_ggtt_restore_fences(gt->ggtt); return err; } diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c index 8b13f091cee2..0d6d59871308 100644 --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c @@ -35,7 +35,7 @@ */ #include "i915_drv.h" -#include "i915_gem_fence_reg.h" +#include "gt/intel_ggtt_fencing.h" #include "gvt.h" static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 82d9df15b22b..832140f4ea3d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1288,7 +1288,7 @@ static int i915_drm_resume(struct drm_device *dev) drm_err(&dev_priv->drm, "failed to re-enable GGTT\n"); i915_ggtt_resume(&dev_priv->ggtt); - i915_gem_restore_fences(&dev_priv->ggtt); + intel_ggtt_restore_fences(&dev_priv->ggtt); intel_csr_ucode_resume(dev_priv); @@ -1606,7 +1606,7 @@ static int intel_runtime_suspend(struct device *kdev) intel_gt_runtime_resume(&dev_priv->gt); - i915_gem_restore_fences(&dev_priv->ggtt); + intel_ggtt_restore_fences(&dev_priv->ggtt); enable_rpm_wakeref_asserts(rpm); @@ -1687,7 +1687,7 @@ static int intel_runtime_resume(struct device *kdev) * we can do is to hope that things will still work (and disable RPM). */ intel_gt_runtime_resume(&dev_priv->gt); - i915_gem_restore_fences(&dev_priv->ggtt); + intel_ggtt_restore_fences(&dev_priv->ggtt); /* * On VLV/CHV display interrupts are part of the display diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1f5b9a584f71..ddd5b40cbbbc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -92,7 +92,6 @@ #include "intel_wopcm.h" #include "i915_gem.h" -#include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" #include "i915_perf_types.h" diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ca5420012a22..2c53be0bd9fd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1156,7 +1156,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv) /* Minimal basic recovery for KMS */ ret = i915_ggtt_enable_hw(dev_priv); i915_ggtt_resume(&dev_priv->ggtt); - i915_gem_restore_fences(&dev_priv->ggtt); + intel_ggtt_restore_fences(&dev_priv->ggtt); intel_init_clock_gating(dev_priv); } diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index e1ced1df13e1..2764c277326f 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -33,7 +33,6 @@ #include "gem/i915_gem_object.h" #include "i915_gem_gtt.h" -#include "i915_gem_fence_reg.h" #include "i915_active.h" #include "i915_request.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c index 623759b73bb4..7ea517a21e0b 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c @@ -125,7 +125,7 @@ static void pm_resume(struct drm_i915_private *i915) */ with_intel_runtime_pm(&i915->runtime_pm, wakeref) { i915_ggtt_resume(&i915->ggtt); - i915_gem_restore_fences(&i915->ggtt); + intel_ggtt_restore_fences(&i915->ggtt); i915_gem_resume(i915); } From patchwork Mon Mar 16 11:42:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440255 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC7826CA for ; Mon, 16 Mar 2020 11:43:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C3E4C20738 for ; Mon, 16 Mar 2020 11:43:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C3E4C20738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 539486E424; Mon, 16 Mar 2020 11:43:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8F61A6E421 for ; Mon, 16 Mar 2020 11:43:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574769-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:24 +0000 Message-Id: <20200316114237.5436-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/15] drm/i915/gt: Pull restoration of GGTT fences underneath the GT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Make the GT responsible for restoring its fence when it wakes up from suspend. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 + drivers/gpu/drm/i915/i915_drv.c | 4 ---- drivers/gpu/drm/i915/i915_gem.c | 1 - drivers/gpu/drm/i915/selftests/i915_gem.c | 2 -- 5 files changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index a7b72fa569a7..bde4f64a41f7 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -1195,6 +1195,8 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt) if (INTEL_GEN(ggtt->vm.i915) >= 8) setup_private_pat(ggtt->vm.gt->uncore); + + intel_ggtt_restore_fences(ggtt); } static struct scatterlist * diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index 8b653c0f5e5f..2e40400d1ecd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -324,6 +324,7 @@ int intel_gt_runtime_resume(struct intel_gt *gt) { GT_TRACE(gt, "\n"); intel_gt_init_swizzling(gt); + intel_ggtt_restore_fences(gt->ggtt); return intel_uc_runtime_resume(>->uc); } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 832140f4ea3d..48ba37e35bea 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1288,7 +1288,6 @@ static int i915_drm_resume(struct drm_device *dev) drm_err(&dev_priv->drm, "failed to re-enable GGTT\n"); i915_ggtt_resume(&dev_priv->ggtt); - intel_ggtt_restore_fences(&dev_priv->ggtt); intel_csr_ucode_resume(dev_priv); @@ -1606,8 +1605,6 @@ static int intel_runtime_suspend(struct device *kdev) intel_gt_runtime_resume(&dev_priv->gt); - intel_ggtt_restore_fences(&dev_priv->ggtt); - enable_rpm_wakeref_asserts(rpm); return ret; @@ -1687,7 +1684,6 @@ static int intel_runtime_resume(struct device *kdev) * we can do is to hope that things will still work (and disable RPM). */ intel_gt_runtime_resume(&dev_priv->gt); - intel_ggtt_restore_fences(&dev_priv->ggtt); /* * On VLV/CHV display interrupts are part of the display diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2c53be0bd9fd..762b50b08d73 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1156,7 +1156,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv) /* Minimal basic recovery for KMS */ ret = i915_ggtt_enable_hw(dev_priv); i915_ggtt_resume(&dev_priv->ggtt); - intel_ggtt_restore_fences(&dev_priv->ggtt); intel_init_clock_gating(dev_priv); } diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c index 7ea517a21e0b..88d400b9df88 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c @@ -125,8 +125,6 @@ static void pm_resume(struct drm_i915_private *i915) */ with_intel_runtime_pm(&i915->runtime_pm, wakeref) { i915_ggtt_resume(&i915->ggtt); - intel_ggtt_restore_fences(&i915->ggtt); - i915_gem_resume(i915); } } From patchwork Mon Mar 16 11:42:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440261 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D805913 for ; Mon, 16 Mar 2020 11:43:09 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 35D8A20738 for ; Mon, 16 Mar 2020 11:43:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 35D8A20738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B9F926E42E; Mon, 16 Mar 2020 11:43:08 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 283436E419 for ; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574770-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:25 +0000 Message-Id: <20200316114237.5436-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/15] drm/i915: Remove manual save/resume of fence register state X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since we always reload the fence register state on runtime resume, having it explicitly in the S0ix resume code is redundant. Indeed, it is not even being used! Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ddd5b40cbbbc..a7ea1d855359 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -539,7 +539,6 @@ struct i915_suspend_saved_registers { u32 saveSWF0[16]; u32 saveSWF1[16]; u32 saveSWF3[3]; - u64 saveFENCE[I915_MAX_NUM_FENCES]; u32 savePCH_PORT_HOTPLUG; u16 saveGCDGMBUS; }; From patchwork Mon Mar 16 11:42:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440263 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62F59913 for ; Mon, 16 Mar 2020 11:43:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4B0D820738 for ; Mon, 16 Mar 2020 11:43:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B0D820738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C6B896E42F; Mon, 16 Mar 2020 11:43:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 25E5F6E418 for ; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574771-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:26 +0000 Message-Id: <20200316114237.5436-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/15] drm/i915/gt: Allocate i915_fence_reg array X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since the number of fence regs can vary dramactically between platforms, allocate the array on demand so we don't waste as much space. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 6 ++++-- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 10 ++++++++++ drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 1 + drivers/gpu/drm/i915/gt/intel_gtt.h | 5 +++-- drivers/gpu/drm/i915/i915_vma.h | 1 + 5 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index bde4f64a41f7..8fcf14372d7a 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -698,11 +698,13 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt) */ void i915_ggtt_driver_release(struct drm_i915_private *i915) { + struct i915_ggtt *ggtt = &i915->ggtt; struct pagevec *pvec; - fini_aliasing_ppgtt(&i915->ggtt); + fini_aliasing_ppgtt(ggtt); - ggtt_cleanup_hw(&i915->ggtt); + intel_ggtt_fini_fences(ggtt); + ggtt_cleanup_hw(ggtt); pvec = &i915->mm.wc_stash.pvec; if (pvec->nr) { diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 94af75673a58..b6ba68c42546 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -857,6 +857,11 @@ void intel_ggtt_init_fences(struct i915_ggtt *ggtt) if (intel_vgpu_active(i915)) num_fences = intel_uncore_read(uncore, vgtif_reg(avail_rs.fence_num)); + ggtt->fence_regs = kcalloc(num_fences, + sizeof(*ggtt->fence_regs), + GFP_KERNEL); + if (!ggtt->fence_regs) + num_fences = 0; /* Initialize fence registers to zero */ for (i = 0; i < num_fences; i++) { @@ -871,6 +876,11 @@ void intel_ggtt_init_fences(struct i915_ggtt *ggtt) intel_ggtt_restore_fences(ggtt); } +void intel_ggtt_fini_fences(struct i915_ggtt *ggtt) +{ + kfree(ggtt->fence_regs); +} + void intel_gt_init_swizzling(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 3b3eb5bf1b75..9850f6a85d2a 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -64,6 +64,7 @@ void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); void intel_ggtt_init_fences(struct i915_ggtt *ggtt); +void intel_ggtt_fini_fences(struct i915_ggtt *ggtt); void intel_gt_init_swizzling(struct intel_gt *gt); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index ce6ff9d3a350..d93ebdf3fa0e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -26,7 +26,6 @@ #include #include "gt/intel_reset.h" -#include "gt/intel_ggtt_fencing.h" #include "i915_selftest.h" #include "i915_vma_types.h" @@ -135,6 +134,8 @@ typedef u64 gen8_pte_t; #define GEN8_PDE_IPS_64K BIT(11) #define GEN8_PDE_PS_2M BIT(7) +struct i915_fence_reg; + #define for_each_sgt_daddr(__dp, __iter, __sgt) \ __for_each_sgt_daddr(__dp, __iter, __sgt, I915_GTT_PAGE_SIZE) @@ -333,7 +334,7 @@ struct i915_ggtt { u32 pin_bias; unsigned int num_fences; - struct i915_fence_reg fence_regs[I915_MAX_NUM_FENCES]; + struct i915_fence_reg *fence_regs; struct list_head fence_list; /** diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 2764c277326f..b958ad07f212 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -30,6 +30,7 @@ #include +#include "gt/intel_ggtt_fencing.h" #include "gem/i915_gem_object.h" #include "i915_gem_gtt.h" From patchwork Mon Mar 16 11:42:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440241 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A2F76CA for ; Mon, 16 Mar 2020 11:42:52 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1269F20738 for ; Mon, 16 Mar 2020 11:42:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1269F20738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 461816E41B; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 613826E417 for ; Mon, 16 Mar 2020 11:42:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574772-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:27 +0000 Message-Id: <20200316114237.5436-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/15] drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Only GPU activity via the GGTT fence is asynchronous, we know that we control the CPU access directly, so we only need to wait for the GPU to stop using the fence before we relinquish it. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 12 ++++++++---- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 3 +++ drivers/gpu/drm/i915/i915_vma.c | 4 ++++ 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index b6ba68c42546..495ad8b2cab0 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -237,15 +237,18 @@ static int fence_update(struct i915_fence_reg *fence, GEM_BUG_ON(!i915_gem_object_get_stride(vma->obj) || !i915_gem_object_get_tiling(vma->obj)); - ret = i915_vma_sync(vma); - if (ret) - return ret; + if (INTEL_GEN(fence_to_i915(fence)) < 4) { + /* implicit 'unfenced' GPU blits */ + ret = i915_vma_sync(vma); + if (ret) + return ret; + } } old = xchg(&fence->vma, NULL); if (old) { /* XXX Ideally we would move the waiting to outside the mutex */ - ret = i915_vma_sync(old); + ret = i915_active_wait(&fence->active); if (ret) { fence->vma = old; return ret; @@ -867,6 +870,7 @@ void intel_ggtt_init_fences(struct i915_ggtt *ggtt) for (i = 0; i < num_fences; i++) { struct i915_fence_reg *fence = &ggtt->fence_regs[i]; + i915_active_init(&fence->active, NULL, NULL); fence->ggtt = ggtt; fence->id = i; list_add_tail(&fence->link, &ggtt->fence_list); diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 9850f6a85d2a..08c6bb667581 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -28,6 +28,8 @@ #include #include +#include "i915_active.h" + struct drm_i915_gem_object; struct i915_ggtt; struct i915_vma; @@ -41,6 +43,7 @@ struct i915_fence_reg { struct i915_ggtt *ggtt; struct i915_vma *vma; atomic_t pin_count; + struct i915_active active; int id; /** * Whether the tiling parameters for the currently diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 5b3efb43a8ef..aedbd056fd45 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1214,6 +1214,10 @@ int i915_vma_move_to_active(struct i915_vma *vma, dma_resv_add_shared_fence(vma->resv, &rq->fence); obj->write_domain = 0; } + + if (flags & EXEC_OBJECT_NEEDS_FENCE && vma->fence) + i915_active_add_request(&vma->fence->active, rq); + obj->read_domains |= I915_GEM_GPU_DOMAINS; obj->mm.dirty = true; From patchwork Mon Mar 16 11:42:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440249 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 247816CA for ; Mon, 16 Mar 2020 11:42:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0C46A20724 for ; Mon, 16 Mar 2020 11:42:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0C46A20724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0B2D66E420; Mon, 16 Mar 2020 11:42:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 643C26E419 for ; Mon, 16 Mar 2020 11:42:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574773-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:28 +0000 Message-Id: <20200316114237.5436-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/15] drm/i915/gt: Store the fence details on the fence X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Make a copy of the object tiling parameters at the point of grabbing the fence. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 93 +++++++------------- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 4 + 2 files changed, 37 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 495ad8b2cab0..1b9adae1a7a8 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -68,8 +68,7 @@ static struct intel_uncore *fence_to_uncore(struct i915_fence_reg *fence) return fence->ggtt->vm.gt->uncore; } -static void i965_write_fence_reg(struct i915_fence_reg *fence, - struct i915_vma *vma) +static void i965_write_fence_reg(struct i915_fence_reg *fence) { i915_reg_t fence_reg_lo, fence_reg_hi; int fence_pitch_shift; @@ -87,18 +86,16 @@ static void i965_write_fence_reg(struct i915_fence_reg *fence, } val = 0; - if (vma) { - unsigned int stride = i915_gem_object_get_stride(vma->obj); + if (fence->tiling) { + unsigned int stride = fence->stride; - GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); - GEM_BUG_ON(!IS_ALIGNED(vma->node.start, I965_FENCE_PAGE)); - GEM_BUG_ON(!IS_ALIGNED(vma->fence_size, I965_FENCE_PAGE)); GEM_BUG_ON(!IS_ALIGNED(stride, 128)); - val = (vma->node.start + vma->fence_size - I965_FENCE_PAGE) << 32; - val |= vma->node.start; + val = fence->start + fence->size - I965_FENCE_PAGE; + val <<= 32; + val |= fence->start; val |= (u64)((stride / 128) - 1) << fence_pitch_shift; - if (i915_gem_object_get_tiling(vma->obj) == I915_TILING_Y) + if (fence->tiling == I915_TILING_Y) val |= BIT(I965_FENCE_TILING_Y_SHIFT); val |= I965_FENCE_REG_VALID; } @@ -125,21 +122,15 @@ static void i965_write_fence_reg(struct i915_fence_reg *fence, } } -static void i915_write_fence_reg(struct i915_fence_reg *fence, - struct i915_vma *vma) +static void i915_write_fence_reg(struct i915_fence_reg *fence) { u32 val; val = 0; - if (vma) { - unsigned int tiling = i915_gem_object_get_tiling(vma->obj); + if (fence->tiling) { + unsigned int stride = fence->stride; + unsigned int tiling = fence->tiling; bool is_y_tiled = tiling == I915_TILING_Y; - unsigned int stride = i915_gem_object_get_stride(vma->obj); - - GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); - GEM_BUG_ON(vma->node.start & ~I915_FENCE_START_MASK); - GEM_BUG_ON(!is_power_of_2(vma->fence_size)); - GEM_BUG_ON(!IS_ALIGNED(vma->node.start, vma->fence_size)); if (is_y_tiled && HAS_128_BYTE_Y_TILING(fence_to_i915(fence))) stride /= 128; @@ -147,10 +138,10 @@ static void i915_write_fence_reg(struct i915_fence_reg *fence, stride /= 512; GEM_BUG_ON(!is_power_of_2(stride)); - val = vma->node.start; + val = fence->start; if (is_y_tiled) val |= BIT(I830_FENCE_TILING_Y_SHIFT); - val |= I915_FENCE_SIZE_BITS(vma->fence_size); + val |= I915_FENCE_SIZE_BITS(fence->size); val |= ilog2(stride) << I830_FENCE_PITCH_SHIFT; val |= I830_FENCE_REG_VALID; @@ -165,25 +156,18 @@ static void i915_write_fence_reg(struct i915_fence_reg *fence, } } -static void i830_write_fence_reg(struct i915_fence_reg *fence, - struct i915_vma *vma) +static void i830_write_fence_reg(struct i915_fence_reg *fence) { u32 val; val = 0; - if (vma) { - unsigned int stride = i915_gem_object_get_stride(vma->obj); - - GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); - GEM_BUG_ON(vma->node.start & ~I830_FENCE_START_MASK); - GEM_BUG_ON(!is_power_of_2(vma->fence_size)); - GEM_BUG_ON(!is_power_of_2(stride / 128)); - GEM_BUG_ON(!IS_ALIGNED(vma->node.start, vma->fence_size)); + if (fence->tiling) { + unsigned int stride = fence->stride; - val = vma->node.start; - if (i915_gem_object_get_tiling(vma->obj) == I915_TILING_Y) + val = fence->start; + if (fence->tiling == I915_TILING_Y) val |= BIT(I830_FENCE_TILING_Y_SHIFT); - val |= I830_FENCE_SIZE_BITS(vma->fence_size); + val |= I830_FENCE_SIZE_BITS(fence->size); val |= ilog2(stride / 128) << I830_FENCE_PITCH_SHIFT; val |= I830_FENCE_REG_VALID; } @@ -197,8 +181,7 @@ static void i830_write_fence_reg(struct i915_fence_reg *fence, } } -static void fence_write(struct i915_fence_reg *fence, - struct i915_vma *vma) +static void fence_write(struct i915_fence_reg *fence) { struct drm_i915_private *i915 = fence_to_i915(fence); @@ -209,18 +192,16 @@ static void fence_write(struct i915_fence_reg *fence, */ if (IS_GEN(i915, 2)) - i830_write_fence_reg(fence, vma); + i830_write_fence_reg(fence); else if (IS_GEN(i915, 3)) - i915_write_fence_reg(fence, vma); + i915_write_fence_reg(fence); else - i965_write_fence_reg(fence, vma); + i965_write_fence_reg(fence); /* * Access through the fenced region afterwards is * ordered by the posting reads whilst writing the registers. */ - - fence->dirty = false; } static int fence_update(struct i915_fence_reg *fence, @@ -232,6 +213,7 @@ static int fence_update(struct i915_fence_reg *fence, struct i915_vma *old; int ret; + fence->tiling = 0; if (vma) { GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma)); GEM_BUG_ON(!i915_gem_object_get_stride(vma->obj) || @@ -243,7 +225,13 @@ static int fence_update(struct i915_fence_reg *fence, if (ret) return ret; } + + fence->start = vma->node.start; + fence->size = vma->fence_size; + fence->stride = i915_gem_object_get_stride(vma->obj); + fence->tiling = i915_gem_object_get_tiling(vma->obj); } + WRITE_ONCE(fence->dirty, false); old = xchg(&fence->vma, NULL); if (old) { @@ -286,7 +274,7 @@ static int fence_update(struct i915_fence_reg *fence, } WRITE_ONCE(fence->vma, vma); - fence_write(fence, vma); + fence_write(fence); if (vma) { vma->fence = fence; @@ -494,23 +482,8 @@ void intel_ggtt_restore_fences(struct i915_ggtt *ggtt) { int i; - rcu_read_lock(); /* keep obj alive as we dereference */ - for (i = 0; i < ggtt->num_fences; i++) { - struct i915_fence_reg *reg = &ggtt->fence_regs[i]; - struct i915_vma *vma = READ_ONCE(reg->vma); - - GEM_BUG_ON(vma && vma->fence != reg); - - /* - * Commit delayed tiling changes if we have an object still - * attached to the fence, otherwise just clear the fence. - */ - if (vma && !i915_gem_object_is_tiled(vma->obj)) - vma = NULL; - - fence_write(reg, vma); - } - rcu_read_unlock(); + for (i = 0; i < ggtt->num_fences; i++) + fence_write(&ggtt->fence_regs[i]); } /** diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 08c6bb667581..9eef679e1311 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -54,6 +54,10 @@ struct i915_fence_reg { * command (such as BLT on gen2/3), as a "fence". */ bool dirty; + u32 start; + u32 size; + u32 tiling; + u32 stride; }; struct i915_fence_reg *i915_reserve_fence(struct i915_ggtt *ggtt); From patchwork Mon Mar 16 11:42:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440253 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E4BD913 for ; Mon, 16 Mar 2020 11:42:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8693720724 for ; Mon, 16 Mar 2020 11:42:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8693720724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 070E66E422; Mon, 16 Mar 2020 11:42:53 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8789B6E41D for ; Mon, 16 Mar 2020 11:42:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574774-1500050 for multiple; Mon, 16 Mar 2020 11:42:37 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:29 +0000 Message-Id: <20200316114237.5436-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/15] drm/i915/gt: Make fence revocation unequivocal X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If we must revoke the fence because the VMA is no longer present, or because the fence no longer applies, ensure that we do and convert it into an error if we try but cannot. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 21 +++++++++++--------- drivers/gpu/drm/i915/i915_gem.c | 12 +++++------ drivers/gpu/drm/i915/i915_vma.c | 4 +--- drivers/gpu/drm/i915/i915_vma.h | 2 +- 4 files changed, 19 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 1b9adae1a7a8..97659a1249fd 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -291,23 +291,26 @@ static int fence_update(struct i915_fence_reg *fence, * * This function force-removes any fence from the given object, which is useful * if the kernel wants to do untiled GTT access. - * - * Returns: - * - * 0 on success, negative error code on failure. */ -int i915_vma_revoke_fence(struct i915_vma *vma) +void i915_vma_revoke_fence(struct i915_vma *vma) { struct i915_fence_reg *fence = vma->fence; + intel_wakeref_t wakeref; lockdep_assert_held(&vma->vm->mutex); if (!fence) - return 0; + return; - if (atomic_read(&fence->pin_count)) - return -EBUSY; + GEM_BUG_ON(fence->vma != vma); + GEM_BUG_ON(!i915_active_is_idle(&fence->active)); + GEM_BUG_ON(atomic_read(&fence->pin_count)); + + fence->tiling = 0; + WRITE_ONCE(fence->vma, NULL); + vma->fence = NULL; - return fence_update(fence, NULL); + with_intel_runtime_pm_if_in_use(fence_to_uncore(fence)->rpm, wakeref) + fence_write(fence); } static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 762b50b08d73..b0836fc47ae6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -993,18 +993,16 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, return ERR_PTR(ret); } + ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL); + if (ret) + return ERR_PTR(ret); + if (vma->fence && !i915_gem_object_is_tiled(obj)) { mutex_lock(&ggtt->vm.mutex); - ret = i915_vma_revoke_fence(vma); + i915_vma_revoke_fence(vma); mutex_unlock(&ggtt->vm.mutex); - if (ret) - return ERR_PTR(ret); } - ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL); - if (ret) - return ERR_PTR(ret); - ret = i915_vma_wait_for_bind(vma); if (ret) { i915_vma_unpin(vma); diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index aedbd056fd45..df197b07ac99 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1280,9 +1280,7 @@ int __i915_vma_unbind(struct i915_vma *vma) i915_vma_flush_writes(vma); /* release the fence reg _after_ flushing */ - ret = i915_vma_revoke_fence(vma); - if (ret) - return ret; + i915_vma_revoke_fence(vma); /* Force a pagefault for domain tracking on next user access */ i915_vma_revoke_mmap(vma); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index b958ad07f212..8ad1daabcd58 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -326,7 +326,7 @@ static inline struct page *i915_vma_first_page(struct i915_vma *vma) * True if the vma has a fence, false otherwise. */ int __must_check i915_vma_pin_fence(struct i915_vma *vma); -int __must_check i915_vma_revoke_fence(struct i915_vma *vma); +void i915_vma_revoke_fence(struct i915_vma *vma); int __i915_vma_pin_fence(struct i915_vma *vma); From patchwork Mon Mar 16 11:42:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440247 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 584AF913 for ; Mon, 16 Mar 2020 11:42:55 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4097320724 for ; Mon, 16 Mar 2020 11:42:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4097320724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AFF376E417; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 618716E418 for ; Mon, 16 Mar 2020 11:42:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574775-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:30 +0000 Message-Id: <20200316114237.5436-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/15] drm/i915/gem: Drop cached obj->bind_count X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We cached the number of vma bound to the object in order to speed up shrinker decisions. This has been superseded by being more proactive in removing objects we cannot shrink from the shrinker lists, and so we can drop the clumsy attempt at atomically counting the bind count and comparing it to the number of pinned mappings of the object. This will only get more clumsier with asynchronous binding and unbinding. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 - .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 -- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 18 ++----------- .../drm/i915/gem/selftests/i915_gem_mman.c | 4 --- drivers/gpu/drm/i915/i915_debugfs.c | 7 ++--- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 7 ++++- drivers/gpu/drm/i915/i915_vma.c | 24 ----------------- .../gpu/drm/i915/selftests/i915_gem_evict.c | 26 +------------------ drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 1 - 12 files changed, 13 insertions(+), 83 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 0cc40e77bbd2..af43e82f45c7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -369,7 +369,7 @@ static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) struct i915_vma *vma; GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); - if (!atomic_read(&obj->bind_count)) + if (list_empty(&obj->vma.list)) return; mutex_lock(&i915->ggtt.vm.mutex); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 5da9f9e534b9..3f01cdd1a39b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -206,7 +206,6 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, } obj->mmo.offsets = RB_ROOT; - GEM_BUG_ON(atomic_read(&obj->bind_count)); GEM_BUG_ON(obj->userfault_count); GEM_BUG_ON(!list_empty(&obj->lut_list)); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index a0b10bcd8d8a..54ee658bb168 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -179,9 +179,6 @@ struct drm_i915_gem_object { #define TILING_MASK (FENCE_MINIMUM_STRIDE - 1) #define STRIDE_MASK (~TILING_MASK) - /** Count of VMA actually bound by this object */ - atomic_t bind_count; - struct { /* * Protects the pages and their use. Do not use directly, but diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 24f4cadea114..5d855fcd5c0f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -199,8 +199,6 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) if (i915_gem_object_has_pinned_pages(obj)) return -EBUSY; - GEM_BUG_ON(atomic_read(&obj->bind_count)); - /* May be called by shrinker from within get_pages() (on another bo) */ mutex_lock(&obj->mm.lock); if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 03e5eb4c99d1..5b65ce738b16 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -26,18 +26,6 @@ static bool can_release_pages(struct drm_i915_gem_object *obj) if (!i915_gem_object_is_shrinkable(obj)) return false; - /* - * Only report true if by unbinding the object and putting its pages - * we can actually make forward progress towards freeing physical - * pages. - * - * If the pages are pinned for any other reason than being bound - * to the GPU, simply unbinding from the GPU is not going to succeed - * in releasing our pin count on the pages themselves. - */ - if (atomic_read(&obj->mm.pages_pin_count) > atomic_read(&obj->bind_count)) - return false; - /* * We can only return physical pages to the system if we can either * discard the contents (because the user has marked them as being @@ -54,6 +42,8 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj, flags = 0; if (shrink & I915_SHRINK_ACTIVE) flags = I915_GEM_OBJECT_UNBIND_ACTIVE; + if (!(shrink & I915_SHRINK_BOUND)) + flags = I915_GEM_OBJECT_UNBIND_TEST; if (i915_gem_object_unbind(obj, flags) == 0) __i915_gem_object_put_pages(obj); @@ -194,10 +184,6 @@ i915_gem_shrink(struct drm_i915_private *i915, i915_gem_object_is_framebuffer(obj)) continue; - if (!(shrink & I915_SHRINK_BOUND) && - atomic_read(&obj->bind_count)) - continue; - if (!can_release_pages(obj)) continue; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 43912e9b683d..ef7abcb3f4ee 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -1156,9 +1156,6 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915, if (err) goto out_unmap; - GEM_BUG_ON(mmo->mmap_type == I915_MMAP_TYPE_GTT && - !atomic_read(&obj->bind_count)); - err = check_present(addr, obj->base.size); if (err) { pr_err("%s: was not present\n", obj->mm.region->name); @@ -1175,7 +1172,6 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915, pr_err("Failed to unbind object!\n"); goto out_unmap; } - GEM_BUG_ON(atomic_read(&obj->bind_count)); if (type != I915_MMAP_TYPE_GTT) { __i915_gem_object_put_pages(obj); diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 6ca797128aa1..9881940eaefd 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -218,7 +218,7 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) struct file_stats { struct i915_address_space *vm; unsigned long count; - u64 total, unbound; + u64 total; u64 active, inactive; u64 closed; }; @@ -234,8 +234,6 @@ static int per_file_stats(int id, void *ptr, void *data) stats->count++; stats->total += obj->base.size; - if (!atomic_read(&obj->bind_count)) - stats->unbound += obj->base.size; spin_lock(&obj->vma.lock); if (!stats->vm) { @@ -285,13 +283,12 @@ static int per_file_stats(int id, void *ptr, void *data) #define print_file_stats(m, name, stats) do { \ if (stats.count) \ - seq_printf(m, "%s: %lu objects, %llu bytes (%llu active, %llu inactive, %llu unbound, %llu closed)\n", \ + seq_printf(m, "%s: %lu objects, %llu bytes (%llu active, %llu inactive, %llu closed)\n", \ name, \ stats.count, \ stats.total, \ stats.active, \ stats.inactive, \ - stats.unbound, \ stats.closed); \ } while (0) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a7ea1d855359..09ecf0526441 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1736,6 +1736,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, unsigned long flags); #define I915_GEM_OBJECT_UNBIND_ACTIVE BIT(0) #define I915_GEM_OBJECT_UNBIND_BARRIER BIT(1) +#define I915_GEM_OBJECT_UNBIND_TEST BIT(2) void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b0836fc47ae6..0cbcb9f54e7d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -118,7 +118,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret; - if (!atomic_read(&obj->bind_count)) + if (list_empty(&obj->vma.list)) return 0; /* @@ -141,6 +141,11 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK)) continue; + if (flags & I915_GEM_OBJECT_UNBIND_TEST) { + ret = -EBUSY; + break; + } + ret = -EAGAIN; if (!i915_vm_tryopen(vm)) break; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index df197b07ac99..73dafaafb3e5 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -608,18 +608,6 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color) return true; } -static void assert_bind_count(const struct drm_i915_gem_object *obj) -{ - /* - * Combine the assertion that the object is bound and that we have - * pinned its pages. But we should never have bound the object - * more than we have pinned its pages. (For complete accuracy, we - * assume that no else is pinning the pages, but as a rough assertion - * that we will not run into problems later, this will do!) - */ - GEM_BUG_ON(atomic_read(&obj->mm.pages_pin_count) < atomic_read(&obj->bind_count)); -} - /** * i915_vma_insert - finds a slot for the vma in its address space * @vma: the vma @@ -738,12 +726,6 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color)); - if (vma->obj) { - struct drm_i915_gem_object *obj = vma->obj; - - atomic_inc(&obj->bind_count); - assert_bind_count(obj); - } list_add_tail(&vma->vm_link, &vma->vm->bound_list); return 0; @@ -761,12 +743,6 @@ i915_vma_detach(struct i915_vma *vma) * it to be reaped by the shrinker. */ list_del(&vma->vm_link); - if (vma->obj) { - struct drm_i915_gem_object *obj = vma->obj; - - assert_bind_count(obj); - atomic_dec(&obj->bind_count); - } } static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c index 06ef88510209..028baae9631f 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c @@ -45,8 +45,8 @@ static void quirk_add(struct drm_i915_gem_object *obj, static int populate_ggtt(struct i915_ggtt *ggtt, struct list_head *objects) { - unsigned long unbound, bound, count; struct drm_i915_gem_object *obj; + unsigned long count; count = 0; do { @@ -72,30 +72,6 @@ static int populate_ggtt(struct i915_ggtt *ggtt, struct list_head *objects) pr_debug("Filled GGTT with %lu pages [%llu total]\n", count, ggtt->vm.total / PAGE_SIZE); - bound = 0; - unbound = 0; - list_for_each_entry(obj, objects, st_link) { - GEM_BUG_ON(!obj->mm.quirked); - - if (atomic_read(&obj->bind_count)) - bound++; - else - unbound++; - } - GEM_BUG_ON(bound + unbound != count); - - if (unbound) { - pr_err("%s: Found %lu objects unbound, expected %u!\n", - __func__, unbound, 0); - return -EINVAL; - } - - if (bound != count) { - pr_err("%s: Found %lu objects bound, expected %lu!\n", - __func__, bound, count); - return -EINVAL; - } - if (list_empty(&ggtt->vm.bound_list)) { pr_err("No objects on the GGTT inactive list!\n"); return -EINVAL; diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index b342bef5e7c9..5d2a02fcf595 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -1229,7 +1229,6 @@ static void track_vma_bind(struct i915_vma *vma) { struct drm_i915_gem_object *obj = vma->obj; - atomic_inc(&obj->bind_count); /* track for eviction later */ __i915_gem_object_pin_pages(obj); GEM_BUG_ON(vma->pages); From patchwork Mon Mar 16 11:42:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440243 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7BF996CA for ; Mon, 16 Mar 2020 11:42:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 644A520738 for ; Mon, 16 Mar 2020 11:42:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 644A520738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B1F846E41D; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 73B366E41D for ; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574776-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:31 +0000 Message-Id: <20200316114237.5436-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/15] drm/i915: Immediately execute the fenced work X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the caller allows and we do not have to wait for any signals, immediately execute the work within the caller's process. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_sw_fence_work.c | 6 +++++- drivers/gpu/drm/i915/i915_sw_fence_work.h | 7 +++++++ drivers/gpu/drm/i915/i915_vma.c | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d3f4f28e9468..f647f98a9bc6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1784,7 +1784,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, dma_resv_add_excl_fence(shadow->resv, &pw->base.dma); dma_resv_unlock(shadow->resv); - dma_fence_work_commit(&pw->base); + dma_fence_work_commit_imm(&pw->base); return 0; err_batch_unlock: diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c index 997b2998f1f2..cb2858d369cf 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c @@ -38,7 +38,10 @@ fence_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) if (!f->dma.error) { dma_fence_get(&f->dma); - queue_work(system_unbound_wq, &f->work); + if (f->immediate) + fence_work(&f->work); + else + queue_work(system_unbound_wq, &f->work); } else { fence_complete(f); } @@ -88,6 +91,7 @@ void dma_fence_work_init(struct dma_fence_work *f, dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0); i915_sw_fence_init(&f->chain, fence_notify); INIT_WORK(&f->work, fence_work); + f->immediate = false; } int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal) diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.h b/drivers/gpu/drm/i915/i915_sw_fence_work.h index 3a22b287e201..c258a2b2e6f4 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence_work.h +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.h @@ -28,6 +28,7 @@ struct dma_fence_work { struct i915_sw_fence chain; struct i915_sw_dma_fence_cb cb; + bool immediate; struct work_struct work; const struct dma_fence_work_ops *ops; }; @@ -41,4 +42,10 @@ static inline void dma_fence_work_commit(struct dma_fence_work *f) i915_sw_fence_commit(&f->chain); } +static inline void dma_fence_work_commit_imm(struct dma_fence_work *f) +{ + f->immediate = atomic_read(&f->chain.pending) <= 1; + dma_fence_work_commit(f); +} + #endif /* I915_SW_FENCE_WORK_H */ diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 73dafaafb3e5..880e18f5b71b 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -956,7 +956,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) mutex_unlock(&vma->vm->mutex); err_fence: if (work) - dma_fence_work_commit(&work->base); + dma_fence_work_commit_imm(&work->base); if (wakeref) intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); err_pages: From patchwork Mon Mar 16 11:42:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440239 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5149C913 for ; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3924720738 for ; Mon, 16 Mar 2020 11:42:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3924720738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B35906E41F; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 742766E41B for ; Mon, 16 Mar 2020 11:42:49 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574777-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:32 +0000 Message-Id: <20200316114237.5436-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/15] drm/i915/gem: Assign context id for async work X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allocate a few dma fence context id that we can use to associate async work [for the CPU] launched on behalf of this context. For extra fun, we allow a configurable concurrency width. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++++ drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 ++++++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++ 3 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 026999b34abd..ade01be32a4a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -721,6 +721,10 @@ __create_context(struct drm_i915_private *i915) ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); mutex_init(&ctx->mutex); + ctx->async.width = rounddown_pow_of_two(num_online_cpus()); + ctx->async.context = dma_fence_context_alloc(ctx->async.width); + ctx->async.width--; + spin_lock_init(&ctx->stale.lock); INIT_LIST_HEAD(&ctx->stale.engines); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index 57b7ae2893e1..0919694f4514 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx) +{ + return (ctx->async.context + + (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width)); +} + static inline struct i915_gem_context * i915_gem_context_get(struct i915_gem_context *ctx) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 28760bd03265..5f5cfa3a3e9b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -85,6 +85,12 @@ struct i915_gem_context { struct intel_timeline *timeline; + struct { + u64 context; + atomic_t cur; + unsigned int width; + } async; + /** * @vm: unique address space (GTT) * From patchwork Mon Mar 16 11:42:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440257 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 407AA1668 for ; Mon, 16 Mar 2020 11:43:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 28BA320738 for ; Mon, 16 Mar 2020 11:43:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 28BA320738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 77B9C6E425; Mon, 16 Mar 2020 11:43:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 561346E41B for ; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574778-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:33 +0000 Message-Id: <20200316114237.5436-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/15] drm/i915: Export a preallocate variant of i915_active_acquire() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_active.c | 107 ++++++++++++++++++++++++++--- drivers/gpu/drm/i915/i915_active.h | 5 ++ 2 files changed, 103 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index c4048628188a..6bb09f3020e2 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -217,11 +217,10 @@ excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb) } static struct i915_active_fence * -active_instance(struct i915_active *ref, struct intel_timeline *tl) +active_instance(struct i915_active *ref, u64 idx) { struct active_node *node, *prealloc; struct rb_node **p, *parent; - u64 idx = tl->fence_context; /* * We track the most recently used timeline to skip a rbtree search @@ -367,7 +366,7 @@ int i915_active_ref(struct i915_active *ref, if (err) return err; - active = active_instance(ref, tl); + active = active_instance(ref, tl->fence_context); if (!active) { err = -ENOMEM; goto out; @@ -384,32 +383,104 @@ int i915_active_ref(struct i915_active *ref, atomic_dec(&ref->count); } if (!__i915_active_fence_set(active, fence)) - atomic_inc(&ref->count); + __i915_active_acquire(ref); out: i915_active_release(ref); return err; } -struct dma_fence * -i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +static struct dma_fence * +__i915_active_set_fence(struct i915_active *ref, + struct i915_active_fence *active, + struct dma_fence *fence) { struct dma_fence *prev; /* We expect the caller to manage the exclusive timeline ordering */ GEM_BUG_ON(i915_active_is_idle(ref)); + if (is_barrier(active)) { /* proto-node used by our idle barrier */ + /* + * This request is on the kernel_context timeline, and so + * we can use it to substitute for the pending idle-barrer + * request that we want to emit on the kernel_context. + */ + __active_del_barrier(ref, node_from_active(active)); + RCU_INIT_POINTER(active->fence, NULL); + atomic_dec(&ref->count); + } + rcu_read_lock(); - prev = __i915_active_fence_set(&ref->excl, f); + prev = __i915_active_fence_set(active, fence); if (prev) prev = dma_fence_get_rcu(prev); else - atomic_inc(&ref->count); + __i915_active_acquire(ref); rcu_read_unlock(); return prev; } +static struct i915_active_fence * +__active_lookup(struct i915_active *ref, u64 idx) +{ + struct active_node *node; + struct rb_node *p; + + /* Like active_instance() but with no malloc */ + + node = READ_ONCE(ref->cache); + if (node && node->timeline == idx) + return &node->base; + + spin_lock_irq(&ref->tree_lock); + GEM_BUG_ON(i915_active_is_idle(ref)); + + p = ref->tree.rb_node; + while (p) { + node = rb_entry(p, struct active_node, node); + if (node->timeline == idx) { + ref->cache = node; + spin_unlock_irq(&ref->tree_lock); + return &node->base; + } + + if (node->timeline < idx) + p = p->rb_right; + else + p = p->rb_left; + } + + spin_unlock_irq(&ref->tree_lock); + + return NULL; +} + +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) +{ + struct dma_fence *prev = ERR_PTR(-ENOENT); + struct i915_active_fence *active; + + if (!i915_active_acquire_if_busy(ref)) + return ERR_PTR(-EINVAL); + + active = __active_lookup(ref, idx); + if (active) + prev = __i915_active_set_fence(ref, active, fence); + + i915_active_release(ref); + return prev; +} + +struct dma_fence * +i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +{ + /* We expect the caller to manage the exclusive timeline ordering */ + return __i915_active_set_fence(ref, &ref->excl, f); +} + bool i915_active_acquire_if_busy(struct i915_active *ref) { debug_active_assert(ref); @@ -443,6 +514,24 @@ int i915_active_acquire(struct i915_active *ref) return err; } +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx) +{ + struct i915_active_fence *active; + int err; + + err = i915_active_acquire(ref); + if (err) + return err; + + active = active_instance(ref, idx); + if (!active) { + i915_active_release(ref); + return -ENOMEM; + } + + return 0; /* return with active ref */ +} + void i915_active_release(struct i915_active *ref) { debug_active_assert(ref); @@ -748,7 +837,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, */ RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN)); node->base.cb.node.prev = (void *)engine; - atomic_inc(&ref->count); + __i915_active_acquire(ref); } GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN)); diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index b3282ae7913c..65a9c560483b 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -163,6 +163,9 @@ void __i915_active_init(struct i915_active *ref, __i915_active_init(ref, active, retire, &__mkey, &__wkey); \ } while (0) +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); + int i915_active_ref(struct i915_active *ref, struct intel_timeline *tl, struct dma_fence *fence); @@ -192,7 +195,9 @@ int i915_request_await_active(struct i915_request *rq, #define I915_ACTIVE_AWAIT_ALL BIT(0) int i915_active_acquire(struct i915_active *ref); +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx); bool i915_active_acquire_if_busy(struct i915_active *ref); + void i915_active_release(struct i915_active *ref); static inline void __i915_active_acquire(struct i915_active *ref) From patchwork Mon Mar 16 11:42:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440265 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E27B01668 for ; Mon, 16 Mar 2020 11:43:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CAC3720738 for ; Mon, 16 Mar 2020 11:43:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAC3720738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F268E6E431; Mon, 16 Mar 2020 11:43:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 463856E419 for ; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574779-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:34 +0000 Message-Id: <20200316114237.5436-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/15] drm/i915/gem: Split eb_vma into its own allocation X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Use a separate array allocation for the execbuf vma, so that we can track their lifetime independently from the copy of the user arguments. With luck, this has a secondary benefit of splitting the malloc size to within reason and avoid vmalloc. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 132 ++++++++++-------- 1 file changed, 74 insertions(+), 58 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index f647f98a9bc6..d3d0d0baa0ec 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -40,6 +40,11 @@ struct eb_vma { u32 handle; }; +struct eb_vma_array { + struct kref kref; + struct eb_vma vma[]; +}; + enum { FORCE_CPU_RELOC = 1, FORCE_GTT_RELOC, @@ -52,7 +57,6 @@ enum { #define __EXEC_OBJECT_NEEDS_MAP BIT(29) #define __EXEC_OBJECT_NEEDS_BIAS BIT(28) #define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 28) /* all of the above */ -#define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE) #define __EXEC_HAS_RELOC BIT(31) #define __EXEC_INTERNAL_FLAGS (~0u << 31) @@ -283,6 +287,7 @@ struct i915_execbuffer { */ int lut_size; struct hlist_head *buckets; /** ht for relocation handles */ + struct eb_vma_array *array; }; static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) @@ -292,8 +297,62 @@ static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) eb->args->batch_len); } +static struct eb_vma_array *eb_vma_array_create(unsigned int count) +{ + struct eb_vma_array *arr; + + arr = kvmalloc(struct_size(arr, vma, count), GFP_KERNEL | __GFP_NOWARN); + if (!arr) + return NULL; + + kref_init(&arr->kref); + arr->vma[0].vma = NULL; + + return arr; +} + +static inline void eb_unreserve_vma(struct eb_vma *ev) +{ + struct i915_vma *vma = ev->vma; + + if (unlikely(ev->flags & __EXEC_OBJECT_HAS_FENCE)) + __i915_vma_unpin_fence(vma); + + if (ev->flags & __EXEC_OBJECT_HAS_PIN) + __i915_vma_unpin(vma); + + ev->flags &= ~(__EXEC_OBJECT_HAS_PIN | + __EXEC_OBJECT_HAS_FENCE); +} + +static void eb_vma_array_destroy(struct kref *kref) +{ + struct eb_vma_array *arr = container_of(kref, typeof(*arr), kref); + struct eb_vma *ev = arr->vma; + + while (ev->vma) { + eb_unreserve_vma(ev); + i915_vma_put(ev->vma); + ev++; + } + + kvfree(arr); +} + +static void eb_vma_array_put(struct eb_vma_array *arr) +{ + kref_put(&arr->kref, eb_vma_array_destroy); +} + static int eb_create(struct i915_execbuffer *eb) { + /* Allocate an extra slot for use by the command parser + sentinel */ + eb->array = eb_vma_array_create(eb->buffer_count + 2); + if (!eb->array) + return -ENOMEM; + + eb->vma = eb->array->vma; + if (!(eb->args->flags & I915_EXEC_HANDLE_LUT)) { unsigned int size = 1 + ilog2(eb->buffer_count); @@ -327,8 +386,10 @@ static int eb_create(struct i915_execbuffer *eb) break; } while (--size); - if (unlikely(!size)) + if (unlikely(!size)) { + eb_vma_array_put(eb->array); return -ENOMEM; + } eb->lut_size = size; } else { @@ -402,26 +463,6 @@ eb_pin_vma(struct i915_execbuffer *eb, return !eb_vma_misplaced(entry, vma, ev->flags); } -static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags) -{ - GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN)); - - if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE)) - __i915_vma_unpin_fence(vma); - - __i915_vma_unpin(vma); -} - -static inline void -eb_unreserve_vma(struct eb_vma *ev) -{ - if (!(ev->flags & __EXEC_OBJECT_HAS_PIN)) - return; - - __eb_unreserve_vma(ev->vma, ev->flags); - ev->flags &= ~__EXEC_OBJECT_RESERVED; -} - static int eb_validate_vma(struct i915_execbuffer *eb, struct drm_i915_gem_exec_object2 *entry, @@ -794,6 +835,7 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) eb_add_vma(eb, i, batch, vma); } + eb->vma[i].vma = NULL; return 0; err_obj: @@ -823,31 +865,13 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) } } -static void eb_release_vmas(const struct i915_execbuffer *eb) -{ - const unsigned int count = eb->buffer_count; - unsigned int i; - - for (i = 0; i < count; i++) { - struct eb_vma *ev = &eb->vma[i]; - struct i915_vma *vma = ev->vma; - - if (!vma) - break; - - eb->vma[i].vma = NULL; - - if (ev->flags & __EXEC_OBJECT_HAS_PIN) - __eb_unreserve_vma(vma, ev->flags); - - i915_vma_put(vma); - } -} - static void eb_destroy(const struct i915_execbuffer *eb) { GEM_BUG_ON(eb->reloc_cache.rq); + if (eb->array) + eb_vma_array_put(eb->array); + if (eb->lut_size > 0) kfree(eb->buckets); } @@ -1597,19 +1621,15 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) err = i915_vma_move_to_active(vma, eb->request, flags); i915_vma_unlock(vma); - - __eb_unreserve_vma(vma, flags); - i915_vma_put(vma); - - ev->vma = NULL; + eb_unreserve_vma(ev); } ww_acquire_fini(&acquire); + eb_vma_array_put(fetch_and_zero(&eb->array)); + if (unlikely(err)) goto err_skip; - eb->exec = NULL; - /* Unconditionally flush any chipset caches (for streaming writes). */ intel_gt_chipset_flush(eb->engine->gt); return 0; @@ -1861,6 +1881,7 @@ static int eb_parse(struct i915_execbuffer *eb) eb->vma[eb->buffer_count].vma = i915_vma_get(shadow); eb->vma[eb->buffer_count].flags = __EXEC_OBJECT_HAS_PIN; eb->batch = &eb->vma[eb->buffer_count++]; + eb->vma[eb->buffer_count].vma = NULL; eb->trampoline = trampoline; eb->batch_start_offset = 0; @@ -2386,8 +2407,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, args->flags |= __EXEC_HAS_RELOC; eb.exec = exec; - eb.vma = (struct eb_vma *)(exec + args->buffer_count + 1); - eb.vma[0].vma = NULL; eb.invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS; reloc_cache_init(&eb.reloc_cache, eb.i915); @@ -2594,8 +2613,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (batch->private) intel_engine_pool_put(batch->private); err_vma: - if (eb.exec) - eb_release_vmas(&eb); if (eb.trampoline) i915_vma_unpin(eb.trampoline); eb_unpin_engine(&eb); @@ -2615,7 +2632,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, static size_t eb_element_size(void) { - return sizeof(struct drm_i915_gem_exec_object2) + sizeof(struct eb_vma); + return sizeof(struct drm_i915_gem_exec_object2); } static bool check_buffer_count(size_t count) @@ -2671,7 +2688,7 @@ i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data, /* Copy in the exec list from userland */ exec_list = kvmalloc_array(count, sizeof(*exec_list), __GFP_NOWARN | GFP_KERNEL); - exec2_list = kvmalloc_array(count + 1, eb_element_size(), + exec2_list = kvmalloc_array(count, eb_element_size(), __GFP_NOWARN | GFP_KERNEL); if (exec_list == NULL || exec2_list == NULL) { drm_dbg(&i915->drm, @@ -2749,8 +2766,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (err) return err; - /* Allocate an extra slot for use by the command parser */ - exec2_list = kvmalloc_array(count + 1, eb_element_size(), + exec2_list = kvmalloc_array(count, eb_element_size(), __GFP_NOWARN | GFP_KERNEL); if (exec2_list == NULL) { drm_dbg(&i915->drm, "Failed to allocate exec list for %zd buffers\n", From patchwork Mon Mar 16 11:42:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440245 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 55B001668 for ; Mon, 16 Mar 2020 11:42:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3E3D620724 for ; Mon, 16 Mar 2020 11:42:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3E3D620724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 071B36E419; Mon, 16 Mar 2020 11:42:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 446396E418 for ; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574781-1500050 for multiple; Mon, 16 Mar 2020 11:42:38 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:35 +0000 Message-Id: <20200316114237.5436-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/15] drm/i915/gem: Separate the ww_mutex walker into its own list X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" In preparation for making eb_vma bigger and heavy to run inn parallel, we need to stop apply an in-place swap() to reorder around ww_mutex deadlocks. Keep the array intact and reorder the locks using a dedicated list. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 48 +++++++++++-------- drivers/gpu/drm/i915/i915_utils.h | 6 +++ 2 files changed, 34 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d3d0d0baa0ec..50b2000a72de 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -35,6 +35,7 @@ struct eb_vma { struct drm_i915_gem_exec_object2 *exec; struct list_head bind_link; struct list_head reloc_link; + struct list_head lock_link; struct hlist_node node; u32 handle; @@ -253,6 +254,8 @@ struct i915_execbuffer { /** list of vma that have execobj.relocation_count */ struct list_head relocs; + struct list_head lock; + /** * Track the most recently used object for relocations, as we * frequently have to perform multiple relocations within the same @@ -396,6 +399,10 @@ static int eb_create(struct i915_execbuffer *eb) eb->lut_size = -eb->buffer_count; } + INIT_LIST_HEAD(&eb->relocs); + INIT_LIST_HEAD(&eb->unbound); + INIT_LIST_HEAD(&eb->lock); + return 0; } @@ -564,6 +571,8 @@ eb_add_vma(struct i915_execbuffer *eb, eb_unreserve_vma(ev); list_add_tail(&ev->bind_link, &eb->unbound); } + + list_add_tail(&ev->lock_link, &eb->lock); } static inline int use_cpu_reloc(const struct reloc_cache *cache, @@ -779,9 +788,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) if (unlikely(i915_gem_context_is_closed(eb->gem_context))) return -ENOENT; - INIT_LIST_HEAD(&eb->relocs); - INIT_LIST_HEAD(&eb->unbound); - batch = eb_batch_index(eb); for (i = 0; i < eb->buffer_count; i++) { @@ -1546,38 +1552,39 @@ static int eb_relocate(struct i915_execbuffer *eb) static int eb_move_to_gpu(struct i915_execbuffer *eb) { - const unsigned int count = eb->buffer_count; struct ww_acquire_ctx acquire; - unsigned int i; + struct eb_vma *ev; int err = 0; ww_acquire_init(&acquire, &reservation_ww_class); - for (i = 0; i < count; i++) { - struct eb_vma *ev = &eb->vma[i]; + list_for_each_entry(ev, &eb->lock, lock_link) { struct i915_vma *vma = ev->vma; err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire); if (err == -EDEADLK) { - GEM_BUG_ON(i == 0); - do { - int j = i - 1; + struct eb_vma *unlock = ev, *en; - ww_mutex_unlock(&eb->vma[j].vma->resv->lock); - - swap(eb->vma[i], eb->vma[j]); - } while (--i); + list_for_each_entry_safe_continue_reverse(unlock, en, &eb->lock, lock_link) { + ww_mutex_unlock(&unlock->vma->resv->lock); + list_move_tail(&unlock->lock_link, &eb->lock); + } + GEM_BUG_ON(!list_is_first(&ev->lock_link, &eb->lock)); err = ww_mutex_lock_slow_interruptible(&vma->resv->lock, &acquire); } - if (err) - break; + if (err) { + list_for_each_entry_continue_reverse(ev, &eb->lock, lock_link) + ww_mutex_unlock(&ev->vma->resv->lock); + + ww_acquire_fini(&acquire); + goto err_skip; + } } ww_acquire_done(&acquire); - while (i--) { - struct eb_vma *ev = &eb->vma[i]; + list_for_each_entry(ev, &eb->lock, lock_link) { struct i915_vma *vma = ev->vma; unsigned int flags = ev->flags; struct drm_i915_gem_object *obj = vma->obj; @@ -1878,9 +1885,10 @@ static int eb_parse(struct i915_execbuffer *eb) if (err) goto err_trampoline; - eb->vma[eb->buffer_count].vma = i915_vma_get(shadow); - eb->vma[eb->buffer_count].flags = __EXEC_OBJECT_HAS_PIN; eb->batch = &eb->vma[eb->buffer_count++]; + eb->batch->vma = i915_vma_get(shadow); + eb->batch->flags = __EXEC_OBJECT_HAS_PIN; + list_add_tail(&eb->batch->lock_link, &eb->lock); eb->vma[eb->buffer_count].vma = NULL; eb->trampoline = trampoline; diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index 03a73d2bd50d..28813806bc19 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -266,6 +266,12 @@ static inline int list_is_last_rcu(const struct list_head *list, return READ_ONCE(list->next) == head; } +#define list_for_each_entry_safe_continue_reverse(pos, n, head, member) \ + for (pos = list_prev_entry(pos, member), \ + n = list_prev_entry(pos, member); \ + &pos->member != (head); \ + pos = n, n = list_prev_entry(n, member)) + /* * Wait until the work is finally complete, even if it tries to postpone * by requeueing itself. Note, that if the worker never cancels itself, From patchwork Mon Mar 16 11:42:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440259 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BBD3A6CA for ; Mon, 16 Mar 2020 11:43:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A374720738 for ; Mon, 16 Mar 2020 11:43:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A374720738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 433896E427; Mon, 16 Mar 2020 11:43:08 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8B3FB6E421 for ; Mon, 16 Mar 2020 11:43:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574782-1500050 for multiple; Mon, 16 Mar 2020 11:42:39 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:36 +0000 Message-Id: <20200316114237.5436-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/15] drm/i915/gem: Asynchronous GTT unbinding X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Auld Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is reasonably common for userspace (even modern drivers like iris) to reuse an active address for a new buffer. This would cause the application to stall under its mutex (originally struct_mutex) until the old batches were idle and it could synchronously remove the stale PTE. However, we can queue up a job that waits on the signal for the old nodes to complete and upon those signals, remove the old nodes replacing them with the new ones for the batch. This is still CPU driven, but in theory we can do the GTT patching from the GPU. The job itself has a completion signal allowing the execbuf to wait upon the rebinding, and also other observers to coordinate with the common VM activity. Letting userspace queue up more work, lets it do more stuff without blocking other clients. In turn, we take care not to let it too much concurrent work, creating a small number of queues for each context to limit the number of concurrent tasks. The implementation relies on only scheduling one unbind operation per vma as we use the unbound vma->node location to track the stale PTE. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1402 Signed-off-by: Chris Wilson Cc: Matthew Auld Cc: Andi Shyti --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 673 ++++++++++++++++-- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 + drivers/gpu/drm/i915/gt/intel_ggtt.c | 3 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 4 + drivers/gpu/drm/i915/gt/intel_gtt.h | 2 + drivers/gpu/drm/i915/gt/intel_ppgtt.c | 3 +- drivers/gpu/drm/i915/i915_vma.c | 134 ++-- drivers/gpu/drm/i915/i915_vma.h | 17 + 8 files changed, 701 insertions(+), 136 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 50b2000a72de..7fb47ff185a3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -18,6 +18,7 @@ #include "gt/intel_engine_pool.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" +#include "gt/intel_gt_requests.h" #include "gt/intel_ring.h" #include "i915_drv.h" @@ -31,6 +32,9 @@ struct eb_vma { struct i915_vma *vma; unsigned int flags; + struct drm_mm_node hole; + unsigned int bind_flags; + /** This vma's place in the execbuf reservation list */ struct drm_i915_gem_exec_object2 *exec; struct list_head bind_link; @@ -57,7 +61,8 @@ enum { #define __EXEC_OBJECT_HAS_FENCE BIT(30) #define __EXEC_OBJECT_NEEDS_MAP BIT(29) #define __EXEC_OBJECT_NEEDS_BIAS BIT(28) -#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 28) /* all of the above */ +#define __EXEC_OBJECT_HAS_PAGES BIT(27) +#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 27) /* all of the above */ #define __EXEC_HAS_RELOC BIT(31) #define __EXEC_INTERNAL_FLAGS (~0u << 31) @@ -314,6 +319,12 @@ static struct eb_vma_array *eb_vma_array_create(unsigned int count) return arr; } +static struct eb_vma_array *eb_vma_array_get(struct eb_vma_array *arr) +{ + kref_get(&arr->kref); + return arr; +} + static inline void eb_unreserve_vma(struct eb_vma *ev) { struct i915_vma *vma = ev->vma; @@ -324,8 +335,12 @@ static inline void eb_unreserve_vma(struct eb_vma *ev) if (ev->flags & __EXEC_OBJECT_HAS_PIN) __i915_vma_unpin(vma); + if (ev->flags & __EXEC_OBJECT_HAS_PAGES) + i915_vma_put_pages(vma); + ev->flags &= ~(__EXEC_OBJECT_HAS_PIN | - __EXEC_OBJECT_HAS_FENCE); + __EXEC_OBJECT_HAS_FENCE | + __EXEC_OBJECT_HAS_PAGES); } static void eb_vma_array_destroy(struct kref *kref) @@ -529,6 +544,7 @@ eb_add_vma(struct i915_execbuffer *eb, GEM_BUG_ON(i915_vma_is_closed(vma)); + memset(&ev->hole, 0, sizeof(ev->hole)); ev->vma = i915_vma_get(vma); ev->exec = entry; ev->flags = entry->flags; @@ -592,64 +608,545 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache, obj->cache_level != I915_CACHE_NONE); } -static int eb_reserve_vma(const struct i915_execbuffer *eb, - struct eb_vma *ev, - u64 pin_flags) +struct eb_vm_work { + struct dma_fence_work base; + struct list_head unbound; + struct eb_vma_array *array; + struct i915_address_space *vm; + struct list_head eviction_list; + u64 *p_flags; + u64 active; +}; + +static inline u64 node_end(const struct drm_mm_node *node) +{ + return node->start + node->size; +} + +static int set_bind_fence(struct i915_vma *vma, struct eb_vm_work *work) +{ + struct dma_fence *prev; + int err = 0; + + prev = i915_active_set_exclusive(&vma->active, &work->base.dma); + if (unlikely(prev)) { + err = i915_sw_fence_await_dma_fence(&work->base.chain, prev, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(prev); + } + + return err < 0 ? err : 0; +} + +static bool await_vma(struct eb_vm_work *work, struct i915_vma *vma) +{ + bool result = false; + + /* Wait for all other previous activity */ + if (i915_sw_fence_await_active(&work->base.chain, + &vma->active, + I915_ACTIVE_AWAIT_ALL)) + return false; + + /* Insert into queue along the exclusive vm->mutex timeline */ + if (i915_active_acquire(&vma->active) == 0) { + result = set_bind_fence(vma, work) == 0; + i915_active_release(&vma->active); + } + + return result; +} + +static bool evict_for_node(struct eb_vm_work *work, + struct eb_vma *const target, + unsigned int flags) +{ + struct i915_address_space *vm = target->vma->vm; + const unsigned long color = target->vma->node.color; + const u64 start = target->vma->node.start; + const u64 end = start + target->vma->node.size; + u64 hole_start = start, hole_end = end; + struct drm_mm_node *node; + LIST_HEAD(eviction_list); + struct i915_vma *vma; + + lockdep_assert_held(&vm->mutex); + GEM_BUG_ON(drm_mm_node_allocated(&target->vma->node)); + GEM_BUG_ON(!IS_ALIGNED(start, I915_GTT_PAGE_SIZE)); + GEM_BUG_ON(!IS_ALIGNED(end, I915_GTT_PAGE_SIZE)); + + if (i915_vm_has_cache_coloring(vm)) { + /* Expand search to cover neighbouring guard pages (or lack!) */ + if (hole_start) + hole_start -= I915_GTT_PAGE_SIZE; + + /* Always look at the page afterwards to avoid the end-of-GTT */ + hole_end += I915_GTT_PAGE_SIZE; + } + GEM_BUG_ON(hole_start >= hole_end); + + drm_mm_for_each_node_in_range(node, &vm->mm, hole_start, hole_end) { + GEM_BUG_ON(node == &target->vma->node); + + /* If we find any non-objects (!vma), we cannot evict them */ + if (node->color == I915_COLOR_UNEVICTABLE) + goto err; + + GEM_BUG_ON(!drm_mm_node_allocated(node)); + vma = container_of(node, typeof(*vma), node); + + /* + * If we are using coloring to insert guard pages between + * different cache domains within the address space, we have + * to check whether the objects on either side of our range + * abutt and conflict. If they are in conflict, then we evict + * those as well to make room for our guard pages. + */ + if (i915_vm_has_cache_coloring(vm)) { + if (node_end(node) == start && node->color == color) + continue; + + if (node->start == end && node->color == color) + continue; + } + + if (i915_vma_is_pinned(vma)) + goto err; + + if (flags & PIN_NONBLOCK && i915_vma_is_active(vma)) + goto err; + + if (!await_vma(work, vma)) + goto err; + + list_move(&vma->vm_link, &eviction_list); + } + + /* No overlapping nodes to evict, claim the slot for ourselves! */ + if (list_empty(&eviction_list)) + return drm_mm_reserve_node(&vm->mm, &target->vma->node) == 0; + + /* + * Mark this range as reserved. + * + * We have not yet removed the PTEs for the old evicted nodes, so + * must prevent this range from being reused for anything else. The + * PTE will be cleared when the range is idle (during the rebind + * phase in the worker). + */ + target->hole.color = I915_COLOR_UNEVICTABLE; + target->hole.start = start; + target->hole.size = end; + + list_for_each_entry(vma, &eviction_list, vm_link) { + target->hole.start = + min(target->hole.start, vma->node.start); + target->hole.size = + max(target->hole.size, node_end(&vma->node)); + + GEM_BUG_ON(vma->node.mm != &vm->mm); + drm_mm_remove_node(&vma->node); + atomic_and(~I915_VMA_BIND_MASK, &vma->flags); + GEM_BUG_ON(i915_vma_is_pinned(vma)); + } + list_splice(&eviction_list, &work->eviction_list); + + target->hole.size -= target->hole.start; + + return drm_mm_reserve_node(&vm->mm, &target->hole) == 0; + +err: + list_splice(&eviction_list, &vm->bound_list); + return false; +} + +static bool +evict_vma(struct eb_vm_work *work, + struct eb_vma * const target, + struct i915_vma *vma) +{ + if (!await_vma(work, vma)) + return false; + + drm_mm_remove_node(&vma->node); + atomic_and(~I915_VMA_BIND_MASK, &vma->flags); + list_move(&vma->vm_link, &work->eviction_list); + + target->hole.start = min(target->hole.start, vma->node.start); + target->hole.size = max(target->hole.size, node_end(&vma->node)); + + return true; +} + +static int +evict_for_range(struct eb_vm_work *work, + struct eb_vma * const target, + u64 start, u64 end, u64 align) +{ + struct i915_address_space *vm = target->vma->vm; + struct i915_vma *vma, *next; + struct drm_mm_scan scan; + LIST_HEAD(eviction_list); + struct drm_mm_node *node; + + lockdep_assert_held(&vm->mutex); + + drm_mm_scan_init_with_range(&scan, &vm->mm, + target->vma->node.size, + align, + target->vma->node.color, + start, end, + DRM_MM_INSERT_BEST); + + list_for_each_entry(vma, &vm->bound_list, vm_link) { + if (i915_vma_is_pinned(vma)) + continue; + + list_move(&vma->evict_link, &eviction_list); + if (drm_mm_scan_add_block(&scan, &vma->node)) + goto found; + } + +err: + list_for_each_entry(vma, &eviction_list, evict_link) + drm_mm_scan_remove_block(&scan, &vma->node); + list_splice(&eviction_list, &vm->bound_list); + return false; + +found: + target->hole.color = I915_COLOR_UNEVICTABLE; + target->hole.start = U64_MAX; + target->hole.size = 0; + + list_for_each_entry_safe(vma, next, &eviction_list, evict_link) { + if (!drm_mm_scan_remove_block(&scan, &vma->node)) { + list_move(&vma->vm_link, &vm->bound_list); + continue; + } + + if (!evict_vma(work, target, vma)) + goto err; + } + + while ((node = drm_mm_scan_color_evict(&scan))) { + vma = container_of(node, struct i915_vma, node); + + if (i915_vma_is_pinned(vma)) + goto err; + + if (!evict_vma(work, target, vma)) + goto err; + } + + target->hole.size -= target->hole.start; + return drm_mm_reserve_node(&vm->mm, &target->hole) == 0; +} + +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) +{ + u64 range, addr; + + GEM_BUG_ON(range_overflows(start, len, end)); + GEM_BUG_ON(round_up(start, align) > round_down(end - len, align)); + + range = round_down(end - len, align) - round_up(start, align); + if (range) { + if (sizeof(unsigned long) == sizeof(u64)) { + addr = get_random_long(); + } else { + addr = get_random_int(); + if (range > U32_MAX) { + addr <<= 32; + addr |= get_random_int(); + } + } + div64_u64_rem(addr, range, &addr); + start += addr; + } + + return round_up(start, align); +} + +static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev) { struct drm_i915_gem_exec_object2 *entry = ev->exec; - unsigned int exec_flags = ev->flags; + const unsigned int exec_flags = ev->flags; struct i915_vma *vma = ev->vma; + struct i915_address_space *vm = vma->vm; + u64 start = 0, end = vm->total; + u64 align = entry->alignment ?: I915_GTT_MIN_ALIGNMENT; + unsigned int bind_flags; int err; + bind_flags = PIN_USER; if (exec_flags & EXEC_OBJECT_NEEDS_GTT) - pin_flags |= PIN_GLOBAL; + bind_flags |= PIN_GLOBAL; + + if (drm_mm_node_allocated(&vma->node)) + goto pin; + + GEM_BUG_ON(i915_vma_is_bound(vma, bind_flags)); + GEM_BUG_ON(i915_active_fence_isset(&vma->active.excl)); + + vma->node.start = entry->offset & PIN_OFFSET_MASK; + vma->node.size = max(entry->pad_to_size, vma->size); + vma->node.color = 0; + if (i915_vm_has_cache_coloring(vm)) + vma->node.color = vma->obj->cache_level; /* * Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset, * limit address to the first 4GBs for unflagged objects. */ if (!(exec_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)) - pin_flags |= PIN_ZONE_4G; + end = min_t(u64, end, (1ULL << 32) - I915_GTT_PAGE_SIZE); + + align = max(align, vma->display_alignment); + if (exec_flags & __EXEC_OBJECT_NEEDS_MAP) { + vma->node.size = max_t(u64, vma->node.size, vma->fence_size); + end = min_t(u64, end, i915_vm_to_ggtt(vm)->mappable_end); + align = max_t(u64, align, vma->fence_alignment); + } + + if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS) + start = BATCH_OFFSET_BIAS; + + if (vma->node.size > end - start) + return -E2BIG; + + if (vma->node.start >= start && + IS_ALIGNED(vma->node.start, align) && + !range_overflows(vma->node.start, vma->node.size, end)) { + unsigned int pin_flags; + + if (drm_mm_reserve_node(&vm->mm, &vma->node) == 0) + goto pin; - if (exec_flags & __EXEC_OBJECT_NEEDS_MAP) - pin_flags |= PIN_MAPPABLE; + pin_flags = 0; + if (!(exec_flags & EXEC_OBJECT_PINNED)) + pin_flags = PIN_NONBLOCK; + + if (evict_for_node(work, ev, pin_flags)) + goto pin; + } if (exec_flags & EXEC_OBJECT_PINNED) - pin_flags |= entry->offset | PIN_OFFSET_FIXED; - else if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS) - pin_flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS; + return -ENOSPC; - if (drm_mm_node_allocated(&vma->node) && - eb_vma_misplaced(entry, vma, ev->flags)) { - err = i915_vma_unbind(vma); - if (err) + vma->node.start = random_offset(start, end, vma->node.size, align); + if (evict_for_node(work, ev, PIN_NONBLOCK)) + goto pin; + + if (drm_mm_insert_node_in_range(&vm->mm, &vma->node, + vma->node.size, + align <= I915_GTT_MIN_ALIGNMENT ? 0 : align, + vma->node.color, + start, end, 0) == 0) + goto pin; + + vma->node.start = random_offset(start, end, vma->node.size, align); + if (evict_for_node(work, ev, 0)) + goto pin; + + if (evict_for_range(work, ev, start, end, align)) + goto pin; + + return -ENOSPC; + +pin: + if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { + err = __i915_vma_pin_fence(vma); /* XXX no waiting */ + if (unlikely(err)) return err; + + if (vma->fence) + ev->flags |= __EXEC_OBJECT_HAS_FENCE; } - err = i915_vma_pin(vma, - entry->pad_to_size, entry->alignment, - pin_flags); - if (err) - return err; + ev->bind_flags = bind_flags & ~atomic_read(&vma->flags); + if (ev->bind_flags) { + err = set_bind_fence(vma, work); + if (unlikely(err)) + return err; + + atomic_add(I915_VMA_PAGES_ACTIVE, &vma->pages_count); + atomic_or(bind_flags, &vma->flags); + list_move_tail(&vma->vm_link, &vm->bound_list); + } + + __i915_vma_pin(vma); + + if (ev->bind_flags && i915_vma_is_ggtt(vma)) + __i915_vma_set_map_and_fenceable(vma); + + GEM_BUG_ON(!ev->bind_flags && !drm_mm_node_allocated(&vma->node)); + GEM_BUG_ON(!(drm_mm_node_allocated(&vma->node) ^ + drm_mm_node_allocated(&ev->hole))); if (entry->offset != vma->node.start) { entry->offset = vma->node.start | UPDATE; - eb->args->flags |= __EXEC_HAS_RELOC; + *work->p_flags |= __EXEC_HAS_RELOC; } - if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { - err = i915_vma_pin_fence(vma); - if (unlikely(err)) { - i915_vma_unpin(vma); - return err; + ev->flags |= __EXEC_OBJECT_HAS_PIN; + GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags)); + + return 0; +} + +static int eb_bind_vma(struct dma_fence_work *base) +{ + struct eb_vm_work *work = container_of(base, typeof(*work), base); + struct i915_address_space *vm = work->vm; + struct eb_vma *ev; + int err = 0; + + /* + * We have to wait until the stale nodes are completely idle before + * we can remove their PTE and unbind their pages. Hence, after + * claiming their slot in the drm_mm, we defer their removal to + * after the fences are signaled. + */ + if (!list_empty(&work->eviction_list)) { + struct i915_vma *vma, *vn; + + mutex_lock(&vm->mutex); + list_for_each_entry_safe(vma, vn, + &work->eviction_list, vm_link) { + GEM_BUG_ON(i915_vma_is_bound(vma, I915_VMA_BIND_MASK)); + GEM_BUG_ON(vma->vm != vm); + + __i915_vma_evict(vma); } + mutex_unlock(&vm->mutex); + } - if (vma->fence) - exec_flags |= __EXEC_OBJECT_HAS_FENCE; + /* + * Now we the nodes we require in drm_mm are idle, we can replace + * the PTE in those ranges with our own. + */ + list_for_each_entry(ev, &work->unbound, bind_link) { + struct i915_vma *vma = ev->vma; + + GEM_BUG_ON(vma->vm != vm); + + if (!ev->bind_flags) + goto put; + + GEM_BUG_ON(!i915_vma_is_active(vma)); + + if (err == 0) + err = vma->ops->bind_vma(vma, + vma->obj->cache_level, + ev->bind_flags | + I915_VMA_ALLOC); + + GEM_BUG_ON(!i915_vma_is_bound(vma, I915_VMA_LOCAL_BIND)); + if (drm_mm_node_allocated(&ev->hole)) { + mutex_lock(&vm->mutex); + GEM_BUG_ON(ev->hole.mm != &vm->mm); + GEM_BUG_ON(drm_mm_node_allocated(&vma->node)); + drm_mm_remove_node(&ev->hole); + drm_mm_reserve_node(&vm->mm, &vma->node); + GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); + mutex_unlock(&vm->mutex); + } + if (err) { + atomic_and(~ev->bind_flags, &vma->flags); + set_bit(I915_VMA_ERROR_BIT, __i915_vma_flags(vma)); + } + +put: + GEM_BUG_ON(drm_mm_node_allocated(&ev->hole)); } - ev->flags = exec_flags | __EXEC_OBJECT_HAS_PIN; - GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags)); + return err; +} + +static void eb_vma_work_release(struct dma_fence_work *base) +{ + struct eb_vm_work *work = container_of(base, typeof(*work), base); + + if (work->active) + i915_active_release(&work->vm->active); + + eb_vma_array_put(work->array); +} + +static const struct dma_fence_work_ops eb_bind_ops = { + .name = "eb_bind", + .work = eb_bind_vma, + .release = eb_vma_work_release, +}; + +static struct eb_vm_work *eb_vm_work(struct i915_execbuffer *eb) +{ + struct eb_vm_work *work; + + work = kzalloc(sizeof(*work), GFP_KERNEL); + if (!work) + return NULL; + + dma_fence_work_init(&work->base, &eb_bind_ops); + list_replace_init(&eb->unbound, &work->unbound); + work->array = eb_vma_array_get(eb->array); + work->p_flags = &eb->args->flags; + work->vm = eb->context->vm; + + /* Preallocate our slot in vm->active, outside of vm->mutex */ + work->active = i915_gem_context_async_id(eb->gem_context); + if (i915_active_acquire_for_context(&work->vm->active, work->active)) { + work->active = 0; + work->base.dma.error = -ENOMEM; + dma_fence_work_commit(&work->base); + return NULL; + } + + INIT_LIST_HEAD(&work->eviction_list); + + GEM_BUG_ON(list_empty(&work->unbound)); + GEM_BUG_ON(!list_empty(&eb->unbound)); + + return work; +} + +static int eb_vm_throttle(struct eb_vm_work *work) +{ + struct dma_fence *p; + int err; + + /* Keep async work queued per context */ + p = __i915_active_ref(&work->vm->active, work->active, &work->base.dma); + if (IS_ERR_OR_NULL(p)) + return PTR_ERR(p); + + err = i915_sw_fence_await_dma_fence(&work->base.chain, p, 0, + GFP_NOWAIT | __GFP_NOWARN); + dma_fence_put(p); + + return err < 0 ? err : 0; +} + +static int eb_prepare_vma(struct eb_vma *ev) +{ + struct i915_vma *vma = ev->vma; + int err; + + GEM_BUG_ON(drm_mm_node_allocated(&ev->hole)); + ev->bind_flags = 0; + + /* Wait for a previous unbind to avoid reusing an active vma->node */ + err = i915_vma_wait_for_unbind(vma); + if (err) + return err; + + if (!(ev->flags & __EXEC_OBJECT_HAS_PAGES)) { + err = i915_vma_get_pages(vma); + if (err) + return err; + + ev->flags |= __EXEC_OBJECT_HAS_PAGES; + } return 0; } @@ -657,39 +1154,82 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb, static int eb_reserve(struct i915_execbuffer *eb) { const unsigned int count = eb->buffer_count; - unsigned int pin_flags = PIN_USER | PIN_NONBLOCK; + struct i915_address_space *vm = eb->context->vm; struct list_head last; - struct eb_vma *ev; unsigned int i, pass; + struct eb_vma *ev; int err = 0; - /* - * Attempt to pin all of the buffers into the GTT. - * This is done in 3 phases: - * - * 1a. Unbind all objects that do not match the GTT constraints for - * the execbuffer (fenceable, mappable, alignment etc). - * 1b. Increment pin count for already bound objects. - * 2. Bind new objects. - * 3. Decrement pin count. - * - * This avoid unnecessary unbinding of later objects in order to make - * room for the earlier objects *unless* we need to defragment. - */ - - if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex)) - return -EINTR; - pass = 0; do { + struct eb_vm_work *work; + list_for_each_entry(ev, &eb->unbound, bind_link) { - err = eb_reserve_vma(eb, ev, pin_flags); + err = eb_prepare_vma(ev); + switch (err) { + case 0: + break; + case -EAGAIN: + goto retry; + default: + return err; + } + } + + work = eb_vm_work(eb); + if (!work) + return -ENOMEM; + + if (mutex_lock_interruptible(&vm->mutex)) { + work->base.dma.error = -EINTR; + dma_fence_work_commit(&work->base); + return -EINTR; + } + + err = eb_vm_throttle(work); + if (err) { + mutex_unlock(&vm->mutex); + work->base.dma.error = err; + dma_fence_work_commit(&work->base); + return err; + } + + list_for_each_entry(ev, &work->unbound, bind_link) { + struct i915_vma *vma = ev->vma; + + if (drm_mm_node_allocated(&vma->node) && + eb_vma_misplaced(ev->exec, vma, ev->flags)) { + err = __i915_vma_unbind(vma); + if (err) + break; + } + + if (!drm_mm_node_allocated(&vma->node) && + i915_vma_is_active(vma)) { + err = -ENOSPC; + break; + } + + err = i915_active_acquire(&vma->active); + if (!err) { + err = eb_reserve_vma(work, ev); + i915_active_release(&vma->active); + } if (err) break; } - if (!(err == -ENOSPC || err == -EAGAIN)) - break; + mutex_unlock(&vm->mutex); + + dma_fence_get(&work->base.dma); + dma_fence_work_commit_imm(&work->base); + if (err == -ENOSPC && dma_fence_wait(&work->base.dma, true)) + err = -EINTR; + dma_fence_put(&work->base.dma); + if (err != -ENOSPC) + return err; + +retry: /* Resort *all* the objects into priority order */ INIT_LIST_HEAD(&eb->unbound); INIT_LIST_HEAD(&last); @@ -719,9 +1259,7 @@ static int eb_reserve(struct i915_execbuffer *eb) list_splice_tail(&last, &eb->unbound); if (err == -EAGAIN) { - mutex_unlock(&eb->i915->drm.struct_mutex); flush_workqueue(eb->i915->mm.userptr_wq); - mutex_lock(&eb->i915->drm.struct_mutex); continue; } @@ -731,24 +1269,20 @@ static int eb_reserve(struct i915_execbuffer *eb) case 1: /* Too fragmented, unbind everything and retry */ - mutex_lock(&eb->context->vm->mutex); - err = i915_gem_evict_vm(eb->context->vm); - mutex_unlock(&eb->context->vm->mutex); - if (err) - goto unlock; + if (i915_active_wait(&vm->active)) + return -EINTR; + break; + + case 2: + if (intel_gt_wait_for_idle(vm->gt, + MAX_SCHEDULE_TIMEOUT)) + return -EINTR; break; default: - err = -ENOSPC; - goto unlock; + return err; } - - pin_flags = PIN_USER; } while (1); - -unlock: - mutex_unlock(&eb->i915->drm.struct_mutex); - return err; } static unsigned int eb_batch_index(const struct i915_execbuffer *eb) @@ -1628,7 +2162,6 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) err = i915_vma_move_to_active(vma, eb->request, flags); i915_vma_unlock(vma); - eb_unreserve_vma(ev); } ww_acquire_fini(&acquire); diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index f4fec7eb4064..2c5ac598ade2 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -370,6 +370,7 @@ static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size) atomic_set(&vma->flags, I915_VMA_GGTT); vma->ggtt_view.type = I915_GGTT_VIEW_ROTATED; /* prevent fencing */ + INIT_LIST_HEAD(&vma->vm_link); INIT_LIST_HEAD(&vma->obj_link); INIT_LIST_HEAD(&vma->closed_link); diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 8fcf14372d7a..e1981515f2e6 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -551,7 +551,8 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, if (flags & I915_VMA_LOCAL_BIND) { struct i915_ppgtt *alias = i915_vm_to_ggtt(vma->vm)->alias; - if (flags & I915_VMA_ALLOC) { + if (flags & I915_VMA_ALLOC && + !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) { ret = alias->vm.allocate_va_range(&alias->vm, vma->node.start, vma->size); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 2a72cce63fd9..82d4f943c346 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -194,6 +194,8 @@ void __i915_vm_close(struct i915_address_space *vm) void i915_address_space_fini(struct i915_address_space *vm) { + i915_active_fini(&vm->active); + spin_lock(&vm->free_pages.lock); if (pagevec_count(&vm->free_pages.pvec)) vm_free_pages_release(vm, true); @@ -246,6 +248,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) drm_mm_init(&vm->mm, 0, vm->total); vm->mm.head_node.color = I915_COLOR_UNEVICTABLE; + i915_active_init(&vm->active, NULL, NULL); + stash_init(&vm->free_pages); INIT_LIST_HEAD(&vm->bound_list); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index d93ebdf3fa0e..773fc76dfa1b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -263,6 +263,8 @@ struct i915_address_space { */ struct list_head bound_list; + struct i915_active active; + struct pagestash free_pages; /* Global GTT */ diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c index f86f7e68ce5e..ecdd58f4b993 100644 --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c @@ -162,7 +162,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma, u32 pte_flags; int err; - if (flags & I915_VMA_ALLOC) { + if (flags & I915_VMA_ALLOC && + !test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) { err = vma->vm->allocate_va_range(vma->vm, vma->node.start, vma->size); if (err) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 880e18f5b71b..e29cc9bb5a6d 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -132,6 +132,7 @@ vma_create(struct drm_i915_gem_object *obj, fs_reclaim_release(GFP_KERNEL); } + INIT_LIST_HEAD(&vma->vm_link); INIT_LIST_HEAD(&vma->closed_link); if (view && view->type != I915_GGTT_VIEW_NORMAL) { @@ -340,25 +341,32 @@ struct i915_vma_work *i915_vma_work(void) return vw; } -int i915_vma_wait_for_bind(struct i915_vma *vma) +static int __i915_vma_wait_excl(struct i915_vma *vma, bool bound) { + struct dma_fence *fence; int err = 0; - if (rcu_access_pointer(vma->active.excl.fence)) { - struct dma_fence *fence; + fence = i915_active_fence_get(&vma->active.excl); + if (!fence) + return 0; - rcu_read_lock(); - fence = dma_fence_get_rcu_safe(&vma->active.excl.fence); - rcu_read_unlock(); - if (fence) { - err = dma_fence_wait(fence, MAX_SCHEDULE_TIMEOUT); - dma_fence_put(fence); - } - } + if (drm_mm_node_allocated(&vma->node) == bound) + err = dma_fence_wait(fence, true); + dma_fence_put(fence); return err; } +int i915_vma_wait_for_bind(struct i915_vma *vma) +{ + return __i915_vma_wait_excl(vma, true); +} + +int i915_vma_wait_for_unbind(struct i915_vma *vma) +{ + return __i915_vma_wait_excl(vma, false); +} + /** * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space. * @vma: VMA to map @@ -726,7 +734,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, color)); - list_add_tail(&vma->vm_link, &vma->vm->bound_list); + list_move_tail(&vma->vm_link, &vma->vm->bound_list); return 0; } @@ -742,7 +750,7 @@ i915_vma_detach(struct i915_vma *vma) * vma, we can drop its hold on the backing storage and allow * it to be reaped by the shrinker. */ - list_del(&vma->vm_link); + list_del_init(&vma->vm_link); } static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) @@ -788,7 +796,7 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned int flags) return pinned; } -static int vma_get_pages(struct i915_vma *vma) +int i915_vma_get_pages(struct i915_vma *vma) { int err = 0; @@ -835,7 +843,7 @@ static void __vma_put_pages(struct i915_vma *vma, unsigned int count) mutex_unlock(&vma->pages_mutex); } -static void vma_put_pages(struct i915_vma *vma) +void i915_vma_put_pages(struct i915_vma *vma) { if (atomic_add_unless(&vma->pages_count, -1, 1)) return; @@ -843,7 +851,7 @@ static void vma_put_pages(struct i915_vma *vma) __vma_put_pages(vma, 1); } -static void vma_unbind_pages(struct i915_vma *vma) +void i915_vma_unbind_pages(struct i915_vma *vma) { unsigned int count; @@ -874,10 +882,14 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (try_qad_pin(vma, flags & I915_VMA_BIND_MASK)) return 0; - err = vma_get_pages(vma); + err = i915_vma_get_pages(vma); if (err) return err; + err = i915_vma_wait_for_unbind(vma); + if (err) + goto err_pages; + if (flags & vma->vm->bind_async_flags) { work = i915_vma_work(); if (!work) { @@ -960,7 +972,7 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (wakeref) intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref); err_pages: - vma_put_pages(vma); + i915_vma_put_pages(vma); return err; } @@ -1108,17 +1120,6 @@ void i915_vma_parked(struct intel_gt *gt) spin_unlock_irq(>->closed_lock); } -static void __i915_vma_iounmap(struct i915_vma *vma) -{ - GEM_BUG_ON(i915_vma_is_pinned(vma)); - - if (vma->iomap == NULL) - return; - - io_mapping_unmap(vma->iomap); - vma->iomap = NULL; -} - void i915_vma_revoke_mmap(struct i915_vma *vma) { struct drm_vma_offset_node *node; @@ -1201,41 +1202,8 @@ int i915_vma_move_to_active(struct i915_vma *vma, return 0; } -int __i915_vma_unbind(struct i915_vma *vma) +void __i915_vma_evict(struct i915_vma *vma) { - int ret; - - lockdep_assert_held(&vma->vm->mutex); - - /* - * First wait upon any activity as retiring the request may - * have side-effects such as unpinning or even unbinding this vma. - * - * XXX Actually waiting under the vm->mutex is a hinderance and - * should be pipelined wherever possible. In cases where that is - * unavoidable, we should lift the wait to before the mutex. - */ - ret = i915_vma_sync(vma); - if (ret) - return ret; - - if (i915_vma_is_pinned(vma)) { - vma_print_allocator(vma, "is pinned"); - return -EAGAIN; - } - - /* - * After confirming that no one else is pinning this vma, wait for - * any laggards who may have crept in during the wait (through - * a residual pin skipping the vm->mutex) to complete. - */ - ret = i915_vma_sync(vma); - if (ret) - return ret; - - if (!drm_mm_node_allocated(&vma->node)) - return 0; - GEM_BUG_ON(i915_vma_is_pinned(vma)); GEM_BUG_ON(i915_vma_is_active(vma)); @@ -1275,7 +1243,45 @@ int __i915_vma_unbind(struct i915_vma *vma) &vma->flags); i915_vma_detach(vma); - vma_unbind_pages(vma); + i915_vma_unbind_pages(vma); +} + +int __i915_vma_unbind(struct i915_vma *vma) +{ + int ret; + + lockdep_assert_held(&vma->vm->mutex); + + /* + * First wait upon any activity as retiring the request may + * have side-effects such as unpinning or even unbinding this vma. + * + * XXX Actually waiting under the vm->mutex is a hinderance and + * should be pipelined wherever possible. In cases where that is + * unavoidable, we should lift the wait to before the mutex. + */ + ret = i915_vma_sync(vma); + if (ret) + return ret; + + if (i915_vma_is_pinned(vma)) { + vma_print_allocator(vma, "is pinned"); + return -EAGAIN; + } + + /* + * After confirming that no one else is pinning this vma, wait for + * any laggards who may have crept in during the wait (through + * a residual pin skipping the vm->mutex) to complete. + */ + ret = i915_vma_sync(vma); + if (ret) + return ret; + + if (!drm_mm_node_allocated(&vma->node)) + return 0; + + __i915_vma_evict(vma); drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */ return 0; diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 8ad1daabcd58..d72847550d1a 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -203,6 +203,7 @@ bool i915_vma_misplaced(const struct i915_vma *vma, u64 size, u64 alignment, u64 flags); void __i915_vma_set_map_and_fenceable(struct i915_vma *vma); void i915_vma_revoke_mmap(struct i915_vma *vma); +void __i915_vma_evict(struct i915_vma *vma); int __i915_vma_unbind(struct i915_vma *vma); int __must_check i915_vma_unbind(struct i915_vma *vma); void i915_vma_unlink_ctx(struct i915_vma *vma); @@ -239,6 +240,10 @@ int __must_check i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags); int i915_ggtt_pin(struct i915_vma *vma, u32 align, unsigned int flags); +int i915_vma_get_pages(struct i915_vma *vma); +void i915_vma_put_pages(struct i915_vma *vma); +void i915_vma_unbind_pages(struct i915_vma *vma); + static inline int i915_vma_pin_count(const struct i915_vma *vma) { return atomic_read(&vma->flags) & I915_VMA_PIN_MASK; @@ -293,6 +298,17 @@ static inline bool i915_node_color_differs(const struct drm_mm_node *node, void __iomem *i915_vma_pin_iomap(struct i915_vma *vma); #define IO_ERR_PTR(x) ((void __iomem *)ERR_PTR(x)) +static inline void __i915_vma_iounmap(struct i915_vma *vma) +{ + GEM_BUG_ON(i915_vma_is_pinned(vma)); + + if (!vma->iomap) + return; + + io_mapping_unmap(vma->iomap); + vma->iomap = NULL; +} + /** * i915_vma_unpin_iomap - unpins the mapping returned from i915_vma_iomap * @vma: VMA to unpin @@ -376,6 +392,7 @@ void i915_vma_make_shrinkable(struct i915_vma *vma); void i915_vma_make_purgeable(struct i915_vma *vma); int i915_vma_wait_for_bind(struct i915_vma *vma); +int i915_vma_wait_for_unbind(struct i915_vma *vma); static inline int i915_vma_sync(struct i915_vma *vma) { From patchwork Mon Mar 16 11:42:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11440251 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DE931913 for ; Mon, 16 Mar 2020 11:42:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C689320724 for ; Mon, 16 Mar 2020 11:42:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C689320724 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DB72E6E418; Mon, 16 Mar 2020 11:42:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 430636E417 for ; Mon, 16 Mar 2020 11:42:50 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 20574783-1500050 for multiple; Mon, 16 Mar 2020 11:42:39 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Mar 2020 11:42:37 +0000 Message-Id: <20200316114237.5436-15-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200316114237.5436-1-chris@chris-wilson.co.uk> References: <20200316114237.5436-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/15] drm/i915/gem: Bind the fence async for execbuf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is illegal to wait on an another vma while holding the vm->mutex, as that easily leads to ABBA deadlocks (we wait on a second vma that waits on us to release the vm->mutex). So while the vm->mutex exists, move the waiting outside of the lock into the async binding pipeline. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 21 +++-- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 87 ++++++++++++++++++- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h | 5 ++ 3 files changed, 103 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 7fb47ff185a3..db5ad6a0df28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -472,13 +472,19 @@ eb_pin_vma(struct i915_execbuffer *eb, return false; if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { - if (unlikely(i915_vma_pin_fence(vma))) { - i915_vma_unpin(vma); - return false; - } + struct i915_fence_reg *reg = vma->fence; - if (vma->fence) + /* Avoid waiting to change the fence; defer to async worker */ + if (reg) { + if (READ_ONCE(reg->dirty)) + return false; + + atomic_inc(®->pin_count); ev->flags |= __EXEC_OBJECT_HAS_FENCE; + } else { + if (i915_gem_object_is_tiled(vma->obj)) + return false; + } } ev->flags |= __EXEC_OBJECT_HAS_PIN; @@ -955,7 +961,7 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct eb_vma *ev) pin: if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) { - err = __i915_vma_pin_fence(vma); /* XXX no waiting */ + err = __i915_vma_pin_fence_async(vma, &work->base); if (unlikely(err)) return err; @@ -1030,6 +1036,9 @@ static int eb_bind_vma(struct dma_fence_work *base) GEM_BUG_ON(vma->vm != vm); + if (ev->flags & __EXEC_OBJECT_HAS_FENCE) + __i915_vma_apply_fence_async(vma); + if (!ev->bind_flags) goto put; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c index 97659a1249fd..d99d57be3505 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -21,10 +21,13 @@ * IN THE SOFTWARE. */ +#include "i915_active.h" #include "i915_drv.h" #include "i915_scatterlist.h" +#include "i915_sw_fence_work.h" #include "i915_pvinfo.h" #include "i915_vgpu.h" +#include "i915_vma.h" /** * DOC: fence register handling @@ -336,16 +339,14 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt) int __i915_vma_pin_fence(struct i915_vma *vma) { struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); - struct i915_fence_reg *fence; + struct i915_fence_reg *fence = vma->fence; struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; int err; lockdep_assert_held(&vma->vm->mutex); /* Just update our place in the LRU if our fence is getting reused. */ - if (vma->fence) { - fence = vma->fence; - GEM_BUG_ON(fence->vma != vma); + if (fence && fence->vma == vma) { atomic_inc(&fence->pin_count); if (!fence->dirty) { list_move_tail(&fence->link, &ggtt->fence_list); @@ -377,6 +378,84 @@ int __i915_vma_pin_fence(struct i915_vma *vma) return err; } +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work) +{ + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm); + struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL; + struct i915_fence_reg *fence = vma->fence; + int err; + + lockdep_assert_held(&vma->vm->mutex); + + /* Just update our place in the LRU if our fence is getting reused. */ + if (fence && fence->vma == vma) { + } else if (set) { + fence = fence_find(ggtt); + if (IS_ERR(fence)) + return PTR_ERR(fence); + + GEM_BUG_ON(atomic_read(&fence->pin_count)); + fence->dirty = true; + } else { + return 0; + } + + atomic_inc(&fence->pin_count); + if (!fence->dirty) { + list_move_tail(&fence->link, &ggtt->fence_list); + return 0; + } + + fence->tiling = 0; + if (set) { + if (INTEL_GEN(fence_to_i915(fence)) < 4) { + /* implicit 'unfenced' GPU blits */ + err = i915_sw_fence_await_active(&work->chain, + &vma->active, + I915_ACTIVE_AWAIT_ALL); + if (err) { + atomic_dec(&fence->pin_count); + return err; + } + } + + fence->start = vma->node.start; + fence->size = vma->fence_size; + fence->stride = i915_gem_object_get_stride(vma->obj); + fence->tiling = i915_gem_object_get_tiling(vma->obj); + } + + set = xchg(&fence->vma, vma); + if (set) { + err = i915_sw_fence_await_active(&work->chain, + &fence->active, + I915_ACTIVE_AWAIT_ALL); + if (err) { + fence->vma = set; + atomic_dec(&fence->pin_count); + return err; + } + + if (set != vma) { + GEM_BUG_ON(set->fence != fence); + i915_vma_revoke_mmap(set); + set->fence = NULL; + } + } + + vma->fence = fence; + return 0; +} + +void __i915_vma_apply_fence_async(struct i915_vma *vma) +{ + struct i915_fence_reg *fence = vma->fence; + + if (fence->dirty) + fence_write(fence); +} + /** * i915_vma_pin_fence - set up fencing for a vma * @vma: vma to map through a fence reg diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h index 9eef679e1311..d306ac14d47e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h @@ -30,6 +30,7 @@ #include "i915_active.h" +struct dma_fence_work; struct drm_i915_gem_object; struct i915_ggtt; struct i915_vma; @@ -70,6 +71,10 @@ void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj, void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj, struct sg_table *pages); +int __i915_vma_pin_fence_async(struct i915_vma *vma, + struct dma_fence_work *work); +void __i915_vma_apply_fence_async(struct i915_vma *vma); + void intel_ggtt_init_fences(struct i915_ggtt *ggtt); void intel_ggtt_fini_fences(struct i915_ggtt *ggtt);