From patchwork Thu Jun 27 06:30:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018893 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8959D1580 for ; Thu, 27 Jun 2019 06:31:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A91128A1B for ; Thu, 27 Jun 2019 06:31:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6DC6428A26; Thu, 27 Jun 2019 06:31:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1944A28A1B for ; Thu, 27 Jun 2019 06:31:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ABD476E5B1; Thu, 27 Jun 2019 06:31:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 085056E5B1 for ; Thu, 27 Jun 2019 06:31:21 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038744-1500050 for multiple; Thu, 27 Jun 2019 07:30:47 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:38 +0100 Message-Id: <20190627063044.11858-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/7] drm/i915: Teach execbuffer to take the engine wakeref not GT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP In the next patch, we would like to couple into the engine wakeref to free the batch pool on idling. The caveat here is that we therefore want to track the engine wakeref more precisely and to hold it instead of the broader GT wakeref as we process the ioctl. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 36 ++++++++++++------- drivers/gpu/drm/i915/gt/intel_context.h | 7 ++++ 2 files changed, 31 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 1c5dfbfad71b..f43eaaa5db5f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2143,13 +2143,35 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) if (err) return err; + /* + * Take a local wakeref for preparing to dispatch the execbuf as + * we expect to access the hardware fairly frequently in the + * process. Upon first dispatch, we acquire another prolonged + * wakeref that we hold until the GPU has been idle for at least + * 100ms. + */ + err = intel_context_timeline_lock(ce); + if (err) + goto err_unpin; + + intel_context_enter(ce); + intel_context_timeline_unlock(ce); + eb->engine = ce->engine; eb->context = ce; return 0; + +err_unpin: + intel_context_unpin(ce); + return err; } static void eb_unpin_context(struct i915_execbuffer *eb) { + __intel_context_timeline_lock(eb->context); + intel_context_exit(eb->context); + intel_context_timeline_unlock(eb->context); + intel_context_unpin(eb->context); } @@ -2430,18 +2452,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_destroy; - /* - * Take a local wakeref for preparing to dispatch the execbuf as - * we expect to access the hardware fairly frequently in the - * process. Upon first dispatch, we acquire another prolonged - * wakeref that we hold until the GPU has been idle for at least - * 100ms. - */ - intel_gt_pm_get(&eb.i915->gt); - err = i915_mutex_lock_interruptible(dev); if (err) - goto err_rpm; + goto err_context; err = eb_select_engine(&eb, file, args); if (unlikely(err)) @@ -2606,8 +2619,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb_unpin_context(&eb); err_unlock: mutex_unlock(&dev->struct_mutex); -err_rpm: - intel_gt_pm_put(&eb.i915->gt); +err_context: i915_gem_context_put(eb.gem_context); err_destroy: eb_destroy(&eb); diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 40cd8320fcc3..065ba4ac4e87 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -126,6 +126,13 @@ static inline void intel_context_put(struct intel_context *ce) kref_put(&ce->ref, ce->ops->destroy); } +static inline void +__intel_context_timeline_lock(struct intel_context *ce) + __acquires(&ce->ring->timeline->mutex) +{ + mutex_lock(&ce->ring->timeline->mutex); +} + static inline int __must_check intel_context_timeline_lock(struct intel_context *ce) __acquires(&ce->ring->timeline->mutex) From patchwork Thu Jun 27 06:30:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018891 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 967FB14E5 for ; Thu, 27 Jun 2019 06:31:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8645E28A1B for ; Thu, 27 Jun 2019 06:31:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7ADC528A26; Thu, 27 Jun 2019 06:31:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E0ADA28A1B for ; Thu, 27 Jun 2019 06:31:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4EF1D6E5B0; Thu, 27 Jun 2019 06:31:21 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E05FE6E5AF for ; Thu, 27 Jun 2019 06:31:13 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038745-1500050 for multiple; Thu, 27 Jun 2019 07:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:39 +0100 Message-Id: <20190627063044.11858-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/7] drm/i915/gt: Track timeline activeness in enter/exit X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Lift moving the timeline to/from the active_list on enter/exit in order to shorten the active tracking span in comparison to the existing pin/unpin. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_pm.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 1 + drivers/gpu/drm/i915/gt/intel_lrc.c | 4 + drivers/gpu/drm/i915/gt/intel_timeline.c | 98 +++++++------------ drivers/gpu/drm/i915/gt/intel_timeline.h | 3 +- .../gpu/drm/i915/gt/intel_timeline_types.h | 1 + drivers/gpu/drm/i915/gt/selftest_timeline.c | 2 - 8 files changed, 46 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 4d774376f5b8..93d188526457 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -38,7 +38,6 @@ static void i915_gem_park(struct drm_i915_private *i915) i915_gem_batch_pool_fini(&engine->batch_pool); } - intel_timelines_park(i915); i915_vma_parked(i915); i915_globals_park(); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 1110fc8f657a..0111a18c1f02 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -232,10 +232,12 @@ int __init i915_global_context_init(void) void intel_context_enter_engine(struct intel_context *ce) { intel_engine_pm_get(ce->engine); + intel_timeline_enter(ce->ring->timeline); } void intel_context_exit_engine(struct intel_context *ce) { + intel_timeline_exit(ce->ring->timeline); intel_engine_pm_put(ce->engine); } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 84e432abe8e0..9751a02d86bc 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -88,6 +88,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) /* Check again on the next retirement. */ engine->wakeref_serial = engine->serial + 1; + intel_timeline_enter(rq->timeline); i915_request_add_barriers(rq); __i915_request_commit(rq); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 471e134de186..2bc25e3f83e6 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3166,6 +3166,8 @@ static void virtual_context_enter(struct intel_context *ce) for (n = 0; n < ve->num_siblings; n++) intel_engine_pm_get(ve->siblings[n]); + + intel_timeline_enter(ce->ring->timeline); } static void virtual_context_exit(struct intel_context *ce) @@ -3173,6 +3175,8 @@ static void virtual_context_exit(struct intel_context *ce) struct virtual_engine *ve = container_of(ce, typeof(*ve), context); unsigned int n; + intel_timeline_exit(ce->ring->timeline); + for (n = 0; n < ve->num_siblings; n++) intel_engine_pm_put(ve->siblings[n]); } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 6daa9eb59e19..4af0b9801d91 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -278,64 +278,11 @@ void intel_timelines_init(struct drm_i915_private *i915) timelines_init(&i915->gt); } -static void timeline_add_to_active(struct intel_timeline *tl) -{ - struct intel_gt_timelines *gt = &tl->gt->timelines; - - mutex_lock(>->mutex); - list_add(&tl->link, >->active_list); - mutex_unlock(>->mutex); -} - -static void timeline_remove_from_active(struct intel_timeline *tl) -{ - struct intel_gt_timelines *gt = &tl->gt->timelines; - - mutex_lock(>->mutex); - list_del(&tl->link); - mutex_unlock(>->mutex); -} - -static void timelines_park(struct intel_gt *gt) -{ - struct intel_gt_timelines *timelines = >->timelines; - struct intel_timeline *timeline; - - mutex_lock(&timelines->mutex); - list_for_each_entry(timeline, &timelines->active_list, link) { - /* - * All known fences are completed so we can scrap - * the current sync point tracking and start afresh, - * any attempt to wait upon a previous sync point - * will be skipped as the fence was signaled. - */ - i915_syncmap_free(&timeline->sync); - } - mutex_unlock(&timelines->mutex); -} - -/** - * intel_timelines_park - called when the driver idles - * @i915: the drm_i915_private device - * - * When the driver is completely idle, we know that all of our sync points - * have been signaled and our tracking is then entirely redundant. Any request - * to wait upon an older sync point will be completed instantly as we know - * the fence is signaled and therefore we will not even look them up in the - * sync point map. - */ -void intel_timelines_park(struct drm_i915_private *i915) -{ - timelines_park(&i915->gt); -} - void intel_timeline_fini(struct intel_timeline *timeline) { GEM_BUG_ON(timeline->pin_count); GEM_BUG_ON(!list_empty(&timeline->requests)); - i915_syncmap_free(&timeline->sync); - if (timeline->hwsp_cacheline) cacheline_free(timeline->hwsp_cacheline); else @@ -370,6 +317,7 @@ int intel_timeline_pin(struct intel_timeline *tl) if (tl->pin_count++) return 0; GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(tl->active_count); err = i915_vma_pin(tl->hwsp_ggtt, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) @@ -380,7 +328,6 @@ int intel_timeline_pin(struct intel_timeline *tl) offset_in_page(tl->hwsp_offset); cacheline_acquire(tl->hwsp_cacheline); - timeline_add_to_active(tl); return 0; @@ -389,6 +336,40 @@ int intel_timeline_pin(struct intel_timeline *tl) return err; } +void intel_timeline_enter(struct intel_timeline *tl) +{ + struct intel_gt_timelines *timelines = &tl->gt->timelines; + + GEM_BUG_ON(!tl->pin_count); + if (tl->active_count++) + return; + GEM_BUG_ON(!tl->active_count); /* overflow? */ + + mutex_lock(&timelines->mutex); + list_add(&tl->link, &timelines->active_list); + mutex_unlock(&timelines->mutex); +} + +void intel_timeline_exit(struct intel_timeline *tl) +{ + struct intel_gt_timelines *timelines = &tl->gt->timelines; + + GEM_BUG_ON(!tl->active_count); + if (--tl->active_count) + return; + + mutex_lock(&timelines->mutex); + list_del(&tl->link); + mutex_unlock(&timelines->mutex); + + /* + * Since this timeline is idle, all bariers upon which we were waiting + * must also be complete and so we can discard the last used barriers + * without loss of information. + */ + i915_syncmap_free(&tl->sync); +} + static u32 timeline_advance(struct intel_timeline *tl) { GEM_BUG_ON(!tl->pin_count); @@ -546,16 +527,9 @@ void intel_timeline_unpin(struct intel_timeline *tl) if (--tl->pin_count) return; - timeline_remove_from_active(tl); + GEM_BUG_ON(tl->active_count); cacheline_release(tl->hwsp_cacheline); - /* - * Since this timeline is idle, all bariers upon which we were waiting - * must also be complete and so we can discard the last used barriers - * without loss of information. - */ - i915_syncmap_free(&tl->sync); - __i915_vma_unpin(tl->hwsp_ggtt); } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h index e08cebf64833..f583af1ba18d 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline.h @@ -77,9 +77,11 @@ static inline bool intel_timeline_sync_is_later(struct intel_timeline *tl, } int intel_timeline_pin(struct intel_timeline *tl); +void intel_timeline_enter(struct intel_timeline *tl); int intel_timeline_get_seqno(struct intel_timeline *tl, struct i915_request *rq, u32 *seqno); +void intel_timeline_exit(struct intel_timeline *tl); void intel_timeline_unpin(struct intel_timeline *tl); int intel_timeline_read_hwsp(struct i915_request *from, @@ -87,7 +89,6 @@ int intel_timeline_read_hwsp(struct i915_request *from, u32 *hwsp_offset); void intel_timelines_init(struct drm_i915_private *i915); -void intel_timelines_park(struct drm_i915_private *i915); void intel_timelines_fini(struct drm_i915_private *i915); #endif diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h index 9a71aea7a338..b820ee76b7f5 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h @@ -58,6 +58,7 @@ struct intel_timeline { */ struct i915_syncmap *sync; + unsigned int active_count; struct list_head link; struct intel_gt *gt; diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c index 193cc564ade2..bc7ed242db3e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c @@ -815,8 +815,6 @@ static int live_hwsp_recycle(void *arg) if (err) goto out; - - intel_timelines_park(i915); /* Encourage recycling! */ } while (!__igt_timeout(end_time, NULL)); } From patchwork Thu Jun 27 06:30:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018887 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E64CC14E5 for ; Thu, 27 Jun 2019 06:31:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD75328A1B for ; Thu, 27 Jun 2019 06:31:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C1C4F28A2D; Thu, 27 Jun 2019 06:31:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D902C28A1B for ; Thu, 27 Jun 2019 06:31:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C64CE6E5AD; Thu, 27 Jun 2019 06:31:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E3EBF89CAF for ; Thu, 27 Jun 2019 06:31:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038746-1500050 for multiple; Thu, 27 Jun 2019 07:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:40 +0100 Message-Id: <20190627063044.11858-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/7] drm/i915/gt: Convert timeline tracking to spinlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Convert the list manipulation of active to use spinlocks so that we can perform the updates from underneath a quick interrupt callback. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 13 ++++++++++--- drivers/gpu/drm/i915/gt/intel_timeline.c | 12 +++++------- drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++---------- 4 files changed, 26 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index c03e56628ee2..cfd41e6c54e1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -26,7 +26,7 @@ struct intel_gt { struct i915_ggtt *ggtt; struct intel_gt_timelines { - struct mutex mutex; /* protects list */ + spinlock_t lock; /* protects active_list */ struct list_head active_list; /* Pack multiple timelines' seqnos into the same page */ diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index adfdb908587f..72002c0f9698 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -858,6 +858,7 @@ void i915_gem_set_wedged(struct drm_i915_private *i915) static bool __i915_gem_unset_wedged(struct drm_i915_private *i915) { struct i915_gpu_error *error = &i915->gpu_error; + struct intel_gt_timelines *timelines = &i915->gt.timelines; struct intel_timeline *tl; if (!test_bit(I915_WEDGED, &error->flags)) @@ -878,14 +879,16 @@ static bool __i915_gem_unset_wedged(struct drm_i915_private *i915) * * No more can be submitted until we reset the wedged bit. */ - mutex_lock(&i915->gt.timelines.mutex); - list_for_each_entry(tl, &i915->gt.timelines.active_list, link) { + spin_lock(&timelines->lock); + list_for_each_entry(tl, &timelines->active_list, link) { struct i915_request *rq; rq = i915_active_request_get_unlocked(&tl->last_request); if (!rq) continue; + spin_unlock(&timelines->lock); + /* * All internal dependencies (i915_requests) will have * been flushed by the set-wedge, but we may be stuck waiting @@ -895,8 +898,12 @@ static bool __i915_gem_unset_wedged(struct drm_i915_private *i915) */ dma_fence_default_wait(&rq->fence, false, MAX_SCHEDULE_TIMEOUT); i915_request_put(rq); + + /* Restart iteration after droping lock */ + spin_lock(&timelines->lock); + tl = list_entry(&timelines->active_list, typeof(*tl), link); } - mutex_unlock(&i915->gt.timelines.mutex); + spin_unlock(&timelines->lock); intel_gt_sanitize(&i915->gt, false); diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 4af0b9801d91..355dfc52c804 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -266,7 +266,7 @@ static void timelines_init(struct intel_gt *gt) { struct intel_gt_timelines *timelines = >->timelines; - mutex_init(&timelines->mutex); + spin_lock_init(&timelines->lock); INIT_LIST_HEAD(&timelines->active_list); spin_lock_init(&timelines->hwsp_lock); @@ -345,9 +345,9 @@ void intel_timeline_enter(struct intel_timeline *tl) return; GEM_BUG_ON(!tl->active_count); /* overflow? */ - mutex_lock(&timelines->mutex); + spin_lock(&timelines->lock); list_add(&tl->link, &timelines->active_list); - mutex_unlock(&timelines->mutex); + spin_unlock(&timelines->lock); } void intel_timeline_exit(struct intel_timeline *tl) @@ -358,9 +358,9 @@ void intel_timeline_exit(struct intel_timeline *tl) if (--tl->active_count) return; - mutex_lock(&timelines->mutex); + spin_lock(&timelines->lock); list_del(&tl->link); - mutex_unlock(&timelines->mutex); + spin_unlock(&timelines->lock); /* * Since this timeline is idle, all bariers upon which we were waiting @@ -548,8 +548,6 @@ static void timelines_fini(struct intel_gt *gt) GEM_BUG_ON(!list_empty(&timelines->active_list)); GEM_BUG_ON(!list_empty(&timelines->hwsp_free_list)); - - mutex_destroy(&timelines->mutex); } void intel_timelines_fini(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b7f290b77f8f..4d2cf149ffe6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -905,20 +905,20 @@ static int wait_for_engines(struct drm_i915_private *i915) static long wait_for_timelines(struct drm_i915_private *i915, - unsigned int flags, long timeout) + unsigned int wait, long timeout) { - struct intel_gt_timelines *gt = &i915->gt.timelines; + struct intel_gt_timelines *timelines = &i915->gt.timelines; struct intel_timeline *tl; - mutex_lock(>->mutex); - list_for_each_entry(tl, >->active_list, link) { + spin_lock(&timelines->lock); + list_for_each_entry(tl, &timelines->active_list, link) { struct i915_request *rq; rq = i915_active_request_get_unlocked(&tl->last_request); if (!rq) continue; - mutex_unlock(>->mutex); + spin_unlock(&timelines->lock); /* * "Race-to-idle". @@ -929,19 +929,19 @@ wait_for_timelines(struct drm_i915_private *i915, * want to complete as quickly as possible to avoid prolonged * stalls, so allow the gpu to boost to maximum clocks. */ - if (flags & I915_WAIT_FOR_IDLE_BOOST) + if (wait & I915_WAIT_FOR_IDLE_BOOST) gen6_rps_boost(rq); - timeout = i915_request_wait(rq, flags, timeout); + timeout = i915_request_wait(rq, wait, timeout); i915_request_put(rq); if (timeout < 0) return timeout; /* restart after reacquiring the lock */ - mutex_lock(>->mutex); - tl = list_entry(>->active_list, typeof(*tl), link); + spin_lock(&timelines->lock); + tl = list_entry(&timelines->active_list, typeof(*tl), link); } - mutex_unlock(>->mutex); + spin_unlock(&timelines->lock); return timeout; } From patchwork Thu Jun 27 06:30:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018899 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EF5214E5 for ; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FF9D28A1F for ; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5483928A2D; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F1C7928A1F for ; Thu, 27 Jun 2019 06:32:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E2AC86E5AE; Thu, 27 Jun 2019 06:32:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5D57A89CAF for ; Thu, 27 Jun 2019 06:32:40 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038747-1500050 for multiple; Thu, 27 Jun 2019 07:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:41 +0100 Message-Id: <20190627063044.11858-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/7] drm/i915/gt: Guard timeline pinning with its own mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP In preparation for removing struct_mutex from around context retirement, we need to make timeline pinning safe. Since multiple engines/contexts can share a single timeline, it needs to be protected by a mutex. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_timeline.c | 27 +++++++++---------- .../gpu/drm/i915/gt/intel_timeline_types.h | 2 +- drivers/gpu/drm/i915/gt/mock_engine.c | 6 ++--- 3 files changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 355dfc52c804..7b476cd55dac 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -211,9 +211,9 @@ int intel_timeline_init(struct intel_timeline *timeline, void *vaddr; kref_init(&timeline->kref); + atomic_set(&timeline->pin_count, 0); timeline->gt = gt; - timeline->pin_count = 0; timeline->has_initial_breadcrumb = !hwsp; timeline->hwsp_cacheline = NULL; @@ -280,7 +280,7 @@ void intel_timelines_init(struct drm_i915_private *i915) void intel_timeline_fini(struct intel_timeline *timeline) { - GEM_BUG_ON(timeline->pin_count); + GEM_BUG_ON(atomic_read(&timeline->pin_count)); GEM_BUG_ON(!list_empty(&timeline->requests)); if (timeline->hwsp_cacheline) @@ -314,33 +314,31 @@ int intel_timeline_pin(struct intel_timeline *tl) { int err; - if (tl->pin_count++) + if (atomic_add_unless(&tl->pin_count, 1, 0)) return 0; - GEM_BUG_ON(!tl->pin_count); - GEM_BUG_ON(tl->active_count); err = i915_vma_pin(tl->hwsp_ggtt, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) - goto unpin; + return err; tl->hwsp_offset = i915_ggtt_offset(tl->hwsp_ggtt) + offset_in_page(tl->hwsp_offset); cacheline_acquire(tl->hwsp_cacheline); + if (atomic_fetch_inc(&tl->pin_count)) { + cacheline_release(tl->hwsp_cacheline); + __i915_vma_unpin(tl->hwsp_ggtt); + } return 0; - -unpin: - tl->pin_count = 0; - return err; } void intel_timeline_enter(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; - GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(!atomic_read(&tl->pin_count)); if (tl->active_count++) return; GEM_BUG_ON(!tl->active_count); /* overflow? */ @@ -372,7 +370,7 @@ void intel_timeline_exit(struct intel_timeline *tl) static u32 timeline_advance(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); + GEM_BUG_ON(!atomic_read(&tl->pin_count)); GEM_BUG_ON(tl->seqno & tl->has_initial_breadcrumb); return tl->seqno += 1 + tl->has_initial_breadcrumb; @@ -523,11 +521,10 @@ int intel_timeline_read_hwsp(struct i915_request *from, void intel_timeline_unpin(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); - if (--tl->pin_count) + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + if (!atomic_dec_and_test(&tl->pin_count)) return; - GEM_BUG_ON(tl->active_count); cacheline_release(tl->hwsp_cacheline); __i915_vma_unpin(tl->hwsp_ggtt); diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h index b820ee76b7f5..8dd14a2b8781 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h @@ -25,7 +25,7 @@ struct intel_timeline { struct mutex mutex; /* protects the flow of requests */ - unsigned int pin_count; + atomic_t pin_count; const u32 *hwsp_seqno; struct i915_vma *hwsp_ggtt; u32 hwsp_offset; diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 490ebd121f4c..a48b36d31e65 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -38,13 +38,13 @@ struct mock_ring { static void mock_timeline_pin(struct intel_timeline *tl) { - tl->pin_count++; + atomic_inc(&tl->pin_count); } static void mock_timeline_unpin(struct intel_timeline *tl) { - GEM_BUG_ON(!tl->pin_count); - tl->pin_count--; + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + atomic_dec(&tl->pin_count); } static struct intel_ring *mock_ring(struct intel_engine_cs *engine) From patchwork Thu Jun 27 06:30:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018897 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D74F314E5 for ; Thu, 27 Jun 2019 06:32:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C869828A1B for ; Thu, 27 Jun 2019 06:32:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BCB1428A26; Thu, 27 Jun 2019 06:32:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 617F428A1B for ; Thu, 27 Jun 2019 06:32:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E860E89CAF; Thu, 27 Jun 2019 06:32:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 62E916E5AE for ; Thu, 27 Jun 2019 06:32:40 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038748-1500050 for multiple; Thu, 27 Jun 2019 07:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:42 +0100 Message-Id: <20190627063044.11858-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/7] drm/i915: Include the breadcrumb when asserting request completion X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP As we explode, include the breadcrumb results in the trace for why we choose to explode on the request completion assertion failure. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_lrc.c | 2 +- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 12 ++++++++++++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 2bc25e3f83e6..ff8eb3fa3fc6 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1381,7 +1381,7 @@ static void process_csb(struct intel_engine_cs *engine) * coherent (visible from the CPU) before the * user interrupt and CSB is processed. */ - GEM_BUG_ON(!i915_request_completed(rq)); + assert_request_complete(rq); execlists_schedule_out(rq); GEM_BUG_ON(execlists->active - execlists->inflight > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 5ff87c4a0cd5..5b1c638d3754 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -329,7 +329,7 @@ void i915_request_retire_upto(struct i915_request *rq) hwsp_seqno(rq)); lockdep_assert_held(&rq->i915->drm.struct_mutex); - GEM_BUG_ON(!i915_request_completed(rq)); + assert_request_complete(rq); if (list_empty(&rq->ring_link)) return; diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index b58ceef92e20..a41791637076 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -432,4 +432,16 @@ static inline void i915_request_mark_complete(struct i915_request *rq) bool i915_retire_requests(struct drm_i915_private *i915); +static inline void assert_request_complete(const struct i915_request *rq) +{ + u32 hwsp = hwsp_seqno(rq); + + if (i915_seqno_passed(hwsp, rq->fence.seqno)) + return; + + GEM_TRACE("fence %llx:%llu incomplete, current %u\n", + rq->fence.context, rq->fence.seqno, hwsp); + GEM_BUG_ON(!i915_seqno_passed(hwsp, rq->fence.seqno)); +} + #endif /* I915_REQUEST_H */ From patchwork Thu Jun 27 06:30:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018901 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F03E71708 for ; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEAC028A1B for ; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D299E28A26; Thu, 27 Jun 2019 06:32:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 872A228A1B for ; Thu, 27 Jun 2019 06:32:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E61296E5AF; Thu, 27 Jun 2019 06:32:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 31B2389CAF for ; Thu, 27 Jun 2019 06:32:41 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038749-1500050 for multiple; Thu, 27 Jun 2019 07:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:43 +0100 Message-Id: <20190627063044.11858-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 6/7] drm/i915: Protect request retirement with timeline->mutex X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Forgo the struct_mutex requirement for request retirement as we have been transitioning over to only using the timeline->mutex for controlling the lifetime of a request on that timeline. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 192 ++++++++++-------- drivers/gpu/drm/i915/gt/intel_context.h | 25 +-- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 1 - drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 - drivers/gpu/drm/i915/gt/intel_gt.c | 1 - drivers/gpu/drm/i915/gt/intel_gt_types.h | 2 - drivers/gpu/drm/i915/gt/intel_lrc.c | 1 + drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 13 +- drivers/gpu/drm/i915/gt/mock_engine.c | 1 - drivers/gpu/drm/i915/i915_request.c | 151 +++++++------- drivers/gpu/drm/i915/i915_request.h | 3 - 11 files changed, 203 insertions(+), 189 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index f43eaaa5db5f..80c9c57a302f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -739,63 +739,6 @@ static int eb_select_context(struct i915_execbuffer *eb) return 0; } -static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring) -{ - struct i915_request *rq; - - /* - * Completely unscientific finger-in-the-air estimates for suitable - * maximum user request size (to avoid blocking) and then backoff. - */ - if (intel_ring_update_space(ring) >= PAGE_SIZE) - return NULL; - - /* - * Find a request that after waiting upon, there will be at least half - * the ring available. The hysteresis allows us to compete for the - * shared ring and should mean that we sleep less often prior to - * claiming our resources, but not so long that the ring completely - * drains before we can submit our next request. - */ - list_for_each_entry(rq, &ring->request_list, ring_link) { - if (__intel_ring_space(rq->postfix, - ring->emit, ring->size) > ring->size / 2) - break; - } - if (&rq->ring_link == &ring->request_list) - return NULL; /* weird, we will check again later for real */ - - return i915_request_get(rq); -} - -static int eb_wait_for_ring(const struct i915_execbuffer *eb) -{ - struct i915_request *rq; - int ret = 0; - - /* - * Apply a light amount of backpressure to prevent excessive hogs - * from blocking waiting for space whilst holding struct_mutex and - * keeping all of their resources pinned. - */ - - rq = __eb_wait_for_ring(eb->context->ring); - if (rq) { - mutex_unlock(&eb->i915->drm.struct_mutex); - - if (i915_request_wait(rq, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT) < 0) - ret = -EINTR; - - i915_request_put(rq); - - mutex_lock(&eb->i915->drm.struct_mutex); - } - - return ret; -} - static int eb_lookup_vmas(struct i915_execbuffer *eb) { struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma; @@ -2122,10 +2065,75 @@ static const enum intel_engine_id user_ring_map[] = { [I915_EXEC_VEBOX] = VECS0 }; -static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) +static struct i915_request *eb_throttle(struct intel_context *ce) +{ + struct intel_ring *ring = ce->ring; + struct intel_timeline *tl = ring->timeline; + struct i915_request *rq; + + /* + * Completely unscientific finger-in-the-air estimates for suitable + * maximum user request size (to avoid blocking) and then backoff. + */ + if (intel_ring_update_space(ring) >= PAGE_SIZE) + return NULL; + + /* + * Find a request that after waiting upon, there will be at least half + * the ring available. The hysteresis allows us to compete for the + * shared ring and should mean that we sleep less often prior to + * claiming our resources, but not so long that the ring completely + * drains before we can submit our next request. + */ + list_for_each_entry(rq, &tl->requests, link) { + if (rq->ring != ring) + continue; + + if (__intel_ring_space(rq->postfix, + ring->emit, ring->size) > ring->size / 2) + break; + } + if (&rq->link == &tl->requests) + return NULL; /* weird, we will check again later for real */ + + return i915_request_get(rq); +} + +static int +__eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) { int err; + if (likely(atomic_inc_not_zero(&ce->pin_count))) + return 0; + + err = mutex_lock_interruptible(&eb->i915->drm.struct_mutex); + if (err) + return err; + + err = __intel_context_do_pin(ce); + mutex_unlock(&eb->i915->drm.struct_mutex); + + return err; +} + +static void +__eb_unpin_context(struct i915_execbuffer *eb, struct intel_context *ce) +{ + if (likely(atomic_add_unless(&ce->pin_count, -1, 1))) + return; + + mutex_lock(&eb->i915->drm.struct_mutex); + intel_context_unpin(ce); + mutex_unlock(&eb->i915->drm.struct_mutex); +} + +static int __eb_pin_engine(struct i915_execbuffer *eb, struct intel_context *ce) +{ + struct intel_timeline *tl; + struct i915_request *rq; + int err; + /* * ABI: Before userspace accesses the GPU (e.g. execbuffer), report * EIO if the GPU is already wedged. @@ -2139,7 +2147,7 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) * GGTT space, so do this first before we reserve a seqno for * ourselves. */ - err = intel_context_pin(ce); + err = __eb_pin_context(eb, ce); if (err) return err; @@ -2150,29 +2158,52 @@ static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce) * wakeref that we hold until the GPU has been idle for at least * 100ms. */ - err = intel_context_timeline_lock(ce); - if (err) + tl = intel_context_timeline_lock(ce); + if (IS_ERR(tl)) { + err = PTR_ERR(tl); goto err_unpin; + } intel_context_enter(ce); - intel_context_timeline_unlock(ce); + rq = eb_throttle(ce); + + intel_context_timeline_unlock(tl); + + if (rq) { + if (i915_request_wait(rq, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT) < 0) { + i915_request_put(rq); + err = -EINTR; + goto err_exit; + } + + i915_request_put(rq); + } eb->engine = ce->engine; eb->context = ce; return 0; +err_exit: + mutex_lock(&tl->mutex); + intel_context_exit(ce); + intel_context_timeline_unlock(tl); err_unpin: - intel_context_unpin(ce); + __eb_unpin_context(eb, ce); return err; } -static void eb_unpin_context(struct i915_execbuffer *eb) +static void eb_unpin_engine(struct i915_execbuffer *eb) { - __intel_context_timeline_lock(eb->context); - intel_context_exit(eb->context); - intel_context_timeline_unlock(eb->context); + struct intel_context *ce = eb->context; + struct intel_timeline *tl = ce->ring->timeline; + + mutex_lock(&tl->mutex); + intel_context_exit(ce); + intel_context_timeline_unlock(tl); - intel_context_unpin(eb->context); + __eb_unpin_context(eb, ce); } static unsigned int @@ -2217,9 +2248,9 @@ eb_select_legacy_ring(struct i915_execbuffer *eb, } static int -eb_select_engine(struct i915_execbuffer *eb, - struct drm_file *file, - struct drm_i915_gem_execbuffer2 *args) +eb_pin_engine(struct i915_execbuffer *eb, + struct drm_file *file, + struct drm_i915_gem_execbuffer2 *args) { struct intel_context *ce; unsigned int idx; @@ -2234,7 +2265,7 @@ eb_select_engine(struct i915_execbuffer *eb, if (IS_ERR(ce)) return PTR_ERR(ce); - err = eb_pin_context(eb, ce); + err = __eb_pin_engine(eb, ce); intel_context_put(ce); return err; @@ -2452,16 +2483,12 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_destroy; - err = i915_mutex_lock_interruptible(dev); - if (err) - goto err_context; - - err = eb_select_engine(&eb, file, args); + err = eb_pin_engine(&eb, file, args); if (unlikely(err)) - goto err_unlock; + goto err_context; - err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */ - if (unlikely(err)) + err = i915_mutex_lock_interruptible(dev); + if (err) goto err_engine; err = eb_relocate(&eb); @@ -2615,10 +2642,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_vma: if (eb.exec) eb_release_vmas(&eb); -err_engine: - eb_unpin_context(&eb); -err_unlock: mutex_unlock(&dev->struct_mutex); +err_engine: + eb_unpin_engine(&eb); err_context: i915_gem_context_put(eb.gem_context); err_destroy: diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 065ba4ac4e87..38b60cbf2592 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -12,6 +12,7 @@ #include "i915_active.h" #include "intel_context_types.h" #include "intel_engine_types.h" +#include "intel_timeline_types.h" void intel_context_init(struct intel_context *ce, struct i915_gem_context *ctx, @@ -126,24 +127,24 @@ static inline void intel_context_put(struct intel_context *ce) kref_put(&ce->ref, ce->ops->destroy); } -static inline void -__intel_context_timeline_lock(struct intel_context *ce) - __acquires(&ce->ring->timeline->mutex) -{ - mutex_lock(&ce->ring->timeline->mutex); -} - -static inline int __must_check +static inline struct intel_timeline *__must_check intel_context_timeline_lock(struct intel_context *ce) __acquires(&ce->ring->timeline->mutex) { - return mutex_lock_interruptible(&ce->ring->timeline->mutex); + struct intel_timeline *tl = ce->ring->timeline; + int err; + + err = mutex_lock_interruptible(&tl->mutex); + if (err) + return ERR_PTR(err); + + return tl; } -static inline void intel_context_timeline_unlock(struct intel_context *ce) - __releases(&ce->ring->timeline->mutex) +static inline void intel_context_timeline_unlock(struct intel_timeline *tl) + __releases(&tl->mutex) { - mutex_unlock(&ce->ring->timeline->mutex); + mutex_unlock(&tl->mutex); } struct i915_request *intel_context_create_request(struct intel_context *ce); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index d1508f0b4c84..b27fc555fe09 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -745,7 +745,6 @@ static int measure_breadcrumb_dw(struct intel_engine_cs *engine) engine->status_page.vma)) goto out_frame; - INIT_LIST_HEAD(&frame->ring.request_list); frame->ring.timeline = &frame->timeline; frame->ring.vaddr = frame->cs; frame->ring.size = sizeof(frame->cs); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 7e056114344e..0dde7e04b102 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -69,8 +69,6 @@ struct intel_ring { void *vaddr; struct intel_timeline *timeline; - struct list_head request_list; - struct list_head active_link; /* * As we have two types of rings, one global to the engine used diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 8cca6b22b386..46d24d9d62ac 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -14,7 +14,6 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915) gt->i915 = i915; gt->uncore = &i915->uncore; - INIT_LIST_HEAD(>->active_rings); INIT_LIST_HEAD(>->closed_vma); spin_lock_init(>->closed_lock); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index cfd41e6c54e1..f43ea830b1e8 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -34,8 +34,6 @@ struct intel_gt { struct list_head hwsp_free_list; } timelines; - struct list_head active_rings; - struct intel_wakeref wakeref; struct list_head closed_vma; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index ff8eb3fa3fc6..2f4c9c786d44 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1511,6 +1511,7 @@ static void execlists_context_unpin(struct intel_context *ce) { i915_gem_context_unpin_hw_id(ce->gem_context); i915_gem_object_unpin_map(ce->state->obj); + intel_ring_reset(ce->ring, ce->ring->tail); } static void diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index 81f9b0422e6a..b771170eb56a 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1227,7 +1227,7 @@ void intel_ring_unpin(struct intel_ring *ring) GEM_TRACE("ring:%llx unpin\n", ring->timeline->fence_context); /* Discard any unused bytes beyond that submitted to hw. */ - intel_ring_reset(ring, ring->tail); + intel_ring_reset(ring, ring->emit); GEM_BUG_ON(!ring->vma); i915_vma_unset_ggtt_write(ring->vma); @@ -1293,7 +1293,6 @@ intel_engine_create_ring(struct intel_engine_cs *engine, return ERR_PTR(-ENOMEM); kref_init(&ring->ref); - INIT_LIST_HEAD(&ring->request_list); ring->timeline = intel_timeline_get(timeline); ring->size = size; @@ -1817,21 +1816,25 @@ static int ring_request_alloc(struct i915_request *request) static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes) { + struct intel_timeline *tl = ring->timeline; struct i915_request *target; long timeout; if (intel_ring_update_space(ring) >= bytes) return 0; - GEM_BUG_ON(list_empty(&ring->request_list)); - list_for_each_entry(target, &ring->request_list, ring_link) { + GEM_BUG_ON(list_empty(&tl->requests)); + list_for_each_entry(target, &tl->requests, link) { + if (target->ring != ring) + continue; + /* Would completion of this request free enough space? */ if (bytes <= __intel_ring_space(target->postfix, ring->emit, ring->size)) break; } - if (WARN_ON(&target->ring_link == &ring->request_list)) + if (GEM_WARN_ON(&target->link == &tl->requests)) return -ENOSPC; timeout = i915_request_wait(target, diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index a48b36d31e65..5bcb461b8372 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -68,7 +68,6 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine) ring->base.timeline = &ring->timeline; atomic_set(&ring->base.pin_count, 1); - INIT_LIST_HEAD(&ring->base.request_list); intel_ring_update_space(&ring->base); return &ring->base; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 5b1c638d3754..3f33a72085f6 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -180,40 +180,6 @@ i915_request_remove_from_client(struct i915_request *request) spin_unlock(&file_priv->mm.lock); } -static void advance_ring(struct i915_request *request) -{ - struct intel_ring *ring = request->ring; - unsigned int tail; - - /* - * We know the GPU must have read the request to have - * sent us the seqno + interrupt, so use the position - * of tail of the request to update the last known position - * of the GPU head. - * - * Note this requires that we are always called in request - * completion order. - */ - GEM_BUG_ON(!list_is_first(&request->ring_link, &ring->request_list)); - if (list_is_last(&request->ring_link, &ring->request_list)) { - /* - * We may race here with execlists resubmitting this request - * as we retire it. The resubmission will move the ring->tail - * forwards (to request->wa_tail). We either read the - * current value that was written to hw, or the value that - * is just about to be. Either works, if we miss the last two - * noops - they are safe to be replayed on a reset. - */ - tail = READ_ONCE(request->tail); - list_del(&ring->active_link); - } else { - tail = request->postfix; - } - list_del_init(&request->ring_link); - - ring->head = tail; -} - static void free_capture_list(struct i915_request *request) { struct i915_capture_list *capture; @@ -231,7 +197,7 @@ static bool i915_request_retire(struct i915_request *rq) { struct i915_active_request *active, *next; - lockdep_assert_held(&rq->i915->drm.struct_mutex); + lockdep_assert_held(&rq->timeline->mutex); if (!i915_request_completed(rq)) return false; @@ -243,7 +209,17 @@ static bool i915_request_retire(struct i915_request *rq) GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit)); trace_i915_request_retire(rq); - advance_ring(rq); + /* + * We know the GPU must have read the request to have + * sent us the seqno + interrupt, so use the position + * of tail of the request to update the last known position + * of the GPU head. + * + * Note this requires that we are always called in request + * completion order. + */ + GEM_BUG_ON(!list_is_first(&rq->link, &rq->timeline->requests)); + rq->ring->head = rq->postfix; /* * Walk through the active list, calling retire on each. This allows @@ -320,7 +296,7 @@ static bool i915_request_retire(struct i915_request *rq) void i915_request_retire_upto(struct i915_request *rq) { - struct intel_ring *ring = rq->ring; + struct intel_timeline * const tl = rq->timeline; struct i915_request *tmp; GEM_TRACE("%s fence %llx:%lld, current %d\n", @@ -328,15 +304,11 @@ void i915_request_retire_upto(struct i915_request *rq) rq->fence.context, rq->fence.seqno, hwsp_seqno(rq)); - lockdep_assert_held(&rq->i915->drm.struct_mutex); + lockdep_assert_held(&tl->mutex); assert_request_complete(rq); - if (list_empty(&rq->ring_link)) - return; - do { - tmp = list_first_entry(&ring->request_list, - typeof(*tmp), ring_link); + tmp = list_first_entry(&tl->requests, typeof(*tmp), link); } while (i915_request_retire(tmp) && tmp != rq); } @@ -563,29 +535,28 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } -static void ring_retire_requests(struct intel_ring *ring) +static void retire_requests(struct intel_timeline *tl) { struct i915_request *rq, *rn; - list_for_each_entry_safe(rq, rn, &ring->request_list, ring_link) + list_for_each_entry_safe(rq, rn, &tl->requests, link) if (!i915_request_retire(rq)) break; } static noinline struct i915_request * -request_alloc_slow(struct intel_context *ce, gfp_t gfp) +request_alloc_slow(struct intel_timeline *tl, gfp_t gfp) { - struct intel_ring *ring = ce->ring; struct i915_request *rq; - if (list_empty(&ring->request_list)) + if (list_empty(&tl->requests)) goto out; if (!gfpflags_allow_blocking(gfp)) goto out; /* Move our oldest request to the slab-cache (if not in use!) */ - rq = list_first_entry(&ring->request_list, typeof(*rq), ring_link); + rq = list_first_entry(&tl->requests, typeof(*rq), link); i915_request_retire(rq); rq = kmem_cache_alloc(global.slab_requests, @@ -594,11 +565,11 @@ request_alloc_slow(struct intel_context *ce, gfp_t gfp) return rq; /* Ratelimit ourselves to prevent oom from malicious clients */ - rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link); + rq = list_last_entry(&tl->requests, typeof(*rq), link); cond_synchronize_rcu(rq->rcustate); /* Retire our old requests in the hope that we free some */ - ring_retire_requests(ring); + retire_requests(tl); out: return kmem_cache_alloc(global.slab_requests, gfp); @@ -649,7 +620,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq = kmem_cache_alloc(global.slab_requests, gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN); if (unlikely(!rq)) { - rq = request_alloc_slow(ce, gfp); + rq = request_alloc_slow(tl, gfp); if (!rq) { ret = -ENOMEM; goto err_unreserve; @@ -741,15 +712,15 @@ struct i915_request * i915_request_create(struct intel_context *ce) { struct i915_request *rq; - int err; + struct intel_timeline *tl; - err = intel_context_timeline_lock(ce); - if (err) - return ERR_PTR(err); + tl = intel_context_timeline_lock(ce); + if (IS_ERR(tl)) + return ERR_CAST(tl); /* Move our oldest request to the slab-cache (if not in use!) */ - rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link); - if (!list_is_last(&rq->ring_link, &ce->ring->request_list)) + rq = list_first_entry(&tl->requests, typeof(*rq), link); + if (!list_is_last(&rq->link, &tl->requests)) i915_request_retire(rq); intel_context_enter(ce); @@ -759,22 +730,22 @@ i915_request_create(struct intel_context *ce) goto err_unlock; /* Check that we do not interrupt ourselves with a new request */ - rq->cookie = lockdep_pin_lock(&ce->ring->timeline->mutex); + rq->cookie = lockdep_pin_lock(&tl->mutex); return rq; err_unlock: - intel_context_timeline_unlock(ce); + intel_context_timeline_unlock(tl); return rq; } static int i915_request_await_start(struct i915_request *rq, struct i915_request *signal) { - if (list_is_first(&signal->ring_link, &signal->ring->request_list)) + if (list_is_first(&signal->link, &signal->ring->timeline->requests)) return 0; - signal = list_prev_entry(signal, ring_link); + signal = list_prev_entry(signal, link); if (intel_timeline_sync_is_later(rq->timeline, &signal->fence)) return 0; @@ -1167,6 +1138,7 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) */ GEM_BUG_ON(rq->reserved_space > ring->space); rq->reserved_space = 0; + rq->emitted_jiffies = jiffies; /* * Record the position of the start of the breadcrumb so that @@ -1180,11 +1152,6 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) prev = __i915_request_add_to_timeline(rq); - list_add_tail(&rq->ring_link, &ring->request_list); - if (list_is_first(&rq->ring_link, &ring->request_list)) - list_add(&ring->active_link, &rq->i915->gt.active_rings); - rq->emitted_jiffies = jiffies; - /* * Let the backend know a new request has arrived that may need * to adjust the existing execution schedule due to a high priority @@ -1237,10 +1204,11 @@ struct i915_request *__i915_request_commit(struct i915_request *rq) void i915_request_add(struct i915_request *rq) { + struct intel_timeline * const tl = rq->timeline; struct i915_request *prev; - lockdep_assert_held(&rq->timeline->mutex); - lockdep_unpin_lock(&rq->timeline->mutex, rq->cookie); + lockdep_assert_held(&tl->mutex); + lockdep_unpin_lock(&tl->mutex, rq->cookie); trace_i915_request_add(rq); @@ -1263,10 +1231,10 @@ void i915_request_add(struct i915_request *rq) * work on behalf of others -- but instead we should benefit from * improved resource management. (Well, that's the theory at least.) */ - if (prev && i915_request_completed(prev)) + if (prev && i915_request_completed(prev) && prev->timeline == tl) i915_request_retire_upto(prev); - mutex_unlock(&rq->timeline->mutex); + mutex_unlock(&tl->mutex); } static unsigned long local_clock_us(unsigned int *cpu) @@ -1487,18 +1455,43 @@ long i915_request_wait(struct i915_request *rq, bool i915_retire_requests(struct drm_i915_private *i915) { - struct intel_ring *ring, *tmp; + struct intel_gt_timelines *timelines = &i915->gt.timelines; + struct intel_timeline *tl, *tn; + LIST_HEAD(free); + + spin_lock(&timelines->lock); + list_for_each_entry_safe(tl, tn, &timelines->active_list, link) { + if (!mutex_trylock(&tl->mutex)) + continue; + + intel_timeline_get(tl); + GEM_BUG_ON(!tl->active_count); + tl->active_count++; /* pin the list element */ + spin_unlock(&timelines->lock); - lockdep_assert_held(&i915->drm.struct_mutex); + retire_requests(tl); - list_for_each_entry_safe(ring, tmp, - &i915->gt.active_rings, active_link) { - intel_ring_get(ring); /* last rq holds reference! */ - ring_retire_requests(ring); - intel_ring_put(ring); + spin_lock(&timelines->lock); + + /* Restart iteration after dropping lock */ + list_safe_reset_next(tl, tn, link); + if (!--tl->active_count) + list_del(&tl->link); + + mutex_unlock(&tl->mutex); + + /* Defer the final release to after the spinlock */ + if (refcount_dec_and_test(&tl->kref.refcount)) { + GEM_BUG_ON(tl->active_count); + list_add(&tl->link, &free); + } } + spin_unlock(&timelines->lock); + + list_for_each_entry_safe(tl, tn, &free, link) + __intel_timeline_free(&tl->kref); - return !list_empty(&i915->gt.active_rings); + return !list_empty(&timelines->active_list); } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index a41791637076..75453474dddb 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -221,9 +221,6 @@ struct i915_request { /** timeline->request entry for this request */ struct list_head link; - /** ring->request_list entry for this request */ - struct list_head ring_link; - struct drm_i915_file_private *file_priv; /** file_priv list entry for this request */ struct list_head client_link; From patchwork Thu Jun 27 06:30:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11018889 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C154F1708 for ; Thu, 27 Jun 2019 06:31:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF0FC28A1B for ; Thu, 27 Jun 2019 06:31:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A33D328A26; Thu, 27 Jun 2019 06:31:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2CD4E28A1F for ; Thu, 27 Jun 2019 06:31:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CB6056E5AE; Thu, 27 Jun 2019 06:31:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id E81BB6E5AE for ; Thu, 27 Jun 2019 06:31:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 17038750-1500050 for multiple; Thu, 27 Jun 2019 07:30:49 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 27 Jun 2019 07:30:44 +0100 Message-Id: <20190627063044.11858-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190627063044.11858-1-chris@chris-wilson.co.uk> References: <20190627063044.11858-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 7/7] drm/i915: Replace struct_mutex for batch pool serialisation X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Switch to tracking activity via i915_active on individual nodes, only keeping a list of retired objects in the cache, and reaping the cache when the engine itself idles. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 58 +++--- drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 - .../gpu/drm/i915/gem/i915_gem_object_types.h | 1 - drivers/gpu/drm/i915/gem/i915_gem_pm.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine.h | 1 - drivers/gpu/drm/i915/gt/intel_engine_cs.c | 11 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_pool.c | 166 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_pool.h | 34 ++++ .../gpu/drm/i915/gt/intel_engine_pool_types.h | 29 +++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 6 +- drivers/gpu/drm/i915/gt/mock_engine.c | 3 + drivers/gpu/drm/i915/i915_debugfs.c | 68 ------- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132 -------------- drivers/gpu/drm/i915/i915_gem_batch_pool.h | 26 --- 16 files changed, 279 insertions(+), 265 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool.c create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool.h create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool_types.h delete mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c delete mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 0fbc3506b5fc..b972afa517f8 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -72,6 +72,7 @@ obj-y += gt/ gt-y += \ gt/intel_breadcrumbs.o \ gt/intel_context.o \ + gt/intel_engine_pool.o \ gt/intel_engine_cs.o \ gt/intel_engine_pm.o \ gt/intel_gt.o \ @@ -118,7 +119,6 @@ i915-y += \ $(gem-y) \ i915_active.o \ i915_cmd_parser.o \ - i915_gem_batch_pool.o \ i915_gem_evict.o \ i915_gem_fence_reg.o \ i915_gem_gtt.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 80c9c57a302f..0ea2d49bc8b9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -16,6 +16,7 @@ #include "gem/i915_gem_ioctls.h" #include "gt/intel_context.h" +#include "gt/intel_engine_pool.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" @@ -1145,25 +1146,26 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, unsigned int len) { struct reloc_cache *cache = &eb->reloc_cache; - struct drm_i915_gem_object *obj; + struct intel_engine_pool_node *pool; struct i915_request *rq; struct i915_vma *batch; u32 *cmd; int err; - obj = i915_gem_batch_pool_get(&eb->engine->batch_pool, PAGE_SIZE); - if (IS_ERR(obj)) - return PTR_ERR(obj); + pool = intel_engine_pool_get(&eb->engine->pool, PAGE_SIZE); + if (IS_ERR(pool)) + return PTR_ERR(pool); - cmd = i915_gem_object_pin_map(obj, + cmd = i915_gem_object_pin_map(pool->obj, cache->has_llc ? I915_MAP_FORCE_WB : I915_MAP_FORCE_WC); - i915_gem_object_unpin_pages(obj); - if (IS_ERR(cmd)) - return PTR_ERR(cmd); + if (IS_ERR(cmd)) { + err = PTR_ERR(cmd); + goto out_pool; + } - batch = i915_vma_instance(obj, vma->vm, NULL); + batch = i915_vma_instance(pool->obj, vma->vm, NULL); if (IS_ERR(batch)) { err = PTR_ERR(batch); goto err_unmap; @@ -1179,6 +1181,10 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, goto err_unpin; } + err = intel_engine_pool_mark_active(pool, rq); + if (err) + goto err_request; + err = reloc_move_to_gpu(rq, vma); if (err) goto err_request; @@ -1204,7 +1210,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, cache->rq_size = 0; /* Return with batch mapping (cmd) still pinned */ - return 0; + goto out_pool; skip_request: i915_request_skip(rq, err); @@ -1213,7 +1219,9 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, err_unpin: i915_vma_unpin(batch); err_unmap: - i915_gem_object_unpin_map(obj); + i915_gem_object_unpin_map(pool->obj); +out_pool: + intel_engine_pool_put(pool); return err; } @@ -1957,18 +1965,17 @@ static int i915_reset_gen7_sol_offsets(struct i915_request *rq) static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) { - struct drm_i915_gem_object *shadow_batch_obj; + struct intel_engine_pool_node *pool; struct i915_vma *vma; int err; - shadow_batch_obj = i915_gem_batch_pool_get(&eb->engine->batch_pool, - PAGE_ALIGN(eb->batch_len)); - if (IS_ERR(shadow_batch_obj)) - return ERR_CAST(shadow_batch_obj); + pool = intel_engine_pool_get(&eb->engine->pool, eb->batch_len); + if (IS_ERR(pool)) + return ERR_CAST(pool); err = intel_engine_cmd_parser(eb->engine, eb->batch->obj, - shadow_batch_obj, + pool->obj, eb->batch_start_offset, eb->batch_len, is_master); @@ -1977,12 +1984,12 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) vma = NULL; else vma = ERR_PTR(err); - goto out; + goto err; } - vma = i915_gem_object_ggtt_pin(shadow_batch_obj, NULL, 0, 0, 0); + vma = i915_gem_object_ggtt_pin(pool->obj, NULL, 0, 0, 0); if (IS_ERR(vma)) - goto out; + goto err; eb->vma[eb->buffer_count] = i915_vma_get(vma); eb->flags[eb->buffer_count] = @@ -1990,8 +1997,11 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) vma->exec_flags = &eb->flags[eb->buffer_count]; eb->buffer_count++; -out: - i915_gem_object_unpin_pages(shadow_batch_obj); + vma->private = pool; + return vma; + +err: + intel_engine_pool_put(pool); return vma; } @@ -2615,6 +2625,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, * to explicitly hold another reference here. */ eb.request->batch = eb.batch; + if (eb.batch->private) + intel_engine_pool_mark_active(eb.batch->private, eb.request); trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb); @@ -2639,6 +2651,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_batch_unpin: if (eb.batch_flags & I915_DISPATCH_SECURE) i915_vma_unpin(eb.batch); + if (eb.batch->private) + intel_engine_pool_put(eb.batch->private); err_vma: if (eb.exec) eb_release_vmas(&eb); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 43194fbcbc2e..3260377ac021 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -64,7 +64,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, INIT_LIST_HEAD(&obj->vma.list); INIT_LIST_HEAD(&obj->lut_list); - INIT_LIST_HEAD(&obj->batch_pool_link); init_rcu_head(&obj->rcu); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 34b51fad02de..d474c6ac4100 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -114,7 +114,6 @@ struct drm_i915_gem_object { unsigned int userfault_count; struct list_head userfault_link; - struct list_head batch_pool_link; I915_SELFTEST_DECLARE(struct list_head st_link); /* diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 93d188526457..bf085b0cb7c6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -33,10 +33,8 @@ static void i915_gem_park(struct drm_i915_private *i915) lockdep_assert_held(&i915->drm.struct_mutex); - for_each_engine(engine, i915, id) { + for_each_engine(engine, i915, id) call_idle_barriers(engine); /* cleanup after wedging */ - i915_gem_batch_pool_fini(&engine->batch_pool); - } i915_vma_parked(i915); diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 557b08b13feb..6375d6111b15 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -9,7 +9,6 @@ #include #include -#include "i915_gem_batch_pool.h" #include "i915_pmu.h" #include "i915_reg.h" #include "i915_request.h" diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index b27fc555fe09..49439cf2fd1f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -32,6 +32,7 @@ #include "intel_engine.h" #include "intel_engine_pm.h" +#include "intel_engine_pool.h" #include "intel_context.h" #include "intel_lrc.h" #include "intel_reset.h" @@ -498,11 +499,6 @@ int intel_engines_init(struct drm_i915_private *i915) return err; } -static void intel_engine_init_batch_pool(struct intel_engine_cs *engine) -{ - i915_gem_batch_pool_init(&engine->batch_pool, engine); -} - void intel_engine_init_execlists(struct intel_engine_cs *engine) { struct intel_engine_execlists * const execlists = &engine->execlists; @@ -628,10 +624,11 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); intel_engine_init_hangcheck(engine); - intel_engine_init_batch_pool(engine); intel_engine_init_cmd_parser(engine); intel_engine_init__pm(engine); + intel_engine_pool_init(&engine->pool); + /* Use the whole device by default */ engine->sseu = intel_sseu_from_device_info(&RUNTIME_INFO(engine->i915)->sseu); @@ -880,9 +877,9 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) cleanup_status_page(engine); + intel_engine_pool_fini(&engine->pool); intel_engine_fini_breadcrumbs(engine); intel_engine_cleanup_cmd_parser(engine); - i915_gem_batch_pool_fini(&engine->batch_pool); if (engine->default_state) i915_gem_object_put(engine->default_state); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 9751a02d86bc..fe9f9eaffe88 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -7,6 +7,7 @@ #include "i915_drv.h" #include "intel_engine.h" +#include "intel_engine_pool.h" #include "intel_engine_pm.h" #include "intel_gt_pm.h" @@ -116,6 +117,7 @@ static int __engine_park(struct intel_wakeref *wf) GEM_TRACE("%s\n", engine->name); intel_engine_disarm_breadcrumbs(engine); + intel_engine_pool_park(&engine->pool); /* Must be reset upon idling, or we may miss the busy wakeup. */ GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c new file mode 100644 index 000000000000..32688ca379ef --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -0,0 +1,166 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#include "gem/i915_gem_object.h" + +#include "i915_drv.h" +#include "intel_engine_pm.h" +#include "intel_engine_pool.h" + +static struct intel_engine_cs *to_engine(struct intel_engine_pool *pool) +{ + return container_of(pool, struct intel_engine_cs, pool); +} + +static struct list_head * +bucket_for_size(struct intel_engine_pool *pool, size_t sz) +{ + int n; + + /* + * Compute a power-of-two bucket, but throw everything greater than + * 16KiB into the same bucket: i.e. the buckets hold objects of + * (1 page, 2 pages, 4 pages, 8+ pages). + */ + n = fls(sz >> PAGE_SHIFT) - 1; + if (n >= ARRAY_SIZE(pool->cache_list)) + n = ARRAY_SIZE(pool->cache_list) - 1; + + return &pool->cache_list[n]; +} + +static void node_free(struct intel_engine_pool_node *node) +{ + i915_gem_object_put(node->obj); + i915_active_fini(&node->active); + kfree(node); +} + +static int pool_active(struct i915_active *ref) +{ + struct intel_engine_pool_node *node = + container_of(ref, typeof(*node), active); + struct reservation_object *resv = node->obj->base.resv; + + if (reservation_object_trylock(resv)) { + reservation_object_add_excl_fence(resv, NULL); + reservation_object_unlock(resv); + } + + return i915_gem_object_pin_pages(node->obj); +} + +static void pool_retire(struct i915_active *ref) +{ + struct intel_engine_pool_node *node = + container_of(ref, typeof(*node), active); + struct intel_engine_pool *pool = node->pool; + struct list_head *list = bucket_for_size(pool, node->obj->base.size); + unsigned long flags; + + GEM_BUG_ON(!intel_engine_pm_is_awake(to_engine(pool))); + + i915_gem_object_unpin_pages(node->obj); + + spin_lock_irqsave(&pool->lock, flags); + list_add(&node->link, list); + spin_unlock_irqrestore(&pool->lock, flags); +} + +static struct intel_engine_pool_node * +node_create(struct intel_engine_pool *pool, size_t sz) +{ + struct intel_engine_cs *engine = to_engine(pool); + struct intel_engine_pool_node *node; + struct drm_i915_gem_object *obj; + + node = kmalloc(sizeof(*node), + GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN); + if (!node) + return ERR_PTR(-ENOMEM); + + node->pool = pool; + i915_active_init(engine->i915, &node->active, pool_active, pool_retire); + + obj = i915_gem_object_create_internal(engine->i915, sz); + if (IS_ERR(obj)) { + i915_active_fini(&node->active); + kfree(node); + return ERR_CAST(obj); + } + + node->obj = obj; + return node; +} + +struct intel_engine_pool_node * +intel_engine_pool_get(struct intel_engine_pool *pool, size_t size) +{ + struct intel_engine_pool_node *node; + struct list_head *list; + unsigned long flags; + int ret; + + GEM_BUG_ON(!intel_engine_pm_is_awake(to_engine(pool))); + + size = PAGE_ALIGN(size); + list = bucket_for_size(pool, size); + + spin_lock_irqsave(&pool->lock, flags); + list_for_each_entry(node, list, link) { + if (node->obj->base.size < size) + continue; + list_del(&node->link); + break; + } + spin_unlock_irqrestore(&pool->lock, flags); + + if (&node->link == list) { + node = node_create(pool, size); + if (IS_ERR(node)) + return node; + } + + ret = i915_active_acquire(&node->active); + if (ret) { + node_free(node); + return ERR_PTR(ret); + } + + return node; +} + +void intel_engine_pool_init(struct intel_engine_pool *pool) +{ + int n; + + spin_lock_init(&pool->lock); + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) + INIT_LIST_HEAD(&pool->cache_list[n]); +} + +void intel_engine_pool_park(struct intel_engine_pool *pool) +{ + int n; + + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) { + struct list_head *list = &pool->cache_list[n]; + struct intel_engine_pool_node *node, *nn; + + list_for_each_entry_safe(node, nn, list, link) + node_free(node); + + INIT_LIST_HEAD(list); + } +} + +void intel_engine_pool_fini(struct intel_engine_pool *pool) +{ + int n; + + for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) + GEM_BUG_ON(!list_empty(&pool->cache_list[n])); +} diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.h b/drivers/gpu/drm/i915/gt/intel_engine_pool.h new file mode 100644 index 000000000000..f7a0a660c1c9 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.h @@ -0,0 +1,34 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#ifndef INTEL_ENGINE_POOL_H +#define INTEL_ENGINE_POOL_H + +#include "intel_engine_pool_types.h" +#include "i915_active.h" +#include "i915_request.h" + +struct intel_engine_pool_node * +intel_engine_pool_get(struct intel_engine_pool *pool, size_t size); + +static inline int +intel_engine_pool_mark_active(struct intel_engine_pool_node *node, + struct i915_request *rq) +{ + return i915_active_ref(&node->active, rq->fence.context, rq); +} + +static inline void +intel_engine_pool_put(struct intel_engine_pool_node *node) +{ + i915_active_release(&node->active); +} + +void intel_engine_pool_init(struct intel_engine_pool *pool); +void intel_engine_pool_park(struct intel_engine_pool *pool); +void intel_engine_pool_fini(struct intel_engine_pool *pool); + +#endif /* INTEL_ENGINE_POOL_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h b/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h new file mode 100644 index 000000000000..e31ee361b76f --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool_types.h @@ -0,0 +1,29 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2018 Intel Corporation + */ + +#ifndef INTEL_ENGINE_POOL_TYPES_H +#define INTEL_ENGINE_POOL_TYPES_H + +#include +#include + +#include "i915_active_types.h" + +struct drm_i915_gem_object; + +struct intel_engine_pool { + spinlock_t lock; + struct list_head cache_list[4]; +}; + +struct intel_engine_pool_node { + struct i915_active active; + struct drm_i915_gem_object *obj; + struct list_head link; + struct intel_engine_pool *pool; +}; + +#endif /* INTEL_ENGINE_POOL_TYPES_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 0dde7e04b102..6d2f3e11da1c 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -16,12 +16,12 @@ #include #include "i915_gem.h" -#include "i915_gem_batch_pool.h" #include "i915_pmu.h" #include "i915_priolist_types.h" #include "i915_selftest.h" -#include "gt/intel_timeline_types.h" +#include "intel_engine_pool_types.h" #include "intel_sseu.h" +#include "intel_timeline_types.h" #include "intel_wakeref.h" #include "intel_workarounds_types.h" @@ -353,7 +353,7 @@ struct intel_engine_cs { * when the command parser is enabled. Prevents the client from * modifying the batch contents after software parsing. */ - struct i915_gem_batch_pool batch_pool; + struct intel_engine_pool pool; struct intel_hw_status_page status_page; struct i915_ctx_workarounds wa_ctx; diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 5bcb461b8372..b94d57bf2c48 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -27,6 +27,7 @@ #include "i915_drv.h" #include "intel_context.h" #include "intel_engine_pm.h" +#include "intel_engine_pool.h" #include "mock_engine.h" #include "selftests/mock_request.h" @@ -291,6 +292,8 @@ int mock_engine_init(struct intel_engine_cs *engine) intel_engine_init_execlists(engine); intel_engine_init__pm(engine); + intel_engine_pool_init(&engine->pool); + engine->kernel_context = i915_gem_context_get_engine(i915->kernel_context, engine->id); if (IS_ERR(engine->kernel_context)) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index eeecdad0e3ca..253e86868061 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -295,27 +295,6 @@ static int per_file_stats(int id, void *ptr, void *data) stats.closed); \ } while (0) -static void print_batch_pool_stats(struct seq_file *m, - struct drm_i915_private *dev_priv) -{ - struct drm_i915_gem_object *obj; - struct intel_engine_cs *engine; - struct file_stats stats = {}; - enum intel_engine_id id; - int j; - - for_each_engine(engine, dev_priv, id) { - for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) - per_file_stats(0, obj, &stats); - } - } - - print_file_stats(m, "[k]batch pool", stats); -} - static void print_context_stats(struct seq_file *m, struct drm_i915_private *i915) { @@ -373,58 +352,12 @@ static int i915_gem_object_info(struct seq_file *m, void *data) if (ret) return ret; - print_batch_pool_stats(m, i915); print_context_stats(m, i915); mutex_unlock(&i915->drm.struct_mutex); return 0; } -static int i915_gem_batch_pool_info(struct seq_file *m, void *data) -{ - struct drm_i915_private *dev_priv = node_to_i915(m->private); - struct drm_device *dev = &dev_priv->drm; - struct drm_i915_gem_object *obj; - struct intel_engine_cs *engine; - enum intel_engine_id id; - int total = 0; - int ret, j; - - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; - - for_each_engine(engine, dev_priv, id) { - for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) { - int count; - - count = 0; - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) - count++; - seq_printf(m, "%s cache[%d]: %d objects\n", - engine->name, j, count); - - list_for_each_entry(obj, - &engine->batch_pool.cache_list[j], - batch_pool_link) { - seq_puts(m, " "); - describe_obj(m, obj); - seq_putc(m, '\n'); - } - - total += count; - } - } - - seq_printf(m, "total: %d\n", total); - - mutex_unlock(&dev->struct_mutex); - - return 0; -} - static void gen8_display_interrupt_info(struct seq_file *m) { struct drm_i915_private *dev_priv = node_to_i915(m->private); @@ -4364,7 +4297,6 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_fence_regs", i915_gem_fence_regs_info, 0}, {"i915_gem_interrupt", i915_interrupt_info, 0}, - {"i915_gem_batch_pool", i915_gem_batch_pool_info, 0}, {"i915_guc_info", i915_guc_info, 0}, {"i915_guc_load_status", i915_guc_load_status_info, 0}, {"i915_guc_log_dump", i915_guc_log_dump, 0}, diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c deleted file mode 100644 index b17f23991253..000000000000 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c +++ /dev/null @@ -1,132 +0,0 @@ -/* - * SPDX-License-Identifier: MIT - * - * Copyright © 2014-2018 Intel Corporation - */ - -#include "i915_gem_batch_pool.h" -#include "i915_drv.h" - -/** - * DOC: batch pool - * - * In order to submit batch buffers as 'secure', the software command parser - * must ensure that a batch buffer cannot be modified after parsing. It does - * this by copying the user provided batch buffer contents to a kernel owned - * buffer from which the hardware will actually execute, and by carefully - * managing the address space bindings for such buffers. - * - * The batch pool framework provides a mechanism for the driver to manage a - * set of scratch buffers to use for this purpose. The framework can be - * extended to support other uses cases should they arise. - */ - -/** - * i915_gem_batch_pool_init() - initialize a batch buffer pool - * @pool: the batch buffer pool - * @engine: the associated request submission engine - */ -void i915_gem_batch_pool_init(struct i915_gem_batch_pool *pool, - struct intel_engine_cs *engine) -{ - int n; - - pool->engine = engine; - - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) - INIT_LIST_HEAD(&pool->cache_list[n]); -} - -/** - * i915_gem_batch_pool_fini() - clean up a batch buffer pool - * @pool: the pool to clean up - * - * Note: Callers must hold the struct_mutex. - */ -void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) -{ - int n; - - lockdep_assert_held(&pool->engine->i915->drm.struct_mutex); - - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) { - struct drm_i915_gem_object *obj, *next; - - list_for_each_entry_safe(obj, next, - &pool->cache_list[n], - batch_pool_link) - i915_gem_object_put(obj); - - INIT_LIST_HEAD(&pool->cache_list[n]); - } -} - -/** - * i915_gem_batch_pool_get() - allocate a buffer from the pool - * @pool: the batch buffer pool - * @size: the minimum desired size of the returned buffer - * - * Returns an inactive buffer from @pool with at least @size bytes, - * with the pages pinned. The caller must i915_gem_object_unpin_pages() - * on the returned object. - * - * Note: Callers must hold the struct_mutex - * - * Return: the buffer object or an error pointer - */ -struct drm_i915_gem_object * -i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, - size_t size) -{ - struct drm_i915_gem_object *obj; - struct list_head *list; - int n, ret; - - lockdep_assert_held(&pool->engine->i915->drm.struct_mutex); - - /* Compute a power-of-two bucket, but throw everything greater than - * 16KiB into the same bucket: i.e. the the buckets hold objects of - * (1 page, 2 pages, 4 pages, 8+ pages). - */ - n = fls(size >> PAGE_SHIFT) - 1; - if (n >= ARRAY_SIZE(pool->cache_list)) - n = ARRAY_SIZE(pool->cache_list) - 1; - list = &pool->cache_list[n]; - - list_for_each_entry(obj, list, batch_pool_link) { - struct reservation_object *resv = obj->base.resv; - - /* The batches are strictly LRU ordered */ - if (!reservation_object_test_signaled_rcu(resv, true)) - break; - - /* - * The object is now idle, clear the array of shared - * fences before we add a new request. Although, we - * remain on the same engine, we may be on a different - * timeline and so may continually grow the array, - * trapping a reference to all the old fences, rather - * than replace the existing fence. - */ - if (rcu_access_pointer(resv->fence)) { - reservation_object_lock(resv, NULL); - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); - } - - if (obj->base.size >= size) - goto found; - } - - obj = i915_gem_object_create_internal(pool->engine->i915, size); - if (IS_ERR(obj)) - return obj; - -found: - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ERR_PTR(ret); - - list_move_tail(&obj->batch_pool_link, list); - return obj; -} diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.h b/drivers/gpu/drm/i915/i915_gem_batch_pool.h deleted file mode 100644 index feeeeeaa54d8..000000000000 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.h +++ /dev/null @@ -1,26 +0,0 @@ -/* - * SPDX-License-Identifier: MIT - * - * Copyright © 2014-2018 Intel Corporation - */ - -#ifndef I915_GEM_BATCH_POOL_H -#define I915_GEM_BATCH_POOL_H - -#include - -struct drm_i915_gem_object; -struct intel_engine_cs; - -struct i915_gem_batch_pool { - struct intel_engine_cs *engine; - struct list_head cache_list[4]; -}; - -void i915_gem_batch_pool_init(struct i915_gem_batch_pool *pool, - struct intel_engine_cs *engine); -void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); -struct drm_i915_gem_object * -i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, size_t size); - -#endif /* I915_GEM_BATCH_POOL_H */