From patchwork Fri Jul 8 14:20:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Karolina Drobnik X-Patchwork-Id: 12911216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECF38C433EF for ; Fri, 8 Jul 2022 14:20:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 133DA10E1DA; Fri, 8 Jul 2022 14:20:26 +0000 (UTC) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4F00710E1DA for ; Fri, 8 Jul 2022 14:20:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657290025; x=1688826025; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tUS8vVbFiJ5TqeECWfq0SeFOMgXCwmCx7SNhOgphXpQ=; b=PiwROihOqAO50U72SOMOjTYGKI/zPEptnoYwidSumldShKDk1LNqei1v P5yK6hUxpodCH9tym6O+m/IKH2T9ivLJN+83jjf24xaqlUQfASPicnjA9 QXLvJeaY6aEe0t2sKZuJ4ZijJw9FQ5CuEk5r8aemvK/oYdW8vUNeRM0q2 JCXlv9iaExklb22W9RrSeSb4qIbLliIKQzUiKmIAE9DYZjZq2cDXqTdca YRGcqqb69Bm7iJeeVDm3LAJI1L5+wPXB3J+WBKHx4dW6MfbBwA1c6BL6+ KFshGuH5ZLV02sE2IMcve3rnoRh7jtD4nvBne1mlvnoR5mOC0vgwA22Mp A==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="267321198" X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="267321198" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:25 -0700 X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="621232097" Received: from dgrynia-mobl.ger.corp.intel.com (HELO kdrobnik-desk.toya.net.pl) ([10.213.5.81]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:22 -0700 From: Karolina Drobnik To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Jul 2022 16:20:11 +0200 Message-Id: <07e05518d9f6620d20cc1101ec1849203fe973f9.1657289332.git.karolina.drobnik@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 1/3] drm/i915/gem: Look for waitboosting across the whole object prior to individual waits X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Voegtle , Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson We employ a "waitboost" heuristic to detect when userspace is stalled waiting for results from earlier execution. Under latency sensitive work mixed between the gpu/cpu, the GPU is typically under-utilised and so RPS sees that low utilisation as a reason to downclock the frequency, causing longer stalls and lower throughput. The user left waiting for the results is not impressed. On applying commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") it was observed that deinterlacing h264 on Haswell performance dropped by 2-5x. The reason being that the natural workload was not intense enough to trigger RPS (using HW evaluation intervals) to upclock, and so it was depending on waitboosting for the throughput. Commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") changes the composition of dma-resv from keeping a single write fence + multiple read fences, to a single array of multiple write and read fences (a maximum of one pair of write/read fences per context). The iteration order was also changed implicitly from all-read fences then the single write fence, to a mix of write fences followed by read fences. It is that ordering change that belied the fragility of waitboosting. Currently, a waitboost is inspected at the point of waiting on an outstanding fence. If the GPU is backlogged such that we haven't yet stated the request we need to wait on, we force the GPU to upclock until the completion of that request. By changing the order in which we waited upon requests, we ended up waiting on those requests in sequence and as such we saw that each request was already started and so not a suitable candidate for waitboosting. Instead of asking whether to boost each fence in turn, we can look at whether boosting is required for the dma-resv ensemble prior to waiting on any fence, making the heuristic more robust to the order in which fences are stored in the dma-resv. Reported-by: Thomas Voegtle Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6284 Fixes: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Signed-off-by: Karolina Drobnik Tested-by: Thomas Voegtle Reviewed-by: Andi Shyti Acked-by: Rodrigo Vivi --- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 34 ++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index 319936f91ac5..e6e01c2a74a6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -9,6 +9,7 @@ #include #include "gt/intel_engine.h" +#include "gt/intel_rps.h" #include "i915_gem_ioctls.h" #include "i915_gem_object.h" @@ -31,6 +32,37 @@ i915_gem_object_wait_fence(struct dma_fence *fence, timeout); } +static void +i915_gem_object_boost(struct dma_resv *resv, unsigned int flags) +{ + struct dma_resv_iter cursor; + struct dma_fence *fence; + + /* + * Prescan all fences for potential boosting before we begin waiting. + * + * When we wait, we wait on outstanding fences serially. If the + * dma-resv contains a sequence such as 1:1, 1:2 instead of a reduced + * form 1:2, then as we look at each wait in turn we see that each + * request is currently executing and not worthy of boosting. But if + * we only happen to look at the final fence in the sequence (because + * of request coalescing or splitting between read/write arrays by + * the iterator), then we would boost. As such our decision to boost + * or not is delicately balanced on the order we wait on fences. + * + * So instead of looking for boosts sequentially, look for all boosts + * upfront and then wait on the outstanding fences. + */ + + dma_resv_iter_begin(&cursor, resv, + dma_resv_usage_rw(flags & I915_WAIT_ALL)); + dma_resv_for_each_fence_unlocked(&cursor, fence) + if (dma_fence_is_i915(fence) && + !i915_request_started(to_request(fence))) + intel_rps_boost(to_request(fence)); + dma_resv_iter_end(&cursor); +} + static long i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, @@ -40,6 +72,8 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, struct dma_fence *fence; long ret = timeout ?: 1; + i915_gem_object_boost(resv, flags); + dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(flags & I915_WAIT_ALL)); dma_resv_for_each_fence_unlocked(&cursor, fence) { From patchwork Fri Jul 8 14:20:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Karolina Drobnik X-Patchwork-Id: 12911217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 08CCBC433EF for ; Fri, 8 Jul 2022 14:20:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6816F10E5EA; Fri, 8 Jul 2022 14:20:31 +0000 (UTC) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id E4D6110E5DA for ; Fri, 8 Jul 2022 14:20:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657290029; x=1688826029; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8lmJekwD4F2E+Y3Bh+B1Upv8Zq0gIGsA36K4HdviIns=; b=m48Kczs+GgOQlNzhAOajvGiRc7wWBXoA4j6kYeVCS170m7PZIm3fmE2n h8c5RIq4S9GPuWo271sX3LLpxOnivnMKLaMcM4IxbYbdCbh2D2hE8P2i7 2uFJtjFYQi4pL4EuA27sIYrfqF0u9crNJ5Z4NKGqhiG1QcWq/tfZH/66W uB3VvsS70+eeZL4ZgMPiQh0rEo3knV7RUO0McQ44ewnHpcW3VOJRuFHTA NKgb9Siv6fMX5Vd68m7Vq6vERdS7KaM2sUV1jeBFffiTwMOydOZZ0t9Mw jAlS2kvAvLTl+C+PZwxcNg+5t+EjtkgZphpJLGNukFe12//TbL2nu2jrL w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="267321218" X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="267321218" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:29 -0700 X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="621232134" Received: from dgrynia-mobl.ger.corp.intel.com (HELO kdrobnik-desk.toya.net.pl) ([10.213.5.81]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:26 -0700 From: Karolina Drobnik To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Jul 2022 16:20:12 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 2/3] drm/i915: Bump GT idling delay to 2 jiffies X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Rodrigo Vivi , Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson In monitoring a transcode pipeline that is latency sensitive (it waits between submitting frames, and each frame requires work on rcs/vcs/vecs engines), it is found that it took longer than a single jiffy for it to sustain its workload. Allowing an extra jiffy headroom for the userspace prevents us from prematurely parking and having to exit powersaving immediately. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/6284 Signed-off-by: Chris Wilson Signed-off-by: Karolina Drobnik Reviewed-by: Rodrigo Vivi Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/i915_active.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index ee2b3a375362..7412abf166a8 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -974,7 +974,7 @@ void i915_active_acquire_barrier(struct i915_active *ref) GEM_BUG_ON(!intel_engine_pm_is_awake(engine)); llist_add(barrier_to_ll(node), &engine->barrier_tasks); - intel_engine_pm_put_delay(engine, 1); + intel_engine_pm_put_delay(engine, 2); } } From patchwork Fri Jul 8 14:20:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Karolina Drobnik X-Patchwork-Id: 12911218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D97B1C43334 for ; Fri, 8 Jul 2022 14:20:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4166510E5FE; Fri, 8 Jul 2022 14:20:34 +0000 (UTC) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id E217D10E5DA for ; Fri, 8 Jul 2022 14:20:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657290032; x=1688826032; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AD8zp/y/hDnsGxO0za+xZ/k9vt2OWI4EhU4E5B13MgY=; b=j09U26WFKG+HiLw+OERKZpPLCozffkxivoG9hzgaXyHZNFzxHuyFMldT JSQ4ClkALkxvPzD1a1ZcA02MXnM8pU+qwp04qoNhOKgGQc5uxfgGfAlr1 MqJNthbo7UiljKPlT2e1SN6mCEynqDlfiKO4TCOkmi93EgFfawHgbBrYo GFDuYBi1HjtmzD3Ka7rVx2gFSJP1JDQaylRccwmMJLqYGp9n7F/5tlkk7 Sj+70XqYPSncMlZdpcJKRxItD3t0/00oPvIs4Ey5ISrBuy8+ow5HQ1zlN oTkZta/3ER/o3J47JwW+luXuUTv+jHedD6xTueaRQRPxOmSbRnz2QFSz0 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="267321236" X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="267321236" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:32 -0700 X-IronPort-AV: E=Sophos;i="5.92,255,1650956400"; d="scan'208";a="621232178" Received: from dgrynia-mobl.ger.corp.intel.com (HELO kdrobnik-desk.toya.net.pl) ([10.213.5.81]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jul 2022 07:20:30 -0700 From: Karolina Drobnik To: intel-gfx@lists.freedesktop.org Date: Fri, 8 Jul 2022 16:20:13 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 3/3] drm/i915/gt: Only kick the signal worker if there's been an update X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") is that it stores many, many more fences. Whereas adding an exclusive fence used to remove the shared fence list, that list is now preserved and the write fences included into the list. Not just a single write fence, but now a write/read fence per context. That causes us to have to track more fences than before (albeit half of those are redundant), and we trigger more interrupts for multi-engine workloads. As part of reducing the impact from handling more signaling, we observe we only need to kick the signal worker after adding a fence iff we have good cause to believe that there is work to be done in processing the fence i.e. we either need to enable the interrupt or the request is already complete but we don't know if we saw the interrupt and so need to check signaling. References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") Signed-off-by: Chris Wilson Signed-off-by: Karolina Drobnik Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 9dc9dccf7b09..ecc990ec1b95 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq) * the request as it may have completed and raised the interrupt as * we were attaching it into the lists. */ - irq_work_queue(&b->irq_work); + if (!b->irq_armed || __i915_request_is_complete(rq)) + irq_work_queue(&b->irq_work); } bool i915_request_enable_breadcrumb(struct i915_request *rq)