From patchwork Sat Jul 28 16:46:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10548027 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1F3F112E for ; Sat, 28 Jul 2018 16:46:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E1C072B106 for ; Sat, 28 Jul 2018 16:46:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D62422B11C; Sat, 28 Jul 2018 16:46:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5380A2B106 for ; Sat, 28 Jul 2018 16:46:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 519386E233; Sat, 28 Jul 2018 16:46:47 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0C4B86E21C for ; Sat, 28 Jul 2018 16:46:43 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12491852-1500050 for multiple; Sat, 28 Jul 2018 17:46:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 28 Jul 2018 17:46:19 +0100 Message-Id: <20180728164623.10613-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/5] drm/i915: Expose the busyspin durations for i915_wait_request X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ben Widawsky , Eero Tamminen Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP An interesting discussion regarding "hybrid interrupt polling" for NVMe came to the conclusion that the ideal busyspin before sleeping was half of the expected request latency (and better if it was already halfway through that request). This suggested that we too should look again at our tradeoff between spinning and waiting. Currently, our spin simply tries to hide the cost of enabling the interrupt, which is good to avoid penalising nop requests (i.e. test throughput) and not much else. Studying real world workloads suggests that a spin of upto 500us can dramatically boost performance, but the suggestion is that this is not from avoiding interrupt latency per-se, but from secondary effects of sleeping such as allowing the CPU reduce cstate and context switch away. In a truly hybrid interrupt polling scheme, we would aim to sleep until just before the request completed and then wake up in advance of the interrupt and do a quick poll to handle completion. This is tricky for ourselves at the moment as we are not recording request times, and since we allow preemption, our requests are not on as a nicely ordered timeline as IO. However, the idea is interesting, for it will certainly help us decide when busyspinning is worthwhile. v2: Expose the spin setting via Kconfig options for easier adjustment and testing. v3: Don't get caught sneaking in a change to the busyspin parameters. v4: Explain more about the "hybrid interrupt polling" scheme that we want to migrate towards. Suggested-by: Sagar Kamble References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf Signed-off-by: Chris Wilson Cc: Sagar Kamble Cc: Eero Tamminen Cc: Tvrtko Ursulin Cc: Ben Widawsky Cc: Joonas Lahtinen Cc: Michał Winiarski Reviewed-by: Sagar Kamble Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/Kconfig | 6 +++++ drivers/gpu/drm/i915/Kconfig.profile | 26 +++++++++++++++++++ drivers/gpu/drm/i915/i915_request.c | 39 +++++++++++++++++++++++++--- 3 files changed, 67 insertions(+), 4 deletions(-) create mode 100644 drivers/gpu/drm/i915/Kconfig.profile diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 5c607f2c707b..387613f29cb0 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -132,3 +132,9 @@ depends on DRM_I915 depends on EXPERT source drivers/gpu/drm/i915/Kconfig.debug endmenu + +menu "drm/i915 Profile Guided Optimisation" + visible if EXPERT + depends on DRM_I915 + source drivers/gpu/drm/i915/Kconfig.profile +endmenu diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile new file mode 100644 index 000000000000..8a230eeb98df --- /dev/null +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -0,0 +1,26 @@ +config DRM_I915_SPIN_REQUEST_IRQ + int + default 5 # microseconds + help + Before sleeping waiting for a request (GPU operation) to complete, + we may spend some time polling for its completion. As the IRQ may + take a non-negligible time to setup, we do a short spin first to + check if the request will complete in the time it would have taken + us to enable the interrupt. + + May be 0 to disable the initial spin. In practice, we estimate + the cost of enabling the interrupt (if currently disabled) to be + a few microseconds. + +config DRM_I915_SPIN_REQUEST_CS + int + default 2 # microseconds + help + After sleeping for a request (GPU operation) to complete, we will + be woken up on the completion of every request prior to the one + being waited on. For very short requests, going back to sleep and + be woken up again may add considerably to the wakeup latency. To + avoid incurring extra latency from the scheduler, we may choose to + spin prior to sleeping again. + + May be 0 to disable spinning after being woken. diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 5c2c93cbab12..ca25c53f8daa 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1336,8 +1336,32 @@ long i915_request_wait(struct i915_request *rq, GEM_BUG_ON(!intel_wait_has_seqno(&wait)); GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit)); - /* Optimistic short spin before touching IRQs */ - if (__i915_spin_request(rq, wait.seqno, state, 5)) + /* + * Optimistic spin before touching IRQs. + * + * We may use a rather large value here to offset the penalty of + * switching away from the active task. Frequently, the client will + * wait upon an old swapbuffer to throttle itself to remain within a + * frame of the gpu. If the client is running in lockstep with the gpu, + * then it should not be waiting long at all, and a sleep now will incur + * extra scheduler latency in producing the next frame. To try to + * avoid adding the cost of enabling/disabling the interrupt to the + * short wait, we first spin to see if the request would have completed + * in the time taken to setup the interrupt. + * + * We need upto 5us to enable the irq, and upto 20us to hide the + * scheduler latency of a context switch, ignoring the secondary + * impacts from a context switch such as cache eviction. + * + * The scheme used for low-latency IO is called "hybrid interrupt + * polling". The suggestion there is to sleep until just before you + * expect to be woken by the device interrupt and then poll for its + * completion. That requires having a good predictor for the request + * duration, which we currently lack. + */ + if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ && + __i915_spin_request(rq, wait.seqno, state, + CONFIG_DRM_I915_SPIN_REQUEST_IRQ)) goto complete; set_current_state(state); @@ -1396,8 +1420,15 @@ long i915_request_wait(struct i915_request *rq, __i915_wait_request_check_and_reset(rq)) continue; - /* Only spin if we know the GPU is processing this request */ - if (__i915_spin_request(rq, wait.seqno, state, 2)) + /* + * A quick spin now we are on the CPU to offset the cost of + * context switching away (and so spin for roughly the same as + * the scheduler latency). We only spin if we know the GPU is + * processing this request, and so likely to finish shortly. + */ + if (CONFIG_DRM_I915_SPIN_REQUEST_CS && + __i915_spin_request(rq, wait.seqno, state, + CONFIG_DRM_I915_SPIN_REQUEST_CS)) break; if (!intel_wait_check_request(&wait, rq)) { From patchwork Sat Jul 28 16:46:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10548019 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C39DE112E for ; Sat, 28 Jul 2018 16:46:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3C7A2B104 for ; Sat, 28 Jul 2018 16:46:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A5F302B107; Sat, 28 Jul 2018 16:46:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 45F632B104 for ; Sat, 28 Jul 2018 16:46:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2F4B16E202; Sat, 28 Jul 2018 16:46:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2573B6E1EB for ; Sat, 28 Jul 2018 16:46:36 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12491853-1500050 for multiple; Sat, 28 Jul 2018 17:46:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 28 Jul 2018 17:46:20 +0100 Message-Id: <20180728164623.10613-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180728164623.10613-1-chris@chris-wilson.co.uk> References: <20180728164623.10613-1-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 2/5] drm/i915: Expose idle delays to Kconfig X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We want to expose the parameters for controlling how long it takes for us to notice and park the GPU after a stream of requests in order to try and tune the optimal power-efficiency vs latency of a mostly idle system. v2: retire_work has two scheduling points, update them both Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/Kconfig.profile | 23 +++++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem.c | 8 +++++--- drivers/gpu/drm/i915/i915_gem.h | 13 +++++++++++++ 3 files changed, 41 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 8a230eeb98df..63cb744d920d 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -24,3 +24,26 @@ config DRM_I915_SPIN_REQUEST_CS spin prior to sleeping again. May be 0 to disable spinning after being woken. + +config DRM_I915_GEM_RETIRE_DELAY + int + default 1000 # milliseconds + help + We maintain a background job whose purpose is to keep cleaning up + after userspace, and may be the first to spot an idle system. This + parameter determines the interval between execution of the worker. + + To reduce the number of timer wakeups, large delays (of greater than + a second) are rounded to the next walltime second to allow coalescing + of multiple system timers into a single wakeup. + +config DRM_I915_GEM_PARK_DELAY + int + default 100 # milliseconds + help + Before parking the engines and the GPU after the final request is + retired, we may wait for a small delay to reduce the frequecy of + having to park/unpark and so the latency in executing a new request. + + May be 0 to immediately start parking the engines after the last + request. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 460f256114f7..5b538a92631d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -192,7 +192,9 @@ void i915_gem_park(struct drm_i915_private *i915) return; /* Defer the actual call to __i915_gem_park() to prevent ping-pongs */ - mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100)); + mod_delayed_work(i915->wq, + &i915->gt.idle_work, + msecs_to_jiffies(CONFIG_DRM_I915_GEM_PARK_DELAY)); } void i915_gem_unpark(struct drm_i915_private *i915) @@ -236,7 +238,7 @@ void i915_gem_unpark(struct drm_i915_private *i915) queue_delayed_work(i915->wq, &i915->gt.retire_work, - round_jiffies_up_relative(HZ)); + i915_gem_retire_delay()); } int @@ -3466,7 +3468,7 @@ i915_gem_retire_work_handler(struct work_struct *work) if (READ_ONCE(dev_priv->gt.awake)) queue_delayed_work(dev_priv->wq, &dev_priv->gt.retire_work, - round_jiffies_up_relative(HZ)); + i915_gem_retire_delay()); } static void shrink_caches(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h index e46592956872..c6b4971435dc 100644 --- a/drivers/gpu/drm/i915/i915_gem.h +++ b/drivers/gpu/drm/i915/i915_gem.h @@ -27,6 +27,8 @@ #include #include +#include +#include struct drm_i915_private; @@ -76,6 +78,17 @@ struct drm_i915_private; void i915_gem_park(struct drm_i915_private *i915); void i915_gem_unpark(struct drm_i915_private *i915); +static inline unsigned long i915_gem_retire_delay(void) +{ + const unsigned long delay = + msecs_to_jiffies(CONFIG_DRM_I915_GEM_RETIRE_DELAY); + + if (CONFIG_DRM_I915_GEM_RETIRE_DELAY >= 1000) + return round_jiffies_up_relative(delay); + else + return delay; +} + static inline void __tasklet_disable_sync_once(struct tasklet_struct *t) { if (atomic_inc_return(&t->count) == 1) From patchwork Sat Jul 28 16:46:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10548025 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AAE4B14E0 for ; Sat, 28 Jul 2018 16:46:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A1172B104 for ; Sat, 28 Jul 2018 16:46:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8C88B2B107; Sat, 28 Jul 2018 16:46:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 35DA62B104 for ; Sat, 28 Jul 2018 16:46:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 38BC76E21C; Sat, 28 Jul 2018 16:46:47 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0B94C6E218 for ; Sat, 28 Jul 2018 16:46:43 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12491854-1500050 for multiple; Sat, 28 Jul 2018 17:46:21 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 28 Jul 2018 17:46:21 +0100 Message-Id: <20180728164623.10613-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180728164623.10613-1-chris@chris-wilson.co.uk> References: <20180728164623.10613-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/5] drm/i915: Increase busyspin limit before a context-switch X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ben Widawsky , Eero Tamminen Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Looking at the distribution of i915_wait_request for a set of GL benchmarks, we see: broadwell# python bcc/tools/funclatency.py -u i915_wait_request usecs : count distribution 0 -> 1 : 29184 |****************************************| 2 -> 3 : 5767 |******* | 4 -> 7 : 3000 |**** | 8 -> 15 : 491 | | 16 -> 31 : 140 | | 32 -> 63 : 203 | | 64 -> 127 : 543 | | 128 -> 255 : 881 |* | 256 -> 511 : 1209 |* | 512 -> 1023 : 1739 |** | 1024 -> 2047 : 22855 |******************************* | 2048 -> 4095 : 1725 |** | 4096 -> 8191 : 5813 |******* | 8192 -> 16383 : 5348 |******* | 16384 -> 32767 : 1000 |* | 32768 -> 65535 : 4400 |****** | 65536 -> 131071 : 296 | | 131072 -> 262143 : 225 | | 262144 -> 524287 : 4 | | 524288 -> 1048575 : 1 | | 1048576 -> 2097151 : 1 | | 2097152 -> 4194303 : 1 | | broxton# python bcc/tools/funclatency.py -u i915_wait_request usecs : count distribution 0 -> 1 : 5523 |************************************* | 2 -> 3 : 1340 |********* | 4 -> 7 : 2100 |************** | 8 -> 15 : 755 |***** | 16 -> 31 : 211 |* | 32 -> 63 : 53 | | 64 -> 127 : 71 | | 128 -> 255 : 113 | | 256 -> 511 : 262 |* | 512 -> 1023 : 358 |** | 1024 -> 2047 : 1105 |******* | 2048 -> 4095 : 848 |***** | 4096 -> 8191 : 1295 |******** | 8192 -> 16383 : 5894 |****************************************| 16384 -> 32767 : 4270 |**************************** | 32768 -> 65535 : 5622 |************************************** | 65536 -> 131071 : 306 |** | 131072 -> 262143 : 50 | | 262144 -> 524287 : 76 | | 524288 -> 1048575 : 34 | | 1048576 -> 2097151 : 0 | | 2097152 -> 4194303 : 1 | | Picking 20us for the context-switch busyspin has the dual advantage of catching most frequent short waits while avoiding the cost of a context switch. 20us is a typical latency of 2 context-switches, i.e. the cost of taking the sleep, without the secondary effects of cache flushing. Signed-off-by: Chris Wilson Cc: Sagar Kamble Cc: Eero Tamminen Cc: Tvrtko Ursulin Cc: Ben Widawsky Cc: Joonas Lahtinen Cc: Michał Winiarski --- drivers/gpu/drm/i915/Kconfig.profile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 63cb744d920d..de394dea4a14 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -14,7 +14,7 @@ config DRM_I915_SPIN_REQUEST_IRQ config DRM_I915_SPIN_REQUEST_CS int - default 2 # microseconds + default 20 # microseconds help After sleeping for a request (GPU operation) to complete, we will be woken up on the completion of every request prior to the one From patchwork Sat Jul 28 16:46:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10548023 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6870D14E0 for ; Sat, 28 Jul 2018 16:46:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 595F72B104 for ; Sat, 28 Jul 2018 16:46:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4DA3C2B107; Sat, 28 Jul 2018 16:46:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C45ED2B104 for ; Sat, 28 Jul 2018 16:46:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 018FB6E218; Sat, 28 Jul 2018 16:46:47 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1386E6E218 for ; Sat, 28 Jul 2018 16:46:45 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12491855-1500050 for multiple; Sat, 28 Jul 2018 17:46:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 28 Jul 2018 17:46:22 +0100 Message-Id: <20180728164623.10613-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180728164623.10613-1-chris@chris-wilson.co.uk> References: <20180728164623.10613-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/5] drm/i915: Increase initial busyspin limit X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ben Widawsky , Eero Tamminen Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Here we bump the busyspin for the current request to avoid sleeping and the cost of both idling and downclocking the CPU. Signed-off-by: Chris Wilson Cc: Sagar Kamble Cc: Eero Tamminen Cc: Francisco Jerez Cc: Tvrtko Ursulin Cc: Ben Widawsky Cc: Joonas Lahtinen Cc: Michał Winiarski --- drivers/gpu/drm/i915/Kconfig.profile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index de394dea4a14..54e0ba4051e7 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -1,6 +1,6 @@ config DRM_I915_SPIN_REQUEST_IRQ int - default 5 # microseconds + default 250 # microseconds help Before sleeping waiting for a request (GPU operation) to complete, we may spend some time polling for its completion. As the IRQ may From patchwork Sat Jul 28 16:46:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10548021 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 711E3112E for ; Sat, 28 Jul 2018 16:46:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 612E12B104 for ; Sat, 28 Jul 2018 16:46:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 55B242B107; Sat, 28 Jul 2018 16:46:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0E5D92B104 for ; Sat, 28 Jul 2018 16:46:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 211B66E220; Sat, 28 Jul 2018 16:46:44 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id F0B146E1EB for ; Sat, 28 Jul 2018 16:46:38 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12491856-1500050 for multiple; Sat, 28 Jul 2018 17:46:22 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 28 Jul 2018 17:46:23 +0100 Message-Id: <20180728164623.10613-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180728164623.10613-1-chris@chris-wilson.co.uk> References: <20180728164623.10613-1-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 5/5] drm/i915: Do not use iowait while waiting for the GPU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eero Tamminen MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP A recent trend for cpufreq is to boost the CPU frequencies for iowaiters, in particularly to benefit high frequency I/O. We do the same and boost the GPU clocks to try and minimise time spent waiting for the GPU. However, as the igfx and CPU share the same TDP, boosting the CPU frequency will result in the GPU being throttled and its frequency being reduced. Thus declaring iowait negatively impacts on GPU throughput. v2: Both sleeps! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107410 References: 52ccc4314293 ("cpufreq: intel_pstate: HWP boost performance on IO wakeup") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Cc: Eero Tamminen Cc: Francisco Jerez --- drivers/gpu/drm/i915/i915_request.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index ca25c53f8daa..694269e5b761 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1330,7 +1330,7 @@ long i915_request_wait(struct i915_request *rq, goto complete; } - timeout = io_schedule_timeout(timeout); + timeout = schedule_timeout(timeout); } while (1); GEM_BUG_ON(!intel_wait_has_seqno(&wait)); @@ -1387,7 +1387,7 @@ long i915_request_wait(struct i915_request *rq, break; } - timeout = io_schedule_timeout(timeout); + timeout = schedule_timeout(timeout); if (intel_wait_complete(&wait) && intel_wait_check_request(&wait, rq))