From patchwork Mon Sep 3 08:33:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10585609 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2810913BB for ; Mon, 3 Sep 2018 08:34:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A9CF290BF for ; Mon, 3 Sep 2018 08:34:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0EFFD290FE; Mon, 3 Sep 2018 08:34:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CA5B5290BF for ; Mon, 3 Sep 2018 08:34:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D24F6E1AA; Mon, 3 Sep 2018 08:34:17 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id ECAFB6E1AB for ; Mon, 3 Sep 2018 08:34:14 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 13512241-1500050 for multiple; Mon, 03 Sep 2018 09:33:38 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 3 Sep 2018 09:33:33 +0100 Message-Id: <20180903083337.13134-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.19.0.rc1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/5] drm/i915: Do a full device reset after being wedged X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We only call unset_wedged on the global reset path (since it's a global operation), so if we are terminally wedged and wish to reset, take the full device reset path rather than the quicker individual engine resets. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_irq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index e31093ce871c..10f28a2ee2e6 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -3309,7 +3309,8 @@ void i915_handle_error(struct drm_i915_private *dev_priv, * Try engine reset when available. We fall back to full reset if * single reset fails. */ - if (intel_has_reset_engine(dev_priv)) { + if (intel_has_reset_engine(dev_priv) && + !i915_terminally_wedged(&dev_priv->gpu_error)) { for_each_engine_masked(engine, dev_priv, engine_mask, tmp) { BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE); if (test_and_set_bit(I915_RESET_ENGINE + engine->id, From patchwork Mon Sep 3 08:33:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10585603 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7AE4113BB for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C9DB29153 for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 613002919D; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0248C290FE for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 91C676E1A8; Mon, 3 Sep 2018 08:34:16 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id ECC1A6E1B0 for ; Mon, 3 Sep 2018 08:34:14 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 13512242-1500050 for multiple; Mon, 03 Sep 2018 09:33:38 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 3 Sep 2018 09:33:34 +0100 Message-Id: <20180903083337.13134-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.19.0.rc1 In-Reply-To: <20180903083337.13134-1-chris@chris-wilson.co.uk> References: <20180903083337.13134-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/5] drm/i915: Flag any possible writes for a GTT fault X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We do not explicitly mark the PTE for the user's GTT mmap as being wrprotect, so we don't get a refault when we would need to change a read-only mmapping into read-write. As such, we must presume that if the vma has PROT_WRITE it may be written to, although this is supposed to be indicated by set-domain there are cases (e.g. after swap) where userspace may not be aware of the implicit domain change. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7b7bbfe59697..625e07c56fe2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2018,7 +2018,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf) struct drm_device *dev = obj->base.dev; struct drm_i915_private *dev_priv = to_i915(dev); struct i915_ggtt *ggtt = &dev_priv->ggtt; - bool write = !!(vmf->flags & FAULT_FLAG_WRITE); + bool write = area->vm_flags & VM_WRITE; struct i915_vma *vma; pgoff_t page_offset; int ret; From patchwork Mon Sep 3 08:33:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10585611 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91B6B14BD for ; Mon, 3 Sep 2018 08:34:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 848EE290FE for ; Mon, 3 Sep 2018 08:34:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 793E629189; Mon, 3 Sep 2018 08:34:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 38CF0290DC for ; Mon, 3 Sep 2018 08:34:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 79D896E1AB; Mon, 3 Sep 2018 08:34:20 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id C628C6E1AA for ; Mon, 3 Sep 2018 08:34:15 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 13512244-1500050 for multiple; Mon, 03 Sep 2018 09:33:38 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 3 Sep 2018 09:33:35 +0100 Message-Id: <20180903083337.13134-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.19.0.rc1 In-Reply-To: <20180903083337.13134-1-chris@chris-wilson.co.uk> References: <20180903083337.13134-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/5] drm/i915: Force the slow path after a user-write error X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If we fail to write the user relocation back when it is changed, force ourselves to take the slow relocation path where we can handle faults in the write path. There is still an element of dubiousness as having patched up the batch to use the correct offset, it no longer matches the presumed_offset in the relocation, so a second pass may miss any changes in layout. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a926d7d47183..931be2651f01 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1486,8 +1486,10 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct i915_vma *vma) * can read from this userspace address. */ offset = gen8_canonical_addr(offset & ~UPDATE); - __put_user(offset, - &urelocs[r-stack].presumed_offset); + if (unlikely(__put_user(offset, &urelocs[r-stack].presumed_offset))) { + remain = -EFAULT; + goto out; + } } } while (r++, --count); urelocs += ARRAY_SIZE(stack); @@ -1572,7 +1574,6 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) relocs = kvmalloc_array(size, 1, GFP_KERNEL); if (!relocs) { - kvfree(relocs); err = -ENOMEM; goto err; } @@ -1586,6 +1587,7 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) if (__copy_from_user((char *)relocs + copied, (char __user *)urelocs + copied, len)) { +end_user: kvfree(relocs); err = -EFAULT; goto err; @@ -1609,7 +1611,6 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) unsafe_put_user(-1, &urelocs[copied].presumed_offset, end_user); -end_user: user_access_end(); eb->exec[i].relocs_ptr = (uintptr_t)relocs; From patchwork Mon Sep 3 08:33:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10585607 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E79013BB for ; Mon, 3 Sep 2018 08:34:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D0E7290BF for ; Mon, 3 Sep 2018 08:34:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 312D8290FE; Mon, 3 Sep 2018 08:34:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CB022290BF for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EA3A56E1A9; Mon, 3 Sep 2018 08:34:16 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC9F36E1AA for ; Mon, 3 Sep 2018 08:34:14 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 13512245-1500050 for multiple; Mon, 03 Sep 2018 09:33:38 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 3 Sep 2018 09:33:36 +0100 Message-Id: <20180903083337.13134-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.19.0.rc1 In-Reply-To: <20180903083337.13134-1-chris@chris-wilson.co.uk> References: <20180903083337.13134-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/5] drm/i915: Early rejection of buffer allocations larger than RAM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We currently try to pin and allocate the whole buffer at a time. If that object is larger than RAM, we will try to pin the whole of physical memory, force the machine into oom, and then still fail the allocation. If the request is obviously too large, error out early. We opt to do this in the backend to make it easy to use alternate paths that do not require the entire object pinned, or may easily handle proxy objects that are larger than physical memory. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 625e07c56fe2..89834ce19acd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2533,13 +2533,21 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp_t noreclaim; int ret; - /* Assert that the object is not currently in any GPU domain. As it + /* + * Assert that the object is not currently in any GPU domain. As it * wasn't in the GTT, there shouldn't be any way it could have been in * a GPU cache */ GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); + /* + * If there's no chance of allocating enough pages for the whole + * object, bail early. + */ + if (page_count > totalram_pages) + return -ENOMEM; + st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) return -ENOMEM; @@ -2550,7 +2558,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) return -ENOMEM; } - /* Get the list of pages out of our struct file. They'll be pinned + /* + * Get the list of pages out of our struct file. They'll be pinned * at this point until we release them. * * Fail silently without starting the shrinker @@ -2582,7 +2591,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) i915_gem_shrink(dev_priv, 2 * page_count, NULL, *s++); cond_resched(); - /* We've tried hard to allocate the memory by reaping + /* + * We've tried hard to allocate the memory by reaping * our own buffer, now let the real VM do its job and * go down in flames if truly OOM. * @@ -2594,7 +2604,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) /* reclaim and warn, but no oom */ gfp = mapping_gfp_mask(mapping); - /* Our bo are always dirty and so we require + /* + * Our bo are always dirty and so we require * kswapd to reclaim our pages (direct reclaim * does not effectively begin pageout of our * buffers on its own). However, direct reclaim @@ -2638,7 +2649,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) ret = i915_gem_gtt_prepare_pages(obj, st); if (ret) { - /* DMA remapping failed? One possible cause is that + /* + * DMA remapping failed? One possible cause is that * it could not reserve enough large entries, asking * for PAGE_SIZE chunks instead may be helpful. */ @@ -2672,7 +2684,8 @@ static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) sg_free_table(st); kfree(st); - /* shmemfs first checks if there is enough memory to allocate the page + /* + * shmemfs first checks if there is enough memory to allocate the page * and reports ENOSPC should there be insufficient, along with the usual * ENOMEM for a genuine allocation failure. * From patchwork Mon Sep 3 08:33:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10585605 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9CC93175A for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D973290BF for ; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 818B629139; Mon, 3 Sep 2018 08:34:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E08F0290BF for ; Mon, 3 Sep 2018 08:34:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 88A7189F07; Mon, 3 Sep 2018 08:34:16 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC2576E1A9 for ; Mon, 3 Sep 2018 08:34:14 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 13512246-1500050 for multiple; Mon, 03 Sep 2018 09:33:39 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 3 Sep 2018 09:33:37 +0100 Message-Id: <20180903083337.13134-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.19.0.rc1 In-Reply-To: <20180903083337.13134-1-chris@chris-wilson.co.uk> References: <20180903083337.13134-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/5] drm/i915: Forcibly flush unwanted requests in drop-caches X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Add a mode to debugfs/drop-caches to flush unwanted requests off the GPU (by wedging the device and resetting). This is very useful if a test terminated leaving a long queue of hanging batches that would ordinarily require a round trip through hangcheck for each. It reduces the inter-test operation to just a write into drop-caches to reset driver/GPU state between tests. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 52 ++++++++++++++++++++--------- 1 file changed, 36 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index a5265c236a33..4ad0e2ed8610 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -4131,13 +4131,17 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_ring_test_irq_fops, #define DROP_FREED BIT(4) #define DROP_SHRINK_ALL BIT(5) #define DROP_IDLE BIT(6) +#define DROP_RESET_ACTIVE BIT(7) +#define DROP_RESET_SEQNO BIT(8) #define DROP_ALL (DROP_UNBOUND | \ DROP_BOUND | \ DROP_RETIRE | \ DROP_ACTIVE | \ DROP_FREED | \ DROP_SHRINK_ALL |\ - DROP_IDLE) + DROP_IDLE | \ + DROP_RESET_ACTIVE | \ + DROP_RESET_SEQNO) static int i915_drop_caches_get(void *data, u64 *val) { @@ -4149,53 +4153,69 @@ i915_drop_caches_get(void *data, u64 *val) static int i915_drop_caches_set(void *data, u64 val) { - struct drm_i915_private *dev_priv = data; - struct drm_device *dev = &dev_priv->drm; + struct drm_i915_private *i915 = data; int ret = 0; DRM_DEBUG("Dropping caches: 0x%08llx [0x%08llx]\n", val, val & DROP_ALL); + if (val & DROP_RESET_ACTIVE && !intel_engines_are_idle(i915)) + i915_gem_set_wedged(i915); + /* No need to check and wait for gpu resets, only libdrm auto-restarts * on ioctls on -EAGAIN. */ - if (val & (DROP_ACTIVE | DROP_RETIRE)) { - ret = mutex_lock_interruptible(&dev->struct_mutex); + if (val & (DROP_ACTIVE | DROP_RETIRE | DROP_RESET_SEQNO)) { + ret = mutex_lock_interruptible(&i915->drm.struct_mutex); if (ret) return ret; if (val & DROP_ACTIVE) - ret = i915_gem_wait_for_idle(dev_priv, + ret = i915_gem_wait_for_idle(i915, I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED, MAX_SCHEDULE_TIMEOUT); + if (val & DROP_RESET_SEQNO) { + intel_runtime_pm_get(i915); + ret = i915_gem_set_global_seqno(&i915->drm, 1); + intel_runtime_pm_put(i915); + } + if (val & DROP_RETIRE) - i915_retire_requests(dev_priv); + i915_retire_requests(i915); - mutex_unlock(&dev->struct_mutex); + mutex_unlock(&i915->drm.struct_mutex); + } + + if (val & DROP_RESET_ACTIVE && + i915_terminally_wedged(&i915->gpu_error)) { + i915_handle_error(i915, ALL_ENGINES, 0, NULL); + wait_on_bit(&i915->gpu_error.flags, + I915_RESET_HANDOFF, + TASK_UNINTERRUPTIBLE); } fs_reclaim_acquire(GFP_KERNEL); if (val & DROP_BOUND) - i915_gem_shrink(dev_priv, LONG_MAX, NULL, I915_SHRINK_BOUND); + i915_gem_shrink(i915, LONG_MAX, NULL, I915_SHRINK_BOUND); if (val & DROP_UNBOUND) - i915_gem_shrink(dev_priv, LONG_MAX, NULL, I915_SHRINK_UNBOUND); + i915_gem_shrink(i915, LONG_MAX, NULL, I915_SHRINK_UNBOUND); if (val & DROP_SHRINK_ALL) - i915_gem_shrink_all(dev_priv); + i915_gem_shrink_all(i915); fs_reclaim_release(GFP_KERNEL); if (val & DROP_IDLE) { do { - if (READ_ONCE(dev_priv->gt.active_requests)) - flush_delayed_work(&dev_priv->gt.retire_work); - drain_delayed_work(&dev_priv->gt.idle_work); - } while (READ_ONCE(dev_priv->gt.awake)); + if (READ_ONCE(i915->gt.active_requests)) + flush_delayed_work(&i915->gt.retire_work); + drain_delayed_work(&i915->gt.idle_work); + } while (READ_ONCE(i915->gt.awake)); } if (val & DROP_FREED) - i915_gem_drain_freed_objects(dev_priv); + i915_gem_drain_freed_objects(i915); return ret; }