From patchwork Thu Jul 26 08:50:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10545437 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A027814E0 for ; Thu, 26 Jul 2018 08:50:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FA172AC74 for ; Thu, 26 Jul 2018 08:50:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 836E32AC6F; Thu, 26 Jul 2018 08:50:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7199E2AC6F for ; Thu, 26 Jul 2018 08:50:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A9F786E652; Thu, 26 Jul 2018 08:50:51 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1DFCA6E47D for ; Thu, 26 Jul 2018 08:50:45 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12466495-1500050 for multiple; Thu, 26 Jul 2018 09:50:33 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 26 Jul 2018 09:50:31 +0100 Message-Id: <20180726085033.4044-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Prevent [ 397.873143] general protection fault: 0000 [#1] PREEMPT SMP PTI [ 397.873154] CPU: 4 PID: 4799 Comm: drv_module_relo Tainted: G U 4.18.0-rc6-CI-CI_DRM_4534+ #1 [ 397.873162] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017 [ 397.873175] RIP: 0010:__lock_acquire+0xf6/0x1b50 [ 397.873179] Code: 85 c0 4c 8b 9d 40 ff ff ff 8b 8d 38 ff ff ff 44 8b 8d 30 ff ff ff 4c 8b 85 28 ff ff ff 44 8b 95 24 ff ff ff 0f 84 54 03 00 00 ff 80 38 01 00 00 8b 15 45 8c 59 02 45 8b bc 24 70 08 00 00 85 [ 397.873240] RSP: 0018:ffffc90000497b40 EFLAGS: 00010002 [ 397.873246] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000 [ 397.873252] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000 [ 397.873258] RBP: ffffc90000497c20 R08: ffffffff810a25e9 R09: 0000000000000000 [ 397.873264] R10: 0000000000000000 R11: ffff880255c63c28 R12: ffff8801093b2840 [ 397.873270] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000246 [ 397.873277] FS: 00007faf88d71980(0000) GS:ffff880266300000(0000) knlGS:0000000000000000 [ 397.873284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 397.873289] CR2: 000055d866c9ca10 CR3: 000000025472e006 CR4: 00000000003606e0 [ 397.873295] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 397.873301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 397.873308] Call Trace: [ 397.873318] ? lock_acquire+0xa6/0x210 [ 397.873323] lock_acquire+0xa6/0x210 [ 397.873331] ? drain_workqueue+0x19/0x180 [ 397.873339] __mutex_lock+0x89/0x980 [ 397.873346] ? drain_workqueue+0x19/0x180 [ 397.873352] ? _raw_spin_unlock_irqrestore+0x4c/0x60 [ 397.873359] ? trace_hardirqs_on_caller+0xe0/0x1b0 [ 397.873365] ? drain_workqueue+0x19/0x180 [ 397.873373] ? debug_object_active_state+0x127/0x150 [ 397.873381] ? drain_workqueue+0x19/0x180 [ 397.873387] drain_workqueue+0x19/0x180 [ 397.873395] destroy_workqueue+0x12/0x1f0 [ 397.873476] intel_guc_fini_misc+0x36/0x90 [i915] [ 397.873540] i915_gem_fini+0x91/0x100 [i915] [ 397.873588] i915_driver_unload+0xd2/0x110 [i915] [ 397.873638] i915_pci_remove+0x19/0x30 [i915] [ 397.873646] pci_device_remove+0x36/0xb0 [ 397.873653] device_release_driver_internal+0x185/0x250 [ 397.873660] driver_detach+0x35/0x70 [ 397.873668] bus_remove_driver+0x53/0xd0 [ 397.873675] pci_unregister_driver+0x25/0xa0 [ 397.873683] __se_sys_delete_module+0x162/0x210 [ 397.873691] ? do_syscall_64+0xd/0x190 [ 397.873697] do_syscall_64+0x55/0x190 [ 397.873704] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 397.873710] RIP: 0033:0x7faf884231b7 [ 397.873714] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48 [ 397.873775] RSP: 002b:00007ffda4e98cf8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 397.873784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faf884231b7 [ 397.873790] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055fbb18f1bd8 [ 397.873796] RBP: 000055fbb18f1b70 R08: 000055fbb18f1bdc R09: 00007ffda4e98d38 [ 397.873802] R10: 00007ffda4e97cf4 R11: 0000000000000206 R12: 000055fbb0d32470 [ 397.873808] R13: 00007ffda4e992e0 R14: 0000000000000000 R15: 0000000000000000 v2: It's use-after-free; not a NULL pointer. Testcase: igt/drv_module_reload/basic-reload-inject Signed-off-by: Chris Wilson Cc: Michał Winiarski Cc: Michal Wajdeczko Reviewed-by: Michał Winiarski --- drivers/gpu/drm/i915/intel_guc.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c index 846d693ecb53..3082d7670f05 100644 --- a/drivers/gpu/drm/i915/intel_guc.c +++ b/drivers/gpu/drm/i915/intel_guc.c @@ -128,13 +128,15 @@ static int guc_init_wq(struct intel_guc *guc) static void guc_fini_wq(struct intel_guc *guc) { - struct drm_i915_private *dev_priv = guc_to_i915(guc); + struct workqueue_struct *wq; - if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) && - USES_GUC_SUBMISSION(dev_priv)) - destroy_workqueue(guc->preempt_wq); + wq = fetch_and_zero(&guc->preempt_wq); + if (wq) + destroy_workqueue(wq); - destroy_workqueue(guc->log.relay.flush_wq); + wq = fetch_and_zero(&guc->log.relay.flush_wq); + if (wq) + destroy_workqueue(wq); } int intel_guc_init_misc(struct intel_guc *guc) From patchwork Thu Jul 26 08:50:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10545435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7FB0112E for ; Thu, 26 Jul 2018 08:50:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D8D5D2AB84 for ; Thu, 26 Jul 2018 08:50:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBEAD2ADE7; Thu, 26 Jul 2018 08:50:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8AD5C2AC6F for ; Thu, 26 Jul 2018 08:50:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D1D0D6E47D; Thu, 26 Jul 2018 08:50:46 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 813B36E109 for ; Thu, 26 Jul 2018 08:50:42 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12466496-1500050 for multiple; Thu, 26 Jul 2018 09:50:33 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 26 Jul 2018 09:50:32 +0100 Message-Id: <20180726085033.4044-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180726085033.4044-1-chris@chris-wilson.co.uk> References: <20180726085033.4044-1-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If we fail during GEM initialisation, we scrub the HW state by performing a device level GPU resuet. However, we want to leave the system in a usable state (with functioning KMS but no GEM) so after scrubbing the HW state, we need to restore some sane defaults and re-enable the low-level common parts of the GPU (such as the GMCH). v2: Restore GTT entries. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a4031fab57b0..d9cc4820d224 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5598,6 +5598,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv) i915_gem_cleanup_userptr(dev_priv); if (ret == -EIO) { + mutex_lock(&dev_priv->drm.struct_mutex); + /* * Allow engine initialisation to fail by marking the GPU as * wedged. But we only want to do this where the GPU is angry, @@ -5608,7 +5610,14 @@ int i915_gem_init(struct drm_i915_private *dev_priv) "Failed to initialize GPU, declaring it wedged!\n"); i915_gem_set_wedged(dev_priv); } - ret = 0; + + /* Minimal basic recovery for KMS */ + ret = i915_ggtt_enable_hw(dev_priv); + i915_gem_restore_gtt_mappings(dev_priv); + i915_gem_restore_fences(dev_priv); + intel_init_clock_gating(dev_priv); + + mutex_unlock(&dev_priv->drm.struct_mutex); } i915_gem_drain_freed_objects(dev_priv); From patchwork Thu Jul 26 08:50:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10545433 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E31B112E for ; Thu, 26 Jul 2018 08:50:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E1162AB84 for ; Thu, 26 Jul 2018 08:50:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3241E2AC6F; Thu, 26 Jul 2018 08:50:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CA7A32AB84 for ; Thu, 26 Jul 2018 08:50:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E70066E109; Thu, 26 Jul 2018 08:50:43 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8142C6E463 for ; Thu, 26 Jul 2018 08:50:42 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12466497-1500050 for multiple; Thu, 26 Jul 2018 09:50:33 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 26 Jul 2018 09:50:33 +0100 Message-Id: <20180726085033.4044-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180726085033.4044-1-chris@chris-wilson.co.uk> References: <20180726085033.4044-1-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 3/3] drm/i915: Don't disable the GPU for older gen on wedging X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If we issue a device level GPU reset on the older gen, it will disable key components of the GMCH and the display engine. The purpose of wedging is to simply prevent further GEM usage without disabling KMS, so we need to be careful when we do issue the reset on wedging. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d9cc4820d224..1c82d5a61c7e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3329,7 +3329,8 @@ void i915_gem_set_wedged(struct drm_i915_private *i915) i915->caps.scheduler = 0; /* Even if the GPU reset fails, it should still stop the engines */ - intel_gpu_reset(i915, ALL_ENGINES); + if (INTEL_GEN(i915) >= 5) + intel_gpu_reset(i915, ALL_ENGINES); /* * Make sure no one is running the old callback before we proceed with