From patchwork Fri Sep 16 09:24:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 12978329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F2E52C54EE9 for ; Fri, 16 Sep 2022 09:24:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0814D10E3EC; Fri, 16 Sep 2022 09:24:43 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id CCAA410E2CE; Fri, 16 Sep 2022 09:24:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663320278; x=1694856278; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vV6j1rx86g5hJHxWDoqFdng4mIpPMQDNCjE9yuGTSic=; b=ZrZ+LDDitCE8S7xar7BQ0WgakQx0aeOmf+jDHXhXWE/KJvKrNO4+6wC7 uM2QmvYsoVkgF3vWFaXZ27x1IdfAkSKygwPMTwVDG74CIBRwCBpfpEkIG 3ZhR/VdoZ/h9mzbPSyKQ1ARUlPlw5GRrvy50vzSCpqaQmnecwOjnAZuko /WB1E9hD+AUZfF9Mr0pYKJOoApkAUE2X1+kT+3e60B1lz1h4S+YbYU60V jOz/V0VNUi5gDT/Z+I8LhrPwZ842S1szEgy1wZeEhKwowUtS/TdGqt0Q+ QQQPrspFisElcxWICn8sAXlRAzEl+kKFutOy0m5+UBkp4OoR7whSrq7cQ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10471"; a="296540770" X-IronPort-AV: E=Sophos;i="5.93,320,1654585200"; d="scan'208";a="296540770" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2022 02:24:38 -0700 X-IronPort-AV: E=Sophos;i="5.93,320,1654585200"; d="scan'208";a="743277646" Received: from jkrzyszt-mobl1.ger.corp.intel.com (HELO jkrzyszt-mobl1.lan) ([10.213.25.54]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2022 02:24:35 -0700 From: Janusz Krzysztofik To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Sep 2022 11:24:02 +0200 Message-Id: <20220916092403.201355-2-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220916092403.201355-1-janusz.krzysztofik@linux.intel.com> References: <20220916092403.201355-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 1/2] drm/i915/gem: Flush contexts on driver release X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Vetter , dri-devel@lists.freedesktop.org, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Due to i915_perf assuming that it can use the i915_gem_context reference to protect its i915->gem.contexts.list iteration, we need to defer removal of the context from the list until last reference to the context is put. However, there is a risk of triggering kernel warning on contexts list not empty at driver release time if we deleagate that task to a worker for i915_gem_context_release_work(), unless that work is flushed first. Unfortunately, it is not flushed on driver release. Fix it. Instead of additionally calling flush_workqueue(), either directly or via a new dedicated wrapper around it, replace last call to i915_gem_drain_freed_objects() with existing i915_gem_drain_workqueue() that performs both tasks. Fixes: 75eefd82581f ("drm/i915: Release i915_gem_context from a worker") Suggested-by: Chris Wilson Signed-off-by: Janusz Krzysztofik Reviewed-by: Andi Shyti Cc: stable@kernel.org # v5.16+ Reviewed-by: Gwan-gyeong Mun --- drivers/gpu/drm/i915/i915_gem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c8e14ed9c2a96..172c29a15f118 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1195,7 +1195,8 @@ void i915_gem_driver_release(struct drm_i915_private *dev_priv) intel_uc_cleanup_firmwares(&to_gt(dev_priv)->uc); - i915_gem_drain_freed_objects(dev_priv); + /* Flush any outstanding work, including i915_gem_context.release_work. */ + i915_gem_drain_workqueue(dev_priv); drm_WARN_ON(&dev_priv->drm, !list_empty(&dev_priv->gem.contexts.list)); } From patchwork Fri Sep 16 09:24:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 12978330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DCA5EECAAD8 for ; Fri, 16 Sep 2022 09:24:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2143C10ECE1; Fri, 16 Sep 2022 09:24:47 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 97FF810E2F1; Fri, 16 Sep 2022 09:24:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663320283; x=1694856283; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B3h4fHpwL5Jj/jiDKqC/OxOLjlomLvmvq5mwChyq3Co=; b=ac+yXvHq4abUNU/nv1PSX+fRFTfnvoWgoIgEACTVDW63H94w9yyWLdDe N7s/MXtaXG8x4gNX6Hk32ax7PXMKkpuRm9ZkH+YaVqGtEE/vwS7CpM0El BvKdkEqurHoPQO4t4q17QeqDhoO9pqqMRuZLsdZqY8lNVClSqYgZsWrT5 69hGeEvJyVe03nfuY+k1L0XauXjn/o0M4opTrOXfnnZhUWAEfk8dvgfhT CmbYYxzXubB/zc7vmWBJ+J6/I8Rgg6dZ1JuHVdfKflFUQHoI8lBZ5us0V MFK0z61REsPbHsRHZNfoXgsIWf3soGiMCCbI+CoPlZNx5EpWFB1f2YtY2 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10471"; a="296540779" X-IronPort-AV: E=Sophos;i="5.93,320,1654585200"; d="scan'208";a="296540779" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2022 02:24:42 -0700 X-IronPort-AV: E=Sophos;i="5.93,320,1654585200"; d="scan'208";a="743277673" Received: from jkrzyszt-mobl1.ger.corp.intel.com (HELO jkrzyszt-mobl1.lan) ([10.213.25.54]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2022 02:24:39 -0700 From: Janusz Krzysztofik To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Sep 2022 11:24:03 +0200 Message-Id: <20220916092403.201355-3-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220916092403.201355-1-janusz.krzysztofik@linux.intel.com> References: <20220916092403.201355-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 2/2] drm/i915/gem: Really move i915_gem_context.link under ref protection X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Vetter , dri-devel@lists.freedesktop.org, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson i915_perf assumes that it can use the i915_gem_context reference to protect its i915->gem.contexts.list iteration. However, this requires that we do not remove the context from the list until after we drop the final reference and release the struct. If, as currently, we remove the context from the list during context_close(), the link.next pointer may be poisoned while we are holding the context reference and cause a GPF: [ 4070.573157] i915 0000:00:02.0: [drm:i915_perf_open_ioctl [i915]] filtering on ctx_id=0x1fffff ctx_id_mask=0x1fffff [ 4070.574881] general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP [ 4070.574897] CPU: 1 PID: 284392 Comm: amd_performance Tainted: G E 5.17.9 #180 [ 4070.574903] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 4070.574907] RIP: 0010:oa_configure_all_contexts.isra.0+0x222/0x350 [i915] [ 4070.574982] Code: 08 e8 32 6e 10 e1 4d 8b 6d 50 b8 ff ff ff ff 49 83 ed 50 f0 41 0f c1 04 24 83 f8 01 0f 84 e3 00 00 00 85 c0 0f 8e fa 00 00 00 <49> 8b 45 50 48 8d 70 b0 49 8d 45 50 48 39 44 24 10 0f 85 34 fe ff [ 4070.574990] RSP: 0018:ffffc90002077b78 EFLAGS: 00010202 [ 4070.574995] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000 [ 4070.575000] RDX: 0000000000000001 RSI: ffffc90002077b20 RDI: ffff88810ddc7c68 [ 4070.575004] RBP: 0000000000000001 R08: ffff888103242648 R09: fffffffffffffffc [ 4070.575008] R10: ffffffff82c50bc0 R11: 0000000000025c80 R12: ffff888101bf1860 [ 4070.575012] R13: dead0000000000b0 R14: ffffc90002077c04 R15: ffff88810be5cabc [ 4070.575016] FS: 00007f1ed50c0780(0000) GS:ffff88885ec80000(0000) knlGS:0000000000000000 [ 4070.575021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4070.575025] CR2: 00007f1ed5590280 CR3: 000000010ef6f005 CR4: 00000000003706e0 [ 4070.575029] Call Trace: [ 4070.575033] [ 4070.575037] lrc_configure_all_contexts+0x13e/0x150 [i915] [ 4070.575103] gen8_enable_metric_set+0x4d/0x90 [i915] [ 4070.575164] i915_perf_open_ioctl+0xbc0/0x1500 [i915] [ 4070.575224] ? asm_common_interrupt+0x1e/0x40 [ 4070.575232] ? i915_oa_init_reg_state+0x110/0x110 [i915] [ 4070.575290] drm_ioctl_kernel+0x85/0x110 [ 4070.575296] ? update_load_avg+0x5f/0x5e0 [ 4070.575302] drm_ioctl+0x1d3/0x370 [ 4070.575307] ? i915_oa_init_reg_state+0x110/0x110 [i915] [ 4070.575382] ? gen8_gt_irq_handler+0x46/0x130 [i915] [ 4070.575445] __x64_sys_ioctl+0x3c4/0x8d0 [ 4070.575451] ? __do_softirq+0xaa/0x1d2 [ 4070.575456] do_syscall_64+0x35/0x80 [ 4070.575461] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 4070.575467] RIP: 0033:0x7f1ed5c10397 [ 4070.575471] Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48 [ 4070.575478] RSP: 002b:00007ffd65c8d7a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 4070.575484] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f1ed5c10397 [ 4070.575488] RDX: 00007ffd65c8d7c0 RSI: 0000000040106476 RDI: 0000000000000006 [ 4070.575492] RBP: 00005620972f9c60 R08: 000000000000000a R09: 0000000000000005 [ 4070.575496] R10: 000000000000000d R11: 0000000000000246 R12: 000000000000000a [ 4070.575500] R13: 000000000000000d R14: 0000000000000000 R15: 00007ffd65c8d7c0 [ 4070.575505] [ 4070.575507] Modules linked in: nls_ascii(E) nls_cp437(E) vfat(E) fat(E) i915(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) aesni_intel(E) crypto_simd(E) intel_gtt(E) cryptd(E) ttm(E) rapl(E) intel_cstate(E) drm_kms_helper(E) cfbfillrect(E) syscopyarea(E) cfbimgblt(E) intel_uncore(E) sysfillrect(E) mei_me(E) sysimgblt(E) i2c_i801(E) fb_sys_fops(E) mei(E) intel_pch_thermal(E) i2c_smbus(E) cfbcopyarea(E) video(E) button(E) efivarfs(E) autofs4(E) [ 4070.575549] ---[ end trace 0000000000000000 ]--- v3: fix incorrect syntax of spin_lock() replacing spin_lock_irqsave() v2: irqsave not required in a worker, neither conversion to irq safe elsewhere (Tvrtko), - perf: it's safe to call gen8_configure_context() even if context has been closed, no need to check, - drop unrelated cleanup (Andi, Tvrtko) Reported-by: Mark Janes Closes: https://gitlab.freedesktop.org/drm/intel/issues/6222 References: a4e7ccdac38e ("drm/i915: Move context management under GEM") Fixes: f8246cf4d9a9 ("drm/i915/gem: Drop free_work for GEM contexts") Signed-off-by: Chris Wilson Reviewed-by: Andi Shyti Signed-off-by: Andi Shyti Signed-off-by: Janusz Krzysztofik Cc: Tvrtko Ursulin Cc: # v5.12+ Reviewed-by: Gwan-gyeong Mun --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index dabdfe09f5e51..0bcde53c50c61 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1269,6 +1269,10 @@ static void i915_gem_context_release_work(struct work_struct *work) trace_i915_context_free(ctx); GEM_BUG_ON(!i915_gem_context_is_closed(ctx)); + spin_lock(&ctx->i915->gem.contexts.lock); + list_del(&ctx->link); + spin_unlock(&ctx->i915->gem.contexts.lock); + if (ctx->syncobj) drm_syncobj_put(ctx->syncobj); @@ -1521,10 +1525,6 @@ static void context_close(struct i915_gem_context *ctx) ctx->file_priv = ERR_PTR(-EBADF); - spin_lock(&ctx->i915->gem.contexts.lock); - list_del(&ctx->link); - spin_unlock(&ctx->i915->gem.contexts.lock); - client = ctx->client; if (client) { spin_lock(&client->ctx_lock);