From patchwork Thu Jun 4 08:12:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587339 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0AD9A14E3 for ; Thu, 4 Jun 2020 08:12:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DD7E52075B for ; Thu, 4 Jun 2020 08:12:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="E/pgUYI2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DD7E52075B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C65656E2B2; Thu, 4 Jun 2020 08:12:36 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id D714B6E2B2 for ; Thu, 4 Jun 2020 08:12:34 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id l10so5015087wrr.10 for ; Thu, 04 Jun 2020 01:12:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CdFnY1g4TiO7TC07GhFyr/JbIyM/VV8gIgkQFmLlHDQ=; b=E/pgUYI2SY0xaGSewMfnWU5m8fnHDao7z+E7QBf31znuNz2UmnVw8X4hz5J+Pz1dU+ aLdb+YCMIJuRPgJ3X1a4V5egheABOcDbYfV7ZXg66Rhzcni1oFNwjMZqwC2tNLCdA8d2 X6YoGl1DDs7upjvieXVFaK7T0OJ5RiE807N3M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CdFnY1g4TiO7TC07GhFyr/JbIyM/VV8gIgkQFmLlHDQ=; b=R0CJio+538bK3rHH0N9xhPzEfPjWjBApgpbiE9GNVaohqhRtRBKOYzVJEvMB2tsZ2A GzSXwOTsiGSV54cW6LsRAfzv7HtrJjsPn9WzyxuEZvjV3Sy2tnxLcvh61iWpgh5v5wPp 0RaiIfSNrFX6/tyhQ9wB6BcMC/5bcLyRrcWZ+A5ILbs8FyoYhWZNFTz35rOmOg6N10Fm WfF5YDFhsAPUwJJuGP3WtLDrENmKsRrEtn0KG7YKD3TzZ1yyQ2tdk92gnLrPbBCsk09s PE10h65LuRJn/IdrS7vvx0TuRh1THlk+ytDJHiod18fQcaG0/fx/adakgaSqWasIeHq0 JEZA== X-Gm-Message-State: AOAM533WQH9/3NLwh17KHI8tZMi7wVU6Ts3E1QXuip9y7mkPGO4dzecv ObkH5FwIklsSQhWUAzq4TUUTYQ== X-Google-Smtp-Source: ABdhPJzIRvuD8WMXGKIijFhhfc+ctkeCbHFIpWW5UM3euiKyXNUz7hZchjZUK8wCqpLkRmBOh919xg== X-Received: by 2002:a5d:5389:: with SMTP id d9mr3447032wrv.77.1591258353390; Thu, 04 Jun 2020 01:12:33 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:32 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:07 +0200 Message-Id: <20200604081224.863494-2-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, linux-mm@kvack.org, Jason Gunthorpe , Daniel Vetter , Andrew Morton , =?utf-8?q?Christian_K=C3=B6nig?= Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" fs_reclaim_acquire/release nicely catch recursion issues when allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend to use to keep the excessive caches in check). For mmu notifier recursions we do have lockdep annotations since 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end"). But these only fire if a path actually results in some pte invalidation - for most small allocations that's very rarely the case. The other trouble is that pte invalidation can happen any time when __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe choice, GFP_NOIO isn't good enough to avoid potential mmu notifier recursion. I was pondering whether we should just do the general annotation, but there's always the risk for false positives. Plus I'm assuming that the core fs and io code is a lot better reviewed and tested than random mmu notifier code in drivers. Hence why I decide to only annotate for that specific case. Furthermore even if we'd create a lockdep map for direct reclaim, we'd still need to explicit pull in the mmu notifier map - there's a lot more places that do pte invalidation than just direct reclaim, these two contexts arent the same. Note that the mmu notifiers needing their own independent lockdep map is also the reason we can't hold them from fs_reclaim_acquire to fs_reclaim_release - it would nest with the acquistion in the pte invalidation code, causing a lockdep splat. And we can't remove the annotations from pte invalidation and all the other places since they're called from many other places than page reclaim. Hence we can only do the equivalent of might_lock, but on the raw lockdep map. With this we can also remove the lockdep priming added in 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep") since the new annotations are strictly more powerful. Cc: Andrew Morton Cc: Jason Gunthorpe Cc: linux-mm@kvack.org Cc: linux-rdma@vger.kernel.org Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- This is part of a gpu lockdep annotation series simply because it really helps to catch issues where gpu subsystem locks and primitives can deadlock with themselves through allocations and mmu notifiers. But aside from that motivation it should be completely free-standing, and can land through -mm/-rdma/-hmm or any other tree really whenever. -Daniel --- mm/mmu_notifier.c | 7 ------- mm/page_alloc.c | 23 ++++++++++++++--------- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 06852b896fa6..5d578b9122f8 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription, lockdep_assert_held_write(&mm->mmap_sem); BUG_ON(atomic_read(&mm->mm_users) <= 0); - if (IS_ENABLED(CONFIG_LOCKDEP)) { - fs_reclaim_acquire(GFP_KERNEL); - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map); - lock_map_release(&__mmu_notifier_invalidate_range_start_map); - fs_reclaim_release(GFP_KERNEL); - } - if (!mm->notifier_subscriptions) { /* * kmalloc cannot be called under mm_take_all_locks(), but we diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 13cc653122b7..f8a222db4a53 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla static struct lockdep_map __fs_reclaim_map = STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); -static bool __need_fs_reclaim(gfp_t gfp_mask) +static bool __need_reclaim(gfp_t gfp_mask) { gfp_mask = current_gfp_context(gfp_mask); @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) if (current->flags & PF_MEMALLOC) return false; - /* We're only interested __GFP_FS allocations for now */ - if (!(gfp_mask & __GFP_FS)) - return false; - if (gfp_mask & __GFP_NOLOCKDEP) return false; @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void) void fs_reclaim_acquire(gfp_t gfp_mask) { - if (__need_fs_reclaim(gfp_mask)) - __fs_reclaim_acquire(); + if (__need_reclaim(gfp_mask)) { + if (!(gfp_mask & __GFP_FS)) + __fs_reclaim_acquire(); + + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map); + lock_map_release(&__mmu_notifier_invalidate_range_start_map); + + } } EXPORT_SYMBOL_GPL(fs_reclaim_acquire); void fs_reclaim_release(gfp_t gfp_mask) { - if (__need_fs_reclaim(gfp_mask)) - __fs_reclaim_release(); + if (__need_reclaim(gfp_mask)) { + if (!(gfp_mask & __GFP_FS)) + __fs_reclaim_release(); + } } EXPORT_SYMBOL_GPL(fs_reclaim_release); #endif From patchwork Thu Jun 4 08:12:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B613490 for ; Thu, 4 Jun 2020 08:12:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 94810207DA for ; Thu, 4 Jun 2020 08:12:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="RUvuRoaA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 94810207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6D6996E2CC; Thu, 4 Jun 2020 08:12:37 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id C50BD6E2B1 for ; Thu, 4 Jun 2020 08:12:35 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id x13so5042298wrv.4 for ; Thu, 04 Jun 2020 01:12:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ys1lEmo1OrglME8cWbIXSVFR7KWfyvCPNKiL9IEbmhc=; b=RUvuRoaAUx3zoTg0dekOOqbI1T8Sgso/qeY9AgIY+69YkzMVoxVd2zcJLUgcBipdk7 bfLnFFHUCXlCtl7HOikMZEY9KbeA3ZJCmbGvpYaKPSFaizoR8g1G0iiiWixkm2fyo6ve uHh1ScOQIMEnMrZouE9W6fbm6RZuu2BrIaMFw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ys1lEmo1OrglME8cWbIXSVFR7KWfyvCPNKiL9IEbmhc=; b=RNfRaHCPgPGc0PAKLWLIEfDnxrPcW2QJbqGWYy3q9NQ1oJitIaNlbajczZi/OyDnSB Nvk+5wvY4LK4Ifn90A+lW98EtoZK5A7TBOq2mcYiiuPeV4OLMD1wCwcxK7tjKfmhnhii 3Wt1Pjg57Dxur6nlr+RpSyd9fXvDnqOf5E+PM9euUjFxrT1MinIvIiRNKV2ph+Tqi/rH tri2zGon4Y4UDBHZugh4IBc9k44TXZYEopcXuDfM8MyzDWtrXag1GElKo5ApdwjAvRz5 CpWdfO/FJG9lKye5gjDva+uqMe1XCpd7wpzgPvVxK2AsiozqUHCVWbX8O19jCj+vRkp/ eKcw== X-Gm-Message-State: AOAM5304+T5cB0inNhjUy00KPKOkmEXeuaJs3+24DC2FFUX1eqPMo+15 1OUjleKc1SJUJ2ZhWAMVki8zUA== X-Google-Smtp-Source: ABdhPJyMtbbr4YZr5XgYcavaaABocIWgVVWO+6bIVkLH86w/gWfWScj44NFGJ8WhcZD+YmTAw14Znw== X-Received: by 2002:a5d:518b:: with SMTP id k11mr3495849wrv.58.1591258354483; Thu, 04 Jun 2020 01:12:34 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:33 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:08 +0200 Message-Id: <20200604081224.863494-3-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Daniel Vetter Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Just some tiny edits: - fix link to struct dma_fence - give slightly more meaningful title - the polling here is about implicit fences, explicit fences (in sync_file or drm_syncobj) also have their own polling Signed-off-by: Daniel Vetter Reviewed-by: Thomas Hellstrom --- drivers/dma-buf/dma-buf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 01ce125f8e8d..e018ef80451e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -161,11 +161,11 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) } /** - * DOC: fence polling + * DOC: implicit fence polling * * To support cross-device and cross-driver synchronization of buffer access - * implicit fences (represented internally in the kernel with &struct fence) can - * be attached to a &dma_buf. The glue for that and a few related things are + * implicit fences (represented internally in the kernel with &struct dma_fence) + * can be attached to a &dma_buf. The glue for that and a few related things are * provided in the &dma_resv structure. * * Userspace can query the state of these implicitly tracked fences using poll() From patchwork Thu Jun 4 08:12:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3CE590 for ; Thu, 4 Jun 2020 08:13:01 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 91ED1207DA for ; Thu, 4 Jun 2020 08:13:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="KloVwGUn" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91ED1207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7E06D6E2C8; Thu, 4 Jun 2020 08:12:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id 60B906E2C8 for ; Thu, 4 Jun 2020 08:12:37 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id l26so4289071wme.3 for ; Thu, 04 Jun 2020 01:12:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YKD4NNqujWI53pi0LNrySV2UBs5yZLD++4cyp/PY0U8=; b=KloVwGUnrk1AZS2fyAW7VkTcr0xsn1gBEZxXNjuTlFJYSBx4FNYYSV73aliKcvyAbv K7ooC389JAuCHMleI+0MlNTd7/5urNhu74242bQZws+RrNE+xi0DaGTj0pyLrVWI1O1Q pBiYqlc1p9W/cWgpEnw7f8ka58DoyVPPnPBfA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YKD4NNqujWI53pi0LNrySV2UBs5yZLD++4cyp/PY0U8=; b=eGfpGrIi/IjC7RHbQbXbeczc2O9W54TF1fxS2jayK4LEe8n4R95pl9NFbbXwzew7ub c5DjayTD8AKR77c2XhOTO97sEspkhXzuSIOa9BZpilUlt+vafEfO70P0qC2CUBH0RpT3 OrDEVfiFTbkNnUfTXa2pcImwH+VesSzLsI2l+B/k+OiSHBF6WOGatrT2tt85tDejy3Qd qIFKi9wGleNexHNgbIO17ZYvQt1er+WcfpnQbgYnkVmzxd0fHhKEZIc3MEEBaxUgTqfd xqlcCwQG7hWnW1RDqmqvJ66YjiuAYgpZ1Ca75Kkizf5byXuqTJSPheF11d1e0Fsi0Ttd Yi1A== X-Gm-Message-State: AOAM533j4IZNJsw5PJKGgAQQ40bxyMmYsKWbYPVZ27fxWhhGW7sFQgLe f8fN8o+qzlPlAU9XMAto4sr61A== X-Google-Smtp-Source: ABdhPJxHK6l0kVjpgG+FrvTB0VM/qeHWLvWMTWS/NdNmuLScATdBDLk0+bIq3tZxeUtZennvxLIRdA== X-Received: by 2002:a1c:2082:: with SMTP id g124mr2984892wmg.21.1591258355927; Thu, 04 Jun 2020 01:12:35 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:35 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:09 +0200 Message-Id: <20200604081224.863494-4-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Thomas Hellstrom , Daniel Vetter , linux-media@vger.kernel.org, =?utf-8?q?Christian_K=C3=B6nig?= , Mika Kuoppala Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Design is similar to the lockdep annotations for workers, but with some twists: - We use a read-lock for the execution/worker/completion side, so that this explicit annotation can be more liberally sprinkled around. With read locks lockdep isn't going to complain if the read-side isn't nested the same way under all circumstances, so ABBA deadlocks are ok. Which they are, since this is an annotation only. - We're using non-recursive lockdep read lock mode, since in recursive read lock mode lockdep does not catch read side hazards. And we _very_ much want read side hazards to be caught. For full details of this limitation see commit e91498589746065e3ae95d9a00b068e525eec34f Author: Peter Zijlstra Date: Wed Aug 23 13:13:11 2017 +0200 locking/lockdep/selftests: Add mixed read-write ABBA tests - To allow nesting of the read-side explicit annotations we explicitly keep track of the nesting. lock_is_held() allows us to do that. - The wait-side annotation is a write lock, and entirely done within dma_fence_wait() for everyone by default. - To be able to freely annotate helper functions I want to make it ok to call dma_fence_begin/end_signalling from soft/hardirq context. First attempt was using the hardirq locking context for the write side in lockdep, but this forces all normal spinlocks nested within dma_fence_begin/end_signalling to be spinlocks. That bollocks. The approach now is to simple check in_atomic(), and for these cases entirely rely on the might_sleep() check in dma_fence_wait(). That will catch any wrong nesting against spinlocks from soft/hardirq contexts. The idea here is that every code path that's critical for eventually signalling a dma_fence should be annotated with dma_fence_begin/end_signalling. The annotation ideally starts right after a dma_fence is published (added to a dma_resv, exposed as a sync_file fd, attached to a drm_syncobj fd, or anything else that makes the dma_fence visible to other kernel threads), up to and including the dma_fence_wait(). Examples are irq handlers, the scheduler rt threads, the tail of execbuf (after the corresponding fences are visible), any workers that end up signalling dma_fences and really anything else. Not annotated should be code paths that only complete fences opportunistically as the gpu progresses, like e.g. shrinker/eviction code. The main class of deadlocks this is supposed to catch are: Thread A: mutex_lock(A); mutex_unlock(A); dma_fence_signal(); Thread B: mutex_lock(A); dma_fence_wait(); mutex_unlock(A); Thread B is blocked on A signalling the fence, but A never gets around to that because it cannot acquire the lock A. Note that dma_fence_wait() is allowed to be nested within dma_fence_begin/end_signalling sections. To allow this to happen the read lock needs to be upgraded to a write lock, which means that any other lock is acquired between the dma_fence_begin_signalling() call and the call to dma_fence_wait(), and still held, this will result in an immediate lockdep complaint. The only other option would be to not annotate such calls, defeating the point. Therefore these annotations cannot be sprinkled over the code entirely mindless to avoid false positives. v2: handle soft/hardirq ctx better against write side and dont forget EXPORT_SYMBOL, drivers can't use this otherwise. v3: Kerneldoc. v4: Some spelling fixes from Mika Cc: Mika Kuoppala Cc: Thomas Hellstrom Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter Nacked-by: Chris Wilson --- Documentation/driver-api/dma-buf.rst | 12 +- drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++ include/linux/dma-fence.h | 12 ++ 3 files changed, 182 insertions(+), 3 deletions(-) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 63dec76d1d8d..05d856131140 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects .. kernel-doc:: drivers/dma-buf/dma-buf.c :doc: cpu access -Fence Poll Support -~~~~~~~~~~~~~~~~~~ +Implicit Fence Poll Support +~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. kernel-doc:: drivers/dma-buf/dma-buf.c - :doc: fence polling + :doc: implicit fence polling Kernel Functions and Structures Reference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -133,6 +133,12 @@ DMA Fences .. kernel-doc:: drivers/dma-buf/dma-fence.c :doc: DMA fences overview +DMA Fence Signalling Annotations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. kernel-doc:: drivers/dma-buf/dma-fence.c + :doc: fence signalling annotation + DMA Fences Functions Reference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 656e9ac2d028..0005bc002529 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num) } EXPORT_SYMBOL(dma_fence_context_alloc); +/** + * DOC: fence signalling annotation + * + * Proving correctness of all the kernel code around &dma_fence through code + * review and testing is tricky for a few reasons: + * + * * It is a cross-driver contract, and therefore all drivers must follow the + * same rules for lock nesting order, calling contexts for various functions + * and anything else significant for in-kernel interfaces. But it is also + * impossible to test all drivers in a single machine, hence brute-force N vs. + * N testing of all combinations is impossible. Even just limiting to the + * possible combinations is infeasible. + * + * * There is an enormous amount of driver code involved. For render drivers + * there's the tail of command submission, after fences are published, + * scheduler code, interrupt and workers to process job completion, + * and timeout, gpu reset and gpu hang recovery code. Plus for integration + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier, + * and &shrinker. For modesetting drivers there's the commit tail functions + * between when fences for an atomic modeset are published, and when the + * corresponding vblank completes, including any interrupt processing and + * related workers. Auditing all that code, across all drivers, is not + * feasible. + * + * * Due to how many other subsystems are involved and the locking hierarchies + * this pulls in there is extremely thin wiggle-room for driver-specific + * differences. &dma_fence interacts with almost all of the core memory + * handling through page fault handlers via &dma_resv, dma_resv_lock() and + * dma_resv_unlock(). On the other side it also interacts through all + * allocation sites through &mmu_notifier and &shrinker. + * + * Furthermore lockdep does not handle cross-release dependencies, which means + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught + * at runtime with some quick testing. The simplest example is one thread + * waiting on a &dma_fence while holding a lock:: + * + * lock(A); + * dma_fence_wait(B); + * unlock(A); + * + * while the other thread is stuck trying to acquire the same lock, which + * prevents it from signalling the fence the previous thread is stuck waiting + * on:: + * + * lock(A); + * unlock(A); + * dma_fence_signal(B); + * + * By manually annotating all code relevant to signalling a &dma_fence we can + * teach lockdep about these dependencies, which also helps with the validation + * headache since now lockdep can check all the rules for us:: + * + * cookie = dma_fence_begin_signalling(); + * lock(A); + * unlock(A); + * dma_fence_signal(B); + * dma_fence_end_signalling(cookie); + * + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to + * annotate critical sections the following rules need to be observed: + * + * * All code necessary to complete a &dma_fence must be annotated, from the + * point where a fence is accessible to other threads, to the point where + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues, + * and due to the very strict rules and many corner cases it is infeasible to + * catch these just with review or normal stress testing. + * + * * &struct dma_resv deserves a special note, since the readers are only + * protected by rcu. This means the signalling critical section starts as soon + * as the new fences are installed, even before dma_resv_unlock() is called. + * + * * The only exception are fast paths and opportunistic signalling code, which + * calls dma_fence_signal() purely as an optimization, but is not required to + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL + * which calls dma_fence_signal(), while the mandatory completion path goes + * through a hardware interrupt and possible job completion worker. + * + * * To aid composability of code, the annotations can be freely nested, as long + * as the overall locking hierarchy is consistent. The annotations also work + * both in interrupt and process context. Due to implementation details this + * requires that callers pass an opaque cookie from + * dma_fence_begin_signalling() to dma_fence_end_signalling(). + * + * * Validation against the cross driver contract is implemented by priming + * lockdep with the relevant hierarchy at boot-up. This means even just + * testing with a single device is enough to validate a driver, at least as + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are + * concerned. + */ +#ifdef CONFIG_LOCKDEP +struct lockdep_map dma_fence_lockdep_map = { + .name = "dma_fence_map" +}; + +/** + * dma_fence_begin_signalling - begin a critical DMA fence signalling section + * + * Drivers should use this to annotate the beginning of any code section + * required to eventually complete &dma_fence by calling dma_fence_signal(). + * + * The end of these critical sections are annotated with + * dma_fence_end_signalling(). + * + * Returns: + * + * Opaque cookie needed by the implementation, which needs to be passed to + * dma_fence_end_signalling(). + */ +bool dma_fence_begin_signalling(void) +{ + /* explicitly nesting ... */ + if (lock_is_held_type(&dma_fence_lockdep_map, 1)) + return true; + + /* rely on might_sleep check for soft/hardirq locks */ + if (in_atomic()) + return true; + + /* ... and non-recursive readlock */ + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_); + + return false; +} +EXPORT_SYMBOL(dma_fence_begin_signalling); + +/** + * dma_fence_end_signalling - end a critical DMA fence signalling section + * + * Closes a critical section annotation opened by dma_fence_begin_signalling(). + */ +void dma_fence_end_signalling(bool cookie) +{ + if (cookie) + return; + + lock_release(&dma_fence_lockdep_map, _RET_IP_); +} +EXPORT_SYMBOL(dma_fence_end_signalling); + +void __dma_fence_might_wait(void) +{ + bool tmp; + + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1); + if (tmp) + lock_release(&dma_fence_lockdep_map, _THIS_IP_); + lock_map_acquire(&dma_fence_lockdep_map); + lock_map_release(&dma_fence_lockdep_map); + if (tmp) + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_); +} +#endif + + /** * dma_fence_signal_locked - signal completion of a fence * @fence: the fence to signal @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence) { unsigned long flags; int ret; + bool tmp; if (!fence) return -EINVAL; + tmp = dma_fence_begin_signalling(); + spin_lock_irqsave(fence->lock, flags); ret = dma_fence_signal_locked(fence); spin_unlock_irqrestore(fence->lock, flags); + dma_fence_end_signalling(tmp); + return ret; } EXPORT_SYMBOL(dma_fence_signal); @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) might_sleep(); + __dma_fence_might_wait(); + trace_dma_fence_wait_start(fence); if (fence->ops->wait) ret = fence->ops->wait(fence, intr, timeout); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 3347c54f3a87..3f288f7db2ef 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep) } while (1); } +#ifdef CONFIG_LOCKDEP +bool dma_fence_begin_signalling(void); +void dma_fence_end_signalling(bool cookie); +#else +static inline bool dma_fence_begin_signalling(void) +{ + return true; +} +static inline void dma_fence_end_signalling(bool cookie) {} +static inline void __dma_fence_might_wait(void) {} +#endif + int dma_fence_signal(struct dma_fence *fence); int dma_fence_signal_locked(struct dma_fence *fence); signed long dma_fence_default_wait(struct dma_fence *fence, From patchwork Thu Jun 4 08:12:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587363 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A047138C for ; Thu, 4 Jun 2020 08:13:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 783FB207DA for ; Thu, 4 Jun 2020 08:13:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="ZCDuFtoO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 783FB207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4BF946E2DA; Thu, 4 Jun 2020 08:12:40 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9EA9E6E2CF for ; Thu, 4 Jun 2020 08:12:38 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id f5so4624175wmh.2 for ; Thu, 04 Jun 2020 01:12:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iX8yvgIySAam94sgW57RQXUmTB7Ou3XcFF4UUEubWkk=; b=ZCDuFtoO+JO6z1AsWRdVcfb4gkFPZIWpzbyZoQgItzGv+2X9uOvWQIfJl2CXVLRyS7 jmrwsv7OJneJtVtuGdLOTI+9nAN12c7tfvxEoB5QUeVSsKzLICOoE2KZwOuoG0NrxtB0 Y5WBOXg98M8mm1tRfoWbjuaEkFXGTwncTzP4U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iX8yvgIySAam94sgW57RQXUmTB7Ou3XcFF4UUEubWkk=; b=V2cmWmFu7jaQGLkK7ns+r6xva2LfLlJTrlN3snb3lTcRMJqUyoNTSc++1M/XYiCdWS c9mNCmWHl0bhHxCC4WjhmjV3mDNDWyrhD5z8P/8rZsWcLFU3Vv1PveUyoJ8cxQfHVai3 67wthI7XVjnAk+oKLIaRSWNLMNN8YLmpFngH3AL2WN40XTmNFccpQ8KEDtqb1tsUPYGH 0nSi+39hSz08MfyKbGOGRHny8nDqRLoq/Ggm5dt811plfPhDwMKlOIndXOxXXQgLpFBD QFyN+zWlYqduvFqnCSQbABjL/fVdRUyGwhjCPh6Mh0sp6rHFe6ay4zLVLXL7HdfRItU9 Ghtg== X-Gm-Message-State: AOAM530/Ntycv0jQC0fdoqR00QXINZREoQbvvJbThbhJR4L+PfTZz5Y1 VsT874YczWHwc7DQgzyRbhmiOw== X-Google-Smtp-Source: ABdhPJxE++sNFKLfONHraOPjYXQYwoR5UNfEopf+/1eDf4bACkMBIaDiWDCJt9Y4c08FHgIuU/EGmg== X-Received: by 2002:a7b:c145:: with SMTP id z5mr3006166wmi.6.1591258357250; Thu, 04 Jun 2020 01:12:37 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:36 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:10 +0200 Message-Id: <20200604081224.863494-5-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/18] dma-fence: prime lockdep annotations X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Thomas Hellstrom , Daniel Vetter , linux-media@vger.kernel.org, =?utf-8?q?Christian_K=C3=B6nig?= , Mika Kuoppala Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Two in one go: - it is allowed to call dma_fence_wait() while holding a dma_resv_lock(). This is fundamental to how eviction works with ttm, so required. - it is allowed to call dma_fence_wait() from memory reclaim contexts, specifically from shrinker callbacks (which i915 does), and from mmu notifier callbacks (which amdgpu does, and which i915 sometimes also does, and probably always should, but that's kinda a debate). Also for stuff like HMM we really need to be able to do this, or things get real dicey. Consequence is that any critical path necessary to get to a dma_fence_signal for a fence must never a) call dma_resv_lock nor b) allocate memory with GFP_KERNEL. Also by implication of dma_resv_lock(), no userspace faulting allowed. That's some supremely obnoxious limitations, which is why we need to sprinkle the right annotations to all relevant paths. The one big locking context we're leaving out here is mmu notifiers, added in commit 23b68395c7c78a764e8963fc15a7cfd318bf187f Author: Daniel Vetter Date: Mon Aug 26 22:14:21 2019 +0200 mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end that one covers a lot of other callsites, and it's also allowed to wait on dma-fences from mmu notifiers. But there's no ready-made functions exposed to prime this, so I've left it out for now. v2: Also track against mmu notifier context. v3: kerneldoc to spec the cross-driver contract. Note that currently i915 throws in a hard-coded 10s timeout on foreign fences (not sure why that was done, but it's there), which is why that rule is worded with SHOULD instead of MUST. Also some of the mmu_notifier/shrinker rules might surprise SoC drivers, I haven't fully audited them all. Which is infeasible anyway, we'll need to run them with lockdep and dma-fence annotations and see what goes boom. v4: A spelling fix from Mika Cc: Mika Kuoppala Cc: Thomas Hellstrom Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter Reviewed-by: Thomas Hellström --- Documentation/driver-api/dma-buf.rst | 6 ++++ drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++ drivers/dma-buf/dma-resv.c | 4 +++ include/linux/dma-fence.h | 1 + 4 files changed, 52 insertions(+) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 05d856131140..f8f6decde359 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -133,6 +133,12 @@ DMA Fences .. kernel-doc:: drivers/dma-buf/dma-fence.c :doc: DMA fences overview +DMA Fence Cross-Driver Contract +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. kernel-doc:: drivers/dma-buf/dma-fence.c + :doc: fence cross-driver contract + DMA Fence Signalling Annotations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0005bc002529..754e6fb84fb7 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -64,6 +64,47 @@ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1); * &dma_buf.resv pointer. */ +/** + * DOC: fence cross-driver contract + * + * Since &dma_fence provide a cross driver contract, all drivers must follow the + * same rules: + * + * * Fences must complete in a reasonable time. Fences which represent kernels + * and shaders submitted by userspace, which could run forever, must be backed + * up by timeout and gpu hang recovery code. Minimally that code must prevent + * further command submission and force complete all in-flight fences, e.g. + * when the driver or hardware do not support gpu reset, or if the gpu reset + * failed for some reason. Ideally the driver supports gpu recovery which only + * affects the offending userspace context, and no other userspace + * submissions. + * + * * Drivers may have different ideas of what completion within a reasonable + * time means. Some hang recovery code uses a fixed timeout, others a mix + * between observing forward progress and increasingly strict timeouts. + * Drivers should not try to second guess timeout handling of fences from + * other drivers. + * + * * To ensure there's no deadlocks of dma_fence_wait() against other locks + * drivers should annotate all code required to reach dma_fence_signal(), + * which completes the fences, with dma_fence_begin_signalling() and + * dma_fence_end_signalling(). + * + * * Drivers are allowed to call dma_fence_wait() while holding dma_resv_lock(). + * This means any code required for fence completion cannot acquire a + * &dma_resv lock. Note that this also pulls in the entire established + * locking hierarchy around dma_resv_lock() and dma_resv_unlock(). + * + * * Drivers are allowed to call dma_fence_wait() from their &shrinker + * callbacks. This means any code required for fence completion cannot + * allocate memory with GFP_KERNEL. + * + * * Drivers are allowed to call dma_fence_wait() from their &mmu_notifier + * respectively &mmu_interval_notifier callbacks. This means any code required + * for fence completeion cannot allocate memory with GFP_NOFS or GFP_NOIO. + * Only GFP_ATOMIC is permissible, which might fail. + */ + static const char *dma_fence_stub_get_name(struct dma_fence *fence) { return "stub"; diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 99c0a33c918d..c223f32425c4 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -35,6 +35,7 @@ #include #include #include +#include /** * DOC: Reservation Object Overview @@ -115,6 +116,9 @@ static int __init dma_resv_lockdep(void) if (ret == -EDEADLK) dma_resv_lock_slow(&obj, &ctx); fs_reclaim_acquire(GFP_KERNEL); + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map); + __dma_fence_might_wait(); + lock_map_release(&__mmu_notifier_invalidate_range_start_map); fs_reclaim_release(GFP_KERNEL); ww_mutex_unlock(&obj.lock); ww_acquire_fini(&ctx); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 3f288f7db2ef..09e23adb351d 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -360,6 +360,7 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep) #ifdef CONFIG_LOCKDEP bool dma_fence_begin_signalling(void); void dma_fence_end_signalling(bool cookie); +void __dma_fence_might_wait(void); #else static inline bool dma_fence_begin_signalling(void) { From patchwork Thu Jun 4 08:12:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587365 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3485B90 for ; Thu, 4 Jun 2020 08:13:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 136D1207DA for ; Thu, 4 Jun 2020 08:13:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="IqxGBAw7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 136D1207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F25D76E2DF; Thu, 4 Jun 2020 08:12:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id D163D6E2D1 for ; Thu, 4 Jun 2020 08:12:39 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id q25so4635197wmj.0 for ; Thu, 04 Jun 2020 01:12:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Z86jDMPjYosEVmpvz8g87itNg+tEFmiX7DLzwmjjruY=; b=IqxGBAw7Ux8mFxKbZqFfhWHBvUThVxHGPJeIMG6V8JAySneDzg1MMfr9dhIgUYw2Fn OTAZ0xbCrYCL4eXPkPqfd9hDPavzifvRAdCmfeUmSdOIDkGg1FY43BlyAFoqsAL1182F slvaQBsul976QOmZFtX03fE52IJl/0XI4phXM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Z86jDMPjYosEVmpvz8g87itNg+tEFmiX7DLzwmjjruY=; b=j7S54/H4e9gF/RV2fsC8zXEm4NvaIFdzRplfiRbcSngDYLAoxMdO3G0pDvaZ876vqS b9dgfPBmjMBa2Ht5WTGyLT2PsaJigzxxf4Pbz1YZac5rX3C2/Ktx105UOIBhIdXR4Hg4 Zi2V/SYjONLEblz8zcU9D+65bL6MqXK1wpHSgpM20o5M+5fjys88d49cFi0VZq/+2u+i OUn8ek6YMo3bgI9wYNa88y2ZZDMbBxlSmm3GEJCjjInX2wqlMYuMjEhgtqpE8GzOPJBC X7/wE6T03aZSGiToG5QO6gWYjD4LQ8axHLW4y0TXTLGzGBo+8zj3L/6WwLLpyNgNY6vq VcCQ== X-Gm-Message-State: AOAM530tlcod195C69eM9YalThrQNeaeV7cot80Jg83lnjPpPYRoeRdR hkSbKl3BNl/86d75ExhJBpAFaw== X-Google-Smtp-Source: ABdhPJyKQbuv5baACXqVPivWSRwfBW+Y73i/xXeG533ND6BCj6YxPKujbnsSL0XPuqc9zCvklUc7sg== X-Received: by 2002:a7b:c385:: with SMTP id s5mr3041108wmj.121.1591258358519; Thu, 04 Jun 2020 01:12:38 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:37 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:11 +0200 Message-Id: <20200604081224.863494-6-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/18] drm/vkms: Annotate vblank timer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Haneen Mohammed , Rodrigo Siqueira , linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is needed to signal the fences from page flips, annotate it accordingly. We need to annotate entire timer callback since if we get stuck anywhere in there, then the timer stops, and hence fences stop. Just annotating the top part that does the vblank handling isn't enough. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter Cc: Rodrigo Siqueira Cc: Haneen Mohammed Cc: Daniel Vetter --- drivers/gpu/drm/vkms/vkms_crtc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c index ac85e17428f8..a53a40848a72 100644 --- a/drivers/gpu/drm/vkms/vkms_crtc.c +++ b/drivers/gpu/drm/vkms/vkms_crtc.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0+ +#include + #include #include #include @@ -14,7 +16,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) struct drm_crtc *crtc = &output->crtc; struct vkms_crtc_state *state; u64 ret_overrun; - bool ret; + bool ret, fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns); @@ -49,6 +53,8 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) DRM_DEBUG_DRIVER("Composer worker already queued\n"); } + dma_fence_end_signalling(fence_cookie); + return HRTIMER_RESTART; } From patchwork Thu Jun 4 08:12:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587369 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B77B314E3 for ; Thu, 4 Jun 2020 08:13:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 96058207F9 for ; Thu, 4 Jun 2020 08:13:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="gE8sqqdh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 96058207F9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 958A36E2E0; Thu, 4 Jun 2020 08:12:42 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 219B26E2DE for ; Thu, 4 Jun 2020 08:12:41 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id j10so5023167wrw.8 for ; Thu, 04 Jun 2020 01:12:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+tjBPalwpvfaI2TJsBD2Z0Z3ZUY/HUiWjYFdKsCw83k=; b=gE8sqqdhi+ZYNpSwpwhah5KEVl5ccYRDpPqBoxZhhBcosuxdUUBujJlkR2DqFrVMMb SHhTPvm1cFszlQjqvdRcNdZIe628+AaiwXuD9C5umUghpDpDj2u5g6Zm/nnu8nPXkam7 hk6PoZYcpGdN71cpcxHDgPkUd9FGT9iRwxbMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+tjBPalwpvfaI2TJsBD2Z0Z3ZUY/HUiWjYFdKsCw83k=; b=kAyjCyXpW/IugkGnj6COmTABdmpU+AZtCWK9BfUkiG7PWwnvu/x9caZT78zxq7GsRv a1Xd6165HNMy3QX+YV6mEl/poAUNDONBW4HDxlVRV9HNr3+7t68fTXpbQN70xY3GcpH2 KPY7DXd5dBk/kR+Qqs2PcES3Etvfelbt7ZOCx4oHB0A4vjESYulBU8KLLbC2/N3d7C6f YQm618w26e+9D3rpkgGmuNjnwVMrHy34UKJAyedpLRMcRl2BBHFDo37P/bd10LcdyvGr JazYHrTffGAC/mJ2Db93skWnHsA6T0r9XHGlhxBQEwfmwP3BkSTKa0ILkRILZv0W+WpD RWow== X-Gm-Message-State: AOAM530HEuq7BQqSuIZ11/pbPkLRJZcni9EqLOLSl+0dF1gFfOJqREab zuDDA7V7rCm7L9Zn3PLOWnA27FVRMHQ= X-Google-Smtp-Source: ABdhPJwyNPXbNijhsqAaAQsM5v0tMhjUWaBKp4ap7N+trIcZ19JD/P6xXjcKeG1ExDIrDg6JX8QemA== X-Received: by 2002:a5d:6144:: with SMTP id y4mr3357789wrt.185.1591258359734; Thu, 04 Jun 2020 01:12:39 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:39 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:12 +0200 Message-Id: <20200604081224.863494-7-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is rather overkill since currently all drivers call this from hardirq (or at least timers). But maybe in the future we're going to have thread irq handlers and what not, doesn't hurt to be prepared. Plus this is an easy start for sprinkling these fence annotations into shared code. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/drm_vblank.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index 85e5f2db1608..93a5bba5f665 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -24,6 +24,7 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include #include #include @@ -1908,7 +1909,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; unsigned long irqflags; - bool disable_irq; + bool disable_irq, fence_cookie; if (drm_WARN_ON_ONCE(dev, !drm_dev_has_vblank(dev))) return false; @@ -1916,6 +1917,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) return false; + fence_cookie = dma_fence_begin_signalling(); + spin_lock_irqsave(&dev->event_lock, irqflags); /* Need timestamp lock to prevent concurrent execution with @@ -1928,6 +1931,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (!vblank->enabled) { spin_unlock(&dev->vblank_time_lock); spin_unlock_irqrestore(&dev->event_lock, irqflags); + dma_fence_end_signalling(fence_cookie); return false; } @@ -1953,6 +1957,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (disable_irq) vblank_disable_fn(&vblank->disable_timer); + dma_fence_end_signalling(fence_cookie); + return true; } EXPORT_SYMBOL(drm_handle_vblank); From patchwork Thu Jun 4 08:12:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E72A014E3 for ; Thu, 4 Jun 2020 08:13:12 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C5DDF207DA for ; Thu, 4 Jun 2020 08:13:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="A/nDmxB1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5DDF207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 911C36E2E1; Thu, 4 Jun 2020 08:12:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id 653ED6E2D7 for ; Thu, 4 Jun 2020 08:12:42 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id d128so4619615wmc.1 for ; Thu, 04 Jun 2020 01:12:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nP9MOB+NRJcIZvAbkEER1wMU4XBwsFVpcXFNKANe0X8=; b=A/nDmxB1z0+GjCIm4Zq3jaNGjJ23cPpfIRlR5gm4EnJUVo5qYiGFagWY005kpAT8FA ArHPraQTxuq4p9JNPAua05uIHltLnWCPHrN4LJm936hW5tG5uT72nfIWJAuitOyRxRAc kB2mcQ2NolCjuilnD8ztQG9UfFpafIoWQgtOQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nP9MOB+NRJcIZvAbkEER1wMU4XBwsFVpcXFNKANe0X8=; b=REc3LOgAQfn/UrfedXcgGN7kECbPHTERcFyE6yapes90me/R0XPAF3IDtYMy0xDLIB 4MG5DUA/TxwdqqaAmTdY/98zrDhWcWQi7qq2dt0Cbx8vNDALiprVE8ZWR7ReX00IXem5 YVKBcDbEtLe0WY7NO6UYfshv9aqN8oY9VtYfJRDWP6ErtA3bJSEJYzTkW4eQFMqM0dBT ib1Eu0qQ44fxZefgs+SaPqVow55H0zUc7Rb/kzcXSanGDUHSEZixfUZ87wDTFPqmeimb 8hnay/3rQZA+j1tvQopqc50m0gRIxtrHntdlAduEfjJKQ3arHsTxhJIXGZydrcbfK234 EJcA== X-Gm-Message-State: AOAM5302DukpcCOBvMEW+6eCVGZeNISANh9bVHmxmM7bcjtB+pHq5xK3 FlZUH72a/d7WZFmY3K5GAvLXog== X-Google-Smtp-Source: ABdhPJykIL9KNzrWb72K3NVL+V2g/6mdC3SH7CAQEoT09C7y7QbvAegOLkcv21ECmp0YTrkrfLSkzQ== X-Received: by 2002:a1c:9e13:: with SMTP id h19mr2939010wme.107.1591258361040; Thu, 04 Jun 2020 01:12:41 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:40 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:13 +0200 Message-Id: <20200604081224.863494-8-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is a bit disappointing since we need to split the annotations over all the different parts. I was considering just leaking the critical section into the ->atomic_commit_tail callback of each driver. But that would mean we need to pass the fence_cookie into each driver (there's a total of 13 implementations of this hook right now), so bad flag day. And also a bit leaky abstraction. Hence just do it function-by-function. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/drm_atomic_helper.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 7cd7fe0d57b4..bfcc7857a9a1 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1549,6 +1549,7 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_flip_done); void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state) { struct drm_device *dev = old_state->dev; + bool fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_commit_modeset_disables(dev, old_state); @@ -1560,6 +1561,8 @@ void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state) drm_atomic_helper_commit_hw_done(old_state); + dma_fence_end_signalling(fence_cookie); + drm_atomic_helper_wait_for_vblanks(dev, old_state); drm_atomic_helper_cleanup_planes(dev, old_state); @@ -1579,6 +1582,7 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_tail); void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state) { struct drm_device *dev = old_state->dev; + bool fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_commit_modeset_disables(dev, old_state); @@ -1591,6 +1595,8 @@ void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state) drm_atomic_helper_commit_hw_done(old_state); + dma_fence_end_signalling(fence_cookie); + drm_atomic_helper_wait_for_vblanks(dev, old_state); drm_atomic_helper_cleanup_planes(dev, old_state); @@ -1606,6 +1612,9 @@ static void commit_tail(struct drm_atomic_state *old_state) ktime_t start; s64 commit_time_ms; unsigned int i, new_self_refresh_mask = 0; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); funcs = dev->mode_config.helper_private; @@ -1634,6 +1643,8 @@ static void commit_tail(struct drm_atomic_state *old_state) if (new_crtc_state->self_refresh_active) new_self_refresh_mask |= BIT(i); + dma_fence_end_signalling(fence_cookie); + if (funcs && funcs->atomic_commit_tail) funcs->atomic_commit_tail(old_state); else @@ -1789,6 +1800,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, bool nonblock) { int ret; + bool fence_cookie; if (state->async_update) { ret = drm_atomic_helper_prepare_planes(dev, state); @@ -1811,6 +1823,8 @@ int drm_atomic_helper_commit(struct drm_device *dev, if (ret) return ret; + fence_cookie = dma_fence_begin_signalling(); + if (!nonblock) { ret = drm_atomic_helper_wait_for_fences(dev, state, true); if (ret) @@ -1848,6 +1862,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, */ drm_atomic_state_get(state); + dma_fence_end_signalling(fence_cookie); if (nonblock) queue_work(system_unbound_wq, &state->commit_work); else @@ -1856,6 +1871,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, return 0; err: + dma_fence_end_signalling(fence_cookie); drm_atomic_helper_cleanup_planes(dev, state); return ret; } From patchwork Thu Jun 4 08:12:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587399 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EA0E138C for ; Thu, 4 Jun 2020 08:13:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F166C207ED for ; Thu, 4 Jun 2020 08:13:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="WQ76x/1R" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F166C207ED Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D80006E2E5; Thu, 4 Jun 2020 08:12:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id AF7FC6E2E4 for ; Thu, 4 Jun 2020 08:12:43 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id j198so6255468wmj.0 for ; Thu, 04 Jun 2020 01:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RU/4ty0eEugeyrPrVEiQiwkrKqeJ+uZPrr4qGpCy8Ns=; b=WQ76x/1RuTy1JkP9aezKfVnljVf6mBF2k34u6CEzkCHMD5rjzO23Ol0qswF1Uqnn3r NZfMgpfPaWZjBnzc3b4R0o6daICNWxE3wsC5YBUn7XLwXhFfZLN5ZIKJ1vC7UEvOJvnQ jpY5agvtrUOJPNYvXTokA+XG56fPkvmyjcTDY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RU/4ty0eEugeyrPrVEiQiwkrKqeJ+uZPrr4qGpCy8Ns=; b=ocXYk5gOAnGfU4DTzL0c3wVJCm7HpugqSe1Cmd3Ay8PeQ9Bg/BId6K95N7imcUnKLn yZw3I3Nt8UgEYF8hjxu4GVmtvSujAnSzmIytlpoQMiGUD2Zd7NGLLe8PP/WROw/uDzy5 kCG3+G6Giumdanex+TM5vU6Z/ZM8seYXLBQ7OSU4BAmA0xzbQ2RtQRFh0gx+JTmUidkv aSKX8ULcxs5IuPdrRvqDFAe62oF8HN26F70QJp6NOCK0R89I/43/8MJsPbTiYARqkcCz gS89BJW/uFhT/5JACF1pS/H008SFr2VkaB0M3Rb5Y1fL2ouCgH5xCmo8SXieKw+Yd1I2 utKg== X-Gm-Message-State: AOAM530BmRHwwxa5bUOOOwepPiYjGRuB6DiSSp5CgJY7YccaE6zEg8Hp 7DK/ilamylp842xQ0+qG6S7nNg== X-Google-Smtp-Source: ABdhPJzyEEdLMFWp1SPIqCkQHe7o3lL3hDIhqzkna+QlwTAI0JRQsL2kauEYBN5PgJQKziIXdiNizg== X-Received: by 2002:a1c:4008:: with SMTP id n8mr3030923wma.118.1591258362288; Thu, 04 Jun 2020 01:12:42 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:41 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:14 +0200 Message-Id: <20200604081224.863494-9-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" I need a canary in a ttm-based atomic driver to make sure the dma_fence_begin/end_signalling annotations actually work. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bdba0bfd6df1..adabfa929f42 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -57,6 +57,7 @@ #include "ivsrcid/ivsrcid_vislands30.h" +#include #include #include #include @@ -7320,6 +7321,9 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) struct drm_connector_state *old_con_state, *new_con_state; struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state; int crtc_disable_count = 0; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_update_legacy_modeset_state(dev, state); @@ -7600,6 +7604,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) /* Signal HW programming completion */ drm_atomic_helper_commit_hw_done(state); + dma_fence_end_signalling(fence_cookie); + if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state); From patchwork Thu Jun 4 08:12:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D929138C for ; Thu, 4 Jun 2020 08:13:14 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5C30B207DA for ; Thu, 4 Jun 2020 08:13:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="IEeSF84M" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5C30B207DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 61B836E2E6; Thu, 4 Jun 2020 08:12:46 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id E891F6E2E1 for ; Thu, 4 Jun 2020 08:12:44 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id r15so4611006wmh.5 for ; Thu, 04 Jun 2020 01:12:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QpRcEBLBFti7+uwIPhFpaG5e28opFzcZsYVebGatC0I=; b=IEeSF84MYGmRcfj4sMCd4J+zYZKwLXaJxMSpi3fUTEKdsfossRgSDuWbe+zZ21xpAx 3y3tAVuXjHWRsIi+zJrg554cTrk73ezXwhTX3+vOBsrV//Okft/By/0MmGxexiInifje DtXukw743V5Gql830qv9EKpT7CFp/fzcgsUTM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QpRcEBLBFti7+uwIPhFpaG5e28opFzcZsYVebGatC0I=; b=qwlr6+BBznAYXBfjx1AcIe26UBdGq+jsARjO95W4B5QqZ9EqzFoQwyLzdWfDXuTLcv Cb3xbGkEWnSk2xMgDW7gowSVm11Hvpp91AkidgCmvVdDG0DfXk0+d19RknXqhGimRRjm XJV3jYUMcEQNsvjgA6qgk/RH5kNYPQApxvaFbl8sLgngc23CJerFNpUzqqGEzvHhAXm3 MS1sGo83IcKNLZgj3Pf42jGzh41dwaWylnAW3vT/09RE1AsOKQRuQ0jnraJ9WsBHKUe0 TYGWqb71H61aPUwiYbx9h8KFGvXTeREzWNs/uvgMkjf3ab3ZrdKpqbS0ldoFhVzc7cYd ISnQ== X-Gm-Message-State: AOAM532twqVgFIA/xjSbj54zwer2KdnYbWTyXDlmIShr0zDsuNlyKXAC ZS3EzXm19Mt8RiZgPUfcGD9SLg== X-Google-Smtp-Source: ABdhPJzK094CFf7KReF2ex18c6m+K0h2mhk+2HMb53cWYgjLEFwQL7H9L4hoB3ty2HpwgAOoCcUgiQ== X-Received: by 2002:a1c:5502:: with SMTP id j2mr3075373wmb.56.1591258363614; Thu, 04 Jun 2020 01:12:43 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:42 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:15 +0200 Message-Id: <20200604081224.863494-10-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the scheduler rt thread gets stuck on a mutex that we're holding while waiting for gpu workloads to complete, we have a problem. Add dma-fence annotations so that lockdep can check this for us. I've tried to quite carefully review this, and I think it's at the right spot. But obviosly no expert on drm scheduler. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/scheduler/sched_main.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 2f319102ae9f..06a736e506ad 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -763,9 +763,12 @@ static int drm_sched_main(void *param) struct sched_param sparam = {.sched_priority = 1}; struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param; int r; + bool fence_cookie; sched_setscheduler(current, SCHED_FIFO, &sparam); + fence_cookie = dma_fence_begin_signalling(); + while (!kthread_should_stop()) { struct drm_sched_entity *entity = NULL; struct drm_sched_fence *s_fence; @@ -823,6 +826,9 @@ static int drm_sched_main(void *param) wake_up(&sched->job_scheduled); } + + dma_fence_end_signalling(fence_cookie); + return 0; } From patchwork Thu Jun 4 08:12:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587403 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE18490 for ; Thu, 4 Jun 2020 08:13:18 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8CB2A208B6 for ; Thu, 4 Jun 2020 08:13:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="lnZAfT6Y" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8CB2A208B6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 259996E303; Thu, 4 Jun 2020 08:12:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id 47F156E2E9 for ; Thu, 4 Jun 2020 08:12:46 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id k26so4623606wmi.4 for ; Thu, 04 Jun 2020 01:12:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JO6albsu9ail3aAxe1jjEWcSoPjULvK53jZ9O0zdFWo=; b=lnZAfT6Yff+IJyU+6wsUrqsOgP2Gj+BMlZtMrYghbDpPPGSv/zoqP2grdoO4AOw05E uLOJb/YvDpIETJrOA8RWtcCTW3/8iagYeIGfCzBgEwAJEkP3SyZ5oE2wCQuq2Ve1iyQR bh1mKlhPqlAZFzfugxhI6xRlys9QcfRU1N110= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JO6albsu9ail3aAxe1jjEWcSoPjULvK53jZ9O0zdFWo=; b=gecMgVxgkn2CuXJYGut6Z/fmQmJZ4wwJW3H/inVyoxQhlaJtdgKPn2IHMiLbYz2xY1 ggHwe2TuuVc7LV7F+caQCpOs72VmqwWypnDl9DrA+blznJ7736gMdIfTu5NWr6OgB5W+ 4pyGQGi1xlvyfYgomyJpxOIREzYpXMn0+iquGKK9snF0siXBS+IuCs+hxb/CVNvfpyYU /f7cHEp0eYYbzzYqMyxMCRFTdH1bLN9FsSWHBAcXP3J22bvFHlOWe3dGgfrSDkWLldtJ wCEuwssJN8+jMibFKU4iCcZ0S6IT/mHiAeqPklSUuoz2y0iFSVdbAakHSBJpg9LXeyZy xQag== X-Gm-Message-State: AOAM530s6L56Qaa3zsXiNn+JNm8cRGyism8enInERQR1lKv/H+U4uw6u 0h8uYBNLTeK+qOTclfLkz/IHK7DAVUI= X-Google-Smtp-Source: ABdhPJyn8WC4htH2TFhAJzXT32NdRtoD972Y+p3loDxohFz/ONHD+IEVEXE5SDKPFz9SSfLivB0maw== X-Received: by 2002:a7b:c385:: with SMTP id s5mr3041559wmj.121.1591258364877; Thu, 04 Jun 2020 01:12:44 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:44 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:16 +0200 Message-Id: <20200604081224.863494-11-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is a bit tricky, since ->notifier_lock is held while calling dma_fence_wait we must ensure that also the read side (i.e. dma_fence_begin_signalling) is on the same side. If we mix this up lockdep complaints, and that's again why we want to have these annotations. A nice side effect of this is that because of the fs_reclaim priming for dma_fence_enable lockdep now automatically checks for us that nothing in here allocates memory, without even running any userptr workloads. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index a25fb59c127c..e109666aec14 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1212,6 +1212,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, struct amdgpu_job *job; uint64_t seq; int r; + bool fence_cookie; job = p->job; p->job = NULL; @@ -1226,6 +1227,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, */ mutex_lock(&p->adev->notifier_lock); + fence_cookie = dma_fence_begin_signalling(); + /* If userptr are invalidated after amdgpu_cs_parser_bos(), return * -EAGAIN, drmIoctl in libdrm will restart the amdgpu_cs_ioctl. */ @@ -1262,12 +1265,14 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm); ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence); + dma_fence_end_signalling(fence_cookie); mutex_unlock(&p->adev->notifier_lock); return 0; error_abort: drm_sched_job_cleanup(&job->base); + dma_fence_end_signalling(fence_cookie); mutex_unlock(&p->adev->notifier_lock); error_unlock: From patchwork Thu Jun 4 08:12:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E4B2138C for ; Thu, 4 Jun 2020 08:13:22 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D48E206C3 for ; Thu, 4 Jun 2020 08:13:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="dPeG6Vwr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D48E206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E10326E312; Thu, 4 Jun 2020 08:12:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9E48D6E30C for ; Thu, 4 Jun 2020 08:12:47 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id c71so4284214wmd.5 for ; Thu, 04 Jun 2020 01:12:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=r5PRcG/JtprEwCOUf2/ZPMLqKS9RbW1APrGVc2im3Nw=; b=dPeG6VwrGWYvNwNN0ZEMIw4vWjkVci2qTgMSxope6UETE3fRl01e8viI9oFa4083Jj DdYccEo90yhy8XH5cdnxNbrh1jbM/izjkui8sSYKCs2y4PK9kM6I2sbEqQBA/eIPT+Ca 9IeXeC1lYJxLhEUT92aKE6kgkpD4NJkeUxG+k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=r5PRcG/JtprEwCOUf2/ZPMLqKS9RbW1APrGVc2im3Nw=; b=RXNeowbaj4DrC8cSc2CN5PyCBA1de7nmBZ1rW0BC/HovPHmk8hHrc99xNhM1W64uds dkmAt+182gqUUVbSTxHNPRUsV4D4geXyOqV2lMsSPrSeXn1A779C8i7upSZ+337/wqqh vS+9faJjbbLBsS63oMvSJkhC0G6cj3dLET/4r0ufMZwSEF5TXiG+s6qYWK0FkRW5J1J7 lltbkq88k9ZU1dKAZIfZxPnOgvL6EWWJJZImzF+JjLRLXxF+1BwV8aQT4LRsJ/Y1aard ibFgoV1mWdh36orHVINvKuuC0lO0Ddjf/cffS9kGINsjqjx0knSVwWfu5SZmNaD/MFDc Wapw== X-Gm-Message-State: AOAM533NWUc66tGcsMKHULdMI+fJpoDmGyoQysicbwOvWedN11T2IevO FHZenN96/xtHoyEgzRzQaa5Y5C/Y/+A= X-Google-Smtp-Source: ABdhPJxlvHiwLUKpMc/fibX83rHdeLrkjFY5wrTostf2iwbBSmbJPCuso4jxxSuY+d7sEL6BpjrzbQ== X-Received: by 2002:a1c:7305:: with SMTP id d5mr2952331wmb.85.1591258366300; Thu, 04 Jun 2020 01:12:46 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:45 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:17 +0200 Message-Id: <20200604081224.863494-12-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" My dma-fence lockdep annotations caught an inversion because we allocate memory where we really shouldn't: kmem_cache_alloc+0x2b/0x6d0 amdgpu_fence_emit+0x30/0x330 [amdgpu] amdgpu_ib_schedule+0x306/0x550 [amdgpu] amdgpu_job_run+0x10f/0x260 [amdgpu] drm_sched_main+0x1b9/0x490 [gpu_sched] kthread+0x12e/0x150 Trouble right now is that lockdep only validates against GFP_FS, which would be good enough for shrinkers. But for mmu_notifiers we actually need !GFP_ATOMIC, since they can be called from any page laundering, even if GFP_NOFS or GFP_NOIO are set. I guess we should improve the lockdep annotations for fs_reclaim_acquire/release. Ofc real fix is to properly preallocate this fence and stuff it into the amdgpu job structure. But GFP_ATOMIC gets the lockdep splat out of the way. v2: Two more allocations in scheduler paths. Frist one: __kmalloc+0x58/0x720 amdgpu_vmid_grab+0x100/0xca0 [amdgpu] amdgpu_job_dependency+0xf9/0x120 [amdgpu] drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched] drm_sched_main+0xf9/0x490 [gpu_sched] Second one: kmem_cache_alloc+0x2b/0x6d0 amdgpu_sync_fence+0x7e/0x110 [amdgpu] amdgpu_vmid_grab+0x86b/0xca0 [amdgpu] amdgpu_job_dependency+0xf9/0x120 [amdgpu] drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched] drm_sched_main+0xf9/0x490 [gpu_sched] Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index d878fe7fee51..055b47241bb1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -143,7 +143,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, uint32_t seq; int r; - fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL); + fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_ATOMIC); if (fence == NULL) return -ENOMEM; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index fe92dcd94d4a..fdcd6659f5ad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -208,7 +208,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_vm *vm, if (ring->vmid_wait && !dma_fence_is_signaled(ring->vmid_wait)) return amdgpu_sync_fence(sync, ring->vmid_wait, false); - fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL); + fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC); if (!fences) return -ENOMEM; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c index b87ca171986a..330476cc0c86 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c @@ -168,7 +168,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f, if (amdgpu_sync_add_later(sync, f, explicit)) return 0; - e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL); + e = kmem_cache_alloc(amdgpu_sync_slab, GFP_ATOMIC); if (!e) return -ENOMEM; From patchwork Thu Jun 4 08:12:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587415 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA183138C for ; Thu, 4 Jun 2020 08:13:23 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A94F62074B for ; Thu, 4 Jun 2020 08:13:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="VVvxH1bH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A94F62074B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4CA4689322; Thu, 4 Jun 2020 08:12:52 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id E7FDC6E2F9 for ; Thu, 4 Jun 2020 08:12:48 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id f5so4624677wmh.2 for ; Thu, 04 Jun 2020 01:12:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OK1ZbpoTNfccVSYFcsXj8JWZN2H8wb6NBSYBSiY0g+w=; b=VVvxH1bHV9p97ank/8LpaorTHKGsJwZyd+JirwJXy0+qZG4TB8H7Oydr9fEe+RFC1/ k1imVqgdJmEqRkZpgw7mIlLcSS5DA0RAvO2zX5M6vZPa7o9dsbw1v3gmCBgImxZgGaXs civc/FbkkhZ5D3KeU5JL9OxvPlSRme8vDYfYA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OK1ZbpoTNfccVSYFcsXj8JWZN2H8wb6NBSYBSiY0g+w=; b=CUOyBYd4nt9dNxMMjGNza4qqzAVBuX8uaA0jEUUCRGMObq65LMFC3HyrZEoGLnRsDo 7Gu9GNBgPvV8vKWt2duRbRHfdULoWNER8GSU0Lq5M8wg/UxyCwRGgHDq95FdX88zex4r hEkLZgDJgrD1GsTYw2dz5457cZqpcO8etNbnskWcC5xM8lQpqPkJDeAIxIubWMJcYhj+ D9v/kd0+vz3j4qNn4AXytKkxqaIcwriSSyE1hi2yDON1OGwhbNazxj/Tja08wwT6j/Cp 3ZiE/WGZWuEoyeHrAR7t0FfBGiD4GQGu4e68Kw3cjC6JdX4ToXUJtzqv10xLOX94aQy8 jNUQ== X-Gm-Message-State: AOAM531v2UbINUDVmY2fgI7KJUWklx+H00mLeBOby81OiOCLQptlqYKt ALdjEWjq7FeUcaQYXao1IG9KKg== X-Google-Smtp-Source: ABdhPJxSRgyDL97CLiAzJiTWnOVc3+E0KQ864iKHEkQpN+0FvKvYO/V8cL0dvx5C878kh5RZWIw8RA== X-Received: by 2002:a1c:df57:: with SMTP id w84mr3077102wmg.52.1591258367556; Thu, 04 Jun 2020 01:12:47 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:46 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:18 +0200 Message-Id: <20200604081224.863494-13-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Not going to bother with a complete&pretty commit message, just offending backtrace: kvmalloc_node+0x47/0x80 dc_create_state+0x1f/0x60 [amdgpu] dc_commit_state+0xcb/0x9b0 [amdgpu] amdgpu_dm_atomic_commit_tail+0xd31/0x2010 [amdgpu] commit_tail+0xa4/0x140 [drm_kms_helper] drm_atomic_helper_commit+0x152/0x180 [drm_kms_helper] drm_client_modeset_commit_atomic+0x1ea/0x250 [drm] drm_client_modeset_commit_locked+0x55/0x190 [drm] drm_client_modeset_commit+0x24/0x40 [drm] v2: Found more in DC code, I'm just going to pile them all up. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/atom.c | 2 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 4 +++- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c b/drivers/gpu/drm/amd/amdgpu/atom.c index 4cfc786699c7..1b0c674fab25 100644 --- a/drivers/gpu/drm/amd/amdgpu/atom.c +++ b/drivers/gpu/drm/amd/amdgpu/atom.c @@ -1226,7 +1226,7 @@ static int amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index, ectx.abort = false; ectx.last_jump = 0; if (ws) - ectx.ws = kcalloc(4, ws, GFP_KERNEL); + ectx.ws = kcalloc(4, ws, GFP_ATOMIC); else ectx.ws = NULL; diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index adabfa929f42..c575e7394d03 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -6833,7 +6833,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, struct dc_stream_update stream_update; } *bundle; - bundle = kzalloc(sizeof(*bundle), GFP_KERNEL); + bundle = kzalloc(sizeof(*bundle), GFP_ATOMIC); if (!bundle) { dm_error("Failed to allocate update bundle\n"); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 45cfb7c45566..9a8e321a7a15 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1416,8 +1416,10 @@ bool dc_post_update_surfaces_to_stream(struct dc *dc) struct dc_state *dc_create_state(struct dc *dc) { + /* No you really cant allocate random crap here this late in + * atomic_commit_tail. */ struct dc_state *context = kvzalloc(sizeof(struct dc_state), - GFP_KERNEL); + GFP_ATOMIC); if (!context) return NULL; From patchwork Thu Jun 4 08:12:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587423 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 465FF90 for ; Thu, 4 Jun 2020 08:13:27 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 24F6E206C3 for ; Thu, 4 Jun 2020 08:13:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="ifotk2PJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 24F6E206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A48289298; Thu, 4 Jun 2020 08:12:53 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 267D76E2D5 for ; Thu, 4 Jun 2020 08:12:50 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id x13so5043075wrv.4 for ; Thu, 04 Jun 2020 01:12:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zHLq9yPZgKKERQ/k46N5wAtq1tEa3e90dS7EqKAAVV8=; b=ifotk2PJMfPqXT/+gwn4eR7/3IWQV+W9DCcY7OyrRdaAZFQwKftRJdQXo0rpUM3/yB gFJrkbneOCQ2QX77TBZL8U/VuKVxgzOxX4tlg9WC2CbjGOvgEb6AVwmiQ15EpDYBd7lG Tzy0RTnpPG+HiMGAwz3SGe+WNdJdbJrdEf6D0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zHLq9yPZgKKERQ/k46N5wAtq1tEa3e90dS7EqKAAVV8=; b=qjfb7V9fwpXkNpZ7qKqRtxWQuZXHv39C9QSA2+cTkc2z7O7OlKEIpsJV/h2fzz4UMh H/2jD5WTjGP9vk3mHIT/z0j69S1QJQ0muAh168X8TmxrcYOnKoLAvt4juNQVWgAGbfAb aKuHgKtpbqQjCPvjS/yMeeHjiAR//AYAGFWqVCyl9RkLqhIcTPLtq1AimTkMqIu81UN9 XtiSGbWQiY/o9u+3Qm0EXxBz5VIINMfCxMQ35rBIt4aicRjl26RsOeoYZYzw4xOEjI0A tvFrMJADpqy1G8N8iKyb9oFmTDdm67grV/ZM4xhT6HxY60zBdcZU5ltA76PFHvIt4Q+o 9+dg== X-Gm-Message-State: AOAM531cUyj+2joXs+MApEVbyPQyLzUhTcS63NrJgviz5vZDH/fBO12Z s+UlOXS2FaKuYXd/fXyzVZF8AA== X-Google-Smtp-Source: ABdhPJwmXgxb0rRlbz9RyKT6KVGNFYaugD+jseyFJKYlXk9wNmaSF3eMQq44tjneQVcPWbVrYiOUJw== X-Received: by 2002:adf:fb0f:: with SMTP id c15mr3424740wrr.410.1591258368775; Thu, 04 Jun 2020 01:12:48 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:48 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:19 +0200 Message-Id: <20200604081224.863494-14-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Trying to grab dma_resv_lock while in commit_tail before we've done all the code that leads to the eventual signalling of the vblank event (which can be a dma_fence) is deadlock-y. Don't do that. Here the solution is easy because just grabbing locks to read something races anyway. We don't need to bother, READ_ONCE is equivalent. And avoids the locking issue. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index c575e7394d03..04c11443b9ca 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -6910,7 +6910,11 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, * explicitly on fences instead * and in general should be called for * blocking commit to as per framework helpers + * + * Yes, this deadlocks, since you're calling dma_resv_lock in a + * path that leads to a dma_fence_signal(). Don't do that. */ +#if 0 r = amdgpu_bo_reserve(abo, true); if (unlikely(r != 0)) DRM_ERROR("failed to reserve buffer before flip\n"); @@ -6920,6 +6924,12 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, tmz_surface = amdgpu_bo_encrypted(abo); amdgpu_bo_unreserve(abo); +#endif + /* + * this races anyway, so READ_ONCE isn't any better or worse + * than the stuff above. Except the stuff above can deadlock. + */ + tiling_flags = READ_ONCE(abo->tiling_flags); fill_dc_plane_info_and_addr( dm->adev, new_plane_state, tiling_flags, From patchwork Thu Jun 4 08:12:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587435 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38C2A90 for ; Thu, 4 Jun 2020 08:13:34 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 178362075B for ; Thu, 4 Jun 2020 08:13:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="MQtZQlsV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 178362075B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 386F66E30F; Thu, 4 Jun 2020 08:12:56 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id 72EC16E316 for ; Thu, 4 Jun 2020 08:12:51 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id r7so5067842wro.1 for ; Thu, 04 Jun 2020 01:12:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cSDqH5ij1BPhN02yjRk7X7Ch94EKjTTeY1hANr+YPIg=; b=MQtZQlsVT4D6M/TZ/TTIidOW50d1o6Q0AiFZHAR30zUPft45srmFqn7Q/chHCQgEwb QuUQyDP7PvsPZgFWkvGTZI4QJxfIwFPjWqa5WQ26vsMQBqacHIfA9jXi1YXDD4rvkBs/ wmbeqc0eo5WnTY22+0zRQvessA2Qhc3AShHpw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cSDqH5ij1BPhN02yjRk7X7Ch94EKjTTeY1hANr+YPIg=; b=INQZQBko2ViLVLoIVCb5gsQihe175qYJOX8fW09aUV+GZEzSsUrscuxu2I83gZ8Fx3 NKrdWDNnyWF7jzwPwvwJytAdgPS1QgsTwDarwg/RZzxtMe0i2id1bXSLVlRkAd2AM5r2 5kUeMkQvBCJKgQLO/qByDGgtTS5bh0HGUA3w7I+0I7l3rtH5AjPNdAUO4qjpIy4ZMxKh LukClEFucloFPJG7UifrUHAZz0bqBGfjys+Q46I9h5L2ynmB9ueMUrqxFsj7U3AUzfdh wTSj0m2Q5KZsjEPnrGST3rbwVwuiy/F+5fCty9UGPDlczph9ZpRUtDFGSYbNsBtqF/RA LwVg== X-Gm-Message-State: AOAM530MK113gDCRCHqQ75DI19JIKHSfiTIUYH/L3nrOF8vGcSok+ssE t/J1ASdAtwEnR/30YaMGlJCAnA== X-Google-Smtp-Source: ABdhPJwTJtq4ZMbXkz9yUomRGTcjfw0qYxpotGG6g0Tb3unDYU1epqhHkACAq8ejGSpX2fZsvYsoJQ== X-Received: by 2002:a05:6000:7:: with SMTP id h7mr3471787wrx.55.1591258370059; Thu, 04 Jun 2020 01:12:50 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:49 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:20 +0200 Message-Id: <20200604081224.863494-15-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" In the face of unpriviledged userspace being able to submit bogus gpu workloads the kernel needs gpu timeout and reset (tdr) to guarantee that dma_fences actually complete. Annotate this worker to make sure we don't have any accidental locking inversions or other problems lurking. Originally this was part of the overall scheduler annotation patch. But amdgpu has some glorious inversions here: - grabs console_lock - does a full modeset, which grabs all kinds of locks (drm_modeset_lock, dma_resv_lock) which can deadlock with dma_fence_wait held inside them. - almost minor at that point, but the modeset code also allocates memory These all look like they'll be very hard to fix properly, the hardware seems to require a full display reset with any gpu recovery. Hence split out as a seperate patch. Since amdgpu isn't the only hardware driver that needs to reset the display (at least gen2/3 on intel have the same problem) we need a generic solution for this. There's two tricks we could still from drm/i915 and lift to dma-fence: - The big whack, aka force-complete all fences. i915 does this for all pending jobs if the reset is somehow stuck. Trouble is we'd need to do this for all fences in the entire system, and just the book-keeping for that will be fun. Plus lots of drivers use fences for all kinds of internal stuff like memory management, so unconditionally resetting all of them doesn't work. I'm also hoping that with these fence annotations we could enlist lockdep in finding the last offenders causing deadlocks, and we could remove this get-out-of-jail trick. - The more feasible approach (across drivers at least as part of the dma_fence contract) is what drm/i915 does for gen2/3: When we need to reset the display we wake up all dma_fence_wait_interruptible calls, or well at least the equivalent of those in i915 internally. Relying on ioctl restart we force all other threads to release their locks, which means the tdr thread is guaranteed to be able to get them. I think we could implement this at the dma_fence level, including proper lockdep annotations. dma_fence_begin_tdr(): - must be nested within a dma_fence_begin/end_signalling section - will wake up all interruptible (but not the non-interruptible) dma_fence_wait() calls and force them to complete with a -ERESTARTSYS errno code. All new interrupitble calls to dma_fence_wait() will immeidately fail with the same error code. dma_fence_end_trdr(): - this will convert dma_fence_wait() calls back to normal. Of course interrupting dma_fence_wait is only ok if the caller specified that, which means we need to split the annotations into interruptible and non-interruptible version. If we then make sure that we only use interruptible dma_fence_wait() calls while holding drm_modeset_lock we can grab them in tdr code, and allow display resets. Doing the same for dma_resv_lock might be a lot harder, so buffer updates must be avoided. What's worse, we're not going to be able to make the dma_fence_wait calls in mmu-notifiers interruptible, that doesn't work. So allocating memory still wont' be allowed, even in tdr sections. Plus obviously we can use this trick only in tdr, it is rather intrusive. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/scheduler/sched_main.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 06a736e506ad..e34a44376e87 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -279,9 +279,12 @@ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; struct drm_sched_job *job; + bool fence_cookie; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); + fence_cookie = dma_fence_begin_signalling(); + /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ spin_lock(&sched->job_list_lock); job = list_first_entry_or_null(&sched->ring_mirror_list, @@ -313,6 +316,8 @@ static void drm_sched_job_timedout(struct work_struct *work) spin_lock(&sched->job_list_lock); drm_sched_start_timeout(sched); spin_unlock(&sched->job_list_lock); + + dma_fence_end_signalling(fence_cookie); } /** From patchwork Thu Jun 4 08:12:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587419 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7EB3790 for ; Thu, 4 Jun 2020 08:13:25 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5D47F206C3 for ; Thu, 4 Jun 2020 08:13:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="Cw6DE8ip" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D47F206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A3A9689803; Thu, 4 Jun 2020 08:12:53 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id A1E9089452 for ; Thu, 4 Jun 2020 08:12:52 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id p5so5015785wrw.9 for ; Thu, 04 Jun 2020 01:12:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mPAnyNEtS0uY+bUuM8GxOFOVsLMIhBxwz7ItIj7lR3A=; b=Cw6DE8iprZ3uQg5gII/9i5sihculizd8rViFXjzCXSdVxul9CqWdFgOfY71gVkGiuV juGxO+Tm75A9A2e/W5n2HP7py+rVnr9u2LxOwSRBXRkhwJHpkeuFJDFWdS3KfjcSOmQG RdZmXaDOTVBzfxFPq9ypg/Dpc5CyQyRF0HX2w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mPAnyNEtS0uY+bUuM8GxOFOVsLMIhBxwz7ItIj7lR3A=; b=Izyqu8x4xIawq0f2EnD12mCsfsPy0cEQ/SX38Oo24WqVvygJzUXce643FtHOOeIu50 gM5XWkUg1/MVkhSp5+yHFiLxuvEpE82Mt/DkhAZ47gTvlnlivETSrSr+a6JW9Zmod0/0 2nYsEVw/e3FPiOagesEeUXxnV7JTqZ2cvo3KK88LulxSyIbwz6dH6NIpEcC/vzGrpBr/ 3JqxBz5dHWNZOp9Q4R9AIayojEOSUfwJ1FF73s6ENhwGG4gZ8c72a4Y+9WVRFn0iM2xY tajf2lHu1FQMt9h0yok4os5bJor24yxfLJ039XhGoUt0irUs71fDXJ/MAHn43G0Qxgfz 8DAw== X-Gm-Message-State: AOAM531hmO2wHLOfOz/fIHIroFJxtVFbaRsj3Z/1npnxQgzDH+SaciW5 EZheUFb6EaYwYF6QULZmOCcr7Q== X-Google-Smtp-Source: ABdhPJzqUtEw7Ja3sjtBGx+terae6/AGLT+5Iphc3r7rUYdY/P0UYh8UASXP+HWl66kBwvknWv+Dcg== X-Received: by 2002:a5d:6391:: with SMTP id p17mr3458296wru.118.1591258371351; Thu, 04 Jun 2020 01:12:51 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:50 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:21 +0200 Message-Id: <20200604081224.863494-16-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To improve coverage also annotate the gpu reset code itself, since that's called from other places than drm/scheduler (which is already annotated). Annotations nests, so this doesn't break anything, and allows easier testing. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a027a8f7b281..ac0286a5f2fc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4215,6 +4215,9 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ? true : false; bool audio_suspended = false; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); /* * Flush RAM to disk so that after reboot @@ -4243,6 +4246,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress", job ? job->base.id : -1, hive->hive_id); mutex_unlock(&hive->hive_lock); + dma_fence_end_signalling(fence_cookie); return 0; } @@ -4253,8 +4257,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, */ INIT_LIST_HEAD(&device_list); if (adev->gmc.xgmi.num_physical_nodes > 1) { - if (!hive) + if (!hive) { + dma_fence_end_signalling(fence_cookie); return -ENODEV; + } if (!list_is_first(&adev->gmc.xgmi.head, &hive->device_list)) list_rotate_to_front(&adev->gmc.xgmi.head, &hive->device_list); device_list_handle = &hive->device_list; @@ -4269,6 +4275,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, DRM_INFO("Bailing on TDR for s_job:%llx, as another already in progress", job ? job->base.id : -1); mutex_unlock(&hive->hive_lock); + dma_fence_end_signalling(fence_cookie); return 0; } @@ -4409,6 +4416,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (r) dev_info(adev->dev, "GPU reset end with ret = %d\n", r); + dma_fence_end_signalling(fence_cookie); return r; } From patchwork Thu Jun 4 08:12:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A4995138C for ; Thu, 4 Jun 2020 08:13:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8340D206C3 for ; Thu, 4 Jun 2020 08:13:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="g2y6+38x" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8340D206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 26578899B5; Thu, 4 Jun 2020 08:12:55 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by gabe.freedesktop.org (Postfix) with ESMTPS id D4A2E8984F for ; Thu, 4 Jun 2020 08:12:53 +0000 (UTC) Received: by mail-wr1-x442.google.com with SMTP id e1so5053535wrt.5 for ; Thu, 04 Jun 2020 01:12:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PAzY52AYv0RSDf9knhitt5KJNjp0ADK9OcL/la6MhG4=; b=g2y6+38xV996zbWQlHRfYfwmdoYrQtMSB9AuhumQ8EuoFThzOqHcB0+rQlar3nhlyH mzJYMNcNs5v0zXm1taJdTlBWcRFGmIG8H58vFVYh2UJyXnrrsWGD/zJxugQTr50yWENI OrKBQ9/cJ2DiSb8hsPkl8FtzjA9HerNQ7h37o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PAzY52AYv0RSDf9knhitt5KJNjp0ADK9OcL/la6MhG4=; b=okbeaqqr/AiL08yNd6gHu5c9l8cbbr03uxqrm0ehXyWWnORjaVfguZUJalZspkV0+v RIu8gkWS1uua0JoX9KpIm4gijTZflji8hvfHpN/ePjw4hlvcya1c6ngj+lpcsukvg1M8 Bw7PefonymlSOot435zI6CgqcHQnPiRXQyhJef9M238PnbDYf3/pieKvsMdZxvzFRTLM 06X2WPppHkF/x25ZWlGmw/nKCA5VjZ1GilmA0DG/XdeYr5lQk+hWrnp6vXVjBktwLO0Y cloAG7XnUX6RetNtP3DN9nHR0Oh22py1G0Mxj09PUX60BekwtUw/tzC7mJOEuPrJKgMq yJkA== X-Gm-Message-State: AOAM532h9RdN4p+XhuNjSd9CUUji47BIfAQwbNOKOF8xes4wqYvcIVRy 2dC5gdeT+A1vhiJJisqxn8EesA== X-Google-Smtp-Source: ABdhPJws7BZsTbZqn46ygWPxsi69ZQ9EdQZlg8FdQb2SX5+xMmbZiTCisZQm8VfjwqondDqR2/ssFg== X-Received: by 2002:a5d:4d89:: with SMTP id b9mr3497599wru.210.1591258372493; Thu, 04 Jun 2020 01:12:52 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:51 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:22 +0200 Message-Id: <20200604081224.863494-17-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is one from the department of "maybe play lottery if you hit this, karma compensation might work". Or at least lockdep ftw! This reverts commit 565d1941557756a584ac357d945bc374d5fcd1d0. It's not quite as low-risk as the commit message claims, because this grabs console_lock, which might be held when we allocate memory, which might never happen because the dma_fence_wait() is stuck waiting on our gpu reset: [ 136.763714] ====================================================== [ 136.763714] WARNING: possible circular locking dependency detected [ 136.763715] 5.7.0-rc3+ #346 Tainted: G W [ 136.763716] ------------------------------------------------------ [ 136.763716] kworker/2:3/682 is trying to acquire lock: [ 136.763716] ffffffff8226f140 (console_lock){+.+.}-{0:0}, at: drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763723] but task is already holding lock: [ 136.763724] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 136.763726] which lock already depends on the new lock. [ 136.763726] the existing dependency chain (in reverse order) is: [ 136.763727] -> #2 (dma_fence_map){++++}-{0:0}: [ 136.763730] __dma_fence_might_wait+0x41/0xb0 [ 136.763732] dma_resv_lockdep+0x171/0x202 [ 136.763734] do_one_initcall+0x5d/0x2f0 [ 136.763736] kernel_init_freeable+0x20d/0x26d [ 136.763738] kernel_init+0xa/0xfb [ 136.763740] ret_from_fork+0x27/0x50 [ 136.763740] -> #1 (fs_reclaim){+.+.}-{0:0}: [ 136.763743] fs_reclaim_acquire.part.0+0x25/0x30 [ 136.763745] kmem_cache_alloc_trace+0x2e/0x6e0 [ 136.763747] device_create_groups_vargs+0x52/0xf0 [ 136.763747] device_create+0x49/0x60 [ 136.763749] fb_console_init+0x25/0x145 [ 136.763750] fbmem_init+0xcc/0xe2 [ 136.763750] do_one_initcall+0x5d/0x2f0 [ 136.763751] kernel_init_freeable+0x20d/0x26d [ 136.763752] kernel_init+0xa/0xfb [ 136.763753] ret_from_fork+0x27/0x50 [ 136.763753] -> #0 (console_lock){+.+.}-{0:0}: [ 136.763755] __lock_acquire+0x1241/0x23f0 [ 136.763756] lock_acquire+0xad/0x370 [ 136.763757] console_lock+0x47/0x70 [ 136.763761] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763809] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu] [ 136.763850] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 136.763851] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 136.763852] process_one_work+0x23c/0x580 [ 136.763853] worker_thread+0x50/0x3b0 [ 136.763854] kthread+0x12e/0x150 [ 136.763855] ret_from_fork+0x27/0x50 [ 136.763855] other info that might help us debug this: [ 136.763856] Chain exists of: console_lock --> fs_reclaim --> dma_fence_map [ 136.763857] Possible unsafe locking scenario: [ 136.763857] CPU0 CPU1 [ 136.763857] ---- ---- [ 136.763857] lock(dma_fence_map); [ 136.763858] lock(fs_reclaim); [ 136.763858] lock(dma_fence_map); [ 136.763858] lock(console_lock); [ 136.763859] *** DEADLOCK *** [ 136.763860] 4 locks held by kworker/2:3/682: [ 136.763860] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 136.763862] #1: ffffc90000cafe58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 136.763863] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 136.763865] #3: ffff8887ab621748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x5ab/0xe7b [amdgpu] [ 136.763914] stack backtrace: [ 136.763915] CPU: 2 PID: 682 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346 [ 136.763916] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018 [ 136.763918] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 136.763919] Call Trace: [ 136.763922] dump_stack+0x8f/0xd0 [ 136.763924] check_noncircular+0x162/0x180 [ 136.763926] __lock_acquire+0x1241/0x23f0 [ 136.763927] lock_acquire+0xad/0x370 [ 136.763932] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763933] ? mark_held_locks+0x2d/0x80 [ 136.763934] ? _raw_spin_unlock_irqrestore+0x46/0x60 [ 136.763936] console_lock+0x47/0x70 [ 136.763940] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763944] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763993] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu] [ 136.764036] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 136.764038] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 136.764040] process_one_work+0x23c/0x580 [ 136.764041] worker_thread+0x50/0x3b0 [ 136.764042] ? process_one_work+0x580/0x580 [ 136.764044] kthread+0x12e/0x150 [ 136.764045] ? kthread_create_worker_on_cpu+0x70/0x70 [ 136.764046] ret_from_fork+0x27/0x50 Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ac0286a5f2fc..4c4492de670c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4063,8 +4063,6 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, if (r) goto out; - amdgpu_fbdev_set_suspend(tmp_adev, 0); - /* must succeed. */ amdgpu_ras_resume(tmp_adev); @@ -4305,8 +4303,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, */ amdgpu_unregister_gpu_instance(tmp_adev); - amdgpu_fbdev_set_suspend(tmp_adev, 1); - /* disable ras on ALL IPs */ if (!(in_ras_intr && !use_baco) && amdgpu_device_ip_need_full_reset(tmp_adev)) From patchwork Thu Jun 4 08:12:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587437 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A49F90 for ; Thu, 4 Jun 2020 08:13:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DB0422075B for ; Thu, 4 Jun 2020 08:13:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="hFJHKrGc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB0422075B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7CC7A6E2F9; Thu, 4 Jun 2020 08:12:57 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 30F5489A88 for ; Thu, 4 Jun 2020 08:12:55 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id l10so5016094wrr.10 for ; Thu, 04 Jun 2020 01:12:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xBHH8HV/ynkGTblaQCBVPIasEigLpWOOzD+/aDDngeE=; b=hFJHKrGcuuTLGVLvJHPMkbscDyUX8o+8t48i8gjHX/wrNO8TtoqmHu26IkBnYNObr4 KrI3LjBc0l2LHjqOn1T5MjPDcrAq4+mlQUnp02d+7XQ707QBLNKeZkrBDQuMdNYSE4rB ZVSbprOpIkob9k4ZgEk5Bvg5TCd7X96OYidgU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xBHH8HV/ynkGTblaQCBVPIasEigLpWOOzD+/aDDngeE=; b=B9FkoRZYhOkAaONY3b8FUCgIyfcs/WzI4+DLWWJIwrOzrBG1q+I7qEeqrkRkSolM5P HmSTf648yDNYuaUcQ3EfqV2nDxL6eFrRLoEwLeptTHfXLYvt2LPyQB7qYiAPZ9/uGiuT hYL6Y0I0OBaM5+a7oyNmClfBMoqGOgHab5YtMHFr+SRiCzJrNj7eYZ7fHnMagRTCuOTL 5xJmEvF2l97Y3bFEKS4zMjmQY3D8dduBKDubogBSc9XeixY2bvXK1i1VxL7+/+IxmcMR sAjKYFBTOuSYApNMwKs05HpCE2f6axddJ6HYtMuBAxam/SvOqN3eGBSZZ625C5eUsV0J 78nA== X-Gm-Message-State: AOAM530rdxIgpeSHKNrsDaLoznQwC0WmCAho9SQjU1Nla3PBS4RzuCnM sH+w62dZxdeUedTmPaSS6p4J8Q== X-Google-Smtp-Source: ABdhPJyObyzep5ImU/AAS5w8hAKuTWNCEAJ1espIeo8JSFze8n8cPO+9cK+VeTUgMxSDryXkAOC4zw== X-Received: by 2002:adf:a4dd:: with SMTP id h29mr3422543wrb.372.1591258373746; Thu, 04 Jun 2020 01:12:53 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:53 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:23 +0200 Message-Id: <20200604081224.863494-18-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" ... I think it's time to stop this little exercise. The lockdep splat, for the record: [ 132.583381] ====================================================== [ 132.584091] WARNING: possible circular locking dependency detected [ 132.584775] 5.7.0-rc3+ #346 Tainted: G W [ 132.585461] ------------------------------------------------------ [ 132.586184] kworker/2:3/865 is trying to acquire lock: [ 132.586857] ffffc90000677c70 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper] [ 132.587569] but task is already holding lock: [ 132.589044] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 132.589803] which lock already depends on the new lock. [ 132.592009] the existing dependency chain (in reverse order) is: [ 132.593507] -> #2 (dma_fence_map){++++}-{0:0}: [ 132.595019] dma_fence_begin_signalling+0x50/0x60 [ 132.595767] drm_atomic_helper_commit+0xa1/0x180 [drm_kms_helper] [ 132.596567] drm_client_modeset_commit_atomic+0x1ea/0x250 [drm] [ 132.597420] drm_client_modeset_commit_locked+0x55/0x190 [drm] [ 132.598178] drm_client_modeset_commit+0x24/0x40 [drm] [ 132.598948] drm_fb_helper_restore_fbdev_mode_unlocked+0x4b/0xa0 [drm_kms_helper] [ 132.599738] drm_fb_helper_set_par+0x30/0x40 [drm_kms_helper] [ 132.600539] fbcon_init+0x2e8/0x660 [ 132.601344] visual_init+0xce/0x130 [ 132.602156] do_bind_con_driver+0x1bc/0x2b0 [ 132.602970] do_take_over_console+0x115/0x180 [ 132.603763] do_fbcon_takeover+0x58/0xb0 [ 132.604564] register_framebuffer+0x1ee/0x300 [ 132.605369] __drm_fb_helper_initial_config_and_unlock+0x36e/0x520 [drm_kms_helper] [ 132.606187] amdgpu_fbdev_init+0xb3/0xf0 [amdgpu] [ 132.607032] amdgpu_device_init.cold+0xe90/0x1677 [amdgpu] [ 132.607862] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu] [ 132.608697] amdgpu_pci_probe+0xf7/0x180 [amdgpu] [ 132.609511] local_pci_probe+0x42/0x80 [ 132.610324] pci_device_probe+0x104/0x1a0 [ 132.611130] really_probe+0x147/0x3c0 [ 132.611939] driver_probe_device+0xb6/0x100 [ 132.612766] device_driver_attach+0x53/0x60 [ 132.613593] __driver_attach+0x8c/0x150 [ 132.614419] bus_for_each_dev+0x7b/0xc0 [ 132.615249] bus_add_driver+0x14c/0x1f0 [ 132.616071] driver_register+0x6c/0xc0 [ 132.616902] do_one_initcall+0x5d/0x2f0 [ 132.617731] do_init_module+0x5c/0x230 [ 132.618560] load_module+0x2981/0x2bc0 [ 132.619391] __do_sys_finit_module+0xaa/0x110 [ 132.620228] do_syscall_64+0x5a/0x250 [ 132.621064] entry_SYSCALL_64_after_hwframe+0x49/0xb3 [ 132.621903] -> #1 (crtc_ww_class_mutex){+.+.}-{3:3}: [ 132.623587] __ww_mutex_lock.constprop.0+0xcc/0x10c0 [ 132.624448] ww_mutex_lock+0x43/0xb0 [ 132.625315] drm_modeset_lock+0x44/0x120 [drm] [ 132.626184] drmm_mode_config_init+0x2db/0x8b0 [drm] [ 132.627098] amdgpu_device_init.cold+0xbd1/0x1677 [amdgpu] [ 132.628007] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu] [ 132.628920] amdgpu_pci_probe+0xf7/0x180 [amdgpu] [ 132.629804] local_pci_probe+0x42/0x80 [ 132.630690] pci_device_probe+0x104/0x1a0 [ 132.631583] really_probe+0x147/0x3c0 [ 132.632479] driver_probe_device+0xb6/0x100 [ 132.633379] device_driver_attach+0x53/0x60 [ 132.634275] __driver_attach+0x8c/0x150 [ 132.635170] bus_for_each_dev+0x7b/0xc0 [ 132.636069] bus_add_driver+0x14c/0x1f0 [ 132.636974] driver_register+0x6c/0xc0 [ 132.637870] do_one_initcall+0x5d/0x2f0 [ 132.638765] do_init_module+0x5c/0x230 [ 132.639654] load_module+0x2981/0x2bc0 [ 132.640522] __do_sys_finit_module+0xaa/0x110 [ 132.641372] do_syscall_64+0x5a/0x250 [ 132.642203] entry_SYSCALL_64_after_hwframe+0x49/0xb3 [ 132.643022] -> #0 (crtc_ww_class_acquire){+.+.}-{0:0}: [ 132.644643] __lock_acquire+0x1241/0x23f0 [ 132.645469] lock_acquire+0xad/0x370 [ 132.646274] drm_modeset_acquire_init+0xd2/0x100 [drm] [ 132.647071] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper] [ 132.647902] dm_suspend+0x1c/0x60 [amdgpu] [ 132.648698] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu] [ 132.649498] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu] [ 132.650300] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu] [ 132.651084] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 132.651825] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 132.652594] process_one_work+0x23c/0x580 [ 132.653402] worker_thread+0x50/0x3b0 [ 132.654139] kthread+0x12e/0x150 [ 132.654868] ret_from_fork+0x27/0x50 [ 132.655598] other info that might help us debug this: [ 132.657739] Chain exists of: crtc_ww_class_acquire --> crtc_ww_class_mutex --> dma_fence_map [ 132.659877] Possible unsafe locking scenario: [ 132.661416] CPU0 CPU1 [ 132.662126] ---- ---- [ 132.662847] lock(dma_fence_map); [ 132.663574] lock(crtc_ww_class_mutex); [ 132.664319] lock(dma_fence_map); [ 132.665063] lock(crtc_ww_class_acquire); [ 132.665799] *** DEADLOCK *** [ 132.667965] 4 locks held by kworker/2:3/865: [ 132.668701] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 132.669462] #1: ffffc90000677e58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 132.670242] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 132.671039] #3: ffff8887b84a1748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x59e/0xe64 [amdgpu] [ 132.671902] stack backtrace: [ 132.673515] CPU: 2 PID: 865 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346 [ 132.674347] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018 [ 132.675194] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 132.676046] Call Trace: [ 132.676897] dump_stack+0x8f/0xd0 [ 132.677748] check_noncircular+0x162/0x180 [ 132.678604] ? stack_trace_save+0x4b/0x70 [ 132.679459] __lock_acquire+0x1241/0x23f0 [ 132.680311] lock_acquire+0xad/0x370 [ 132.681163] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper] [ 132.682021] ? cpumask_next+0x16/0x20 [ 132.682880] ? module_assert_mutex_or_preempt+0x14/0x40 [ 132.683737] ? __module_address+0x28/0xf0 [ 132.684601] drm_modeset_acquire_init+0xd2/0x100 [drm] [ 132.685466] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper] [ 132.686335] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper] [ 132.687255] dm_suspend+0x1c/0x60 [amdgpu] [ 132.688152] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu] [ 132.689057] ? amdgpu_fence_process+0x4c/0x150 [amdgpu] [ 132.689963] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu] [ 132.690893] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu] [ 132.691818] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 132.692707] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 132.693597] process_one_work+0x23c/0x580 [ 132.694487] worker_thread+0x50/0x3b0 [ 132.695373] ? process_one_work+0x580/0x580 [ 132.696264] kthread+0x12e/0x150 [ 132.697154] ? kthread_create_worker_on_cpu+0x70/0x70 [ 132.698057] ret_from_fork+0x27/0x50 Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 4c4492de670c..3ea4b9258fb0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2441,6 +2441,14 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev) /* displays are handled separately */ if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) { /* XXX handle errors */ + + /* + * This is dm_suspend, which calls modeset locks, and + * that a pretty good inversion against dma_fence_signal + * which gpu recovery is supposed to guarantee. + * + * Dont ask me how to fix this. + */ r = adev->ip_blocks[i].version->funcs->suspend(adev); /* XXX handle errors */ if (r) { From patchwork Thu Jun 4 08:12:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11587445 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C9DF8138C for ; Thu, 4 Jun 2020 08:13:38 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A897720738 for ; Thu, 4 Jun 2020 08:13:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="MV7FJtoE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A897720738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F30316E2DD; Thu, 4 Jun 2020 08:13:00 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id 37FF46E30C for ; Thu, 4 Jun 2020 08:12:56 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id f185so4619922wmf.3 for ; Thu, 04 Jun 2020 01:12:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tFPcYYVPEczCJNqu565ZZzRL31Ro7iqoJLud+V9/W6s=; b=MV7FJtoEKLyy0r+Z0KoFPEtyhRXEwQ/IbBpYLu8LO9Uc78Ax/2RYA4vzBMjtiJcK6p AdKwrHzr6dzh+N3ms51V1I6le1IjLWDF+LcV9lCl9LStGwUptD8GwQ/Htx2fZJN8DVly sAfPeJLEgUIdNH8S/XB9b4lCyMgUL5KbMtjSc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tFPcYYVPEczCJNqu565ZZzRL31Ro7iqoJLud+V9/W6s=; b=oOADczGH3dNsgMoHDA2cZ0r0MvGB5VBD6DX0lvFiWht3n+Qg/c+yzfoXpTjG/EOnN6 G56bO4uWMulVHHR2NffI9TTkq1usDyu+LoRsPPYzTIcr4G4VtHvaXDKNb9LgsT0/iLLU jMw0CmJxziDOlLz+dxzxnJZJAdNvNhlw+8WxGi5uy4eRfRnNSUZJM8GQS4B8kTZnG9xn V9KEWxA1m7+oToV7dRwqmTCNr6MDqU+Q79ylcKGnnHZ0Eyn5QDwSDzFpPig844+pJ5TY OJEwKqPdkiWo200iLq5mtxPt7bgLq6tuLmYaT+u1N5Z0ayxBWjgpp6QdM2JCoJAhldtC nHtw== X-Gm-Message-State: AOAM5337B0ZR6clyF1YtiEQg8WlXJNwO9qRjG0XNtfYN1FWzqbSLh1ZR VbS5Js7io5YIirYfcCCk8vX7pA== X-Google-Smtp-Source: ABdhPJwyTSsHObvqYxt5vjWFqYKxgNRr/Tg50mOhhgXdXRLp1pp0LpRagZgRAkv0R2QBG8+IZNWOoQ== X-Received: by 2002:a7b:c18a:: with SMTP id y10mr3020133wmi.73.1591258374887; Thu, 04 Jun 2020 01:12:54 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f11sm6873305wrj.2.2020.06.04.01.12.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jun 2020 01:12:54 -0700 (PDT) From: Daniel Vetter To: DRI Development Date: Thu, 4 Jun 2020 10:12:24 +0200 Message-Id: <20200604081224.863494-19-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200604081224.863494-1-daniel.vetter@ffwll.ch> References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/18] drm/i915: Annotate dma_fence_work X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-rdma@vger.kernel.org, Daniel Vetter , Intel Graphics Development , LKML , amd-gfx@lists.freedesktop.org, Chris Wilson , linaro-mm-sig@lists.linaro.org, Daniel Vetter , =?utf-8?q?Christian_K=C3=B6nig?= , linux-media@vger.kernel.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" i915 does tons of allocations from this worker, which lockdep catches. Also generic infrastructure like this with big potential for how dma_fence or other cross driver contracts work, really should be reviewed on dri-devel. Implementing custom wheels for everything within the driver is a classic case of "platform problem" [1]. Which in upstream we really shouldn't have. Since there's no quick way to solve these splats (dma_fence_work is used a bunch in basic buffer management and command submission) like for amdgpu, I'm giving up at this point here. Annotating i915 scheduler and gpu reset could would be interesting, but since lockdep is one-shot we can't see what surprises would lurk there. 1: https://lwn.net/Articles/443531/ Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_sw_fence_work.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c index a3a81bb8f2c3..5b74acadaef5 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c @@ -17,12 +17,15 @@ static void fence_work(struct work_struct *work) { struct dma_fence_work *f = container_of(work, typeof(*f), work); int err; + bool fence_cookie; + fence_cookie = dma_fence_begin_signalling(); err = f->ops->work(f); if (err) dma_fence_set_error(&f->dma, err); fence_complete(f); + dma_fence_end_signalling(fence_cookie); dma_fence_put(&f->dma); }