From patchwork Fri Oct 23 12:21:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852871 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 177FE1580 for ; Fri, 23 Oct 2020 12:22:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E098C21527 for ; Fri, 23 Oct 2020 12:22:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="SBvkme8o" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463588AbgJWMW2 (ORCPT ); Fri, 23 Oct 2020 08:22:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463585AbgJWMW2 (ORCPT ); Fri, 23 Oct 2020 08:22:28 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE60BC0613D4 for ; Fri, 23 Oct 2020 05:22:25 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id n15so1596451wrq.2 for ; Fri, 23 Oct 2020 05:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IgQVmlN66kTzdCBU8ZoKnJMQUq1PyY1GSJDM/XX0ojY=; b=SBvkme8oy7kGO+RfY6ceOp15GpaJIvk6RoLhvV/WueIm6Pg+oG/7v+xKkGaetsoFKT 4Jrw4O5tvC4Old6EBSgoM8lbYnG5Co58mGGq/XlAbL+NJ+J1nzTnHFdqYbcE/fjum1hb sSY+nGawYtOErvna7ZMT/xH6GPplKKq36FBvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IgQVmlN66kTzdCBU8ZoKnJMQUq1PyY1GSJDM/XX0ojY=; b=b3+Ze/HlNqXCGWx+TjVOp9rD/Z36SvntVSX49QoUMtnZLhcRoT4a1goeC2nlWQg78C I9UT4v2jxGR2qyITCydDV/ZyinQyrWBaM26dWAr63kf+F7AQTTKIyecaTuAjitCicdxb smCgcrQ+NfVrm6JPFzrKOfQtyz7m3BweZ3pF+g/qNi2/WK7obAssPMhj8AM95X5xlaOZ dXL4tX+kF1xAVOozKVUpBxhY4+04E3vlTWk8G+OaBvZmJ+NIqEEsRJ1HcGg+jbSHJkwA oaPlpWPCccJFmoojLcpca7OVki+WQajsiQOLadhTXlVx/tB4lU1ARe7eq7ThBaWrs766 0nog== X-Gm-Message-State: AOAM530VIthJvhLRNKSBVthGWgQUBPQHCWJGVRJtGjYWV/SsXOvxS96q Hnpk8gL8PAGsLvGiHdhuzUFoiHY8bT3Uw+gP X-Google-Smtp-Source: ABdhPJznoMnt3wkQ258dPqEZNcn4LIp7cxpxLk7m7dixQUbSTt/QJMXc3d+QHpQiaKAR3urrcsOBqA== X-Received: by 2002:adf:e8c7:: with SMTP id k7mr2365623wrn.102.1603455744516; Fri, 23 Oct 2020 05:22:24 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:23 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-fsdevel@vger.kernel.org, Dave Chinner , Qian Cai , linux-xfs@vger.kernel.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Andrew Morton , Jason Gunthorpe , linux-mm@kvack.org, linux-rdma@vger.kernel.org, Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 03/65] mm: Track mmu notifiers in fs_reclaim_acquire/release Date: Fri, 23 Oct 2020 14:21:14 +0200 Message-Id: <20201023122216.2373294-3-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org fs_reclaim_acquire/release nicely catch recursion issues when allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend to use to keep the excessive caches in check). For mmu notifier recursions we do have lockdep annotations since 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end"). But these only fire if a path actually results in some pte invalidation - for most small allocations that's very rarely the case. The other trouble is that pte invalidation can happen any time when __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe choice, GFP_NOIO isn't good enough to avoid potential mmu notifier recursion. I was pondering whether we should just do the general annotation, but there's always the risk for false positives. Plus I'm assuming that the core fs and io code is a lot better reviewed and tested than random mmu notifier code in drivers. Hence why I decide to only annotate for that specific case. Furthermore even if we'd create a lockdep map for direct reclaim, we'd still need to explicit pull in the mmu notifier map - there's a lot more places that do pte invalidation than just direct reclaim, these two contexts arent the same. Note that the mmu notifiers needing their own independent lockdep map is also the reason we can't hold them from fs_reclaim_acquire to fs_reclaim_release - it would nest with the acquistion in the pte invalidation code, causing a lockdep splat. And we can't remove the annotations from pte invalidation and all the other places since they're called from many other places than page reclaim. Hence we can only do the equivalent of might_lock, but on the raw lockdep map. With this we can also remove the lockdep priming added in 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep") since the new annotations are strictly more powerful. v2: Review from Thomas Hellstrom: - unbotch the fs_reclaim context check, I accidentally inverted it, but it didn't blow up because I inverted it immediately - fix compiling for !CONFIG_MMU_NOTIFIER v3: Unbreak the PF_MEMALLOC_ context flags. Thanks to Qian for the report and Dave for explaining what I failed to see. Cc: linux-fsdevel@vger.kernel.org Cc: Dave Chinner Cc: Qian Cai Cc: linux-xfs@vger.kernel.org Cc: Thomas Hellström (Intel) Cc: Andrew Morton Cc: Jason Gunthorpe Cc: linux-mm@kvack.org Cc: linux-rdma@vger.kernel.org Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- mm/mmu_notifier.c | 7 ------- mm/page_alloc.c | 31 ++++++++++++++++++++----------- 2 files changed, 20 insertions(+), 18 deletions(-) diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 4fc918163dd3..6bf798373eb0 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription, mmap_assert_write_locked(mm); BUG_ON(atomic_read(&mm->mm_users) <= 0); - if (IS_ENABLED(CONFIG_LOCKDEP)) { - fs_reclaim_acquire(GFP_KERNEL); - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map); - lock_map_release(&__mmu_notifier_invalidate_range_start_map); - fs_reclaim_release(GFP_KERNEL); - } - if (!mm->notifier_subscriptions) { /* * kmalloc cannot be called under mm_take_all_locks(), but we diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 780c8f023b28..2edd3fd447fa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -4207,10 +4208,8 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla static struct lockdep_map __fs_reclaim_map = STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); -static bool __need_fs_reclaim(gfp_t gfp_mask) +static bool __need_reclaim(gfp_t gfp_mask) { - gfp_mask = current_gfp_context(gfp_mask); - /* no reclaim without waiting on it */ if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) return false; @@ -4219,10 +4218,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) if (current->flags & PF_MEMALLOC) return false; - /* We're only interested __GFP_FS allocations for now */ - if (!(gfp_mask & __GFP_FS)) - return false; - if (gfp_mask & __GFP_NOLOCKDEP) return false; @@ -4241,15 +4236,29 @@ void __fs_reclaim_release(void) void fs_reclaim_acquire(gfp_t gfp_mask) { - if (__need_fs_reclaim(gfp_mask)) - __fs_reclaim_acquire(); + gfp_mask = current_gfp_context(gfp_mask); + + if (__need_reclaim(gfp_mask)) { + if (gfp_mask & __GFP_FS) + __fs_reclaim_acquire(); + +#ifdef CONFIG_MMU_NOTIFIER + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map); + lock_map_release(&__mmu_notifier_invalidate_range_start_map); +#endif + + } } EXPORT_SYMBOL_GPL(fs_reclaim_acquire); void fs_reclaim_release(gfp_t gfp_mask) { - if (__need_fs_reclaim(gfp_mask)) - __fs_reclaim_release(); + gfp_mask = current_gfp_context(gfp_mask); + + if (__need_reclaim(gfp_mask)) { + if (gfp_mask & __GFP_FS) + __fs_reclaim_release(); + } } EXPORT_SYMBOL_GPL(fs_reclaim_release); #endif From patchwork Fri Oct 23 12:21:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852873 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E37B92C for ; Fri, 23 Oct 2020 12:22:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 296812168B for ; Fri, 23 Oct 2020 12:22:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="HhyQAz2t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463592AbgJWMWa (ORCPT ); Fri, 23 Oct 2020 08:22:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S369869AbgJWMW3 (ORCPT ); Fri, 23 Oct 2020 08:22:29 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C232FC0613D2 for ; Fri, 23 Oct 2020 05:22:28 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id d78so1227952wmd.3 for ; Fri, 23 Oct 2020 05:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l68XO7xQj33o6JWbWMWZkGONENy/DxFvUnFeVCTeGi8=; b=HhyQAz2t2mLzwyAZiSSqkGeFXWhJ/QAO3Ky4yYkOswG8UL5+IJIekmejZtr/Eu7jli 7O76qRpeBjYlXMgT5JKngM4+wE6qjPXobuph0BlMxT+HEJnO3dlg5uKhDEa8NIDEmcJC 1+J+Tolig6BVzwsRsAXFS1UU+lSADV+e+2DBs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l68XO7xQj33o6JWbWMWZkGONENy/DxFvUnFeVCTeGi8=; b=d1r2M1hiCOft+vOnlXilpsu+dQpEwN3UyetOY5jlTNbP549cVIt/dMgErlzgrMjydM ugK4dcCKqqKLyWUyfbv9pCZ3QYaSRw1H73fjpecNtuzBxbtMqpG8KeadxVzIoAOwjdp4 opam4xPlo2UCp8BO7TeBLl7jritlFUryWdMYHVemkdFnZUOun5vwK3k8MzxMPkbUdDzl aPAiz8zrHL3Oz1Fdd/g5H2bl2109D/ojyYHWovi3flTmTrrTrydb2OL19YbZ3OvfSmop z4/9fjAIilda9neRIWgX3mQo0RAXOAjwswSEMt5fKcZBFRkoQnyww+n3LD/oVML4H1YJ X66g== X-Gm-Message-State: AOAM531ha2P6x4nPTaGHk6LfpDZHmIkkFm649mYASQX8wt5jKDsWgLnq 2ucr2Muhuef6uL3qbGrzzgljVA== X-Google-Smtp-Source: ABdhPJwPwU1hLA+MfVAzbSi/5oxo75UPCs85Tq0o08JbHiqr38YS1fGUKKAsWwR4dLGwDNigNva1dA== X-Received: by 2002:a7b:c183:: with SMTP id y3mr2049208wmi.84.1603455747583; Fri, 23 Oct 2020 05:22:27 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:26 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 05/65] drm/atomic-helper: Add dma-fence annotations Date: Fri, 23 Oct 2020 14:21:16 +0200 Message-Id: <20201023122216.2373294-5-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This is a bit disappointing since we need to split the annotations over all the different parts. I was considering just leaking the critical section into the ->atomic_commit_tail callback of each driver. But that would mean we need to pass the fence_cookie into each driver (there's a total of 13 implementations of this hook right now), so bad flag day. And also a bit leaky abstraction. Hence just do it function-by-function. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/drm_atomic_helper.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 549a31e6042c..23013209d4bf 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1567,6 +1567,7 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_flip_done); void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state) { struct drm_device *dev = old_state->dev; + bool fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_commit_modeset_disables(dev, old_state); @@ -1578,6 +1579,8 @@ void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state) drm_atomic_helper_commit_hw_done(old_state); + dma_fence_end_signalling(fence_cookie); + drm_atomic_helper_wait_for_vblanks(dev, old_state); drm_atomic_helper_cleanup_planes(dev, old_state); @@ -1597,6 +1600,7 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_tail); void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state) { struct drm_device *dev = old_state->dev; + bool fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_commit_modeset_disables(dev, old_state); @@ -1609,6 +1613,8 @@ void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state) drm_atomic_helper_commit_hw_done(old_state); + dma_fence_end_signalling(fence_cookie); + drm_atomic_helper_wait_for_vblanks(dev, old_state); drm_atomic_helper_cleanup_planes(dev, old_state); @@ -1624,6 +1630,9 @@ static void commit_tail(struct drm_atomic_state *old_state) ktime_t start; s64 commit_time_ms; unsigned int i, new_self_refresh_mask = 0; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); funcs = dev->mode_config.helper_private; @@ -1652,6 +1661,8 @@ static void commit_tail(struct drm_atomic_state *old_state) if (new_crtc_state->self_refresh_active) new_self_refresh_mask |= BIT(i); + dma_fence_end_signalling(fence_cookie); + if (funcs && funcs->atomic_commit_tail) funcs->atomic_commit_tail(old_state); else @@ -1810,6 +1821,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, bool nonblock) { int ret; + bool fence_cookie; if (state->async_update) { ret = drm_atomic_helper_prepare_planes(dev, state); @@ -1832,6 +1844,8 @@ int drm_atomic_helper_commit(struct drm_device *dev, if (ret) return ret; + fence_cookie = dma_fence_begin_signalling(); + if (!nonblock) { ret = drm_atomic_helper_wait_for_fences(dev, state, true); if (ret) @@ -1869,6 +1883,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, */ drm_atomic_state_get(state); + dma_fence_end_signalling(fence_cookie); if (nonblock) queue_work(system_unbound_wq, &state->commit_work); else @@ -1877,6 +1892,7 @@ int drm_atomic_helper_commit(struct drm_device *dev, return 0; err: + dma_fence_end_signalling(fence_cookie); drm_atomic_helper_cleanup_planes(dev, state); return ret; } From patchwork Fri Oct 23 12:21:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852889 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98D1492C for ; Fri, 23 Oct 2020 12:22:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 717EB21527 for ; Fri, 23 Oct 2020 12:22:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="h2UcRa9U" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463590AbgJWMWc (ORCPT ); Fri, 23 Oct 2020 08:22:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463594AbgJWMWb (ORCPT ); Fri, 23 Oct 2020 08:22:31 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F316C0613CE for ; Fri, 23 Oct 2020 05:22:30 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id c16so1298354wmd.2 for ; Fri, 23 Oct 2020 05:22:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+fkOMfIAhHnIvKJblgrObApVFyijBiqdZwYQpautVWE=; b=h2UcRa9UQoJClDthSwfwclF7bhL3oBVRME7c7/5zSRqhORAazkbgsGc4/vw6kIZtIK fhou2mbGdVANXK1CAFqIvL046sqBvxJGq5gDLkYCxZejl9o44OcJnB+RHw52tVnqbiHm JEDd6QHk5g7ytqeIZjF9fLUc3+7QOoDG7T1g8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+fkOMfIAhHnIvKJblgrObApVFyijBiqdZwYQpautVWE=; b=sWiGKDzZJmuHdWVU+jTOEGKXmSD2rGgeD2hZPrlmHJXpyHx/4eHR5lEnf+LfAYYyeI 6J+M7sWZpLZ7ubTKRl+h8RFbXTh10TCMQv7glkKOQxKds/MWo6ixixOeW1Vj0zYcq1Bn hqnoqe/8Oo7FvRFtNyE78K3pg5DiC0ub0++orgDsY+MV0H7auIXxyyfiz5UoS/bwVSAs ef2TcWyW5aiGGHdJYSUgIm+7tFdS3+P5z2+V2KG6PZ144EgEeRcEM9UUnYLgjY43b/3h YNGDGcY4gQ8USENwQjnChbsTEhaqhb2l0Eoi64BDbzkK/ro5XlKs09cal1FUbjFi5RhU S1GA== X-Gm-Message-State: AOAM533vbiXVv7Kskynz1i0W30edPDNdRqFg+Vls52BGvWpI6Yt5h5ZO vaMylu4TtEJ3+gGoWbr8SNwDjA== X-Google-Smtp-Source: ABdhPJxH4xe48wzE/NSho9ypMHFrSsNZaDWF+abLcTHBzoHtv3gPf22+aRfQRDawp5z3s1pKd7nXCw== X-Received: by 2002:a1c:4445:: with SMTP id r66mr2000497wma.140.1603455748879; Fri, 23 Oct 2020 05:22:28 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:28 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , Melissa Wen , Rodrigo Siqueira , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter , Haneen Mohammed , Daniel Vetter Subject: [PATCH 06/65] drm/vkms: Annotate vblank timer Date: Fri, 23 Oct 2020 14:21:17 +0200 Message-Id: <20201023122216.2373294-6-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This is needed to signal the fences from page flips, annotate it accordingly. We need to annotate entire timer callback since if we get stuck anywhere in there, then the timer stops, and hence fences stop. Just annotating the top part that does the vblank handling isn't enough. Tested-by: Melissa Wen Reviewed-by: Rodrigo Siqueira Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter Cc: Rodrigo Siqueira Cc: Haneen Mohammed Cc: Daniel Vetter --- drivers/gpu/drm/vkms/vkms_crtc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c index e43e4e1b268a..8124d8f2ee15 100644 --- a/drivers/gpu/drm/vkms/vkms_crtc.c +++ b/drivers/gpu/drm/vkms/vkms_crtc.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0+ +#include + #include #include #include @@ -14,7 +16,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) struct drm_crtc *crtc = &output->crtc; struct vkms_crtc_state *state; u64 ret_overrun; - bool ret; + bool ret, fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns); @@ -49,6 +53,8 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) DRM_DEBUG_DRIVER("Composer worker already queued\n"); } + dma_fence_end_signalling(fence_cookie); + return HRTIMER_RESTART; } From patchwork Fri Oct 23 12:21:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852879 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BDE221580 for ; Fri, 23 Oct 2020 12:22:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 955992225F for ; Fri, 23 Oct 2020 12:22:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="fiTRFOwo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463606AbgJWMWc (ORCPT ); Fri, 23 Oct 2020 08:22:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463593AbgJWMWb (ORCPT ); Fri, 23 Oct 2020 08:22:31 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48571C0613D5 for ; Fri, 23 Oct 2020 05:22:31 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id j7so1561451wrt.9 for ; Fri, 23 Oct 2020 05:22:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EDTVofbCMc0egYLRlqXIN6LpopRSTfzGEie/Zsh4fwY=; b=fiTRFOwoZwB1hxJNZHI2zlhvHk6752nqcuemPo2p7RB3dgJfMrL8zJG1C7b96wXT0o CIuclNq70jhEjpjWLbUyPf7ujefkEfl/ix1Rp2p+jmflfFn3BtP6l7y4oTaoorGxqss9 F7MzjoJPPKYCHdVP2dcH2TfnEkXmMA3UKzRks= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EDTVofbCMc0egYLRlqXIN6LpopRSTfzGEie/Zsh4fwY=; b=FUu+g0DF/Ls6HJ0/jWR0KixD7hU+kE6HvruZzA6xyXUccRDpyt/DbW5H+dsr1WCoB6 8HHl3kwU41dwZuc25cMMu5gCcInfSaGPV2xecqjokIDOv3VhE+6mziIZrLVPkVk0TIa5 iO1Sr7s3ZPXuslh4aIW0KHrjyhNOeKMIBpC2IG8UEQXwgD3Jaij+6JXa0GGuz2WG3kj6 4OzrhQTvWYvkVYubPww4bg8PWiMxBWJdzJVnp3+wm+k9IJHZK9WjQo36LgQhDC533m33 a6i48cgiXao44AW3Br6CJHssGnzTKsOLznJhaWng3NLTplJQDr4WqQcVZDbA1O/MwNlZ FNFw== X-Gm-Message-State: AOAM533HHnMuZAkS0L4zSCq+ojUJ6WZOBdMmOgN5qlbJL7ZGiQNcvcsk PvScXRglYv03+zKkzf88YZ7KzA== X-Google-Smtp-Source: ABdhPJy744SH+1F0vP1OFiZleMBBw7T3HuaGHdSIESqwHpxlqcTXhfEAzxhZQrMJs3qkcrYSV/lWWg== X-Received: by 2002:a5d:4802:: with SMTP id l2mr2300863wrq.282.1603455750031; Fri, 23 Oct 2020 05:22:30 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:29 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 07/65] drm/vblank: Annotate with dma-fence signalling section Date: Fri, 23 Oct 2020 14:21:18 +0200 Message-Id: <20201023122216.2373294-7-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This is rather overkill since currently all drivers call this from hardirq (or at least timers). But maybe in the future we're going to have thread irq handlers and what not, doesn't hurt to be prepared. Plus this is an easy start for sprinkling these fence annotations into shared code. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/drm_vblank.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index f135b79593dd..ba7e741764aa 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -24,6 +24,7 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include #include #include #include @@ -1913,7 +1914,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; unsigned long irqflags; - bool disable_irq; + bool disable_irq, fence_cookie; if (drm_WARN_ON_ONCE(dev, !drm_dev_has_vblank(dev))) return false; @@ -1921,6 +1922,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) return false; + fence_cookie = dma_fence_begin_signalling(); + spin_lock_irqsave(&dev->event_lock, irqflags); /* Need timestamp lock to prevent concurrent execution with @@ -1933,6 +1936,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (!vblank->enabled) { spin_unlock(&dev->vblank_time_lock); spin_unlock_irqrestore(&dev->event_lock, irqflags); + dma_fence_end_signalling(fence_cookie); return false; } @@ -1959,6 +1963,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) if (disable_irq) vblank_disable_fn(&vblank->disable_timer); + dma_fence_end_signalling(fence_cookie); + return true; } EXPORT_SYMBOL(drm_handle_vblank); From patchwork Fri Oct 23 12:21:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852881 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C8E691580 for ; Fri, 23 Oct 2020 12:22:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A29B32168B for ; Fri, 23 Oct 2020 12:22:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="QL0X80IR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463591AbgJWMWg (ORCPT ); Fri, 23 Oct 2020 08:22:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463609AbgJWMWc (ORCPT ); Fri, 23 Oct 2020 08:22:32 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 857DEC0613CE for ; Fri, 23 Oct 2020 05:22:32 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id j7so1561515wrt.9 for ; Fri, 23 Oct 2020 05:22:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9/R15F1LsgpmrfF3NNCzaFiRsuDDTsIACKqWmG0SISo=; b=QL0X80IRynXXMUZCjYkwF4TVxPMv6OvJXEja8o1OkGXOMBO4FW8qKjIlQHiTaMX6lC Vted5qkM0/57rDsZq+bhda1RMP0UNp2QiZQKvEGxG3qdHN+b54wJSOKQ91pPGKKIz2Ar +nKGcH1SKaXqQODHi7K9iRIwpw1YNdCyzAKLI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9/R15F1LsgpmrfF3NNCzaFiRsuDDTsIACKqWmG0SISo=; b=R7VOBzx+MeFwzTB4D7eUWZF/fmHfs2pV914Vatu6tKxtzPf5rN5BiK0kf89C5iq5bw FnmgwhgKYI0nLJx4Co4vX0SEUsQs2rbx8qBmyiJvRJkcPynjGzUy116PM+qY/keLljJG waTbwub7i7dkZ1M5lPWd0z8RRjDg6c1OHYyYHv3lkAnQECBzGnkgHn3W7ZriOh1FrKu8 Gyg6EnUvtZmFIdjBECQm0pdWk/C2JBrt23wHb1nXyvU2tFqXoOi4fIWI5J8L9Avv376q ExzDJFDEQJsyqE12/qP4vFAT+4Ia0h2j7DPazWj7S8porYp5IxRqQtaroRM2BN71nBVd xacw== X-Gm-Message-State: AOAM531r9ll0zm/n8FaNmGax/BN4mkF3Jtvq6czGwj0ZqHnceqnlPS0g IeECv4JKCYNGfxsU7a7I7KbQVA== X-Google-Smtp-Source: ABdhPJzjwHFFX0eYKKTO5kCI7ZhG3p97yKYVzt0PU/hjOwBpbNrm4zEFSBi8+JEVc4oqTj3X2egovA== X-Received: by 2002:adf:a1cb:: with SMTP id v11mr2261703wrv.86.1603455751311; Fri, 23 Oct 2020 05:22:31 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:30 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 08/65] drm/amdgpu: add dma-fence annotations to atomic commit path Date: Fri, 23 Oct 2020 14:21:19 +0200 Message-Id: <20201023122216.2373294-8-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org I need a canary in a ttm-based atomic driver to make sure the dma_fence_begin/end_signalling annotations actually work. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bb1bc7f5d149..b05fecf06f25 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -57,6 +57,7 @@ #include "ivsrcid/ivsrcid_vislands30.h" +#include #include #include #include @@ -7492,6 +7493,9 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state; int crtc_disable_count = 0; bool mode_set_reset_required = false; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); drm_atomic_helper_update_legacy_modeset_state(dev, state); drm_atomic_helper_calc_timestamping_constants(state); @@ -7816,6 +7820,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) /* Signal HW programming completion */ drm_atomic_helper_commit_hw_done(state); + dma_fence_end_signalling(fence_cookie); + if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state); From patchwork Fri Oct 23 12:21:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852893 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A434E1580 for ; Fri, 23 Oct 2020 12:22:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7EF3821527 for ; Fri, 23 Oct 2020 12:22:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="QRe2K4ND" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463619AbgJWMWn (ORCPT ); Fri, 23 Oct 2020 08:22:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463616AbgJWMWn (ORCPT ); Fri, 23 Oct 2020 08:22:43 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C691AC0613D2 for ; Fri, 23 Oct 2020 05:22:42 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id e2so1303797wme.1 for ; Fri, 23 Oct 2020 05:22:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Oo+WCqNcRY4hz8ARHU44AzglCk3LcvfnVh+Q+d4r9RU=; b=QRe2K4NDzMa3XvVHnklPYEQBWwaTYYKSqPpSchH+IzdO7uoZMOJmLrYKxFyPuFedgt hBGRQ8jSDsXFVPkf4tno5kbL04YE1J5uETPopMp2haTMemX81cxn8GM2SG8tLS8LW1Tq 3bmfMo7wGggEbrwFcrXY/XzlIA2yGcA0FzNrU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Oo+WCqNcRY4hz8ARHU44AzglCk3LcvfnVh+Q+d4r9RU=; b=SeHkQeL5rblOSGLIuTUkYviPikB/WDBp+42rzYoRSBV4pYNX5AAn+bBYZ71nyXqjKy yQFTAc807S96rCOSP+4gpJgyKiPeWdtg+HHWLyXxz1iDYPFH6DOCIp1q/FWGgfNR+0bn zjrRwHBwZftmX0ZO5YRNDYWrfLMQMnT/MuKuSpjZd814EqxWC6GxEmNaD5w9mRsmfgwX egWBmVebfjyYviEhePryBNUe2ZJckHmMF7r/LZn0zjW8IRabqBAs2XKtorhgvr3hwCfe qJd7CIpQC8exWBaZnYmpHylW4fe4ALNFIPi1fyJ4sC9lVYW7jNvIkN5a3uIwJ1u1wvhX UlSg== X-Gm-Message-State: AOAM532h3/ipdJZ4Lg5lJ0YUBp9ebfpZ56+U4PbQfQJKKDr3FMfBPFek 3uYfWNXN1T6+gXW0IpLj74Ld9De90R9QaoAn X-Google-Smtp-Source: ABdhPJxmIrYyotlMUykKvZ8yFRoXhmXv4OuqQiJQL/V4ipGjIXJX6Lq0IMzK4dvfWtQS6XzUEHIJNg== X-Received: by 2002:a7b:c401:: with SMTP id k1mr2023294wmi.120.1603455761598; Fri, 23 Oct 2020 05:22:41 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:41 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 17/65] drm/scheduler: use dma-fence annotations in main thread Date: Fri, 23 Oct 2020 14:21:28 +0200 Message-Id: <20201023122216.2373294-17-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org If the scheduler rt thread gets stuck on a mutex that we're holding while waiting for gpu workloads to complete, we have a problem. Add dma-fence annotations so that lockdep can check this for us. I've tried to quite carefully review this, and I think it's at the right spot. But obviosly no expert on drm scheduler. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/scheduler/sched_main.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 9a0d77a68018..f69abc4e70d3 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -764,9 +764,12 @@ static int drm_sched_main(void *param) { struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param; int r; + bool fence_cookie; sched_set_fifo_low(current); + fence_cookie = dma_fence_begin_signalling(); + while (!kthread_should_stop()) { struct drm_sched_entity *entity = NULL; struct drm_sched_fence *s_fence; @@ -824,6 +827,9 @@ static int drm_sched_main(void *param) wake_up(&sched->job_scheduled); } + + dma_fence_end_signalling(fence_cookie); + return 0; } From patchwork Fri Oct 23 12:21:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852897 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD03A92C for ; Fri, 23 Oct 2020 12:22:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B811221527 for ; Fri, 23 Oct 2020 12:22:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="da0uNPQT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463616AbgJWMWo (ORCPT ); Fri, 23 Oct 2020 08:22:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463615AbgJWMWo (ORCPT ); Fri, 23 Oct 2020 08:22:44 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4576C0613D2 for ; Fri, 23 Oct 2020 05:22:43 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id n6so1553027wrm.13 for ; Fri, 23 Oct 2020 05:22:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KOyGCTKuuJziCpWzWdCo/QK3Mayinv6OMW6Q5kQhnyk=; b=da0uNPQTF9iafrGvjttCTo8KaCRGJJ0losxnfEnQ+ctYYMk40w+JM+UOvxXYJaJDlG xnhBhyiiG2KhEERw/wSo3xnYtahtTOoeS3KP8SHjgUOynK4D05IBt6vq+GpjyIyyqglj o++XAyFlbf9CE2K6mHWsMlao1Qc55VGiPnPJM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KOyGCTKuuJziCpWzWdCo/QK3Mayinv6OMW6Q5kQhnyk=; b=f4GEKGVqivDIjfQgIB0MQE6kD1vRKkyUoUqh4Q0xgU6q3DVIJfGQF0Wh8/D2aTb5DD 6FRbfQmiiZYaaFVhVFX02g3GWUWVfHfxOqRMzEpxeKOtxlJGGLotKzO2rIlvo9fc7/WN s6AzJ9II5ojoZMAhIEYYbLEiHmXyaTHMH9c4haNb+Uv0lpLUZeu85DqXv0wMvEfHfkSU ZiLWd2lKY8UfjV9BI8crEUpAoxfNEX5isxsXNao6O9mgFSQqCwZSrtt9wiM3kfYK6PMM gNFWUVchcU6dtwH0TyxK5INhBzyWdnLbh8k0fzZWLAJiV9X3our40a4w32LePsuMyZ7Q onWQ== X-Gm-Message-State: AOAM5329Zq12W458bWx4ghO1HcEBYV7wLzbbRZXWnFG4J21v3wnqnqLL 9k2lKJXOQyAS/dKMABOxI+cAKA== X-Google-Smtp-Source: ABdhPJxrOX8/B2rmy6TerQUriY/hvWhKXRqCPO7bzGJK2shyHhd4WjRC4UHiZqehTAyO+ARcrDOJSA== X-Received: by 2002:adf:e8d0:: with SMTP id k16mr2304784wrn.362.1603455762699; Fri, 23 Oct 2020 05:22:42 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:42 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 18/65] drm/amdgpu: use dma-fence annotations in cs_submit() Date: Fri, 23 Oct 2020 14:21:29 +0200 Message-Id: <20201023122216.2373294-18-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This is a bit tricky, since ->notifier_lock is held while calling dma_fence_wait we must ensure that also the read side (i.e. dma_fence_begin_signalling) is on the same side. If we mix this up lockdep complaints, and that's again why we want to have these annotations. A nice side effect of this is that because of the fs_reclaim priming for dma_fence_enable lockdep now automatically checks for us that nothing in here allocates memory, without even running any userptr workloads. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index d50b63a93d37..3b3999225e31 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1212,6 +1212,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, struct amdgpu_job *job; uint64_t seq; int r; + bool fence_cookie; job = p->job; p->job = NULL; @@ -1226,6 +1227,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, */ mutex_lock(&p->adev->notifier_lock); + fence_cookie = dma_fence_begin_signalling(); + /* If userptr are invalidated after amdgpu_cs_parser_bos(), return * -EAGAIN, drmIoctl in libdrm will restart the amdgpu_cs_ioctl. */ @@ -1262,12 +1265,14 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm); ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence); + dma_fence_end_signalling(fence_cookie); mutex_unlock(&p->adev->notifier_lock); return 0; error_abort: drm_sched_job_cleanup(&job->base); + dma_fence_end_signalling(fence_cookie); mutex_unlock(&p->adev->notifier_lock); error_unlock: From patchwork Fri Oct 23 12:21:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852905 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 716B792C for ; Fri, 23 Oct 2020 12:22:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4FE1B20795 for ; Fri, 23 Oct 2020 12:22:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="NXRWzX+f" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463629AbgJWMWr (ORCPT ); Fri, 23 Oct 2020 08:22:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463627AbgJWMWq (ORCPT ); Fri, 23 Oct 2020 08:22:46 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0DD71C0613D4 for ; Fri, 23 Oct 2020 05:22:45 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id g12so1562121wrp.10 for ; Fri, 23 Oct 2020 05:22:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XJ633h4wxlhAhAPHZ0cgOENHyn8crds9DoKAsmETQdY=; b=NXRWzX+f27c5mLWSBbXv3t/s+HOpVOJMSfj43lC7wtwyQ0sNjxQe/5gdW8X+eqKEih sTztwRjWJVIq89lcKZxO3E4hXo85pvlb4M1L91yA5wLRe1PHPncXn7uGl3/ZxammiFHv brSTcO8r0ks2c6CuwTrumQKrDrUClLq4lH/Yw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XJ633h4wxlhAhAPHZ0cgOENHyn8crds9DoKAsmETQdY=; b=RhRmDOwMQEiWMuEK8gZzGy/D0HIvaPb2+uOGD8nEDwnPxHcQW75WIfek2DKoYpCuix WHB0jNahOmS2clhyL7ubZmP0mUR9NtDayvlIWNP20N6qJhGjf8Fq4CCj3YBDtHK8Z5Mx KrP8DGeQH0xUtPGHTMMATeGArdnZ8eozGwYvceVmBj3HhsPnvDeLn9hprT0r+1KjcoCp m1VS4gsC2mYYo9DGVo6LBfPM6t+JLk9fX8GjNBqHS8A9nmpcS2oLQ2NaYanihadm+VA/ idq2rEDnJYQRlf+bI5/vCaRMF8NTdTMd3ckfRqgclqQL4GBqz3pyFyZCHmmZW1W/tIXR kEZQ== X-Gm-Message-State: AOAM532WxvVPTdsWnFDRzYSO/ORbVCqWtObttTKyOWtDlbX5lJo9W2Zq VdRv6eYv/996VnVFGvJC2oaQ1Q== X-Google-Smtp-Source: ABdhPJwcF2clDMCt70ko8y66TM4vU0KRO3am4a0vWyGSDSz/G+LltJn7Wxasz4t10w4zbpKLQ7S82Q== X-Received: by 2002:a05:6000:10cd:: with SMTP id b13mr2256558wrx.4.1603455763781; Fri, 23 Oct 2020 05:22:43 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:43 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 19/65] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Date: Fri, 23 Oct 2020 14:21:30 +0200 Message-Id: <20201023122216.2373294-19-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org My dma-fence lockdep annotations caught an inversion because we allocate memory where we really shouldn't: kmem_cache_alloc+0x2b/0x6d0 amdgpu_fence_emit+0x30/0x330 [amdgpu] amdgpu_ib_schedule+0x306/0x550 [amdgpu] amdgpu_job_run+0x10f/0x260 [amdgpu] drm_sched_main+0x1b9/0x490 [gpu_sched] kthread+0x12e/0x150 Trouble right now is that lockdep only validates against GFP_FS, which would be good enough for shrinkers. But for mmu_notifiers we actually need !GFP_ATOMIC, since they can be called from any page laundering, even if GFP_NOFS or GFP_NOIO are set. I guess we should improve the lockdep annotations for fs_reclaim_acquire/release. Ofc real fix is to properly preallocate this fence and stuff it into the amdgpu job structure. But GFP_ATOMIC gets the lockdep splat out of the way. v2: Two more allocations in scheduler paths. Frist one: __kmalloc+0x58/0x720 amdgpu_vmid_grab+0x100/0xca0 [amdgpu] amdgpu_job_dependency+0xf9/0x120 [amdgpu] drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched] drm_sched_main+0xf9/0x490 [gpu_sched] Second one: kmem_cache_alloc+0x2b/0x6d0 amdgpu_sync_fence+0x7e/0x110 [amdgpu] amdgpu_vmid_grab+0x86b/0xca0 [amdgpu] amdgpu_job_dependency+0xf9/0x120 [amdgpu] drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched] drm_sched_main+0xf9/0x490 [gpu_sched] Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index fe2d495d08ab..09614b325b5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -143,7 +143,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, uint32_t seq; int r; - fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL); + fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_ATOMIC); if (fence == NULL) return -ENOMEM; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index 7521f4ab55de..2a4cde7cd746 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -208,7 +208,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_vm *vm, if (ring->vmid_wait && !dma_fence_is_signaled(ring->vmid_wait)) return amdgpu_sync_fence(sync, ring->vmid_wait); - fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL); + fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC); if (!fences) return -ENOMEM; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c index 8ea6c49529e7..af22b526cec9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c @@ -160,7 +160,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f) if (amdgpu_sync_add_later(sync, f)) return 0; - e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL); + e = kmem_cache_alloc(amdgpu_sync_slab, GFP_ATOMIC); if (!e) return -ENOMEM; From patchwork Fri Oct 23 12:21:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852907 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39D4692C for ; Fri, 23 Oct 2020 12:22:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 138F3221F9 for ; Fri, 23 Oct 2020 12:22:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="Edq/tGIR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463633AbgJWMWs (ORCPT ); Fri, 23 Oct 2020 08:22:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463632AbgJWMWr (ORCPT ); Fri, 23 Oct 2020 08:22:47 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 277A1C0613CE for ; Fri, 23 Oct 2020 05:22:46 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id h5so1574166wrv.7 for ; Fri, 23 Oct 2020 05:22:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fHzwpOVHrIbQLdmsuDmh6xvbrH+DqYzNXJD+pE0pvSU=; b=Edq/tGIRKQ88HrDhuJq2i7oTlejAOjJ2LX7qUq3NrAb27IYCN8NuptMkKr0dm/Z2VB TdWm4oN9A72isl4LC2F7iKYqPLd2Cmmtbz5kTmVHXWXyiGCWYR2PYPeTtVfm/xHi+K2Q YmBw7hL8GMCn4q/q9NBEdTONzn/v6ye+CqNio= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fHzwpOVHrIbQLdmsuDmh6xvbrH+DqYzNXJD+pE0pvSU=; b=OG4+WQ/JZ/xceDwdxwh3b8qpU5kgxstJH158t4lFXZrka+nd5omhTapxxB7ee5+b88 DJuehqF3nMypzorQYu/ROVzoelHtZa64pOj98InX2DyVPqe7vnoGEn9kORMrSBiKP2ts B+OHEp79iQYQvq2dF6XmeNIH79MuqigMUfPXsZn27Myn/9wx3yuN0IPm+fNHJ2CZO7UX MwST7UJgey2KZsNOy2KxykNHzaOjS0BAOENeiGARkhzcJOArDaAqQcfe15Z961dsVrut JbPAR2Es83uCKBTOk98AGaRWEVT2xaKUL6eDDWJGBTNYfa/mhT6S88HvdHuN9f8nR6fK Zipw== X-Gm-Message-State: AOAM531NKun+Q+hbGcvi+oNeYIFNb5VXMdczE2CJ0hwzRTVwNpUfdS0s eM5AoWF/dYPFlON3j0z/U2yMbg== X-Google-Smtp-Source: ABdhPJx8O/4FjcZWjwjAl63soPWTKchHeUQbq383W3x/1+VQsikl3G0yQpze6wY/eIGgJ5a6pGKAwQ== X-Received: by 2002:adf:fc08:: with SMTP id i8mr2537904wrr.116.1603455764887; Fri, 23 Oct 2020 05:22:44 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:44 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 20/65] drm/scheduler: use dma-fence annotations in tdr work Date: Fri, 23 Oct 2020 14:21:31 +0200 Message-Id: <20201023122216.2373294-20-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org In the face of unpriviledged userspace being able to submit bogus gpu workloads the kernel needs gpu timeout and reset (tdr) to guarantee that dma_fences actually complete. Annotate this worker to make sure we don't have any accidental locking inversions or other problems lurking. Originally this was part of the overall scheduler annotation patch. But amdgpu has some glorious inversions here: - grabs console_lock - does a full modeset, which grabs all kinds of locks (drm_modeset_lock, dma_resv_lock) which can deadlock with dma_fence_wait held inside them. - almost minor at that point, but the modeset code also allocates memory These all look like they'll be very hard to fix properly, the hardware seems to require a full display reset with any gpu recovery. Hence split out as a seperate patch. Since amdgpu isn't the only hardware driver that needs to reset the display (at least gen2/3 on intel have the same problem) we need a generic solution for this. There's two tricks we could still from drm/i915 and lift to dma-fence: - The big whack, aka force-complete all fences. i915 does this for all pending jobs if the reset is somehow stuck. Trouble is we'd need to do this for all fences in the entire system, and just the book-keeping for that will be fun. Plus lots of drivers use fences for all kinds of internal stuff like memory management, so unconditionally resetting all of them doesn't work. I'm also hoping that with these fence annotations we could enlist lockdep in finding the last offenders causing deadlocks, and we could remove this get-out-of-jail trick. - The more feasible approach (across drivers at least as part of the dma_fence contract) is what drm/i915 does for gen2/3: When we need to reset the display we wake up all dma_fence_wait_interruptible calls, or well at least the equivalent of those in i915 internally. Relying on ioctl restart we force all other threads to release their locks, which means the tdr thread is guaranteed to be able to get them. I think we could implement this at the dma_fence level, including proper lockdep annotations. dma_fence_begin_tdr(): - must be nested within a dma_fence_begin/end_signalling section - will wake up all interruptible (but not the non-interruptible) dma_fence_wait() calls and force them to complete with a -ERESTARTSYS errno code. All new interrupitble calls to dma_fence_wait() will immeidately fail with the same error code. dma_fence_end_trdr(): - this will convert dma_fence_wait() calls back to normal. Of course interrupting dma_fence_wait is only ok if the caller specified that, which means we need to split the annotations into interruptible and non-interruptible version. If we then make sure that we only use interruptible dma_fence_wait() calls while holding drm_modeset_lock we can grab them in tdr code, and allow display resets. Doing the same for dma_resv_lock might be a lot harder, so buffer updates must be avoided. What's worse, we're not going to be able to make the dma_fence_wait calls in mmu-notifiers interruptible, that doesn't work. So allocating memory still wont' be allowed, even in tdr sections. Plus obviously we can use this trick only in tdr, it is rather intrusive. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/scheduler/sched_main.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index f69abc4e70d3..ae0d5ceca49a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -281,9 +281,12 @@ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; struct drm_sched_job *job; + bool fence_cookie; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); + fence_cookie = dma_fence_begin_signalling(); + /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ spin_lock(&sched->job_list_lock); job = list_first_entry_or_null(&sched->ring_mirror_list, @@ -315,6 +318,8 @@ static void drm_sched_job_timedout(struct work_struct *work) spin_lock(&sched->job_list_lock); drm_sched_start_timeout(sched); spin_unlock(&sched->job_list_lock); + + dma_fence_end_signalling(fence_cookie); } /** From patchwork Fri Oct 23 12:21:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852909 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 067021580 for ; Fri, 23 Oct 2020 12:22:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D53EB2225F for ; Fri, 23 Oct 2020 12:22:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="J1ovvaeR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S371521AbgJWMWt (ORCPT ); Fri, 23 Oct 2020 08:22:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463626AbgJWMWr (ORCPT ); Fri, 23 Oct 2020 08:22:47 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5879AC0613D4 for ; Fri, 23 Oct 2020 05:22:47 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id n15so1597772wrq.2 for ; Fri, 23 Oct 2020 05:22:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Pt7r3r8KzR5b5qiGeAc9wam0F2FH3YbYbdtTLI7+5pA=; b=J1ovvaeRgAyxo4+fhyaimEa7i+0LS95i5IhmBF2STfEkqG/VqrnAk7NJv/NWmihpPf X7Np5ypMJj57D3bg6y5M89f/D5DBHFvI4gBX/jBw4SJGMzN43CprGiEkGtFlLBdX8fbV vsahdqW51+kNqRYjTRhqxtPo2IT1e1Su+fPVA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Pt7r3r8KzR5b5qiGeAc9wam0F2FH3YbYbdtTLI7+5pA=; b=ljPftl8YM/i5IFmFX5//r+z+Nx5HZzh6yxJb09h4j66ixJ9sHjubHjG31fYsFiRzTg 5t/B0VuQvxdP9ooFgSruvVavgJ3dYRcX+UysvJJXaITtkJB6MGKiiDGoVakuJWs9Kyg0 6jAXcAmUKRb8f9lD4+sHFh6OdfzEvvzHOJDHXh53Rd1f5fmyvP4oK71KOQjJ/cVJzY3P GNXXBEqUpR345wsZY0hSewLgEBSIf1SjgvLMr4c9eVurqBkLBZlKMSDWsp4O3l5zLvLU ZNDpXrWN6jO3qd2DXKy1fBDU/IZEmkylUBe2Tw26S0HtdN0Hef6JxhgOAAI9jnjIDaBb xA9A== X-Gm-Message-State: AOAM530y03dciSjpyShZZYXiPNqa/0FOsT7cPfsHE39HpneKmn7BVuUz sz9EjHQYrz8iRdA9kv0XtMDbVuBpyrY6GRK8 X-Google-Smtp-Source: ABdhPJzeL0b3o7SdVpsqso58ueQCQULN8UwDSO9Ve2MQd7IHgs3k/2/xklwW61HBtOOF7k0eWuT9pw== X-Received: by 2002:a5d:4ccd:: with SMTP id c13mr2315042wrt.221.1603455766129; Fri, 23 Oct 2020 05:22:46 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:45 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= Subject: [PATCH 21/65] drm/amdgpu: use dma-fence annotations for gpu reset code Date: Fri, 23 Oct 2020 14:21:32 +0200 Message-Id: <20201023122216.2373294-21-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org To improve coverage also annotate the gpu reset code itself, since that's called from other places than drm/scheduler (which is already annotated). Annotations nests, so this doesn't break anything, and allows easier testing. Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index e8b41756c9f9..029a026ecfa9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4496,6 +4496,9 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, int i, r = 0; bool need_emergency_restart = false; bool audio_suspended = false; + bool fence_cookie; + + fence_cookie = dma_fence_begin_signalling(); /** * Special case: RAS triggered and full reset isn't supported @@ -4529,6 +4532,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress", job ? job->base.id : -1, hive->hive_id); amdgpu_put_xgmi_hive(hive); + dma_fence_end_signalling(fence_cookie); return 0; } mutex_lock(&hive->hive_lock); @@ -4541,8 +4545,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, */ INIT_LIST_HEAD(&device_list); if (adev->gmc.xgmi.num_physical_nodes > 1) { - if (!hive) + if (!hive) { + dma_fence_end_signalling(fence_cookie); return -ENODEV; + } if (!list_is_first(&adev->gmc.xgmi.head, &hive->device_list)) list_rotate_to_front(&adev->gmc.xgmi.head, &hive->device_list); device_list_handle = &hive->device_list; @@ -4556,8 +4562,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (!amdgpu_device_lock_adev(tmp_adev, hive)) { dev_info(tmp_adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress", job ? job->base.id : -1); - r = 0; - goto skip_recovery; } /* @@ -4699,6 +4703,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (r) dev_info(adev->dev, "GPU reset end with ret = %d\n", r); + dma_fence_end_signalling(fence_cookie); return r; } From patchwork Fri Oct 23 12:21:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852915 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C24092C for ; Fri, 23 Oct 2020 12:22:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 050722168B for ; Fri, 23 Oct 2020 12:22:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="htfnkTc+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S463637AbgJWMWu (ORCPT ); Fri, 23 Oct 2020 08:22:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463632AbgJWMWt (ORCPT ); Fri, 23 Oct 2020 08:22:49 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97051C0613D4 for ; Fri, 23 Oct 2020 05:22:48 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id x7so1595246wrl.3 for ; Fri, 23 Oct 2020 05:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l2f4yPYYME2DTpuyrkHxIxIPVviQpkcCzpFQoof8xZU=; b=htfnkTc+EmPeG0/GmNNRktzQuZnBkEZ5WxtnLvx4A26hvPpxcZL/WZ1E2iMapXJ7F8 sYwnwW67xrGBCZcu7ZKtH7RGR3Klzf1FwMh5C3D2yfC4Y2HrxIoa+ZdrgP6bIuxcRc+m OC/kSMX339P4eTo2UpTUZGUqBf2+D/Hgktq3M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l2f4yPYYME2DTpuyrkHxIxIPVviQpkcCzpFQoof8xZU=; b=mKstwrymKiGAVLYrKLs90HRlie+54I1eQsGQyPBJi2soPgSiHtvl/DUCi6PR9bb9Qv DJSh3EIQtuqvRnflXFLGopter16uBIzkz1LwHuHr/5uvx7mcfE5yboM0Yc98NTAW9+s1 0oDSTbw0JEM9CgErcuHi/g7wL4BPzyCJxnJDxW+6IGC80EEIEMxlFFsjnmfWMtMSQGMG st6Yog7rWA4LinQBrAwauTerMQOhH7GP0wkuz3UA0l1SunjncRyt1PCna15jK6GxnAR5 ctymSb7WLBOVJ80U8dvi7jKLCALB3PUg4tmQ3TUGWwFcxIqEGfHG3JEvRp8FQjZp8UtW 0BzA== X-Gm-Message-State: AOAM533Q5X8QNOkzsVFlKTy8a5pFSOy2skCGWgTVbI3/BVzdk6Nvfkcj WNgaRQXoBqASXlINx4mvqw6KXQ== X-Google-Smtp-Source: ABdhPJyfjTgfuVVogkOvOEvv2+h5MYBRWOpR68BwVx3mHmk6n/hf7lS67d2ElS6ONGSJ5K0KZ+vD1w== X-Received: by 2002:adf:f643:: with SMTP id x3mr2525910wrp.180.1603455767291; Fri, 23 Oct 2020 05:22:47 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:46 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 22/65] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Date: Fri, 23 Oct 2020 14:21:33 +0200 Message-Id: <20201023122216.2373294-22-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This is one from the department of "maybe play lottery if you hit this, karma compensation might work". Or at least lockdep ftw! This reverts commit 565d1941557756a584ac357d945bc374d5fcd1d0. It's not quite as low-risk as the commit message claims, because this grabs console_lock, which might be held when we allocate memory, which might never happen because the dma_fence_wait() is stuck waiting on our gpu reset: [ 136.763714] ====================================================== [ 136.763714] WARNING: possible circular locking dependency detected [ 136.763715] 5.7.0-rc3+ #346 Tainted: G W [ 136.763716] ------------------------------------------------------ [ 136.763716] kworker/2:3/682 is trying to acquire lock: [ 136.763716] ffffffff8226f140 (console_lock){+.+.}-{0:0}, at: drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763723] but task is already holding lock: [ 136.763724] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 136.763726] which lock already depends on the new lock. [ 136.763726] the existing dependency chain (in reverse order) is: [ 136.763727] -> #2 (dma_fence_map){++++}-{0:0}: [ 136.763730] __dma_fence_might_wait+0x41/0xb0 [ 136.763732] dma_resv_lockdep+0x171/0x202 [ 136.763734] do_one_initcall+0x5d/0x2f0 [ 136.763736] kernel_init_freeable+0x20d/0x26d [ 136.763738] kernel_init+0xa/0xfb [ 136.763740] ret_from_fork+0x27/0x50 [ 136.763740] -> #1 (fs_reclaim){+.+.}-{0:0}: [ 136.763743] fs_reclaim_acquire.part.0+0x25/0x30 [ 136.763745] kmem_cache_alloc_trace+0x2e/0x6e0 [ 136.763747] device_create_groups_vargs+0x52/0xf0 [ 136.763747] device_create+0x49/0x60 [ 136.763749] fb_console_init+0x25/0x145 [ 136.763750] fbmem_init+0xcc/0xe2 [ 136.763750] do_one_initcall+0x5d/0x2f0 [ 136.763751] kernel_init_freeable+0x20d/0x26d [ 136.763752] kernel_init+0xa/0xfb [ 136.763753] ret_from_fork+0x27/0x50 [ 136.763753] -> #0 (console_lock){+.+.}-{0:0}: [ 136.763755] __lock_acquire+0x1241/0x23f0 [ 136.763756] lock_acquire+0xad/0x370 [ 136.763757] console_lock+0x47/0x70 [ 136.763761] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763809] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu] [ 136.763850] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 136.763851] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 136.763852] process_one_work+0x23c/0x580 [ 136.763853] worker_thread+0x50/0x3b0 [ 136.763854] kthread+0x12e/0x150 [ 136.763855] ret_from_fork+0x27/0x50 [ 136.763855] other info that might help us debug this: [ 136.763856] Chain exists of: console_lock --> fs_reclaim --> dma_fence_map [ 136.763857] Possible unsafe locking scenario: [ 136.763857] CPU0 CPU1 [ 136.763857] ---- ---- [ 136.763857] lock(dma_fence_map); [ 136.763858] lock(fs_reclaim); [ 136.763858] lock(dma_fence_map); [ 136.763858] lock(console_lock); [ 136.763859] *** DEADLOCK *** [ 136.763860] 4 locks held by kworker/2:3/682: [ 136.763860] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 136.763862] #1: ffffc90000cafe58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580 [ 136.763863] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched] [ 136.763865] #3: ffff8887ab621748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x5ab/0xe7b [amdgpu] [ 136.763914] stack backtrace: [ 136.763915] CPU: 2 PID: 682 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346 [ 136.763916] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018 [ 136.763918] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 136.763919] Call Trace: [ 136.763922] dump_stack+0x8f/0xd0 [ 136.763924] check_noncircular+0x162/0x180 [ 136.763926] __lock_acquire+0x1241/0x23f0 [ 136.763927] lock_acquire+0xad/0x370 [ 136.763932] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763933] ? mark_held_locks+0x2d/0x80 [ 136.763934] ? _raw_spin_unlock_irqrestore+0x46/0x60 [ 136.763936] console_lock+0x47/0x70 [ 136.763940] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763944] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper] [ 136.763993] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu] [ 136.764036] amdgpu_job_timedout+0xfb/0x150 [amdgpu] [ 136.764038] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched] [ 136.764040] process_one_work+0x23c/0x580 [ 136.764041] worker_thread+0x50/0x3b0 [ 136.764042] ? process_one_work+0x580/0x580 [ 136.764044] kthread+0x12e/0x150 [ 136.764045] ? kthread_create_worker_on_cpu+0x70/0x70 [ 136.764046] ret_from_fork+0x27/0x50 Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 029a026ecfa9..935116614884 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4330,8 +4330,6 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, if (r) goto out; - amdgpu_fbdev_set_suspend(tmp_adev, 0); - /* * The GPU enters bad state once faulty pages * by ECC has reached the threshold, and ras @@ -4590,8 +4588,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, */ amdgpu_unregister_gpu_instance(tmp_adev); - amdgpu_fbdev_set_suspend(tmp_adev, 1); - /* disable ras on ALL IPs */ if (!need_emergency_restart && amdgpu_device_ip_need_full_reset(tmp_adev)) From patchwork Fri Oct 23 12:21:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 11852923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3B10A16BC for ; Fri, 23 Oct 2020 12:22:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1435124248 for ; Fri, 23 Oct 2020 12:22:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="GOpMVLRD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S371535AbgJWMWz (ORCPT ); Fri, 23 Oct 2020 08:22:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S463639AbgJWMWv (ORCPT ); Fri, 23 Oct 2020 08:22:51 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF51FC0613D4 for ; Fri, 23 Oct 2020 05:22:49 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id d78so1228948wmd.3 for ; Fri, 23 Oct 2020 05:22:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PXR6++RRL9LARYV7LHJfxiTuXc7gp0ylfY0jpRLccnk=; b=GOpMVLRDSJhBIUH0Jl9RpNU7ca62jsa/0uF4vz52hy/l0Nb7bHFlNFX95AMzOO9A7g kFor3nekCRKiDob6YYYrBsijgmi8v2hST44iXnT7fzAasthUs8V+fGhsJqV+ZTfOhaUH GR9QGDBC7vCZ36wK45YiK3pyokqCw8S0lhdgY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PXR6++RRL9LARYV7LHJfxiTuXc7gp0ylfY0jpRLccnk=; b=qo0kvMVaqi+7thENw96Ebf8lIlKY1JKRgSGX9QjoZQbijIzRrhNACV0hiTK0eFeOsu BhfM4MjcIjJGTixMpEyIfVQYUuMvjtULhMOKOHXAvfWi1GDQJbEA12r4R7H8HEuXWsB5 xWsJMWXqtBOC9JtWih3VARSkhAKONnciLcPvuQfuD/Rka1X9TQ/un/KnDOlkuBkMyPLR YaGpy4Fazcmrh+G6uakD+e5+3g1y8IsYWyKqW/KUAQcu9sg4HnJaE/D14KNnaluWFiyZ uqI3TJUH66FpJpkm6cF7WhDQScVqRU7nKRh1MpjBebc5yGpb2sgQ+SeurILoRH2/IgnU JGcQ== X-Gm-Message-State: AOAM533XhiJpUBAGKw7Quo+3dNtYI3BxK2xzKNqmXiUwI7ouMPYgVTj9 Q1FGXz93bRMPeA7wOurgjSjnIA== X-Google-Smtp-Source: ABdhPJxJctwHHwXLpK/i827ZMfpLrwvrm04VWkzSg+J1vZd4u3wLfW88Tc9LRut5cVMvDGZpg74J1g== X-Received: by 2002:a7b:cd96:: with SMTP id y22mr2135356wmj.126.1603455768671; Fri, 23 Oct 2020 05:22:48 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y4sm3056484wrp.74.2020.10.23.05.22.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 05:22:48 -0700 (PDT) From: Daniel Vetter To: DRI Development Cc: Intel Graphics Development , Daniel Vetter , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Chris Wilson , Maarten Lankhorst , =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= , Daniel Vetter Subject: [PATCH 23/65] drm/i915: Annotate dma_fence_work Date: Fri, 23 Oct 2020 14:21:34 +0200 Message-Id: <20201023122216.2373294-23-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201023122216.2373294-1-daniel.vetter@ffwll.ch> References: <20201021163242.1458885-1-daniel.vetter@ffwll.ch> <20201023122216.2373294-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org i915 does tons of allocations from this worker, which lockdep catches. Also generic infrastructure like this with big potential for how dma_fence or other cross driver contracts work, really should be reviewed on dri-devel. Implementing custom wheels for everything within the driver is a classic case of "platform problem" [1]. Which in upstream we really shouldn't have. Since there's no quick way to solve these splats (dma_fence_work is used a bunch in basic buffer management and command submission) like for amdgpu, I'm giving up at this point here. Annotating i915 scheduler and gpu reset could would be interesting, but since lockdep is one-shot we can't see what surprises would lurk there. 1: https://lwn.net/Articles/443531/ Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson Cc: Maarten Lankhorst Cc: Christian König Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_sw_fence_work.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c index a3a81bb8f2c3..5b74acadaef5 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c @@ -17,12 +17,15 @@ static void fence_work(struct work_struct *work) { struct dma_fence_work *f = container_of(work, typeof(*f), work); int err; + bool fence_cookie; + fence_cookie = dma_fence_begin_signalling(); err = f->ops->work(f); if (err) dma_fence_set_error(&f->dma, err); fence_complete(f); + dma_fence_end_signalling(fence_cookie); dma_fence_put(&f->dma); }