From patchwork Wed Mar 8 13:26:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9611087 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E94906046A for ; Wed, 8 Mar 2017 13:27:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D22CA281F9 for ; Wed, 8 Mar 2017 13:27:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C732228433; Wed, 8 Mar 2017 13:27:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3802D281F9 for ; Wed, 8 Mar 2017 13:27:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8EB486E8F5; Wed, 8 Mar 2017 13:26:34 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4D7E76E8F5 for ; Wed, 8 Mar 2017 13:26:33 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id u132so6055240wmg.1 for ; Wed, 08 Mar 2017 05:26:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:date:message-id; bh=Rwz8o7tacN+irm7dfGPEIQcW06GMXRqQzZUQopMqMg8=; b=qIdzkXh0CgP/SjmBydfPyJxcGpqMYpdk64AH8fgms+UE/r7qKm/gZVDT35qk+2/ZGW eeNU4Bm8JjKX2aJMfFO/w0Bp6IHUO6fRSjNxGPT2R+S1CZYNnLkwIZ8BNXzSbMlhPiJc 0/aHIWzfUWGk1KQpl5TyqANFT1A1hV0QwHFV22/RU3HKtaLDB58bqZDOnza4pgx3mOMt i7ff92v18m2p26gbnrVbJTog9bSQ+RZL5v4E8oOIGAJkCU+PXfuCzh3DxXztbnReebwZ 887QkZPsNgzpNeOTDXqNZLI2NQRFKmTu9uCJVJayvX8P6fWepGEpWWP/XrQutRzG6KbW HcYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id; bh=Rwz8o7tacN+irm7dfGPEIQcW06GMXRqQzZUQopMqMg8=; b=PjnrhsCQrm+moa44kflJcmUNX2uETnC1qf+U1zHpshN2oGV0iYmygO6rebqQ0U4Y3L QzPIRDxqoC1o0bn5ldablzLsT/7cLZlk7Nk8CzPjzO/M7AY5Tak8Rje38/Y0Yza1ZSp/ rTqYxlHDkl2G+we9IZ9FK2eE+a6379pPygKeILVmTSD+meXhKaTxEmExz40sqzKtUwah ibnpE1j+9//mVjh3O8BrepoxXjAR5rlfZDlGeUjcfdIrAK6+/+FGtdrVtCBIXfHly1Hd 9vsTsVB6kw4yrb5s+oz8YSrfTEuKH1+uEeJ8HwAJ1f6YQ6F6OLt8zwoQLsy9UNRyAsqm CnDg== X-Gm-Message-State: AMke39lycgHtMiCXCve38wA5x5wXiTXCZiLo6U7Apgl04mWQRHDrhzv2diT8m4Tk2pFcSw== X-Received: by 10.28.20.148 with SMTP id 142mr5837502wmu.134.1488979591240; Wed, 08 Mar 2017 05:26:31 -0800 (PST) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id k10sm4744478wmg.10.2017.03.08.05.26.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Mar 2017 05:26:30 -0800 (PST) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 8 Mar 2017 13:26:28 +0000 Message-Id: <20170308132629.7987-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.11.0 Subject: [Intel-gfx] [CI 1/2] drm/i915: Avoiding recursing on ww_mutex inside shrinker X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We have to avoid taking ww_mutex inside the shrinker as we use it as a plain mutex type and so need to avoid recursive deadlocks: [ 602.771969] ================================= [ 602.771970] [ INFO: inconsistent lock state ] [ 602.771973] 4.10.0gpudebug+ #122 Not tainted [ 602.771974] --------------------------------- [ 602.771975] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 602.771978] kswapd0/40 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 602.771979] (reservation_ww_class_mutex){+.+.?.}, at: [] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772020] {RECLAIM_FS-ON-W} state was registered at: [ 602.772024] mark_held_locks+0x76/0x90 [ 602.772026] lockdep_trace_alloc+0xb8/0xc0 [ 602.772028] __kmalloc_track_caller+0x5d/0x130 [ 602.772031] krealloc+0x89/0xb0 [ 602.772033] reservation_object_reserve_shared+0xaf/0xd0 [ 602.772055] i915_gem_do_execbuffer.isra.35+0x1413/0x18b0 [i915] [ 602.772075] i915_gem_execbuffer2+0x10e/0x1d0 [i915] [ 602.772078] drm_ioctl+0x291/0x480 [ 602.772079] do_vfs_ioctl+0x695/0x6f0 [ 602.772081] SyS_ioctl+0x3c/0x70 [ 602.772084] entry_SYSCALL_64_fastpath+0x18/0xad [ 602.772085] irq event stamp: 5197423 [ 602.772088] hardirqs last enabled at (5197423): [] kfree+0xdd/0x170 [ 602.772091] hardirqs last disabled at (5197422): [] kfree+0xb9/0x170 [ 602.772095] softirqs last enabled at (5190992): [] __do_softirq+0x221/0x280 [ 602.772097] softirqs last disabled at (5190575): [] irq_exit+0x64/0xc0 [ 602.772099] other info that might help us debug this: [ 602.772100] Possible unsafe locking scenario: [ 602.772101] CPU0 [ 602.772101] ---- [ 602.772102] lock(reservation_ww_class_mutex); [ 602.772104] [ 602.772105] lock(reservation_ww_class_mutex); [ 602.772107] *** DEADLOCK *** [ 602.772109] 2 locks held by kswapd0/40: [ 602.772110] #0: (shrinker_rwsem){++++..}, at: [] shrink_slab.constprop.62+0x35/0x280 [ 602.772116] #1: (&dev->struct_mutex){+.+.+.}, at: [] i915_gem_shrinker_lock+0x27/0x60 [i915] [ 602.772141] stack backtrace: [ 602.772144] CPU: 2 PID: 40 Comm: kswapd0 Not tainted 4.10.0gpudebug+ #122 [ 602.772145] Hardware name: LENOVO 42433ZG/42433ZG, BIOS 8AET64WW (1.44 ) 07/26/2013 [ 602.772147] Call Trace: [ 602.772151] dump_stack+0x68/0xa1 [ 602.772153] print_usage_bug+0x1d4/0x1f0 [ 602.772155] mark_lock+0x390/0x530 [ 602.772157] ? print_irq_inversion_bug+0x200/0x200 [ 602.772159] __lock_acquire+0x405/0x1260 [ 602.772181] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772183] lock_acquire+0x60/0x80 [ 602.772205] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772207] mutex_lock_nested+0x69/0x760 [ 602.772229] ? i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772231] ? kfree+0xdd/0x170 [ 602.772253] ? i915_gem_object_wait+0x163/0x410 [i915] [ 602.772255] ? trace_hardirqs_on_caller+0x18d/0x1c0 [ 602.772256] ? trace_hardirqs_on+0xd/0x10 [ 602.772278] i915_gem_object_wait+0x39a/0x410 [i915] [ 602.772300] i915_gem_object_unbind+0x5e/0x130 [i915] [ 602.772323] i915_gem_shrink+0x22d/0x3d0 [i915] [ 602.772347] i915_gem_shrinker_scan+0x3f/0x80 [i915] [ 602.772349] shrink_slab.constprop.62+0x1ad/0x280 [ 602.772352] shrink_node+0x52/0x80 [ 602.772355] kswapd+0x427/0x5c0 [ 602.772358] kthread+0x122/0x130 [ 602.772360] ? try_to_free_pages+0x270/0x270 [ 602.772362] ? kthread_stop+0x70/0x70 [ 602.772365] ret_from_fork+0x2e/0x40 v2: Add commentary about the pruning being opportunistic Reported-by: Jan Nordholz Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99977#c10 Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait") Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Matthew Auld Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e29f9400c9d1..aca1eaddafb4 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -462,11 +462,16 @@ i915_gem_object_wait_reservation(struct reservation_object *resv, dma_fence_put(excl); + /* Oportunistically prune the fences iff we know they have *all* been + * signaled and that the reservation object has not been changed (i.e. + * no new fences have been added). + */ if (prune_fences && !__read_seqcount_retry(&resv->seq, seq)) { - reservation_object_lock(resv, NULL); - if (!__read_seqcount_retry(&resv->seq, seq)) - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); + if (reservation_object_trylock(resv)) { + if (!__read_seqcount_retry(&resv->seq, seq)) + reservation_object_add_excl_fence(resv, NULL); + reservation_object_unlock(resv); + } } return timeout;