From patchwork Tue Nov 1 08:48:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9407067 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A350660721 for ; Tue, 1 Nov 2016 08:48:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9925429566 for ; Tue, 1 Nov 2016 08:48:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B5072958D; Tue, 1 Nov 2016 08:48:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1101A29566 for ; Tue, 1 Nov 2016 08:48:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5F7376E3D2; Tue, 1 Nov 2016 08:48:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) by gabe.freedesktop.org (Postfix) with ESMTPS id CC88F6E3D1 for ; Tue, 1 Nov 2016 08:48:46 +0000 (UTC) Received: by mail-wm0-x243.google.com with SMTP id u144so2534417wmu.0 for ; Tue, 01 Nov 2016 01:48:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:subject:date:message-id; bh=qwK3FMtU6xIypD3ymLVUl+uA4UwACot0WhomlPtkOuw=; b=s8+k5FOVQLaEjOrfdFtnyVGepW+eZrHZl+ndpe6UiNFqCyWLcQeGdZSnXvySBE5HGQ rSjpmtkNaGvYLrL4vX+P01Omzks/YumNS5hb/PJzvbVWDgTvOpb42cA7DzQwEncA+B/I 5iZ30KIYZf9iK34cynhJLYkFX9vZSZtAG+XLTLYTmd+fpuEBAxcJkyoCq5+xXB2HZDh4 Snjog0wfcr50nY2SbDwdc11UYl3yu84swJT/dwxTzXt/q2YRHYs+NqXOZGbwKAhUzCR9 L25KtwlXlImmMgV3bySsr0/T53vWcmqtcvEehfCQDPSEbrvM3r0TCQC+aIkWzqURSbPK 0yyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:subject:date:message-id; bh=qwK3FMtU6xIypD3ymLVUl+uA4UwACot0WhomlPtkOuw=; b=c9u2qxu4APQYHZpjuY0j0YTnJYPdKvOqbsJ6AsM0Lha+1EATB4ZL/tlVNDMsImxtNy VR4D3ewyhX4GoRRA5qeqq+FeiBOZHvSCvv3b9vM7/gIKdLG+MiGh21R3CPkCWsloVqW0 Z2ASA9S29nSNh683NuzRcakyyWC44G+nbCjwF0vjn6KmRf3RZGJrbOaQggbKApDIb+jG xEOKPt1L996pzAo08pg7/SAHCvt5fdyWwJDxyw8IkM6e5K0zMnN/xJpyoVY48AwYcllh dXHmF2u0yaj0fC4xE+pLaCi+qVldYRnVEXaLA2txZitjMxm14/lwnqksX03i0E22R3IU AaRQ== X-Gm-Message-State: ABUngvevBhmQquAHSne53S+iqK5NVRr+FJxc1HJHUKRL1Qv2OadDQhX6WNL/PQPBmwXncg== X-Received: by 10.28.92.21 with SMTP id q21mr519354wmb.71.1477990125058; Tue, 01 Nov 2016 01:48:45 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id p13sm29301302wmd.20.2016.11.01.01.48.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Nov 2016 01:48:44 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 1 Nov 2016 08:48:41 +0000 Message-Id: <20161101084843.3961-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.10.2 Subject: [Intel-gfx] [CI 1/3] drm/i915: Use the full hammer when shutting down the rcu tasks X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP To flush all call_rcu() tasks (here from i915_gem_free_object()) we need to call rcu_barrier() (not synchronize_rcu()). If we don't then we may still have objects being freed as we continue to teardown the driver - in particular, the recently released rings may race with the memory manager shutdown resulting in sporadic: [ 142.217186] WARNING: CPU: 7 PID: 6185 at drivers/gpu/drm/drm_mm.c:932 drm_mm_takedown+0x2e/0x40 [ 142.217187] Memory manager not clean during takedown. [ 142.217187] Modules linked in: i915(-) x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_realtek snd_hda_codec_generic mei_me mei snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: snd_hda_intel] [ 142.217199] CPU: 7 PID: 6185 Comm: rmmod Not tainted 4.9.0-rc2-CI-Trybot_242+ #1 [ 142.217199] Hardware name: LENOVO 10AGS00601/SHARKBAY, BIOS FBKT34AUS 04/24/2013 [ 142.217200] ffffc90002ecfce0 ffffffff8142dd65 ffffc90002ecfd30 0000000000000000 [ 142.217202] ffffc90002ecfd20 ffffffff8107e4e6 000003a40778c2a8 ffff880401355c48 [ 142.217204] ffff88040778c2a8 ffffffffa040f3c0 ffffffffa040f4a0 00005621fbf8b1f0 [ 142.217206] Call Trace: [ 142.217209] [] dump_stack+0x67/0x92 [ 142.217211] [] __warn+0xc6/0xe0 [ 142.217213] [] warn_slowpath_fmt+0x4a/0x50 [ 142.217214] [] drm_mm_takedown+0x2e/0x40 [ 142.217236] [] i915_gem_cleanup_stolen+0x1a/0x20 [i915] [ 142.217246] [] i915_ggtt_cleanup_hw+0x31/0xb0 [i915] [ 142.217253] [] i915_driver_cleanup_hw+0x31/0x40 [i915] [ 142.217260] [] i915_driver_unload+0x141/0x1a0 [i915] [ 142.217268] [] i915_pci_remove+0x14/0x20 [i915] [ 142.217269] [] pci_device_remove+0x34/0xb0 [ 142.217271] [] __device_release_driver+0x9c/0x150 [ 142.217272] [] driver_detach+0xb6/0xc0 [ 142.217273] [] bus_remove_driver+0x53/0xd0 [ 142.217274] [] driver_unregister+0x27/0x50 [ 142.217276] [] pci_unregister_driver+0x25/0x70 [ 142.217287] [] i915_exit+0x1a/0x71 [i915] [ 142.217289] [] SyS_delete_module+0x193/0x1e0 [ 142.217291] [] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 142.217292] ---[ end trace 6fd164859c154772 ]--- [ 142.217505] [drm:show_leaks] *ERROR* node [6b6b6b6b6b6b6b6b + 6b6b6b6b6b6b6b6b]: inserted at [] save_stack.isra.1+0x53/0xa0 [] drm_mm_insert_node_in_range_generic+0x2ad/0x360 [] i915_gem_stolen_insert_node_in_range+0x93/0xe0 [i915] [] i915_gem_object_create_stolen+0x75/0xb0 [i915] [] intel_engine_create_ring+0x9a/0x140 [i915] [] intel_init_ring_buffer+0xf1/0x440 [i915] [] intel_init_render_ring_buffer+0xab/0x1b0 [i915] [] intel_engines_init+0xc8/0x210 [i915] [] i915_gem_init+0xac/0xf0 [i915] [] i915_driver_load+0x9c4/0x1430 [i915] [] i915_pci_probe+0x28/0x40 [i915] [] pci_device_probe+0x85/0xf0 [] driver_probe_device+0x21f/0x430 [] __driver_attach+0xde/0xe0 In particular note that the node was being poisoned as we inspected the list, a clear indication that the object is being freed as we make the assertion. v2: Don't loop, just assert that we do all the work required as that will be better at detecting further errors. Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker") Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_gem.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 6a99544c98d3..3b9bfd2cf0c0 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -544,7 +544,7 @@ static void i915_gem_fini(struct drm_i915_private *dev_priv) i915_gem_context_fini(&dev_priv->drm); mutex_unlock(&dev_priv->drm.struct_mutex); - synchronize_rcu(); + rcu_barrier(); flush_work(&dev_priv->mm.free_work); WARN_ON(!list_empty(&dev_priv->context_list)); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1e5d2bf777e4..b51274562e79 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4787,6 +4787,8 @@ void i915_gem_load_cleanup(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); + WARN_ON(!llist_empty(&dev_priv->mm.free_list)); + kmem_cache_destroy(dev_priv->requests); kmem_cache_destroy(dev_priv->vmas); kmem_cache_destroy(dev_priv->objects);