From patchwork Wed Jul 10 09:56:27 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 2825580 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 449789F756 for ; Wed, 10 Jul 2013 09:57:30 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 525CB20161 for ; Wed, 10 Jul 2013 09:57:29 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 513682015F for ; Wed, 10 Jul 2013 09:57:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 37047E6370 for ; Wed, 10 Jul 2013 02:57:28 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by gabe.freedesktop.org (Postfix) with ESMTP id 5C063E5FF1 for ; Wed, 10 Jul 2013 02:56:29 -0700 (PDT) Received: from 5ed49945.cm-7-5c.dynamic.ziggo.nl ([94.212.153.69] helo=[192.168.1.128]) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1Uwr8O-00088f-7o; Wed, 10 Jul 2013 09:56:28 +0000 Message-ID: <51DD2FCB.70809@canonical.com> Date: Wed, 10 Jul 2013 11:56:27 +0200 From: Maarten Lankhorst User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 MIME-Version: 1.0 To: Markus Trippelsdorf Subject: Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780 References: <20130710092211.GB356@x4> <51DD2976.2010904@canonical.com> <20130710094637.GA354@x4> In-Reply-To: <20130710094637.GA354@x4> Cc: Dave Airlie , Jerome Glisse , dri-devel@lists.freedesktop.org X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Op 10-07-13 11:46, Markus Trippelsdorf schreef: > On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote: >> Op 10-07-13 11:22, Markus Trippelsdorf schreef: >>> By simply copy/pasting a big document under LibreOffice my system hangs >>> itself up. Only a hard reset gets it working again. >>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551 >>> >>> I've bisected the issue to: >>> >>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995 >>> Author: Maarten Lankhorst >>> Date: Thu Jun 27 13:48:17 2013 +0200 >>> >>> drm/ttm: make ttm reservation calls behave like reservation calls >>> >>> This commit converts the source of the val_seq counter to >>> the ww_mutex api. The reservation objects are converted later, >>> because there is still a lockdep splat in nouveau that has to >>> resolved first. >>> >>> Signed-off-by: Maarten Lankhorst >>> Reviewed-by: Jerome Glisse >>> Signed-off-by: Dave Airlie >> Hey, >> >> Can you try current head with CONFIG_PROVE_LOCKING set and post the >> lockdep splat from dmesg, if any? If there is any locking issue >> lockdep should warn about it. Lockdep will turn itself off after the >> first splat, so if the lockdep splat happens before running the >> affected parts those will have to be fixed first. > There was an unrelated EDAC lockdep splat, so I simply disabled it. > > This is what I get: > > Jul 10 11:40:44 x4 kernel: ================================================ > Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ] > Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted > Jul 10 11:40:44 x4 kernel: ------------------------------------------------ > Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held! > Jul 10 11:40:44 x4 kernel: 2 locks held by X/211: > Jul 10 11:40:44 x4 kernel: #0: (reservation_ww_class_acquire){+.+.+.}, at: [] radeon_bo_list_validate+0x20/0xd0 > Jul 10 11:40:44 x4 kernel: #1: (reservation_ww_class_mutex){+.+.+.}, at: [] ttm_eu_reserve_buffers+0x126/0x4b0 > Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync > Jul 10 11:40:53 x4 kernel: Emergency Sync complete > Thanks, exactly what I thought. I missed a backoff somewhere.. Does the below patch fix it? diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 0219d26..2020bf4 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket, domain = lobj->alt_domain; goto retry; } + ttm_eu_backoff_reservation(ticket, head); return r; } }