From patchwork Fri Mar 7 11:18:49 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 3795011 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 6B9BABF540 for ; Fri, 7 Mar 2014 22:37:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8305120304 for ; Fri, 7 Mar 2014 22:37:44 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 8071820265 for ; Fri, 7 Mar 2014 22:37:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 92282FADF7; Fri, 7 Mar 2014 14:37:38 -0800 (PST) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org X-Greylist: delayed 1058 seconds by postgrey-1.32 at gabe; Fri, 07 Mar 2014 03:36:34 PST Received: from Galois.linutronix.de (www.linutronix.de [62.245.132.108]) by gabe.freedesktop.org (Postfix) with ESMTP id ACEF9FAC54 for ; Fri, 7 Mar 2014 03:36:34 -0800 (PST) Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1WLsnh-0002Mk-1U; Fri, 07 Mar 2014 12:18:49 +0100 Date: Fri, 7 Mar 2014 12:18:49 +0100 From: Sebastian Andrzej Siewior To: Fernando Lopez-Lezcano , Ben Skeggs , Peter Hurley , Maarten Lankhorst Subject: nouveau crash due to missing channel (WAS: Re: [ANNOUNCE] 3.12.12-rt19) Message-ID: <20140307111848.GA8637@linutronix.de> References: <20140223184727.GA12442@linutronix.de> <53128DED.1000402@localhost> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <53128DED.1000402@localhost> X-Key-Id: 97C4700B X-Key-Fingerprint: 09E2 D1F3 9A3A FF13 C3D3 961C 0688 1C1E 97C4 700B User-Agent: Mutt/1.5.21 (2010-09-15) X-Mailman-Approved-At: Fri, 07 Mar 2014 14:37:36 -0800 Cc: linux-rt-users , LKML , rostedt@goodmis.org, dri-devel@lists.freedesktop.org, John Kacur , Thomas Gleixner X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP * Fernando Lopez-Lezcano | 2014-03-01 17:48:29 [-0800]: >On 02/23/2014 10:47 AM, Sebastian Andrzej Siewior wrote: >>Dear RT folks! >> >>I'm pleased to announce the v3.12.12-rt19 patch set. > >Just hit this Oops in my desktop at home: > >[22328.388996] BUG: unable to handle kernel NULL pointer dereference >at 0000000000000008 >[22328.389013] IP: [] >nouveau_fence_wait_uevent.isra.2+0x22/0x440 [nouveau] This is | static int | nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool intr) | | { | struct nouveau_channel *chan = fence->channel; | struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device); and chan is NULL. >[22328.389046] RAX: 0000000000000000 RBX: ffff8807a68f8fa8 RCX: >0000000000000000 >[22328.389046] RDX: 0000000000000001 RSI: ffff8807a68f8fb0 RDI: >ffff8807a68f8fa8 >[22328.389047] RBP: ffff8807c09bdca0 R08: 000000000000045e R09: >000000000000e200 >[22328.389047] R10: ffffffffa0157d80 R11: ffff8807c09bdde0 R12: >0000000000000001 >[22328.389047] R13: 0000000000000000 R14: ffff8807d8493a80 R15: >ffff8807a68f8fb0 >[22328.389053] Call Trace: >[22328.389069] [] nouveau_fence_wait+0x86/0x1a0 [nouveau] >[22328.389081] [] nouveau_bo_fence_wait+0x15/0x20 >[nouveau] >[22328.389084] [] ttm_bo_wait+0x96/0x1a0 [ttm] >[22328.389095] [] >nouveau_gem_ioctl_cpu_prep+0x5c/0xf0 [nouveau] >[22328.389101] [] drm_ioctl+0x502/0x630 [drm] >[22328.389114] [] nouveau_drm_ioctl+0x51/0x90 [nouveau] I can't find any kind of locking so my question is what ensures that chan is not set to NULL between nouveau_fence_done() and nouveau_fence_wait_uevent()? There are just a few opcodes in between but nothing that pauses nouveau_fence_signal(). Fernando, can you please check the patch below and test if the warning or the crash appears? >-- Fernando Sebastian diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -184,14 +184,20 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool intr) { struct nouveau_channel *chan = fence->channel; - struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device); - struct nouveau_fence_priv *priv = chan->drm->fence; + struct nouveau_fifo *pfifo; + struct nouveau_fence_priv *priv; struct nouveau_fence_uevent uevent = { .handler.func = nouveau_fence_wait_uevent_handler, - .priv = priv, }; int ret = 0; + if (WARN_ON_ONCE(!chan)) + return 0; + + pfifo = nouveau_fifo(chan->drm->device); + priv = chan->drm->fence; + uevent.priv = priv; + nouveau_event_get(pfifo->uevent, 0, &uevent.handler); if (fence->timeout) {