From patchwork Sun Feb 17 16:31:03 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 2153851 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork1.kernel.org (Postfix) with ESMTP id 5BAF33FDF1 for ; Sun, 17 Feb 2013 19:55:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1F42AE61D7 for ; Sun, 17 Feb 2013 11:55:14 -0800 (PST) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by gabe.freedesktop.org (Postfix) with ESMTP id 8F0D0E5D26 for ; Sun, 17 Feb 2013 08:31:52 -0800 (PST) Received: by mail-pa0-f53.google.com with SMTP id bg4so2386858pad.12 for ; Sun, 17 Feb 2013 08:31:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type; bh=kP9H8a2/G2Ydb/2xBqxT6DKVOhr9KEN/N/E1TjDvOhg=; b=YuN0n4ddZBXO7IWtSj66/Q4KY7r0RlTc9Z2htzLxAxd8s2DO6R49FLMhVt4BSu97xa 7be2DR3JI+oc0+fZ53qc68gHoIj+5UgLu52R3/Q0mj+r1xxTBZL1aBesd1iXUMW7hfEp 79Hf9RxCj8g+KUd86bx7BEh1CmOLkj3NPwlp7JTh0HlHrIph/VDAZx3iJCUnrDv7bCqe xgfZaZpq30/B0Dw+VXnlmrndZsl45QgKPixF61w/DVEcqNqFUqGKsxNLAtTr9mssKIyN wxQWM5sliOKmBM1kae+N+d/6VcNDO9g2sendr2DfFVKUmrkDyVCD6I/osG0zr3f+Gq5J 9mAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type :x-gm-message-state; bh=kP9H8a2/G2Ydb/2xBqxT6DKVOhr9KEN/N/E1TjDvOhg=; b=f31656MA2a6NJZm5LB7SfYFQMexbtrptGnwWO9ec55VCmgQw+OMfLFZ+w+gpHBKtUS pjD0UfbGEi+3g7se5UPlGAf3w7hh+VAS03kr+2IH6rtAOsIEVaUKOF7UbMoZoXAW8c84 WZIfzVrM4F9LyXS0Acl7AJKq/fMKxKs4K/tJYhoKtCb+Cvkfxm4W/yntJVv7eaL0PVZv vjirdxDKRBDwjf5ONpZdSxS4Q9ale+nEg1+pwHwECzcykYhLR3zhmOXW5KYEVxbmw13v EE29XtUnDCRy3RRbCiFSyQn7YpjdtYYuEfGYSMHX494LPtAtcIdJkhweBO75iIXgG5V3 0ppw== X-Received: by 10.68.236.10 with SMTP id uq10mr22632887pbc.77.1361118711986; Sun, 17 Feb 2013 08:31:51 -0800 (PST) Received: from [192.168.1.8] (c-67-188-178-35.hsd1.ca.comcast.net. [67.188.178.35]) by mx.google.com with ESMTPS id eh5sm11520441pbc.44.2013.02.17.08.31.50 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 17 Feb 2013 08:31:51 -0800 (PST) Date: Sun, 17 Feb 2013 08:31:03 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Daniel Vetter Subject: Re: Debugging Thinkpad T430s occasional suspend failure. In-Reply-To: <20130217133814.GK5813@phenom.ffwll.local> Message-ID: References: <20130213193411.GA15928@redhat.com> <20130215011503.GA11914@redhat.com> <20130217133814.GK5813@phenom.ffwll.local> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 X-Gm-Message-State: ALoCoQlI00Wkh5W7Sz4ZhvDDrdBOT2fmwImwFXAPatjJaJmJFl3oeAcXga2Sypvgmwvw4GOBucd/bstJwVn2fVdmP6cdXbqzBcdnu1kNwog87h/rIl+h1/DrjEiSQYbZ5M2wAexCa62VHV2LAJpvtIvrKffSYmHAYfDbpZtjQO/RbTjBsz9GABMBBRHucyQ3htWBnNYPQXJx8rj0g1tOG5YGU1sKo5Eprg== X-Mailman-Approved-At: Sun, 17 Feb 2013 11:53:20 -0800 Cc: Linux Kernel Mailing List , DRI , Paul McKenney , Dave Jones , Linus Torvalds X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org On Sun, 17 Feb 2013, Daniel Vetter wrote: > On Sun, Feb 17, 2013 at 3:21 AM, Hugh Dickins wrote: > > On Sat, 16 Feb 2013, Hugh Dickins wrote: > >> On Sat, 16 Feb 2013, Linus Torvalds wrote: > >> > > >> > I think it's worth it to give them a heads-up already. So I've cc'd > >> > the main suspects here.. > >> > >> Okay, thanks. > >> > >> > > >> > Daniel, Dave - any comments about a NULL fb in > >> > intel_choose_pipe_bpp_dither() during either suspend or resume? Some > >> > googling shows this: > >> > > >> > https://bugzilla.redhat.com/show_bug.cgi?id=895123 > >> > >> Great, yes, I'm sure that's the same (though it says "suspend" > >> and I say "resume"). > >> > >> > > >> > which sounds remarkably similar, and is also during a suspend attempt > >> > (but apparently Satish got a full oops out).. Some timing race with a > >> > worker entry? > > > > Comparing Satish's backtrace and drivers/gpu/drm history, it's clear that > > the oops comes from Daniel's 3.8-rc2 45e2b5f640b3 "drm/i915: force restore > > on lid open", whose force_restore case now passes down crtc->base.fb. But > > I wouldn't have a clue why that's usually non-NULL but occasionally NULL: > > your timing race with a worker entry, perhaps. > > > > And 45e2b5f640b3 contains a fine history of going back and forth, so I > > wouldn't want to play further with it out of ignorance - though tempted > > to replace the "if (force_restore) {" by an interim safe-seeming > > compromise of "if (force_restore && crtc->base.fb) {". My suggestion there in the line above was stupidly wrong :( > > Two things to try while I try to grow a clue on what exactly is going on: Thank you. By the way, I hope you've looked back through this thread, and realize that Dave and I both had ThinkPad T4?0s suspend/resume display problems, but they've gone off in different directions: so a lot of the discussion comes from Dave having CONFIG_PROVE_RCU_DELAY, and has nothing to do with what we now know to be my oops in i915/intel_display.c. > > 1. Related to new ACPI sleep states we've recently made the lid_notifier > locking more sound, maybe that helps: > > http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-next-queued&id=b8efb17b3d687695b81485f606fc4e6c35a50f9a Looks like it may be relevant, but I'll ignore it for now: preferring first to test the more minimal safety you suggest below. > > 2. The new i915 force restore code is indeed missing a safety check > compared to the old crtc helper based one: > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 6eb3882..095094c 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -9153,7 +9153,13 @@ void intel_modeset_setup_hw_state(struct drm_device *dev, > > if (force_restore) { > for_each_pipe(pipe) { > - intel_crtc_restore_mode(dev_priv->pipe_to_crtc_mapping[pipe]); > + struct drm_crtc * crtc = > + dev_priv->pipe_to_crtc_mapping[pipe]; > + > + if (!crtc->enabled) > + continue; > + > + intel_crtc_restore_mode(crtc); > } > > i915_redisable_vga(dev); I see your followup, where you observe that intel_modeset_affected_pipes() should already have made this check; but you do say it would still be good to prove one way or the other, so I'll run from now with the patch below. Note that we haven't got to worrying about what will be in 3.9 yet (linux-next tells me to expect hangcheck timer problems): it's Linus's current git for 3.8 final that we're working on at present. And since quick resumes have always worked for me, it's only occasionally that a long suspend (e.g. overnight) fails for me in this way, so I won't be able to report success for several days - but failure may come sooner. And, it being very tiresome to debug when it does fail, I have inserted WARN_ONs and more safety: here's what I'm running with now. --- 3.8-rc7/drivers/gpu/drm/i915/intel_display.c 2013-01-17 20:06:11.384002362 -0800 +++ linux/drivers/gpu/drm/i915/intel_display.c 2013-02-17 07:50:28.012075150 -0800 @@ -4156,7 +4156,9 @@ static bool intel_choose_pipe_bpp_dither * also stays within the max display bpc discovered above. */ - switch (fb->depth) { + if (WARN_ON(!fb)) + bpc = 8; + else switch (fb->depth) { case 8: bpc = 8; /* since we go through a colormap */ break; @@ -9302,6 +9304,10 @@ void intel_modeset_setup_hw_state(struct if (force_restore) { for_each_pipe(pipe) { crtc = to_intel_crtc(dev_priv->pipe_to_crtc_mapping[pipe]); + if (WARN_ON(!crtc->base.enabled)) + continue; + if (WARN_ON(!crtc->base.fb)) + continue; intel_set_mode(&crtc->base, &crtc->base.mode, crtc->base.x, crtc->base.y, crtc->base.fb); }