From patchwork Wed Sep 11 09:01:35 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Markus Trippelsdorf X-Patchwork-Id: 2871551 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id EA5AABF43F for ; Wed, 11 Sep 2013 09:02:11 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1E785202EC for ; Wed, 11 Sep 2013 09:02:10 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id C7B1E202C6 for ; Wed, 11 Sep 2013 09:02:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AB538E7482 for ; Wed, 11 Sep 2013 02:02:08 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail.ud10.udmedia.de (ud10.udmedia.de [194.117.254.50]) by gabe.freedesktop.org (Postfix) with ESMTP id 61622E73D7 for ; Wed, 11 Sep 2013 02:01:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=mail.ud10.udmedia.de; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=beta; bh= lI7Nr5D5LaVmSW0HFvkAbomdA9U+aI8MwxFSvlOEhOg=; b=e9gBXFDPW1QsdywV cV9q0ALtSf3dKxax1W7c82ozW/NratiK5jpl+c4tuKxlxi9WuNyDpMICw3NTOgXh YIgXqmE9/jU6ptJmDuKflG2n3ftSzL09AkhoEjHqXxkjUqtfrCdAjRdl3HJTNWoQ 4LhJbZbx62WnhQi0Mo9eZDrDIQ8= Received: (qmail 12748 invoked from network); 11 Sep 2013 11:01:36 +0200 Received: from unknown (HELO x4) (ud10?360p3@91.64.96.185) by mail.ud10.udmedia.de with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 11 Sep 2013 11:01:36 +0200 Date: Wed, 11 Sep 2013 11:01:35 +0200 From: Markus Trippelsdorf To: Christian =?iso-8859-1?Q?K=F6nig?= Subject: Re: [PATCH 0/3] drm/radeon kexec fixes Message-ID: <20130911090135.GB359@x4> References: <20130908120947.GA360@x4> <87bo42eswi.fsf@xmission.com> <20130909092140.GA359@x4> <522D972D.5090805@vodafone.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <522D972D.5090805@vodafone.de> Cc: kexec@lists.infradead.org, "Eric W. Biederman" , dri-devel@lists.freedesktop.org X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2013.09.09 at 11:38 +0200, Christian König wrote: > Am 09.09.2013 11:21, schrieb Markus Trippelsdorf: > > On 2013.09.08 at 17:32 -0700, Eric W. Biederman wrote: > >> Markus Trippelsdorf writes: > >> > >>> Here are a couple of patches that get kexec working with radeon devices. > >>> I've tested this on my RS780. > >>> Comments or flames are welcome. > >>> Thanks. > >> A couple of high level comments. > >> > >> This looks promising for the usual case. > >> > >> Removing the printk at the end of the kexec path seems a little dubious, > >> what of other cpus, interrupt handlers, etc. Basically estabilishing a > >> new rule on when printk is allowed seems a little dubious at this point, > >> even if it is a useful debugging trick. > > OK. I will drop this patch. It doesn't seem to be necessary, because I > > cannot reproduce the printk related hang anymore. > > > >> Having a clean shutdown of the radeon definitely seems worth doing, > >> because the cases where we care abouty video are when a person is in > >> front of the system. > > Yes. But please note that even with radeon_pci_shutdown implemented, I > > still get ring test failures on roughly every eighth kexec boot: > > > > [drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD) > > radeon 0000:01:05.0: disabling GPU acceleration > > > > That's definitely better than the current state of affairs, with ring > > test failures on every second boot. But I haven't figured out the reason > > for these failures yet. It's curious that once a ring test failure > > occurs, it will reliably fail after each kexec invocation, no matter how > > often repeated. Only a reboot brings the machine back to normal. > > The main problem here is that the AMD gfx hardware doesn't really > support being reinitialized once booted (for various reasons). It's a > (intended) limitation of the hardware design that you can only > initialize certain blocks once every power cycle, so the whole approach > actually will never work 100% reliable. > > All you can hope for is that stopping the hardware while shutting down > the old kernel and starting it again results in exactly the same > hardware parameters (offsets, clock etc...) otherwise starting the > blocks will just fail and you end up with disabled acceleration like above. > > Sorry, but there isn't much we can do about this, I've tested this further and it turned out that if I revert commit f5d9b7f0f9 on top of my "drm/radeon: Implement radeon_pci_shutdown" patch, the initialization failures seem to go away completely. Any idea what's going on? Here's the patch: diff --git a/drivers/gpu/drm/radeon/r600_dpm.c b/drivers/gpu/drm/radeon/r600_dpm.c index fa0de46..4e8c1988 100644 --- a/drivers/gpu/drm/radeon/r600_dpm.c +++ b/drivers/gpu/drm/radeon/r600_dpm.c @@ -296,9 +296,9 @@ bool r600_dynamicpm_enabled(struct radeon_device *rdev) void r600_enable_sclk_control(struct radeon_device *rdev, bool enable) { if (enable) - WREG32_P(SCLK_PWRMGT_CNTL, 0, ~SCLK_PWRMGT_OFF); + WREG32_P(GENERAL_PWRMGT, 0, ~SCLK_PWRMGT_OFF); else - WREG32_P(SCLK_PWRMGT_CNTL, SCLK_PWRMGT_OFF, ~SCLK_PWRMGT_OFF); + WREG32_P(GENERAL_PWRMGT, SCLK_PWRMGT_OFF, ~SCLK_PWRMGT_OFF); } void r600_enable_mclk_control(struct radeon_device *rdev, bool enable)