From patchwork Thu Dec 14 15:30:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Rafael J. Wysocki" X-Patchwork-Id: 10112497 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1215160327 for ; Thu, 14 Dec 2017 15:31:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02D8C28C70 for ; Thu, 14 Dec 2017 15:31:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB596290E5; Thu, 14 Dec 2017 15:31:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7EF7028C70 for ; Thu, 14 Dec 2017 15:31:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753015AbdLNPbZ (ORCPT ); Thu, 14 Dec 2017 10:31:25 -0500 Received: from cloudserver094114.home.net.pl ([79.96.170.134]:45068 "EHLO cloudserver094114.home.net.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752671AbdLNPbZ (ORCPT ); Thu, 14 Dec 2017 10:31:25 -0500 Received: from 79.184.254.73.ipv4.supernova.orange.pl (79.184.254.73) (HELO aspire.rjw.lan) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.82) id 35e4823dc40a2a1c; Thu, 14 Dec 2017 16:31:22 +0100 From: "Rafael J. Wysocki" To: Thomas Gleixner Cc: Linus Torvalds , Bjorn Helgaas , Maarten Lankhorst , Michal Hocko , Andy Lutomirski , Linux Kernel Mailing List , the arch/x86 maintainers , Daniel Vetter , Bjorn Helgaas , "Rafael J. Wysocki" , linux-pci@vger.kernel.org, linux-pm@vger.kernel.org Subject: Re: Linux 4.15-rc2: Regression in resume from ACPI S3 Date: Thu, 14 Dec 2017 16:30:39 +0100 Message-ID: <5115041.vUGA3IjvdM@aspire.rjw.lan> In-Reply-To: References: <168050887.sZlTFXWCmO@aspire.rjw.lan> <3265333.8krWOQvcRi@aspire.rjw.lan> MIME-Version: 1.0 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thursday, December 14, 2017 1:30:37 PM CET Thomas Gleixner wrote: > On Thu, 14 Dec 2017, Rafael J. Wysocki wrote: > > On Thursday, December 14, 2017 12:54:05 PM CET Thomas Gleixner wrote: > > > Now the graphics issue is a different story. That only happens on > > > hibernation after doing the snapshot. There all non boot cpus are onlined > > > again and after that the devices are 'thawed'. The following reenable of > > > interrupts fails because i915 is not in PCI_D0 state. > > > > > > Suspend: > > > > > > irq_migrate_all_off_this_cpu: Mask 125 pci_msi_mask_irq+0x0/0x10 > > > __pci_write_msi_msg: 0000:00:02.0 00000000fee0100c 0000412a > > > __pci_write_msi_msg: Not written <- Device not in PCI_D0 > > > .... > > > device_pm_callback_start: i915 0000:00:02.0, parent: pci0000:00, noirq bus [resume] > > > pci_pm_resume_noirq <-dpm_run_callback > > > pci_pm_resume_noirq <-dpm_run_callback > > > pci_pm_default_resume_early <-pci_pm_resume_noirq > > > pci_pm_default_resume_early <-pci_pm_resume_noirq > > > __pci_write_msi_msg: 0000:00:02.0 00000000fee0100c 0000412a <-- Set the new affinity > > > device_pm_callback_end: i915 0000:00:02.0, err=0 > > > > So this works, because we power up the device during resume even if it > > had been suspended (via runtime PM) before the suspend started. > > > > > Hibernate: > > > > > > irq_migrate_all_off_this_cpu: Mask 125 pci_msi_mask_irq+0x0/0x10 > > > __pci_write_msi_msg: 0000:00:02.0 00000000fee0100c 0000412a > > > __pci_write_msi_msg: Not written <- Device not in PCI_D0 > > > .... > > > device_pm_callback_start: i915 0000:00:02.0, parent: pci0000:00, noirq bus [thaw] > > > pci_pm_thaw_noirq <-dpm_run_callback > > > __pci_write_msi_msg: 0000:00:02.0 00000000fee0100c 0000412a > > > __pci_write_msi_msg: Not written <--- Device is not in PCI_D0 > > > device_pm_callback_end: i915 0000:00:02.0, err=0 > > > > And here we try to leave the device alone which is OK for devices in D0, > > but not for suspended ones. > > > > It looks like we need to power up them at the "thaw" time too or at least > > I don't see how to address that differently. > > The question is whether the code which brings the device out of D0 should > write the message unconditionally. That would be sufficient I think. It doesn't have to do that. The problem here is that pci_pm_thaw_noirq() calls pci_restore_state() which in fact requires the device to be in D0, so the caller should put it into D0 instead of trying to "update" its power state. [Note that the PCI layer doesn't put devices into low-power states during the hibernation's "freeze" transition, but drivers can legitimately do that in their "freeze" callbacks which was overlooked in that code and that's what i915 does.] So IMO what we need is the change below. I'm going to test it shortly, but please give it a go too. --- drivers/pci/pci-driver.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) Index: linux-pm/drivers/pci/pci-driver.c =================================================================== --- linux-pm.orig/drivers/pci/pci-driver.c +++ linux-pm/drivers/pci/pci-driver.c @@ -1027,7 +1027,12 @@ static int pci_pm_thaw_noirq(struct devi if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); - pci_update_current_state(pci_dev, PCI_D0); + /* + * pci_restore_state() requires the device to be in D0 (because of MSI + * restoration among other things), so force it into D0 in case the + * driver's "freeze" callbacks put it into a low-power state directly. + */ + pci_set_power_state(pci_dev, PCI_D0); pci_restore_state(pci_dev); if (drv && drv->pm && drv->pm->thaw_noirq)