PCI: Freeze PME scan before suspending devices

Laurent Pinchart reported that the Renesas R-Car H2 Lager board
(r8a7790) crashes during suspend tests.  Geert Uytterhoeven managed to
reproduce the issue on an M2-W Koelsch board (r8a7791):

It occurs when the PME scan runs, once per second.  During PME scan, the
PCI host bridge (rcar-pci) registers are accessed while its module clock
has already been disabled, leading to the crash.

One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:

# echo 0 > /sys/module/printk/parameters/console_suspend
# echo s2idle > /sys/power/mem_sleep
# echo mem > /sys/power/state

Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test.  It does not (or is less likely) to happen during
full system suspend ("core" or "none") because system suspend also
disables timers, and thus the workqueue handling PME scans no longer
runs.  Geert believes the issue may still happen in the small window
between disabling module clocks and disabling timers:

# echo 0 > /sys/module/printk/parameters/console_suspend
# echo platform > /sys/power/pm_test    # Or "processors"
# echo mem > /sys/power/state

(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are
enabled.)

Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible.  To that end, queue the task on a
workqueue that gets frozen before devices suspend.

Rafael notes however that as a result, some wakeup events may be missed
if they are delivered via PME from a device without working IRQ (which
hence must be polled) and occur after the workqueue has been frozen.
If that turns out to be an issue in practice, it may be possible to
solve it by calling pci_pme_list_scan() once directly from one of the
host bridge's pm_ops callbacks.

Stacktrace for posterity:

PM: Syncing filesystems ... [   38.566237] done.
PM: Preparing system for sleep (mem)
Freezing user space processes ... [   38.579813] (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system (mem)
PM: suspend of devices complete after 152.456 msecs
PM: late suspend of devices complete after 2.809 msecs
PM: noirq suspend of devices complete after 29.863 msecs
suspend debug: Waiting for 5 second(s).
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000040004003, *pmd=00000000
Internal error: : 1211 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
Hardware name: Generic R8A7791 (Flattened Device Tree)
Workqueue: events pci_pme_list_scan
task: eb56e140 task.stack: eb58e000
PC is at pci_generic_config_read+0x64/0x6c
LR is at rcar_pci_cfg_base+0x64/0x84
pc : [<c041d7b4>]    lr : [<c04309a0>]    psr: 600d0093
sp : eb58fe98  ip : c041d750  fp : 00000008
r10: c0e2283c  r9 : 00000000  r8 : 600d0013
r7 : 00000008  r6 : eb58fed6  r5 : 00000002  r4 : eb58feb4
r3 : 00000000  r2 : 00000044  r1 : 00000008  r0 : 00000000
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 30c5387d  Table: 6a9f6c80  DAC: 55555555
Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
Stack: (0xeb58fe98 to 0xeb590000)
fe80:                                                       00000002 00000044
fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
[<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
(pci_bus_read_config_word+0x58/0x80)
[<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
(pci_check_pme_status+0x34/0x78)
[<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
[<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
[<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
(process_one_work+0x1bc/0x308)
[<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
[<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
[<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
---[ end trace 667d43ba3aa9e589 ]---

Reported-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: stable@vger.kernel.org # 2.6.37+
Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/pci.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Message ID	03aaf3220cccbabb7db86aa4eb10d1147add5c34.1492536769.git.lukas@wunner.de (mailing list archive)
State	Accepted
Delegated to:	Geert Uytterhoeven
Headers	show Return-Path: <linux-renesas-soc-owner@kernel.org> Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0794C601C2 for <patchwork-linux-renesas-soc@patchwork.kernel.org>; Tue, 18 Apr 2017 18:44:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EDBD1223C7 for <patchwork-linux-renesas-soc@patchwork.kernel.org>; Tue, 18 Apr 2017 18:44:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E08DE28354; Tue, 18 Apr 2017 18:44:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 354CF223C7 for <patchwork-linux-renesas-soc@patchwork.kernel.org>; Tue, 18 Apr 2017 18:44:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751424AbdDRSoh (ORCPT <rfc822;patchwork-linux-renesas-soc@patchwork.kernel.org>); Tue, 18 Apr 2017 14:44:37 -0400 Received: from mailout3.hostsharing.net ([176.9.242.54]:44091 "EHLO mailout3.hostsharing.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751300AbdDRSog (ORCPT <rfc822;linux-renesas-soc@vger.kernel.org>); Tue, 18 Apr 2017 14:44:36 -0400 Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mailout3.hostsharing.net (Postfix) with ESMTPS id B9056100AF827; Tue, 18 Apr 2017 20:44:39 +0200 (CEST) Received: from localhost (5-38-90-81.adsl.cmo.de [81.90.38.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by h08.hostsharing.net (Postfix) with ESMTPSA id 7AA31603E10E; Tue, 18 Apr 2017 20:44:30 +0200 (CEST) X-Mailbox-Line: From 03aaf3220cccbabb7db86aa4eb10d1147add5c34 Mon Sep 17 00:00:00 2001 Message-Id: <03aaf3220cccbabb7db86aa4eb10d1147add5c34.1492536769.git.lukas@wunner.de> From: Lukas Wunner <lukas@wunner.de> Date: Tue, 18 Apr 2017 20:44:30 +0200 Subject: [PATCH] PCI: Freeze PME scan before suspending devices MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit To: Bjorn Helgaas <bhelgaas@google.com> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>, Geert Uytterhoeven <geert+renesas@glider.be>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Mika Westerberg <mika.westerberg@linux.intel.com>, Niklas Soderlund <niklas.soderlund+renesas@ragnatech.se>, Simon Horman <horms+renesas@verge.net.au>, Yinghai Lu <yinghai@kernel.org>, Matthew Garrett <mjg59@srcf.ucam.org>, linux-pci@vger.kernel.org, linux-pm@vger.kernel.org, linux-renesas-soc@vger.kernel.org Sender: linux-renesas-soc-owner@vger.kernel.org Precedence: bulk List-ID: <linux-renesas-soc.vger.kernel.org> X-Mailing-List: linux-renesas-soc@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP

PCI: Freeze PME scan before suspending devices

Commit Message

Comments

Patch