From patchwork Thu Sep 17 15:48:29 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grygorii Strashko X-Patchwork-Id: 7208661 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id CFAC99F39B for ; Thu, 17 Sep 2015 15:48:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9FCA82082A for ; Thu, 17 Sep 2015 15:48:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4902020817 for ; Thu, 17 Sep 2015 15:48:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751770AbbIQPsm (ORCPT ); Thu, 17 Sep 2015 11:48:42 -0400 Received: from comal.ext.ti.com ([198.47.26.152]:48408 "EHLO comal.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751000AbbIQPsl (ORCPT ); Thu, 17 Sep 2015 11:48:41 -0400 Received: from dflxv15.itg.ti.com ([128.247.5.124]) by comal.ext.ti.com (8.13.7/8.13.7) with ESMTP id t8HFmXbS018655; Thu, 17 Sep 2015 10:48:33 -0500 Received: from DFLE72.ent.ti.com (dfle72.ent.ti.com [128.247.5.109]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id t8HFmXYT023611; Thu, 17 Sep 2015 10:48:33 -0500 Received: from dlep32.itg.ti.com (157.170.170.100) by DFLE72.ent.ti.com (128.247.5.109) with Microsoft SMTP Server id 14.3.224.2; Thu, 17 Sep 2015 10:48:33 -0500 Received: from [137.167.41.102] (ileax41-snat.itg.ti.com [10.172.224.153]) by dlep32.itg.ti.com (8.14.3/8.13.8) with ESMTP id t8HFmVe7009436; Thu, 17 Sep 2015 10:48:31 -0500 Subject: Re: [PATCH] driver core: Ensure proper suspend/resume ordering To: "Rafael J. Wysocki" , Alan Stern References: <4548634.pC6GYZYGWm@vostro.rjw.lan> CC: Thierry Reding , Greg Kroah-Hartman , "Rafael J. Wysocki" , Tomeu Vizoso , , From: Grygorii Strashko Message-ID: <55FAE0CD.2030605@ti.com> Date: Thu, 17 Sep 2015 18:48:29 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <4548634.pC6GYZYGWm@vostro.rjw.lan> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, On 09/17/2015 03:07 AM, Rafael J. Wysocki wrote: > On Wednesday, September 16, 2015 03:27:55 PM Alan Stern wrote: >> On Wed, 16 Sep 2015, Grygorii Strashko wrote: >> >>> I think, It should prohibited to probe devices during suspend/hibernation. >>> And solution introduced in this patch might help to fix it - >>> in general, we could do : >>> - add sync point on suspend enter: wait_for_device_probe() and >>> - prohibit probing: move all devices which will request probing into >>> deferred_probe list >>> - one suspend exit: allow probing and do driver_deferred_probe_trigger >> >> That could work; it's a good idea. >> >>> I'd like to mention here that this patch will work only >>> if dmp_list will be filled according device creation order ("parent<-child" dependencies) >>> *AND* according device's probing order ("supplier<-consumer"). >>> So, if there is the case when Parent device can be probed AFTER its children >>> - it will not work, because "parent<-child" dependencies will not be tracked >>> any more :( Sry, I could not even imagine that such crazy case exist :'( >> >> If we avoid moving devices to the end of the dpm_list when they already >> have children, then we should be okay, right? >> >>> Are there any other subsystems with the same behavior like PCI? >> >> I don't know. >> >>> If not - probably, it could be fixed in PCI subsystem using device_pm_move_after() or >>> device_move() in PCIe ports probe. >>> if yes - ... maybe we can scan/re-check and reorder dpm_list on suspend enter and >>> restore ("parent<-child" dependencies). >> >>> Truth is that smth. need to be done 100%. Personally, I was hit by this issue also, >>> and it cost me 3 hours of debugging and I came up with the same patch as >>> Bill Huang, then spent some time trying to understand what is wrong with PCI >>> - finally, I've just changed the order of my devices in DT :) >>> >>> Also, I think, it will be good to have this patch in -next to collect more feedbacks. >> >> I like the idea of forcing all probes during a sleep transition to be >> deferred. We could carry them out just before unfreezing the user >> threads. That combined with the change mentioned above ought to be >> worth testing. > > Agreed. > I've prepared code change which should prohibit devices probing during suspend/hibernation (below). It also expected to fix wait_for_device_probe() to take into account the case when the deferred probe workqueue could be still active. NOTE: It's only compile time tested! I'm very sorry that I'm replying here instead of sending a proper patch - I'm on business trip right now and I will be traveling next week also and will not be able to work on it intensively. If proposed approach is correct I can send RFC/RFT patch/es (or anyone else could pick up it if interested to move forward faster). diff --git a/drivers/base/dd.c b/drivers/base/dd.c index be0eb46..dcadf30 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -55,6 +55,14 @@ static struct workqueue_struct *deferred_wq; static atomic_t deferred_trigger_count = ATOMIC_INIT(0); /* + * In some cases, like suspend to RAM or hibernation, It might be reasonable + * to prohibit probing of devices as it could be unsafe. + * Once driver_force_probe_deferral is true all drivers probes will + * be forcibly deferred + */ +static bool driver_force_probe_deferral; + +/* * deferred_probe_work_func() - Retry probing devices in the active list. */ static void deferred_probe_work_func(struct work_struct *work) @@ -171,6 +179,14 @@ static void driver_deferred_probe_trigger(void) queue_work(deferred_wq, &deferred_probe_work); } +void device_force_probe_deferral(bool enable) +{ + driver_force_probe_deferral = enable; + if (!enable) + driver_deferred_probe_trigger(); +} +EXPORT_SYMBOL_GPL(device_force_probe_deferral); + /** * deferred_probe_initcall() - Enable probing of deferred devices * @@ -277,9 +293,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue); static int really_probe(struct device *dev, struct device_driver *drv) { - int ret = 0; + int ret = -EPROBE_DEFER; int local_trigger_count = atomic_read(&deferred_trigger_count); + if (driver_force_probe_deferral) { + dev_dbg(dev, "Driver %s force probe deferral\n", drv->name); + driver_deferred_probe_add(dev); + return ret; + } + atomic_inc(&probe_count); pr_debug("bus: '%s': %s: probing driver %s with device %s\n", drv->bus->name, __func__, drv->name, dev_name(dev)); @@ -391,6 +413,10 @@ int driver_probe_done(void) */ void wait_for_device_probe(void) { + /* wait for the deferred probe workqueue to finish */ + if (driver_deferred_probe_enable) + flush_workqueue(deferred_wq); + /* wait for the known devices to complete their probing */ wait_event(probe_waitqueue, atomic_read(&probe_count) == 0); async_synchronize_full(); diff --git a/include/linux/device.h b/include/linux/device.h index 5d7bc63..c68b8e1 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1034,6 +1034,7 @@ extern int __must_check device_attach(struct device *dev); extern int __must_check driver_attach(struct device_driver *drv); extern void device_initial_probe(struct device *dev); extern int __must_check device_reprobe(struct device *dev); +extern void device_force_probe_deferral(bool enable); /* * Easy functions for dynamically creating devices on the fly diff --git a/kernel/power/process.c b/kernel/power/process.c index 564f786..c13e78d 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -148,6 +148,13 @@ int freeze_processes(void) if (!error && !oom_killer_disable()) error = -EBUSY; + if (!error) { + /** wait for the known devices to complete their probing */ + wait_for_device_probe(); + device_force_probe_deferral(true); + wait_for_device_probe(); + } + if (error) thaw_processes(); return error; @@ -190,6 +197,7 @@ void thaw_processes(void) atomic_dec(&system_freezing_cnt); pm_freezing = false; pm_nosig_freezing = false; + device_force_probe_deferral(false); oom_killer_enable();