diff mbox

[2/2] PM / sleep: prohibit devices probing during suspend/hibernation

Message ID 1444323427-6822-3-git-send-email-grygorii.strashko@ti.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Grygorii Strashko Oct. 8, 2015, 4:57 p.m. UTC
It is unsafe [1] if probing of devices will happen during suspend or
hibernation and system behavior will be unpredictable in this case.
So, lets prohibit device's probing in dpm_prepare() and defer their
probing instead. The normal behavior will be restored in
dpm_complete().

This patch introduces new DD core API device_defer_all_probes(bool enable)
to enable/disable probing of devices:
 if @enable = true
   It will disable probing of devices and defer their probes.
 otherwise
   It will restore normal behavior and trigger re-probing of deferred devices.

[1] https://lkml.org/lkml/2015/9/11/554
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/base/base.h       |  1 +
 drivers/base/dd.c         | 35 ++++++++++++++++++++++++++++++++++-
 drivers/base/power/main.c | 13 +++++++++++++
 3 files changed, 48 insertions(+), 1 deletion(-)

Comments

Alan Stern Oct. 8, 2015, 5:24 p.m. UTC | #1
On Thu, 8 Oct 2015, Grygorii Strashko wrote:

> It is unsafe [1] if probing of devices will happen during suspend or
> hibernation and system behavior will be unpredictable in this case.
> So, lets prohibit device's probing in dpm_prepare() and defer their

s/lets/let's/, and same for the comment in the code.

> probing instead. The normal behavior will be restored in
> dpm_complete().


> @@ -172,6 +179,26 @@ static void driver_deferred_probe_trigger(void)
>  }
>  
>  /**
> + * device_defer_all_probes() - Enable/disable probing of devices
> + * @enable:  Enable/disable probing of devices
> + *
> + * if @enable = true
> + *	It will disable probing of devices and defer their probes.
> + * otherwise
> + *	It will restore normal behavior and trigger re-probing of deferred
> + *	devices.
> + */
> +void device_defer_all_probes(bool enable)
> +{
> +	defer_all_probes = enable;
> +	if (enable)
> +		/* sync with probes to avoid any races. */
> +		wait_for_device_probe();
> +	else
> +		driver_deferred_probe_trigger();
> +}

Some people might prefer to see two separate functions, an enable
routine and a disable routine.  I don't much care.

> @@ -277,9 +304,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>  
>  static int really_probe(struct device *dev, struct device_driver *drv)
>  {
> -	int ret = 0;
> +	int ret = -EPROBE_DEFER;
>  	int local_trigger_count = atomic_read(&deferred_trigger_count);
>  
> +	if (defer_all_probes) {
> +		dev_dbg(dev, "Driver %s force probe deferral\n", drv->name);
> +		driver_deferred_probe_add(dev);
> +		return ret;
> +	}

In theory there's a race here.  If one CPU sets defer_all_probes, the 
new value might not be perceived by another CPU until a little while 
later.  Is there an easy way to insure that this race won't cause any 
problems?

Or do we already know that when this mechanism gets used, the system is 
already running on a single CPU?  I forget when that happens.

> @@ -1624,6 +1627,16 @@ int dpm_prepare(pm_message_t state)
>  	trace_suspend_resume(TPS("dpm_prepare"), state.event, true);
>  	might_sleep();
>  
> +	/* Give a chance for the known devices to complete their probing. */
> +	wait_for_device_probe();
> +	/*
> +	 * It is unsafe if probing of devices will happen during suspend or
> +	 * hibernation and system behavior will be unpredictable in this case.
> +	 * So, lets prohibit device's probing here and defer their probes
> +	 * instead. The normal behavior will be restored in dpm_complete().
> +	 */
> +	device_defer_all_probes(true);

Don't you want to call these two functions in the opposite order?  
First prevent new probes from occurring, then wait for any probes that 
are already in progress?  The way you have it here, a new probe could 
start between these two lines.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grygorii Strashko Oct. 8, 2015, 6:54 p.m. UTC | #2
On 10/08/2015 12:24 PM, Alan Stern wrote:
> On Thu, 8 Oct 2015, Grygorii Strashko wrote:
> 
>> It is unsafe [1] if probing of devices will happen during suspend or
>> hibernation and system behavior will be unpredictable in this case.
>> So, lets prohibit device's probing in dpm_prepare() and defer their
> 
> s/lets/let's/, and same for the comment in the code.
> 
>> probing instead. The normal behavior will be restored in
>> dpm_complete().
> 
> 
>> @@ -172,6 +179,26 @@ static void driver_deferred_probe_trigger(void)
>>   }
>>   
>>   /**
>> + * device_defer_all_probes() - Enable/disable probing of devices
>> + * @enable:  Enable/disable probing of devices
>> + *
>> + * if @enable = true
>> + *	It will disable probing of devices and defer their probes.
>> + * otherwise
>> + *	It will restore normal behavior and trigger re-probing of deferred
>> + *	devices.
>> + */
>> +void device_defer_all_probes(bool enable)
>> +{
>> +	defer_all_probes = enable;
>> +	if (enable)
>> +		/* sync with probes to avoid any races. */
>> +		wait_for_device_probe();

^ pls, pay attention on above code line

>> +	else
>> +		driver_deferred_probe_trigger();
>> +}
> 
> Some people might prefer to see two separate functions, an enable
> routine and a disable routine.  I don't much care.

May be. Should I change it?

> 
>> @@ -277,9 +304,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>   
>>   static int really_probe(struct device *dev, struct device_driver *drv)
>>   {
>> -	int ret = 0;
>> +	int ret = -EPROBE_DEFER;
>>   	int local_trigger_count = atomic_read(&deferred_trigger_count);
>>   
>> +	if (defer_all_probes) {
>> +		dev_dbg(dev, "Driver %s force probe deferral\n", drv->name);
>> +		driver_deferred_probe_add(dev);
>> +		return ret;
>> +	}
> 
> In theory there's a race here.  If one CPU sets defer_all_probes, the
> new value might not be perceived by another CPU until a little while
> later.  Is there an easy way to insure that this race won't cause any
> problems?

Yes. this question was raised by Rafael also [1].

> 
> Or do we already know that when this mechanism gets used, the system is
> already running on a single CPU?  I forget when that happens.

No. nonboot cpus are still  on.

> 
>> @@ -1624,6 +1627,16 @@ int dpm_prepare(pm_message_t state)
>>   	trace_suspend_resume(TPS("dpm_prepare"), state.event, true);
>>   	might_sleep();
>>   
>> +	/* Give a chance for the known devices to complete their probing. */
>> +	wait_for_device_probe();

^ this sync point is important at least at boot time + hibernation restore

>> +	/*
>> +	 * It is unsafe if probing of devices will happen during suspend or
>> +	 * hibernation and system behavior will be unpredictable in this case.
>> +	 * So, lets prohibit device's probing here and defer their probes
>> +	 * instead. The normal behavior will be restored in dpm_complete().
>> +	 */
>> +	device_defer_all_probes(true);
> 
> Don't you want to call these two functions in the opposite order?
> First prevent new probes from occurring, then wait for any probes that
> are already in progress?  The way you have it here, a new probe could
> start between these two lines.

No. Initially I did it as below:
     wait_for_device_probe(); <- wait for active probes
     device_defer_all_probes(true); <- prohibit probing
     wait_for_device_probe(); <- sync again to avoid races

then I decided to move second wait_for_device_probe() call inside
device_defer_all_probes() because it's always required.

[1] https://lkml.org/lkml/2015/9/17/857
Alan Stern Oct. 8, 2015, 7:20 p.m. UTC | #3
On Thu, 8 Oct 2015, Grygorii Strashko wrote:

> >>   /**
> >> + * device_defer_all_probes() - Enable/disable probing of devices
> >> + * @enable:  Enable/disable probing of devices
> >> + *
> >> + * if @enable = true
> >> + *	It will disable probing of devices and defer their probes.
> >> + * otherwise
> >> + *	It will restore normal behavior and trigger re-probing of deferred
> >> + *	devices.
> >> + */
> >> +void device_defer_all_probes(bool enable)
> >> +{
> >> +	defer_all_probes = enable;
> >> +	if (enable)
> >> +		/* sync with probes to avoid any races. */
> >> +		wait_for_device_probe();
> 
> ^ pls, pay attention on above code line
> 
> >> +	else
> >> +		driver_deferred_probe_trigger();
> >> +}
> > 
> > Some people might prefer to see two separate functions, an enable
> > routine and a disable routine.  I don't much care.
> 
> May be. Should I change it?

It would then be more in line with functions like 
pm_runtime_set_{active|suspended} or pm_runtime_[dont_]use_autosuspend.

> >> @@ -277,9 +304,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
> >>   
> >>   static int really_probe(struct device *dev, struct device_driver *drv)
> >>   {
> >> -	int ret = 0;
> >> +	int ret = -EPROBE_DEFER;
> >>   	int local_trigger_count = atomic_read(&deferred_trigger_count);
> >>   
> >> +	if (defer_all_probes) {
> >> +		dev_dbg(dev, "Driver %s force probe deferral\n", drv->name);
> >> +		driver_deferred_probe_add(dev);
> >> +		return ret;
> >> +	}
> > 
> > In theory there's a race here.  If one CPU sets defer_all_probes, the
> > new value might not be perceived by another CPU until a little while
> > later.  Is there an easy way to insure that this race won't cause any
> > problems?
> 
> Yes. this question was raised by Rafael also [1].

I see.  Can you add a comment explaining all of this?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grygorii Strashko Oct. 8, 2015, 7:48 p.m. UTC | #4
On 10/08/2015 02:20 PM, Alan Stern wrote:
> On Thu, 8 Oct 2015, Grygorii Strashko wrote:
> 
>>>>    /**
>>>> + * device_defer_all_probes() - Enable/disable probing of devices
>>>> + * @enable:  Enable/disable probing of devices
>>>> + *
>>>> + * if @enable = true
>>>> + *	It will disable probing of devices and defer their probes.
>>>> + * otherwise
>>>> + *	It will restore normal behavior and trigger re-probing of deferred
>>>> + *	devices.
>>>> + */
>>>> +void device_defer_all_probes(bool enable)
>>>> +{
>>>> +	defer_all_probes = enable;
>>>> +	if (enable)
>>>> +		/* sync with probes to avoid any races. */
>>>> +		wait_for_device_probe();
>>
>> ^ pls, pay attention on above code line
>>
>>>> +	else
>>>> +		driver_deferred_probe_trigger();
>>>> +}
>>>
>>> Some people might prefer to see two separate functions, an enable
>>> routine and a disable routine.  I don't much care.
>>
>> May be. Should I change it?
> 
> It would then be more in line with functions like
> pm_runtime_set_{active|suspended} or pm_runtime_[dont_]use_autosuspend.

ok

> 
>>>> @@ -277,9 +304,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>>>    
>>>>    static int really_probe(struct device *dev, struct device_driver *drv)
>>>>    {
>>>> -	int ret = 0;
>>>> +	int ret = -EPROBE_DEFER;
>>>>    	int local_trigger_count = atomic_read(&deferred_trigger_count);
>>>>    
>>>> +	if (defer_all_probes) {

Will it be ok If I add below comment here?
		/*
		 * Value of defer_all_probes can be set only by
		 * device_defer_all_probes_enable() which, in turn, will call
		 * wait_for_device_probe() right after that to avoid any races.
		 */


>>>> +		dev_dbg(dev, "Driver %s force probe deferral\n", drv->name);
>>>> +		driver_deferred_probe_add(dev);
>>>> +		return ret;
>>>> +	}
>>>
>>> In theory there's a race here.  If one CPU sets defer_all_probes, the
>>> new value might not be perceived by another CPU until a little while
>>> later.  Is there an easy way to insure that this race won't cause any
>>> problems?
>>
>> Yes. this question was raised by Rafael also [1].
> 
> I see.  Can you add a comment explaining all of this?
Alan Stern Oct. 8, 2015, 8:05 p.m. UTC | #5
On Thu, 8 Oct 2015, Grygorii Strashko wrote:

> >>>> @@ -277,9 +304,15 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
> >>>>    
> >>>>    static int really_probe(struct device *dev, struct device_driver *drv)
> >>>>    {
> >>>> -	int ret = 0;
> >>>> +	int ret = -EPROBE_DEFER;
> >>>>    	int local_trigger_count = atomic_read(&deferred_trigger_count);
> >>>>    
> >>>> +	if (defer_all_probes) {
> 
> Will it be ok If I add below comment here?
> 		/*
> 		 * Value of defer_all_probes can be set only by
> 		 * device_defer_all_probes_enable() which, in turn, will call
> 		 * wait_for_device_probe() right after that to avoid any races.
> 		 */

That will help.  You could also add a comment where
wait_for_device_probe is called before device_defer_all_probes_enable,
explaining that is needed.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Oct. 8, 2015, 8:46 p.m. UTC | #6
On Thursday, October 08, 2015 03:20:08 PM Alan Stern wrote:
> On Thu, 8 Oct 2015, Grygorii Strashko wrote:
> 
> > >>   /**
> > >> + * device_defer_all_probes() - Enable/disable probing of devices
> > >> + * @enable:  Enable/disable probing of devices
> > >> + *
> > >> + * if @enable = true
> > >> + *	It will disable probing of devices and defer their probes.
> > >> + * otherwise
> > >> + *	It will restore normal behavior and trigger re-probing of deferred
> > >> + *	devices.
> > >> + */
> > >> +void device_defer_all_probes(bool enable)
> > >> +{
> > >> +	defer_all_probes = enable;
> > >> +	if (enable)
> > >> +		/* sync with probes to avoid any races. */
> > >> +		wait_for_device_probe();
> > 
> > ^ pls, pay attention on above code line
> > 
> > >> +	else
> > >> +		driver_deferred_probe_trigger();
> > >> +}
> > > 
> > > Some people might prefer to see two separate functions, an enable
> > > routine and a disable routine.  I don't much care.
> > 
> > May be. Should I change it?
> 
> It would then be more in line with functions like 
> pm_runtime_set_{active|suspended} or pm_runtime_[dont_]use_autosuspend.

Also it would be cleaner code without conditionals:

enable:
	defer_all_probes = true;
	wait_for_device_probe();

disable:
	defer_all_probes = false;
	driver_deferred_probe_trigger();

Cleaner, no?

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grygorii Strashko Oct. 9, 2015, 2:31 p.m. UTC | #7
On 10/08/2015 03:46 PM, Rafael J. Wysocki wrote:
> On Thursday, October 08, 2015 03:20:08 PM Alan Stern wrote:
>> On Thu, 8 Oct 2015, Grygorii Strashko wrote:
>>
>>>>>    /**
>>>>> + * device_defer_all_probes() - Enable/disable probing of devices
>>>>> + * @enable:  Enable/disable probing of devices
>>>>> + *
>>>>> + * if @enable = true
>>>>> + *	It will disable probing of devices and defer their probes.
>>>>> + * otherwise
>>>>> + *	It will restore normal behavior and trigger re-probing of deferred
>>>>> + *	devices.
>>>>> + */
>>>>> +void device_defer_all_probes(bool enable)
>>>>> +{
>>>>> +	defer_all_probes = enable;
>>>>> +	if (enable)
>>>>> +		/* sync with probes to avoid any races. */
>>>>> +		wait_for_device_probe();
>>>
>>> ^ pls, pay attention on above code line
>>>
>>>>> +	else
>>>>> +		driver_deferred_probe_trigger();
>>>>> +}
>>>>
>>>> Some people might prefer to see two separate functions, an enable
>>>> routine and a disable routine.  I don't much care.
>>>
>>> May be. Should I change it?
>>
>> It would then be more in line with functions like
>> pm_runtime_set_{active|suspended} or pm_runtime_[dont_]use_autosuspend.
>
> Also it would be cleaner code without conditionals:
>
> enable:
> 	defer_all_probes = true;
> 	wait_for_device_probe();
>
> disable:
> 	defer_all_probes = false;
> 	driver_deferred_probe_trigger();
>
> Cleaner, no?
>

NP. Will do.
diff mbox

Patch

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 1782f3a..6c9684b 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -131,6 +131,7 @@  extern void device_remove_groups(struct device *dev,
 extern char *make_class_name(const char *name, struct kobject *kobj);
 
 extern int devres_release_all(struct device *dev);
+extern void device_defer_all_probes(bool enable);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 98de1a5..d600c14 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -55,6 +55,13 @@  static struct workqueue_struct *deferred_wq;
 static atomic_t deferred_trigger_count = ATOMIC_INIT(0);
 
 /*
+ * In some cases, like suspend to RAM or hibernation, It might be reasonable
+ * to prohibit probing of devices as it could be unsafe.
+ * Once defer_all_probes is true all drivers probes will be forcibly deferred.
+ */
+static bool defer_all_probes;
+
+/*
  * deferred_probe_work_func() - Retry probing devices in the active list.
  */
 static void deferred_probe_work_func(struct work_struct *work)
@@ -172,6 +179,26 @@  static void driver_deferred_probe_trigger(void)
 }
 
 /**
+ * device_defer_all_probes() - Enable/disable probing of devices
+ * @enable:  Enable/disable probing of devices
+ *
+ * if @enable = true
+ *	It will disable probing of devices and defer their probes.
+ * otherwise
+ *	It will restore normal behavior and trigger re-probing of deferred
+ *	devices.
+ */
+void device_defer_all_probes(bool enable)
+{
+	defer_all_probes = enable;
+	if (enable)
+		/* sync with probes to avoid any races. */
+		wait_for_device_probe();
+	else
+		driver_deferred_probe_trigger();
+}
+
+/**
  * deferred_probe_initcall() - Enable probing of deferred devices
  *
  * We don't want to get in the way when the bulk of drivers are getting probed.
@@ -277,9 +304,15 @@  static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
 
 static int really_probe(struct device *dev, struct device_driver *drv)
 {
-	int ret = 0;
+	int ret = -EPROBE_DEFER;
 	int local_trigger_count = atomic_read(&deferred_trigger_count);
 
+	if (defer_all_probes) {
+		dev_dbg(dev, "Driver %s force probe deferral\n", drv->name);
+		driver_deferred_probe_add(dev);
+		return ret;
+	}
+
 	atomic_inc(&probe_count);
 	pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
 		 drv->bus->name, __func__, drv->name, dev_name(dev));
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 1710c26..98c1da36 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -963,6 +963,9 @@  void dpm_complete(pm_message_t state)
 	}
 	list_splice(&list, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
+
+	/* Allow device probing and trigger re-probing of deferred devices */
+	device_defer_all_probes(false);
 	trace_suspend_resume(TPS("dpm_complete"), state.event, false);
 }
 
@@ -1624,6 +1627,16 @@  int dpm_prepare(pm_message_t state)
 	trace_suspend_resume(TPS("dpm_prepare"), state.event, true);
 	might_sleep();
 
+	/* Give a chance for the known devices to complete their probing. */
+	wait_for_device_probe();
+	/*
+	 * It is unsafe if probing of devices will happen during suspend or
+	 * hibernation and system behavior will be unpredictable in this case.
+	 * So, lets prohibit device's probing here and defer their probes
+	 * instead. The normal behavior will be restored in dpm_complete().
+	 */
+	device_defer_all_probes(true);
+
 	mutex_lock(&dpm_list_mtx);
 	while (!list_empty(&dpm_list)) {
 		struct device *dev = to_device(dpm_list.next);