diff mbox

thinkpad-acpi: fix potential suspend blocking issue

Message ID 1362504883-9180-1-git-send-email-msb@chromium.org (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Mandeep Singh Baines March 5, 2013, 5:34 p.m. UTC
Fixes the following lockdep error:

[ BUG: ktpacpi_nvramd/446 still has locks held! ]

hotkey_kthread() calls set_freezable() after acquiring the
hotkey_kthread_mutex(). set_freezable() calls try_to_freeze().
This could block suspend if we were to freeze at this point
and another task were to block on the mutex, potentially via
writing to one of the sysfs attrs. This race is unlikely but
can be easily fixed by moving the set_freezable() call.

Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Aaron Lu <aaron.lu@intel.com>
CC: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
CC: Tejun Heo <tj@kernel.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
---
 drivers/platform/x86/thinkpad_acpi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Oleg Nesterov March 5, 2013, 5:48 p.m. UTC | #1
On 03/05, Mandeep Singh Baines wrote:
>
> @@ -2462,13 +2462,13 @@ static int hotkey_kthread(void *data)
>  	unsigned int poll_freq;
>  	bool was_frozen;
>  
> +	set_freezable();
> +
>  	mutex_lock(&hotkey_thread_mutex);
>  
>  	if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
>  		goto exit;
>  
> -	set_freezable();
> -

I don't understand this code... but don't we have the same problem
with kthread_freezable_should_stop() below? It can call __refrigerator()
too under the same lock.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mandeep Singh Baines March 5, 2013, 5:59 p.m. UTC | #2
On Tue, Mar 5, 2013 at 9:48 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/05, Mandeep Singh Baines wrote:
>>
>> @@ -2462,13 +2462,13 @@ static int hotkey_kthread(void *data)
>>       unsigned int poll_freq;
>>       bool was_frozen;
>>
>> +     set_freezable();
>> +
>>       mutex_lock(&hotkey_thread_mutex);
>>
>>       if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
>>               goto exit;
>>
>> -     set_freezable();
>> -
>
> I don't understand this code... but don't we have the same problem
> with kthread_freezable_should_stop() below? It can call __refrigerator()
> too under the same lock.
>

I don't think the lock is held at that point. There is an unlock right
before entering the while loop and at the bottom of the loop.

> Oleg.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Oleg Nesterov March 5, 2013, 6:05 p.m. UTC | #3
On 03/05, Mandeep Singh Baines wrote:
>
> On Tue, Mar 5, 2013 at 9:48 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 03/05, Mandeep Singh Baines wrote:
> >>
> >> @@ -2462,13 +2462,13 @@ static int hotkey_kthread(void *data)
> >>       unsigned int poll_freq;
> >>       bool was_frozen;
> >>
> >> +     set_freezable();
> >> +
> >>       mutex_lock(&hotkey_thread_mutex);
> >>
> >>       if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
> >>               goto exit;
> >>
> >> -     set_freezable();
> >> -
> >
> > I don't understand this code... but don't we have the same problem
> > with kthread_freezable_should_stop() below? It can call __refrigerator()
> > too under the same lock.
> >
>
> I don't think the lock is held at that point. There is an unlock right
> before entering the while loop and at the bottom of the loop.

Hmm... Afaics this is another lock, hotkey_thread_data_mutex. But
hotkey_thread_mutex is still held.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Rutecki March 5, 2013, 7:18 p.m. UTC | #4
On wtorek, 5 marca 2013 o 18:34:43 Mandeep Singh Baines wrote:
> Fixes the following lockdep error:
> 
> [ BUG: ktpacpi_nvramd/446 still has locks held! ]
> 
> hotkey_kthread() calls set_freezable() after acquiring the
> hotkey_kthread_mutex(). set_freezable() calls try_to_freeze().
> This could block suspend if we were to freeze at this point
> and another task were to block on the mutex, potentially via
> writing to one of the sysfs attrs. This race is unlikely but
> can be easily fixed by moving the set_freezable() call.
> 
> Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com>
> Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
> CC: Aaron Lu <aaron.lu@intel.com>
> CC: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
> CC: Tejun Heo <tj@kernel.org>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>

Patch fixes the problem. Thanks!

Regards
Mandeep Singh Baines March 5, 2013, 8:55 p.m. UTC | #5
On Tue, Mar 5, 2013 at 10:05 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/05, Mandeep Singh Baines wrote:
>>
>> On Tue, Mar 5, 2013 at 9:48 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > On 03/05, Mandeep Singh Baines wrote:
>> >>
>> >> @@ -2462,13 +2462,13 @@ static int hotkey_kthread(void *data)
>> >>       unsigned int poll_freq;
>> >>       bool was_frozen;
>> >>
>> >> +     set_freezable();
>> >> +
>> >>       mutex_lock(&hotkey_thread_mutex);
>> >>
>> >>       if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
>> >>               goto exit;
>> >>
>> >> -     set_freezable();
>> >> -
>> >
>> > I don't understand this code... but don't we have the same problem
>> > with kthread_freezable_should_stop() below? It can call __refrigerator()
>> > too under the same lock.
>> >
>>
>> I don't think the lock is held at that point. There is an unlock right
>> before entering the while loop and at the bottom of the loop.
>
> Hmm... Afaics this is another lock, hotkey_thread_data_mutex. But
> hotkey_thread_mutex is still held.
>

Ah. You're right. The two names were similar so that confused me. I'm
also looking at this code for the first time:)

This mutex seems wrong. Its held the entire time the kthread is
running. I think its used to synchronize on the exit of the kthread. A
completion would more appropriate in that case.

Regards,
Mandeep

> Oleg.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton March 5, 2013, 10:18 p.m. UTC | #6
On Tue,  5 Mar 2013 09:34:43 -0800 Mandeep Singh Baines <msb@chromium.org> wrote:

> Fixes the following lockdep error:
> 
> [ BUG: ktpacpi_nvramd/446 still has locks held! ]
> 
> hotkey_kthread() calls set_freezable() after acquiring the
> hotkey_kthread_mutex(). set_freezable() calls try_to_freeze().
> This could block suspend if we were to freeze at this point
> and another task were to block on the mutex, potentially via
> writing to one of the sysfs attrs. This race is unlikely but
> can be easily fixed by moving the set_freezable() call.
>
> ...
>
> --- a/drivers/platform/x86/thinkpad_acpi.c
> +++ b/drivers/platform/x86/thinkpad_acpi.c
> @@ -2462,13 +2462,13 @@ static int hotkey_kthread(void *data)
>  	unsigned int poll_freq;
>  	bool was_frozen;
>  
> +	set_freezable();
> +
>  	mutex_lock(&hotkey_thread_mutex);
>  
>  	if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
>  		goto exit;
>  
> -	set_freezable();
> -
>  	so = 0;
>  	si = 1;
>  	t = 0;

Basically the same as
http://ozlabs.org/~akpm/mmots/broken-out/drivers-platform-x86-thinkpad_acpic-move-hotkey_thread_mutex-lock-after-set_freezable.patch.
 I think Artem's patch is a little better.  There doesn't appear to be
any locking protocol for tpacpi_lifecycle.

I'll move Artem's patch into my for-3.9-rc2 queue.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Henrique de Moraes Holschuh March 5, 2013, 11:26 p.m. UTC | #7
On Tue, 05 Mar 2013, Mandeep Singh Baines wrote:
> This mutex seems wrong. Its held the entire time the kthread is
> running. I think its used to synchronize on the exit of the kthread. A
> completion would more appropriate in that case.

From the top of the driver source:

/* Acquired while the poller kthread is running, use to sync start/stop */
static struct mutex hotkey_thread_mutex;

/*
 * Acquire mutex to write poller control variables as an
 * atomic block.
 *
 * Increment hotkey_config_change when changing them if you
 * want the kthread to forget old state.
 *
 * See HOTKEY_CONFIG_CRITICAL_START/HOTKEY_CONFIG_CRITICAL_END
 */
static struct mutex hotkey_thread_data_mutex;
static unsigned int hotkey_config_change;

#define HOTKEY_CONFIG_CRITICAL_START \
        do { \
                mutex_lock(&hotkey_thread_data_mutex); \
                hotkey_config_change++; \
        } while (0);
#define HOTKEY_CONFIG_CRITICAL_END \
        mutex_unlock(&hotkey_thread_data_mutex);


This can likely be modernized a lot.  This code is from 2008, I think it
first shipped in 2.6.25-rc1.
Oleg Nesterov March 6, 2013, 3:44 p.m. UTC | #8
On 03/05, Henrique de Moraes Holschuh wrote:
>
> On Tue, 05 Mar 2013, Mandeep Singh Baines wrote:
> > This mutex seems wrong. Its held the entire time the kthread is
> > running. I think its used to synchronize on the exit of the kthread. A
> > completion would more appropriate in that case.
>
> From the top of the driver source:
>
> /* Acquired while the poller kthread is running, use to sync start/stop */
> static struct mutex hotkey_thread_mutex;

I simply can't understand what this "sync start/stop" means...

Ignoring hotkey_kthread(), the only user is

	static void hotkey_poll_stop_sync(void)
	{
		if (tpacpi_hotkey_task) {
			kthread_stop(tpacpi_hotkey_task);
			tpacpi_hotkey_task = NULL;
			mutex_lock(&hotkey_thread_mutex);
			/* at this point, the thread did exit */
			mutex_unlock(&hotkey_thread_mutex);
		}
	}

And I simply do not understand the comment. This thread has already exited
when kthread_stop() returns (OK, it can be running do_exit() paths but this
doesn't matter). So this mutex_lock() buys nothing afaics.

As for serializing with hotkey_poll_setup/etc, looks like this code relies
on hotkey_mutex.

So I think hotkey_thread_mutex can be simply removed?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Oleg Nesterov March 6, 2013, 3:50 p.m. UTC | #9
On 03/05, Andrew Morton wrote:
>
> Basically the same as
> http://ozlabs.org/~akpm/mmots/broken-out/drivers-platform-x86-thinkpad_acpic-move-hotkey_thread_mutex-lock-after-set_freezable.patch.
>  I think Artem's patch is a little better.  There doesn't appear to be
> any locking protocol for tpacpi_lifecycle.

Which seems to have the same problem, hotkey_kthread() still calls
kthread_freezable_should_stop() under hotkey_thread_mutex.

IOW, we have two try_to_freeze's here, the patch moves only one of
them outside of the hotkey_thread_mutex.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Artem Savkov March 6, 2013, 8:18 p.m. UTC | #10
On Wed, Mar 06, 2013 at 04:50:39PM +0100, Oleg Nesterov wrote:
> On 03/05, Andrew Morton wrote:
> >
> > Basically the same as
> > http://ozlabs.org/~akpm/mmots/broken-out/drivers-platform-x86-thinkpad_acpic-move-hotkey_thread_mutex-lock-after-set_freezable.patch.
> >  I think Artem's patch is a little better.  There doesn't appear to be
> > any locking protocol for tpacpi_lifecycle.
> 
> Which seems to have the same problem, hotkey_kthread() still calls
> kthread_freezable_should_stop() under hotkey_thread_mutex.
> 
> IOW, we have two try_to_freeze's here, the patch moves only one of
> them outside of the hotkey_thread_mutex.

It's hard for me to judge but this lock does indeed look like it has
been used to block until the thread exits. I'm trying out the "remove
hotkey_thread_mutex completely" approach and everything looks fine so
far.
Henrique de Moraes Holschuh March 6, 2013, 11:32 p.m. UTC | #11
On Wed, 06 Mar 2013, Oleg Nesterov wrote:
> On 03/05, Henrique de Moraes Holschuh wrote:
> > On Tue, 05 Mar 2013, Mandeep Singh Baines wrote:
> > > This mutex seems wrong. Its held the entire time the kthread is
> > > running. I think its used to synchronize on the exit of the kthread. A
> > > completion would more appropriate in that case.
> >
> > From the top of the driver source:
> >
> > /* Acquired while the poller kthread is running, use to sync start/stop */
> > static struct mutex hotkey_thread_mutex;
> 
> I simply can't understand what this "sync start/stop" means...
> 
> Ignoring hotkey_kthread(), the only user is
> 
> 	static void hotkey_poll_stop_sync(void)
> 	{
> 		if (tpacpi_hotkey_task) {
> 			kthread_stop(tpacpi_hotkey_task);
> 			tpacpi_hotkey_task = NULL;
> 			mutex_lock(&hotkey_thread_mutex);
> 			/* at this point, the thread did exit */
> 			mutex_unlock(&hotkey_thread_mutex);
> 		}
> 	}
> 
> And I simply do not understand the comment. This thread has already exited
> when kthread_stop() returns (OK, it can be running do_exit() paths but this
> doesn't matter). So this mutex_lock() buys nothing afaics.

It was added due to an oops, waaaaay back then.  If it is not needed
anymore, and there is zero chance of the kthread still being active when
hotkey_poll_stop_sync() ends, hotkey_thread_mutex can be simply removed.

Note that hotkey_thread_data_mutex is still required.

> As for serializing with hotkey_poll_setup/etc, looks like this code relies
> on hotkey_mutex.
> 
> So I think hotkey_thread_mutex can be simply removed?

Looks like it, if the current semanthics of ktread_stop() are syncronous.
Oleg Nesterov March 7, 2013, 5:53 p.m. UTC | #12
On 03/06, Henrique de Moraes Holschuh wrote:
>
> On Wed, 06 Mar 2013, Oleg Nesterov wrote:
> >
> > 	static void hotkey_poll_stop_sync(void)
> > 	{
> > 		if (tpacpi_hotkey_task) {
> > 			kthread_stop(tpacpi_hotkey_task);
> > 			tpacpi_hotkey_task = NULL;
> > 			mutex_lock(&hotkey_thread_mutex);
> > 			/* at this point, the thread did exit */
> > 			mutex_unlock(&hotkey_thread_mutex);
> > 		}
> > 	}
> >
> > And I simply do not understand the comment. This thread has already exited
> > when kthread_stop() returns (OK, it can be running do_exit() paths but this
> > doesn't matter). So this mutex_lock() buys nothing afaics.
>
> It was added due to an oops, waaaaay back then.  If it is not needed
> anymore, and there is zero chance of the kthread still being active when
> hotkey_poll_stop_sync() ends, hotkey_thread_mutex can be simply removed.

Well, there could be another bug. Say, hotkey_poll_stop_sync() can block
on hotkey_thread_mutex if another thread was started. But at first glance
this can't happen (hotkey_mutex), and even _if_ it can this needs another
fix.

> Looks like it, if the current semanthics of ktread_stop() are syncronous.

IIRC, it always was... But at least currently it is certainly syncronous.
kthread_stop(t) does wait_for_completion(t->vfork_done), complete(vfork_done)
can't happen unless this task calls do_exit().

Hmm. I just noticed that the recent changes in kthread_stop() are not correct...
But this is offtopic and doesn't affect thinkpad_acpi.c, I'll write another
email later.

So, what do you think about (UNTESTED) 1/1 ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
index 9a90756..69870a841 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -2462,13 +2462,13 @@  static int hotkey_kthread(void *data)
 	unsigned int poll_freq;
 	bool was_frozen;
 
+	set_freezable();
+
 	mutex_lock(&hotkey_thread_mutex);
 
 	if (tpacpi_lifecycle == TPACPI_LIFE_EXITING)
 		goto exit;
 
-	set_freezable();
-
 	so = 0;
 	si = 1;
 	t = 0;