workqueue: Handle race between wake up and rebind

Message ID	1516005492-4994-1-git-send-email-neeraju@codeaurora.org (mailing list archive)
State	Not Applicable, archived
Delegated to:	Andy Gross
Headers	show Return-Path: <linux-arm-msm-owner@kernel.org> sender: neeraju@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 0C8476053B; Mon, 15 Jan 2018 08:38:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 0C8476053B From: Neeraj Upadhyay <neeraju@codeaurora.org> To: tj@kernel.org, jiangshanlai@gmail.com Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, prsood@codeaurora.org, sramana@codeaurora.org, Neeraj Upadhyay <neeraju@codeaurora.org> Subject: [PATCH] workqueue: Handle race between wake up and rebind Date: Mon, 15 Jan 2018 14:08:12 +0530 Message-Id: <1516005492-4994-1-git-send-email-neeraju@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk

Message ID

1516005492-4994-1-git-send-email-neeraju@codeaurora.org (mailing list archive)

State

Not Applicable, archived

Delegated to:

Andy Gross

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 0C8476053B
From: Neeraj Upadhyay <neeraju@codeaurora.org>
To: tj@kernel.org, jiangshanlai@gmail.com
Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	prsood@codeaurora.org, sramana@codeaurora.org,
	Neeraj Upadhyay <neeraju@codeaurora.org>
Subject: [PATCH] workqueue: Handle race between wake up and rebind
Date: Mon, 15 Jan 2018 14:08:12 +0530
Message-Id: <1516005492-4994-1-git-send-email-neeraju@codeaurora.org>
Sender: linux-arm-msm-owner@vger.kernel.org
Precedence: bulk

Commit Message

Neeraj Upadhyay Jan. 15, 2018, 8:38 a.m. UTC

There is a potential race b/w rebind_workers() and
wakeup of a worker thread, which can result in
workqueue lockup for a bounder worker pool.

Below is the potential race:

- cpu0 is a bounded worker pool, which is unbound
  from its cpu. A new work is queued on this pool,
  which causes its worker (kworker/0:0) to be woken
  up on a cpu different from cpu0, lets say cpu1.

  workqueue_queue_work
    workqueue_activate_work
      <worker 0 is woken up on cpu1>

- cpu0 rebind happens
  rebind_workers()
    Clears POOL_DISASSOCIATED and binds cpumask of all
    workers.

- kworker/0:0 gets chance to run on cpu1; while processing
  a work, it goes to sleep. However, it does not decrement
  pool->nr_running. This is because WORKER_REBOUND (NOT_
  RUNNING) flag was cleared, when worker entered worker_
  thread().

  Worker 0 runs on cpu1
    worker_thread()
      process_one_work()
        wq_worker_sleeping()
          if (worker->flags & WORKER_NOT_RUNNING)
            return NULL;
          if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
            <Does not decrement nr_running>

- After this, when kworker/0:0 wakes up, this time on its
  bounded cpu cpu0, it increments pool->nr_running again.
  So, pool->nr_running becomes 2.

- When kworker/0:0 enters idle, it decrements pool->nr_running
  by 1. This leaves pool->nr_running =1 , with no workers in
  runnable state.

- Now, no new workers will be woken up, as pool->nr_running is
  non-zero. This results in indefinite lockup for this pool.

Fix this by deferring the work to some other idle worker,
if the current worker is not bound to its pool's CPU.

Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
---
 kernel/workqueue.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Tejun Heo Jan. 16, 2018, 5:35 p.m. UTC | #1

Hello, Neeraj.

On Mon, Jan 15, 2018 at 02:08:12PM +0530, Neeraj Upadhyay wrote:
> - kworker/0:0 gets chance to run on cpu1; while processing
>   a work, it goes to sleep. However, it does not decrement
>   pool->nr_running. This is because WORKER_REBOUND (NOT_
>   RUNNING) flag was cleared, when worker entered worker_

Do you mean that because REBOUND was set?

>   thread().
> 
>   Worker 0 runs on cpu1
>     worker_thread()
>       process_one_work()
>         wq_worker_sleeping()
>           if (worker->flags & WORKER_NOT_RUNNING)
>             return NULL;
>           if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>             <Does not decrement nr_running>
> 
> - After this, when kworker/0:0 wakes up, this time on its
>   bounded cpu cpu0, it increments pool->nr_running again.
>   So, pool->nr_running becomes 2.

Why is it suddenly 2?  Who made it one on the account of the kworker?

Do you see this happening?  Or better, is there a (semi) reliable
repro for this issue?

Thanks.

Neeraj Upadhyay Jan. 16, 2018, 8:08 p.m. UTC | #2

On 01/16/2018 11:05 PM, Tejun Heo wrote:
> Hello, Neeraj.
>
> On Mon, Jan 15, 2018 at 02:08:12PM +0530, Neeraj Upadhyay wrote:
>> - kworker/0:0 gets chance to run on cpu1; while processing
>>    a work, it goes to sleep. However, it does not decrement
>>    pool->nr_running. This is because WORKER_REBOUND (NOT_
>>    RUNNING) flag was cleared, when worker entered worker_
> Do you mean that because REBOUND was set?

Actually, I meant REBOUND was not set. Below is the sequence

- cpu0 bounded pool is unbound.

- kworker/0:0 is woken up on cpu1.

- cpu0 pool is rebound
   REBOUND is set for kworker/0:0

- kworker/0:0 starts running on cpu1
   worker_thread()
     // It clears REBOUND and sets nr_running =1 after below call
     worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);

- kworker/0:0 goes to sleep
   wq_worker_sleeping()
     // Below condition is not true, as all NOT_RUNNING
     // flags were cleared in worker_thread()
     if (worker->flags & WORKER_NOT_RUNNING)
     // Below is true, as worker is running on cpu1
     if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
       return NULL;
     // Below is not reached and nr_running stays 1
     if (atomic_dec_and_test(&pool->nr_running) &&

- kworker/0:0 wakes up again, this time on cpu0, as worker->task
   cpus_allowed was set to cpu0, in rebind_workers.
   wq_worker_waking_up()
     if (!(worker->flags & WORKER_NOT_RUNNING)) {
         // Increments pool->nr_running to 2
         atomic_inc(&worker->pool->nr_running);

>
>>    thread().
>>
>>    Worker 0 runs on cpu1
>>      worker_thread()
>>        process_one_work()
>>          wq_worker_sleeping()
>>            if (worker->flags & WORKER_NOT_RUNNING)
>>              return NULL;
>>            if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>>              <Does not decrement nr_running>
>>
>> - After this, when kworker/0:0 wakes up, this time on its
>>    bounded cpu cpu0, it increments pool->nr_running again.
>>    So, pool->nr_running becomes 2.
> Why is it suddenly 2?  Who made it one on the account of the kworker?
As shown in above comment, it became 1 in
worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
>
> Do you see this happening?  Or better, is there a (semi) reliable
> repro for this issue?
Yes, this was reported in our long run testing with random hotplug.
Sorry, don't have a quick reproducer for it. Issue is reported in few
days of testing.
>
> Thanks.
>

Lai Jiangshan Jan. 18, 2018, 3:02 a.m. UTC | #3

On Wed, Jan 17, 2018 at 4:08 AM, Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>
>
> On 01/16/2018 11:05 PM, Tejun Heo wrote:
>>
>> Hello, Neeraj.
>>
>> On Mon, Jan 15, 2018 at 02:08:12PM +0530, Neeraj Upadhyay wrote:
>>>
>>> - kworker/0:0 gets chance to run on cpu1; while processing
>>>    a work, it goes to sleep. However, it does not decrement
>>>    pool->nr_running. This is because WORKER_REBOUND (NOT_
>>>    RUNNING) flag was cleared, when worker entered worker_
>>
>> Do you mean that because REBOUND was set?
>
>
> Actually, I meant REBOUND was not set. Below is the sequence
>
> - cpu0 bounded pool is unbound.
>
> - kworker/0:0 is woken up on cpu1.
>
> - cpu0 pool is rebound
>   REBOUND is set for kworker/0:0
>

Thanks for looking into the detail of workqueue...

"REBOUND is set for kworker/0:0" means set_cpus_allowed_ptr(kworker/0:0)
already successfull returned and kworker/0:0 is already moved to cpu0.

It will not still run on cpu1 as the following steps you described.

If there is something wrong with " set_cpus_allowed_ptr()"
in this situation, could you please elaborate it.

> - kworker/0:0 starts running on cpu1
>   worker_thread()
>     // It clears REBOUND and sets nr_running =1 after below call
>     worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
>
> - kworker/0:0 goes to sleep
>   wq_worker_sleeping()
>     // Below condition is not true, as all NOT_RUNNING
>     // flags were cleared in worker_thread()
>     if (worker->flags & WORKER_NOT_RUNNING)
>     // Below is true, as worker is running on cpu1
>     if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>       return NULL;
>     // Below is not reached and nr_running stays 1
>     if (atomic_dec_and_test(&pool->nr_running) &&
>
> - kworker/0:0 wakes up again, this time on cpu0, as worker->task
>   cpus_allowed was set to cpu0, in rebind_workers.
>   wq_worker_waking_up()
>     if (!(worker->flags & WORKER_NOT_RUNNING)) {
>         // Increments pool->nr_running to 2
>         atomic_inc(&worker->pool->nr_running);
>
>>
>>>    thread().
>>>
>>>    Worker 0 runs on cpu1
>>>      worker_thread()
>>>        process_one_work()
>>>          wq_worker_sleeping()
>>>            if (worker->flags & WORKER_NOT_RUNNING)
>>>              return NULL;
>>>            if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>>>              <Does not decrement nr_running>
>>>
>>> - After this, when kworker/0:0 wakes up, this time on its
>>>    bounded cpu cpu0, it increments pool->nr_running again.
>>>    So, pool->nr_running becomes 2.
>>
>> Why is it suddenly 2?  Who made it one on the account of the kworker?
>
> As shown in above comment, it became 1 in
> worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
>>
>>
>> Do you see this happening?  Or better, is there a (semi) reliable
>> repro for this issue?
>
> Yes, this was reported in our long run testing with random hotplug.
> Sorry, don't have a quick reproducer for it. Issue is reported in few
> days of testing.
>>
>>
>> Thanks.
>>
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
> member of the Code Aurora Forum, hosted by The Linux Foundation
>
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Neeraj Upadhyay Jan. 18, 2018, 10:07 a.m. UTC | #4

On 01/18/2018 08:32 AM, Lai Jiangshan wrote:
> On Wed, Jan 17, 2018 at 4:08 AM, Neeraj Upadhyay <neeraju@codeaurora.org> wrote:
>>
>> On 01/16/2018 11:05 PM, Tejun Heo wrote:
>>> Hello, Neeraj.
>>>
>>> On Mon, Jan 15, 2018 at 02:08:12PM +0530, Neeraj Upadhyay wrote:
>>>> - kworker/0:0 gets chance to run on cpu1; while processing
>>>>     a work, it goes to sleep. However, it does not decrement
>>>>     pool->nr_running. This is because WORKER_REBOUND (NOT_
>>>>     RUNNING) flag was cleared, when worker entered worker_
>>> Do you mean that because REBOUND was set?
>>
>> Actually, I meant REBOUND was not set. Below is the sequence
>>
>> - cpu0 bounded pool is unbound.
>>
>> - kworker/0:0 is woken up on cpu1.
>>
>> - cpu0 pool is rebound
>>    REBOUND is set for kworker/0:0
>>
> Thanks for looking into the detail of workqueue...
>
> "REBOUND is set for kworker/0:0" means set_cpus_allowed_ptr(kworker/0:0)
> already successfull returned and kworker/0:0 is already moved to cpu0.
>
> It will not still run on cpu1 as the following steps you described.
>
> If there is something wrong with " set_cpus_allowed_ptr()"
> in this situation, could you please elaborate it.

Thanks Lai, I missed that; will debug from that perspective.
>
>> - kworker/0:0 starts running on cpu1
>>    worker_thread()
>>      // It clears REBOUND and sets nr_running =1 after below call
>>      worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
>>
>> - kworker/0:0 goes to sleep
>>    wq_worker_sleeping()
>>      // Below condition is not true, as all NOT_RUNNING
>>      // flags were cleared in worker_thread()
>>      if (worker->flags & WORKER_NOT_RUNNING)
>>      // Below is true, as worker is running on cpu1
>>      if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>>        return NULL;
>>      // Below is not reached and nr_running stays 1
>>      if (atomic_dec_and_test(&pool->nr_running) &&
>>
>> - kworker/0:0 wakes up again, this time on cpu0, as worker->task
>>    cpus_allowed was set to cpu0, in rebind_workers.
>>    wq_worker_waking_up()
>>      if (!(worker->flags & WORKER_NOT_RUNNING)) {
>>          // Increments pool->nr_running to 2
>>          atomic_inc(&worker->pool->nr_running);
>>
>>>>     thread().
>>>>
>>>>     Worker 0 runs on cpu1
>>>>       worker_thread()
>>>>         process_one_work()
>>>>           wq_worker_sleeping()
>>>>             if (worker->flags & WORKER_NOT_RUNNING)
>>>>               return NULL;
>>>>             if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
>>>>               <Does not decrement nr_running>
>>>>
>>>> - After this, when kworker/0:0 wakes up, this time on its
>>>>     bounded cpu cpu0, it increments pool->nr_running again.
>>>>     So, pool->nr_running becomes 2.
>>> Why is it suddenly 2?  Who made it one on the account of the kworker?
>> As shown in above comment, it became 1 in
>> worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);
>>>
>>> Do you see this happening?  Or better, is there a (semi) reliable
>>> repro for this issue?
>> Yes, this was reported in our long run testing with random hotplug.
>> Sorry, don't have a quick reproducer for it. Issue is reported in few
>> days of testing.
>>>
>>> Thanks.
>>>
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
>> member of the Code Aurora Forum, hosted by The Linux Foundation
>>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 43d18cb..71c0023 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2218,6 +2218,17 @@  static int worker_thread(void *__worker)
 	if (unlikely(!may_start_working(pool)) && manage_workers(worker))
 		goto recheck;
 
+	/* handle the case where, while a bounded pool is unbound,
+	 * its worker is woken up on a target CPU, which is different
+	 * from pool->cpu, but pool is rebound before this worker gets
+	 * chance to run on the target CPU.
+	 */
+	if (WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
+		raw_smp_processor_id() != pool->cpu)) {
+		wake_up_worker(pool);
+		goto sleep;
+	}
+
 	/*
 	 * ->scheduled list can only be filled while a worker is
 	 * preparing to process a work or actually processing it.

workqueue: Handle race between wake up and rebind

Commit Message

Comments

Patch