diff mbox series

[1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()

Message ID 20180821120317.4115-2-johannes@sipsolutions.net (mailing list archive)
State Not Applicable
Delegated to: Kalle Valo
Headers show
Series workqueue lockdep limitations/bugs | expand

Commit Message

Johannes Berg Aug. 21, 2018, 12:03 p.m. UTC
From: Johannes Berg <johannes.berg@intel.com>

In cancel_work_sync(), we can only have one of two cases, even
with an ordered workqueue:
 * the work isn't running, just cancelled before it started
 * the work is running, but then nothing else can be on the
   workqueue before it

Thus, we need to skip the lockdep workqueue dependency handling,
otherwise we get false positive reports from lockdep saying that
we have a potential deadlock when the workqueue also has other
work items with locking, e.g.

  work1_function() { mutex_lock(&mutex); ... }
  work2_function() { /* nothing */ }

  other_function() {
    queue_work(ordered_wq, &work1);
    queue_work(ordered_wq, &work2);
    mutex_lock(&mutex);
    cancel_work_sync(&work2);
  }

As described above, this isn't a problem, but lockdep will
currently flag it as if cancel_work_sync() was flush_work(),
which *is* a problem.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 kernel/workqueue.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

Comments

Tejun Heo Aug. 21, 2018, 4:08 p.m. UTC | #1
On Tue, Aug 21, 2018 at 02:03:16PM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> In cancel_work_sync(), we can only have one of two cases, even
> with an ordered workqueue:
>  * the work isn't running, just cancelled before it started
>  * the work is running, but then nothing else can be on the
>    workqueue before it
> 
> Thus, we need to skip the lockdep workqueue dependency handling,
> otherwise we get false positive reports from lockdep saying that
> we have a potential deadlock when the workqueue also has other
> work items with locking, e.g.
> 
>   work1_function() { mutex_lock(&mutex); ... }
>   work2_function() { /* nothing */ }
> 
>   other_function() {
>     queue_work(ordered_wq, &work1);
>     queue_work(ordered_wq, &work2);
>     mutex_lock(&mutex);
>     cancel_work_sync(&work2);
>   }
> 
> As described above, this isn't a problem, but lockdep will
> currently flag it as if cancel_work_sync() was flush_work(),
> which *is* a problem.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
>  kernel/workqueue.c | 37 ++++++++++++++++++++++---------------
>  1 file changed, 22 insertions(+), 15 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 78b192071ef7..a6c2b823f348 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2843,7 +2843,8 @@ void drain_workqueue(struct workqueue_struct *wq)
>  }
>  EXPORT_SYMBOL_GPL(drain_workqueue);
>  
> -static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> +static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> +			     bool from_cancel)
>  {
>  	struct worker *worker = NULL;
>  	struct worker_pool *pool;
> @@ -2885,7 +2886,8 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
>  	 * workqueues the deadlock happens when the rescuer stalls, blocking
>  	 * forward progress.
>  	 */
> -	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
> +	if (!from_cancel &&
> +	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
>  		lock_map_acquire(&pwq->wq->lockdep_map);
>  		lock_map_release(&pwq->wq->lockdep_map);
>  	}

But this can lead to a deadlock.  I'd much rather err on the side of
discouraging complex lock dancing around ordered workqueues, no?

Thanks.
Johannes Berg Aug. 21, 2018, 5:18 p.m. UTC | #2
On Tue, 2018-08-21 at 09:08 -0700, Tejun Heo wrote:
> 
> > -static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> > +static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> > +			     bool from_cancel)
> >  {
> >  	struct worker *worker = NULL;
> >  	struct worker_pool *pool;
> > @@ -2885,7 +2886,8 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> >  	 * workqueues the deadlock happens when the rescuer stalls, blocking
> >  	 * forward progress.
> >  	 */
> > -	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
> > +	if (!from_cancel &&
> > +	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
> >  		lock_map_acquire(&pwq->wq->lockdep_map);
> >  		lock_map_release(&pwq->wq->lockdep_map);
> >  	}
> 
> But this can lead to a deadlock.  I'd much rather err on the side of
> discouraging complex lock dancing around ordered workqueues, no?

What can lead to a deadlock?

Writing out the example again, with the unlock now:

   work1_function() { mutex_lock(&mutex); mutex_unlock(&mutex); }
   work2_function() { /* nothing */ }
 
   other_function() {
     queue_work(ordered_wq, &work1);
     queue_work(ordered_wq, &work2);
     mutex_lock(&mutex);
     cancel_work_sync(&work2);
     mutex_unlock(&mutex);
   }

This shouldn't be able to lead to a deadlock like I had explained:

> In cancel_work_sync(), we can only have one of two cases, even
> with an ordered workqueue:
>  * the work isn't running, just cancelled before it started
>  * the work is running, but then nothing else can be on the
>    workqueue before it

johannes
Tejun Heo Aug. 21, 2018, 5:27 p.m. UTC | #3
Hello, Johannes.

On Tue, Aug 21, 2018 at 07:18:14PM +0200, Johannes Berg wrote:
> > But this can lead to a deadlock.  I'd much rather err on the side of
> > discouraging complex lock dancing around ordered workqueues, no?
> 
> What can lead to a deadlock?

Oh not this particular case, but I was wondering whether we'd be
missing legitimate possible deadlock cases by skipping lockdep for all
cancel_work_sync()'s as they can actually flush.

Thanks.
Johannes Berg Aug. 21, 2018, 5:30 p.m. UTC | #4
Hi,

> On Tue, Aug 21, 2018 at 07:18:14PM +0200, Johannes Berg wrote:
> > > But this can lead to a deadlock.  I'd much rather err on the side of
> > > discouraging complex lock dancing around ordered workqueues, no?
> > 
> > What can lead to a deadlock?
> 
> Oh not this particular case, but I was wondering whether we'd be
> missing legitimate possible deadlock cases by skipping lockdep for all
> cancel_work_sync()'s as they can actually flush.

I don't see how? This is only relevant in ordered/single-threaded WQs,
but even there it doesn't matter doesn't matter as explained?

I'm actually seeing a false positive report from lockdep, because it
*is* flushing, i.e. I'm running into the case of the work actually
running, i.e. the "_sync" part of "cancel_work_sync()" is kicking in,
but in that case a single-threaded WQ can't have anything executing
*before* it, so we don't need to generate a lockdep dependency - and in
fact don't *want* to create one to avoid the false positive.

I'm not really sure what you think we might be missing? Am I missing
some case where cancel_work_sync() can possibly deadlock? Apart from the
issue I addressed in the second patch, obviously.

johannes
Tejun Heo Aug. 21, 2018, 5:55 p.m. UTC | #5
Hello,

On Tue, Aug 21, 2018 at 07:30:21PM +0200, Johannes Berg wrote:
> I don't see how? This is only relevant in ordered/single-threaded WQs,
> but even there it doesn't matter doesn't matter as explained?
> 
> I'm actually seeing a false positive report from lockdep, because it
> *is* flushing, i.e. I'm running into the case of the work actually
> running, i.e. the "_sync" part of "cancel_work_sync()" is kicking in,
> but in that case a single-threaded WQ can't have anything executing
> *before* it, so we don't need to generate a lockdep dependency - and in
> fact don't *want* to create one to avoid the false positive.
> 
> I'm not really sure what you think we might be missing? Am I missing
> some case where cancel_work_sync() can possibly deadlock? Apart from the
> issue I addressed in the second patch, obviously.

Ah, that was me being slow.  I thought you were skipping the work's
lockdep_map.  I can almost swear we had that before (the part you're
adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
now redundant lock acquisitions wrt. workqueue flushes") removed it
because it gets propagated through wait_for_completion().  Did we miss
some cases with that change?

Thanks.
Johannes Berg Aug. 21, 2018, 7:20 p.m. UTC | #6
On Tue, 2018-08-21 at 10:55 -0700, Tejun Heo wrote:

> > I'm not really sure what you think we might be missing? Am I missing
> > some case where cancel_work_sync() can possibly deadlock? Apart from the
> > issue I addressed in the second patch, obviously.
> 
> Ah, that was me being slow.  I thought you were skipping the work's
> lockdep_map.  I can almost swear we had that before (the part you're
> adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
> now redundant lock acquisitions wrt. workqueue flushes") removed it
> because it gets propagated through wait_for_completion().  Did we miss
> some cases with that change?

Hmm.

It doesn't seem to be working.

No, ok, actually it probably *does*, but the point is similar to my
issue # 3 before - we don't do any of this unless the work is actually
running, but we really want the lockdep annotation *regardless* of that,
so that we catch the error unconditionally.

So perhaps that commit just needs to be reverted entirely - I'd only
looked at a small subset of it, but the flush_workqueue() case has the
same problem - we only get to the completion when there's something to
flush, not when the workqueue happens to actually be empty. But again,
for lockdep we want to catch *potential* problems, not only *actual*
ones.

The remaining part of the patch I'm not sure I fully understand (removal
of lockdep_init_map_crosslock()), but I suppose if we revert the other
bits we need to revert this as well.

So please drop this patch, but revert Byungchul Park's commit
fd1a5b04dfb8 again, I don't think the lockdep annotations there are
really redundant as I just explained.

johannes
Byungchul Park Aug. 22, 2018, 2:45 a.m. UTC | #7
On Tue, Aug 21, 2018 at 09:20:41PM +0200, Johannes Berg wrote:
> On Tue, 2018-08-21 at 10:55 -0700, Tejun Heo wrote:
> 
> > > I'm not really sure what you think we might be missing? Am I missing
> > > some case where cancel_work_sync() can possibly deadlock? Apart from the
> > > issue I addressed in the second patch, obviously.
> > 
> > Ah, that was me being slow.  I thought you were skipping the work's
> > lockdep_map.  I can almost swear we had that before (the part you're
> > adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
> > now redundant lock acquisitions wrt. workqueue flushes") removed it
> > because it gets propagated through wait_for_completion().  Did we miss
> > some cases with that change?
> 
> Hmm.
> 
> It doesn't seem to be working.
> 
> No, ok, actually it probably *does*, but the point is similar to my
> issue # 3 before - we don't do any of this unless the work is actually
> running, but we really want the lockdep annotation *regardless* of that,
> so that we catch the error unconditionally.
> 
> So perhaps that commit just needs to be reverted entirely - I'd only
> looked at a small subset of it, but the flush_workqueue() case has the
> same problem - we only get to the completion when there's something to
> flush, not when the workqueue happens to actually be empty. But again,
> for lockdep we want to catch *potential* problems, not only *actual*
> ones.
> 
> The remaining part of the patch I'm not sure I fully understand (removal
> of lockdep_init_map_crosslock()), but I suppose if we revert the other
> bits we need to revert this as well.
> 
> So please drop this patch, but revert Byungchul Park's commit
> fd1a5b04dfb8 again, I don't think the lockdep annotations there are
> really redundant as I just explained.

That should've been adjusted as well when Ingo reverted Cross-release.
It would be much easier to add each pair, acquire/release, before
wait_for_completion() in both flush_workqueue() and flush_work() than
reverting the whole commit.

What's lacking is only lockdep annotations for wait_for_completion().

Byungchul

> 
> johannes
Johannes Berg Aug. 22, 2018, 4:02 a.m. UTC | #8
On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:

> That should've been adjusted as well when Ingo reverted Cross-release.

I can't really say.

> It would be much easier to add each pair, acquire/release, before
> wait_for_completion() in both flush_workqueue() and flush_work() than
> reverting the whole commit.

The commit doesn't do much more than this though.

> What's lacking is only lockdep annotations for wait_for_completion().

No, I disagree. Like I said before, we need the lockdep annotations on
the WQ even when we don't actually create/use the completion, so we can
catch issues even when they don't actually happen.

johannes
Byungchul Park Aug. 22, 2018, 5:47 a.m. UTC | #9
On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> 
> > That should've been adjusted as well when Ingo reverted Cross-release.
> 
> I can't really say.

What do you mean?

> > It would be much easier to add each pair, acquire/release, before
> > wait_for_completion() in both flush_workqueue() and flush_work() than
> > reverting the whole commit.
> 
> The commit doesn't do much more than this though.

That also has named of lockdep_map for wq/work in a better way.

> > What's lacking is only lockdep annotations for wait_for_completion().
> 
> No, I disagree. Like I said before, we need the lockdep annotations on

You seem to be confused. I was talking about wait_for_completion() in
both flush_workqueue() and flush_work(). Without
the wait_for_completion()s, nothing matters wrt what you are concerning.

> the WQ even when we don't actually create/use the completion, so we can
> catch issues even when they don't actually happen.

This is obviously true.

Byungchul

> 
> johannes
Johannes Berg Aug. 22, 2018, 7:07 a.m. UTC | #10
On Wed, 2018-08-22 at 14:47 +0900, Byungchul Park wrote:
> On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> > On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> > 
> > > That should've been adjusted as well when Ingo reverted Cross-release.
> > 
> > I can't really say.
> 
> What do you mean?

I haven't followed any of this, so I just don't know.

> > > It would be much easier to add each pair, acquire/release, before
> > > wait_for_completion() in both flush_workqueue() and flush_work() than
> > > reverting the whole commit.
> > 
> > The commit doesn't do much more than this though.
> 
> That also has named of lockdep_map for wq/work in a better way.

What do you mean?

> > > What's lacking is only lockdep annotations for wait_for_completion().
> > 
> > No, I disagree. Like I said before, we need the lockdep annotations on
> 
> You seem to be confused. I was talking about wait_for_completion() in
> both flush_workqueue() and flush_work(). Without
> the wait_for_completion()s, nothing matters wrt what you are concerning.

Yes and no.

You're basically saying if we don't get to do a wait_for_completion(),
then we don't need any lockdep annotation. I'm saying this isn't true.

Consider the following case:

work_function()
{
	mutex_lock(&mutex);
	mutex_unlock(&mutex);
}

other_function()
{
	queue_work(&my_wq, &work);

	if (common_case) {
		schedule_and_wait_for_something_that_takes_a_long_time()
	}

	mutex_lock(&mutex);
	flush_workqueue(&my_wq);
	mutex_unlock(&mutex);
}


Clearly this code is broken, right?

However, you'll almost never get lockdep to indicate that, because of
the "if (common_case)".

My argument basically is that the lockdep annotations in the workqueue
code should be entirely independent of the actual need to call
wait_for_completion().

Therefore, the commit should be reverted regardless of any cross-release 
work (that I neither know and thus don't understand right now), since it
makes workqueue code rely on lockdep for the completion, whereas we
really want to have annotations here even when we didn't actually need
to wait_for_completion().

johannes
Byungchul Park Aug. 22, 2018, 7:50 a.m. UTC | #11
On Wed, Aug 22, 2018 at 09:07:23AM +0200, Johannes Berg wrote:
> On Wed, 2018-08-22 at 14:47 +0900, Byungchul Park wrote:
> > On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> > > On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> > > 
> > > > That should've been adjusted as well when Ingo reverted Cross-release.
> > > 
> > > I can't really say.
> > 
> > What do you mean?
> 
> I haven't followed any of this, so I just don't know.
> 
> > > > It would be much easier to add each pair, acquire/release, before
> > > > wait_for_completion() in both flush_workqueue() and flush_work() than
> > > > reverting the whole commit.
> > > 
> > > The commit doesn't do much more than this though.
> > 
> > That also has named of lockdep_map for wq/work in a better way.
> 
> What do you mean?

Ah.. Not important thing. I just mentioned I changed lock names a bit
when initializing lockdep_map instances which was suggested by Ingo. But
no problem even if you revert the whole thing. I just informed it. ;)

> > > > What's lacking is only lockdep annotations for wait_for_completion().
> > > 
> > > No, I disagree. Like I said before, we need the lockdep annotations on
> > 
> > You seem to be confused. I was talking about wait_for_completion() in
> > both flush_workqueue() and flush_work(). Without
> > the wait_for_completion()s, nothing matters wrt what you are concerning.
> 
> Yes and no.
> 
> You're basically saying if we don't get to do a wait_for_completion(),
> then we don't need any lockdep annotation. I'm saying this isn't true.

Strictly no. But I'm just talking about the case in wq flush code.

> Consider the following case:
> 
> work_function()
> {
> 	mutex_lock(&mutex);
> 	mutex_unlock(&mutex);
> }
> 
> other_function()
> {
> 	queue_work(&my_wq, &work);
> 
> 	if (common_case) {
> 		schedule_and_wait_for_something_that_takes_a_long_time()
> 	}
> 
> 	mutex_lock(&mutex);
> 	flush_workqueue(&my_wq);
> 	mutex_unlock(&mutex);
> }
> 
> 
> Clearly this code is broken, right?
> 
> However, you'll almost never get lockdep to indicate that, because of
> the "if (common_case)".

Sorry I don't catch you. Why is that problem with the example? Please
a deadlock example.

> My argument basically is that the lockdep annotations in the workqueue
> code should be entirely independent of the actual need to call
> wait_for_completion().

No. Lockdep annotations always do with either wait_for_something or self
event loop within a single context e.g. fs -> memory reclaim -> fs -> ..

> Therefore, the commit should be reverted regardless of any cross-release

No. That is necessary only when the wait_for_completion() cannot be
tracked in checking dependencies automatically by cross-release.

It might be the key to understand you, could you explain it more why you
think lockdep annotations are independent of the actual need to call
wait_for_completion()(or wait_for_something_else) hopefully with a
deadlock example?

> work (that I neither know and thus don't understand right now), since it
> makes workqueue code rely on lockdep for the completion, whereas we

Using wait_for_completion(), right?

> really want to have annotations here even when we didn't actually need
> to wait_for_completion().

Please an example of deadlock even w/o wait_for_completion().

> 
> johannes

Byungchul
Johannes Berg Aug. 22, 2018, 8:02 a.m. UTC | #12
On Wed, 2018-08-22 at 16:50 +0900, Byungchul Park wrote:

> > You're basically saying if we don't get to do a wait_for_completion(),
> > then we don't need any lockdep annotation. I'm saying this isn't true.
> 
> Strictly no. But I'm just talking about the case in wq flush code.

Sure, I meant it's not true in the wq flush code, not talking about
anything else.

> > Consider the following case:
> > 
> > work_function()
> > {
> > 	mutex_lock(&mutex);
> > 	mutex_unlock(&mutex);
> > }
> > 
> > other_function()
> > {
> > 	queue_work(&my_wq, &work);
> > 
> > 	if (common_case) {
> > 		schedule_and_wait_for_something_that_takes_a_long_time()
> > 	}
> > 
> > 	mutex_lock(&mutex);
> > 	flush_workqueue(&my_wq);
> > 	mutex_unlock(&mutex);
> > }
> > 
> > 
> > Clearly this code is broken, right?
> > 
> > However, you'll almost never get lockdep to indicate that, because of
> > the "if (common_case)".
> 
> Sorry I don't catch you. Why is that problem with the example? Please
> a deadlock example.

sure, I thought that was easy:

thread 1			thread 2 (wq thread)

common_case = false;
queue_work(&my_wq, &work);
mutex_lock(&mutex);

flush_workqueue(&my_wq);
				work_function()
				-> mutex_lock(&mutex);
				-> schedule(), wait for mutex
wait_for_completion()

-> deadlock - we can't make any forward progress here.


> > My argument basically is that the lockdep annotations in the workqueue
> > code should be entirely independent of the actual need to call
> > wait_for_completion().
> 
> No. Lockdep annotations always do with either wait_for_something or self
> event loop within a single context e.g. fs -> memory reclaim -> fs -> ..

Yes, but I'm saying that we should catch *potential* deadlocks *before*
they happen.

See the example above. Clearly, you're actually deadlocking, and
obviously (assuming all the wait_for_completion() things work right)
lockdep will show why we just deadlocked.

BUT.

This is useless. We want/need lockdep to show *potential* deadlocks,
even when they didn't happen. Consider the other case in the above
scenario:

thread 1			thread 2 (wq thread)

common_case = true;
queue_work(&my_wq, &work);

schedule_and_wait_...();	work_function()
				-> mutex_lock(&mutex);
				-> mutex_unlock()
				done


mutex_lock(&mutex);
flush_workqueue(&my_wq);
-> nothing to do, will NOT
   call wait_for_completion();

-> no deadlock

Here we don't have a deadlock, but without the revert we will also not
get a lockdep report. We should though, because we're doing something
that's quite clearly dangerous - we simply don't know if the work
function will complete before we get to flush_workqueue(). Maybe the
work function has an uncommon case itself that takes forever, etc.

> > Therefore, the commit should be reverted regardless of any cross-release
> 
> No. That is necessary only when the wait_for_completion() cannot be
> tracked in checking dependencies automatically by cross-release.

No. See above. We want the annotation regardless of invoking
wait_for_completion().

> It might be the key to understand you, could you explain it more why you
> think lockdep annotations are independent of the actual need to call
> wait_for_completion()(or wait_for_something_else) hopefully with a
> deadlock example?

See above.

You're getting too hung up about a deadlock example. We don't want to
have lockdep only catch *actual* deadlocks. The code I wrote clearly has
a potential deadlock (sequence given above), but in most cases the code
above will *not* deadlock. This is the interesting part we want lockdep
to catch.

> > work (that I neither know and thus don't understand right now), since it
> > makes workqueue code rely on lockdep for the completion, whereas we
> 
> Using wait_for_completion(), right?

Yes.

> > really want to have annotations here even when we didn't actually need
> > to wait_for_completion().
> 
> Please an example of deadlock even w/o wait_for_completion().

No, here's where you get confused. Clearly, there is no lockdep if we
don't do wait_for_completion(). But if you have the code above, lockdep
should still warn about the potential deadlock that happens when you
*do* get to wait_for_completion(). Lockdep shouldn't be restricted to
warning about a deadlock that *actually* happened.

johannes
Byungchul Park Aug. 22, 2018, 9:15 a.m. UTC | #13
On Wed, Aug 22, 2018 at 10:02:20AM +0200, Johannes Berg wrote:
> > Sorry I don't catch you. Why is that problem with the example? Please
> > a deadlock example.
> 
> sure, I thought that was easy:
> 
> thread 1			thread 2 (wq thread)
> 
> common_case = false;
> queue_work(&my_wq, &work);
> mutex_lock(&mutex);
> 
> flush_workqueue(&my_wq);
> 				work_function()
> 				-> mutex_lock(&mutex);
> 				-> schedule(), wait for mutex
> wait_for_completion()
> 
> -> deadlock - we can't make any forward progress here.

I see. Now, I understand what you are talking about.

   if (common_case)
	schedule_and_wait_for_something_that_takes_a_long_time()

should be:

   if (common_case)
	schedule_and_wait_long_time_so_that_the_work_to_finish()

> > > My argument basically is that the lockdep annotations in the workqueue
> > > code should be entirely independent of the actual need to call
> > > wait_for_completion().
> > 
> > No. Lockdep annotations always do with either wait_for_something or self
> > event loop within a single context e.g. fs -> memory reclaim -> fs -> ..
> 
> Yes, but I'm saying that we should catch *potential* deadlocks *before*
> they happen.

Absolutely true about what lockdep does.

> See the example above. Clearly, you're actually deadlocking, and
> obviously (assuming all the wait_for_completion() things work right)
> lockdep will show why we just deadlocked.
> 
> BUT.
> 
> This is useless. We want/need lockdep to show *potential* deadlocks,
> even when they didn't happen. Consider the other case in the above
> scenario:
> 
> thread 1			thread 2 (wq thread)
> 
> common_case = true;
> queue_work(&my_wq, &work);
> 
> schedule_and_wait_...();	work_function()
> 				-> mutex_lock(&mutex);
> 				-> mutex_unlock()
> 				done
> 
> 
> mutex_lock(&mutex);
> flush_workqueue(&my_wq);
> -> nothing to do, will NOT
>    call wait_for_completion();
> 
> -> no deadlock

Sure. It's not a deadlock because wait_for_completion() is not involved
in this path - where wait_for_completion() is necessary to deadlock.

Ok. I didn't know what you are talking about but now I got it.

You are talking about who knows whether common_case is true or not,
so we must aggresively consider the case the wait_for_completion()
so far hasn't been called but may be called in the future.

I think it's a problem of how aggressively we need to check dependencies.
If we choose the aggressive option, then we could get reported
aggressively but could not avoid aggresive false positives either.
I don't want to strongly argue that because it's a problem of policy.

Just, I would consider only waits that actually happened anyway. Of
course, we gotta consider the waits definitely even if any actual
deadlock doesn't happen since detecting potantial problem is more
important than doing on actual ones as you said.

So no big objection to check dependencies aggressively.

> Here we don't have a deadlock, but without the revert we will also not

You misunderstand me. The commit should be reverted or acquire/release
pairs should be added in both flush functions.

> get a lockdep report. We should though, because we're doing something
> that's quite clearly dangerous - we simply don't know if the work
> function will complete before we get to flush_workqueue(). Maybe the
> work function has an uncommon case itself that takes forever, etc.
> 
> > > Therefore, the commit should be reverted regardless of any cross-release
> > 
> > No. That is necessary only when the wait_for_completion() cannot be
> > tracked in checking dependencies automatically by cross-release.
> 
> No. See above. We want the annotation regardless of invoking
> wait_for_completion().

Anyway the annotation should be placed in the path where
wait_for_completion() might be called. But whether doing it
regardless of invoking wait_for_completion() or not is a problem of
policy so I don't care. No big objection whichever you do.

> > It might be the key to understand you, could you explain it more why you
> > think lockdep annotations are independent of the actual need to call
> > wait_for_completion()(or wait_for_something_else) hopefully with a
> > deadlock example?
> 
> See above.
> 
> You're getting too hung up about a deadlock example. We don't want to

No no. Your example was helpful to understand what you are talking about.
I asked you for a potential deadlock example.

> have lockdep only catch *actual* deadlocks. The code I wrote clearly has
> a potential deadlock (sequence given above), but in most cases the code
> above will *not* deadlock. This is the interesting part we want lockdep
> to catch.

Absolutly true. You can find my opinion about that in
Documentation/locking/crossrelease.txt which has been removed because
crossrelease is strong at detecting *potential* deadlock problems.

> > > work (that I neither know and thus don't understand right now), since it
> > > makes workqueue code rely on lockdep for the completion, whereas we
> > 
> > Using wait_for_completion(), right?
> 
> Yes.
> 
> > > really want to have annotations here even when we didn't actually need
> > > to wait_for_completion().
> > 
> > Please an example of deadlock even w/o wait_for_completion().
> 
> No, here's where you get confused. Clearly, there is no lockdep if we
> don't do wait_for_completion(). But if you have the code above, lockdep
> should still warn about the potential deadlock that happens when you
> *do* get to wait_for_completion(). Lockdep shouldn't be restricted to
> warning about a deadlock that *actually* happened.

I didn't know if you were talking about this in the previous talk.

Byungchul
Johannes Berg Aug. 22, 2018, 9:42 a.m. UTC | #14
On Wed, 2018-08-22 at 18:15 +0900, Byungchul Park wrote:
> 
> > thread 1			thread 2 (wq thread)
> > 
> > common_case = false;
> > queue_work(&my_wq, &work);
> > mutex_lock(&mutex);
> > 
> > flush_workqueue(&my_wq);
> > 				work_function()
> > 				-> mutex_lock(&mutex);
> > 				-> schedule(), wait for mutex
> > wait_for_completion()
> > 
> > -> deadlock - we can't make any forward progress here.
> 
> I see. Now, I understand what you are talking about.
> 
>    if (common_case)
> 	schedule_and_wait_for_something_that_takes_a_long_time()
> 
> should be:
> 
>    if (common_case)
> 	schedule_and_wait_long_time_so_that_the_work_to_finish()

Fair point.

> Ok. I didn't know what you are talking about but now I got it.

great.

> You are talking about who knows whether common_case is true or not,
> so we must aggresively consider the case the wait_for_completion()
> so far hasn't been called but may be called in the future.

Yes.

> I think it's a problem of how aggressively we need to check dependencies.
> If we choose the aggressive option, then we could get reported
> aggressively but could not avoid aggresive false positives either.
> I don't want to strongly argue that because it's a problem of policy.

I don't think you could consider a report from "aggressive reporting" to
 be a false positive. It's clearly possible that this happens, and once
you go to a workqueue you basically don't really know your sequencing
and timing any more.

> Just, I would consider only waits that actually happened anyway. Of
> course, we gotta consider the waits definitely even if any actual
> deadlock doesn't happen since detecting potantial problem is more
> important than doing on actual ones as you said.
> 
> So no big objection to check dependencies aggressively.
> 
> > Here we don't have a deadlock, but without the revert we will also not
> 
> You misunderstand me. The commit should be reverted or acquire/release
> pairs should be added in both flush functions.

Ok, I thought you were arguing we shouldn't revert it :)

I don't know whether to revert or just add the pairs in the flush
functions, because I can't say I understand what the third part of the
patch does.

> Anyway the annotation should be placed in the path where
> wait_for_completion() might be called.

Yes, it should be called regardless of whether we actually wait or not,
i.e. before most of the checking in the functions.

My issue #3 that I outlined is related - we'd like to have lockdep
understand that if this work was on the WQ it might deadlock, but we
don't have a way to get the WQ unless the work is scheduled - and in the
case that it is scheduled we might already hitting the deadlock
(depending on the order of execution though I guess).

> Absolutly true. You can find my opinion about that in
> Documentation/locking/crossrelease.txt which has been removed because
> crossrelease is strong at detecting *potential* deadlock problems.

Ok, you know what, I'm going to read this now ... hang on........

I see. You were trying to solve the problem of completions/context
transfers in lockdep.

Given the revert of crossrelease though, we can't actually revert your
patch anyway, and looking at it again I see what you mean by the "name"
now ...

So yeah, we should only re-add the two acquire/release pairs. I guess
I'll make a patch for that, replacing my patch 2.

I'm intrigued by the crossrelease - but that's a separate topic.

johannes
Byungchul Park Aug. 22, 2018, 12:47 p.m. UTC | #15
On 08/22/2018 06:42 PM, Johannes Berg wrote:
> On Wed, 2018-08-22 at 18:15 +0900, Byungchul Park wrote:
>>> thread 1			thread 2 (wq thread)
>>>
>>> common_case = false;
>>> queue_work(&my_wq, &work);
>>> mutex_lock(&mutex);
>>>
>>> flush_workqueue(&my_wq);
>>> 				work_function()
>>> 				-> mutex_lock(&mutex);
>>> 				-> schedule(), wait for mutex
>>> wait_for_completion()
>>>
>>> -> deadlock - we can't make any forward progress here.
>> I see. Now, I understand what you are talking about.
>>
>>     if (common_case)
>> 	schedule_and_wait_for_something_that_takes_a_long_time()
>>
>> should be:
>>
>>     if (common_case)
>> 	schedule_and_wait_long_time_so_that_the_work_to_finish()
> Fair point.
>
>> Ok. I didn't know what you are talking about but now I got it.
> great.
>
>> You are talking about who knows whether common_case is true or not,
>> so we must aggresively consider the case the wait_for_completion()
>> so far hasn't been called but may be called in the future.
> Yes.
>
>> I think it's a problem of how aggressively we need to check dependencies.
>> If we choose the aggressive option, then we could get reported
>> aggressively but could not avoid aggresive false positives either.
>> I don't want to strongly argue that because it's a problem of policy.
> I don't think you could consider a report from "aggressive reporting" to
>   be a false positive. It's clearly possible that this happens, and once
> you go to a workqueue you basically don't really know your sequencing
> and timing any more.
>
>> Just, I would consider only waits that actually happened anyway. Of
>> course, we gotta consider the waits definitely even if any actual
>> deadlock doesn't happen since detecting potantial problem is more
>> important than doing on actual ones as you said.
>>
>> So no big objection to check dependencies aggressively.
>>
>>> Here we don't have a deadlock, but without the revert we will also not
>> You misunderstand me. The commit should be reverted or acquire/release
>> pairs should be added in both flush functions.
> Ok, I thought you were arguing we shouldn't revert it :)
>
> I don't know whether to revert or just add the pairs in the flush
> functions, because I can't say I understand what the third part of the
> patch does.
>
>> Anyway the annotation should be placed in the path where
>> wait_for_completion() might be called.
> Yes, it should be called regardless of whether we actually wait or not,
> i.e. before most of the checking in the functions.
>
> My issue #3 that I outlined is related - we'd like to have lockdep
> understand that if this work was on the WQ it might deadlock, but we
> don't have a way to get the WQ unless the work is scheduled - and in the
> case that it is scheduled we might already hitting the deadlock
> (depending on the order of execution though I guess).
>
>> Absolutly true. You can find my opinion about that in
>> Documentation/locking/crossrelease.txt which has been removed because
>> crossrelease is strong at detecting *potential* deadlock problems.
> Ok, you know what, I'm going to read this now ... hang on........

But very sorry for poor English on it in advance...

The document should have gotten contributed by ones who
are good at English. It might be too naive for you to read.

>
> I see. You were trying to solve the problem of completions/context
> transfers in lockdep.
>
> Given the revert of crossrelease though, we can't actually revert your
> patch anyway, and looking at it again I see what you mean by the "name"
> now ...
>
> So yeah, we should only re-add the two acquire/release pairs. I guess
> I'll make a patch for that, replacing my patch 2.
>
> I'm intrigued by the crossrelease - but that's a separate topic.
>
> johannes
>
diff mbox series

Patch

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 78b192071ef7..a6c2b823f348 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2843,7 +2843,8 @@  void drain_workqueue(struct workqueue_struct *wq)
 }
 EXPORT_SYMBOL_GPL(drain_workqueue);
 
-static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
+static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
+			     bool from_cancel)
 {
 	struct worker *worker = NULL;
 	struct worker_pool *pool;
@@ -2885,7 +2886,8 @@  static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
 	 * workqueues the deadlock happens when the rescuer stalls, blocking
 	 * forward progress.
 	 */
-	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
+	if (!from_cancel &&
+	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
 		lock_map_acquire(&pwq->wq->lockdep_map);
 		lock_map_release(&pwq->wq->lockdep_map);
 	}
@@ -2896,6 +2898,22 @@  static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
 	return false;
 }
 
+static bool __flush_work(struct work_struct *work, bool from_cancel)
+{
+	struct wq_barrier barr;
+
+	if (WARN_ON(!wq_online))
+		return false;
+
+	if (start_flush_work(work, &barr, from_cancel)) {
+		wait_for_completion(&barr.done);
+		destroy_work_on_stack(&barr.work);
+		return true;
+	} else {
+		return false;
+	}
+}
+
 /**
  * flush_work - wait for a work to finish executing the last queueing instance
  * @work: the work to flush
@@ -2909,18 +2927,7 @@  static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
  */
 bool flush_work(struct work_struct *work)
 {
-	struct wq_barrier barr;
-
-	if (WARN_ON(!wq_online))
-		return false;
-
-	if (start_flush_work(work, &barr)) {
-		wait_for_completion(&barr.done);
-		destroy_work_on_stack(&barr.work);
-		return true;
-	} else {
-		return false;
-	}
+	return __flush_work(work, false);
 }
 EXPORT_SYMBOL_GPL(flush_work);
 
@@ -2986,7 +2993,7 @@  static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
 	 * isn't executing.
 	 */
 	if (wq_online)
-		flush_work(work);
+		__flush_work(work, true);
 
 	clear_work_data(work);