diff mbox

[RESEND,1/3] completion: Add support for initializing completion with lockdep_map

Message ID 1508319532-24655-2-git-send-email-byungchul.park@lge.com (mailing list archive)
State New, archived
Headers show

Commit Message

Byungchul Park Oct. 18, 2017, 9:38 a.m. UTC
Sometimes, we want to initialize completions with sparate lockdep maps
to assign lock classes under control. For example, the workqueue code
manages lockdep maps, as it can classify lockdep maps properly.
Provided a function for that purpose.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/completion.h | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Bart Van Assche Oct. 19, 2017, 11:24 p.m. UTC | #1
On Wed, 2017-10-18 at 18:38 +0900, Byungchul Park wrote:
> Sometimes, we want to initialize completions with sparate lockdep maps

> to assign lock classes under control. For example, the workqueue code

> manages lockdep maps, as it can classify lockdep maps properly.

> Provided a function for that purpose.

> 

> Signed-off-by: Byungchul Park <byungchul.park@lge.com>

> ---

>  include/linux/completion.h | 8 ++++++++

>  1 file changed, 8 insertions(+)

> 

> diff --git a/include/linux/completion.h b/include/linux/completion.h

> index cae5400..182d56e 100644

> --- a/include/linux/completion.h

> +++ b/include/linux/completion.h

> @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x)

>  	lock_commit_crosslock((struct lockdep_map *)&x->map);

>  }

>  

> +#define init_completion_with_map(x, m)					\

> +do {									\

> +	lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map,	\

> +			(m)->name, (m)->key, 0);				\

> +	__init_completion(x);						\

> +} while (0)


Are there any completion objects for which the cross-release checking is
useful? Are there any wait_for_completion() callers that hold a mutex or
other locking object?

Thanks,

Bart.
Byungchul Park Oct. 20, 2017, 6:14 a.m. UTC | #2
On Thu, Oct 19, 2017 at 11:24:00PM +0000, Bart Van Assche wrote:
> Are there any completion objects for which the cross-release checking is
> useful? Are there any wait_for_completion() callers that hold a mutex or
> other locking object?

Check /proc/lockdep, then you can find all dependencies wrt cross-lock.
I named a lock class of wait_for_completion(), a sting starting with
"(complete)".

For example, in my machine:

console_lock -> (complete)&req.done
cpu_hotplug_lock.rw_sem -> (complete)&st->done_up
cpuhp_state_mutex -> (complete)&st->done_up

and so on.
Thomas Gleixner Oct. 20, 2017, 6:34 a.m. UTC | #3
On Thu, 19 Oct 2017, Bart Van Assche wrote:
> On Wed, 2017-10-18 at 18:38 +0900, Byungchul Park wrote:
> > Sometimes, we want to initialize completions with sparate lockdep maps
> > to assign lock classes under control. For example, the workqueue code
> > manages lockdep maps, as it can classify lockdep maps properly.
> > Provided a function for that purpose.
> > 
> > Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> > ---
> >  include/linux/completion.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/include/linux/completion.h b/include/linux/completion.h
> > index cae5400..182d56e 100644
> > --- a/include/linux/completion.h
> > +++ b/include/linux/completion.h
> > @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x)
> >  	lock_commit_crosslock((struct lockdep_map *)&x->map);
> >  }
> >  
> > +#define init_completion_with_map(x, m)					\
> > +do {									\
> > +	lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map,	\
> > +			(m)->name, (m)->key, 0);				\
> > +	__init_completion(x);						\
> > +} while (0)
> 
> Are there any completion objects for which the cross-release checking is
> useful?

All of them by definition.

> Are there any wait_for_completion() callers that hold a mutex or
> other locking object?

Yes, there are also cross completion dependencies. There have been such
bugs and I expect more to be unearthed.

I really have to ask what your motiviation is to fight the lockdep coverage
of synchronization objects tooth and nail?

Thanks,

	tglx
Bart Van Assche Oct. 20, 2017, 7:58 p.m. UTC | #4
On Fri, 2017-10-20 at 08:34 +0200, Thomas Gleixner wrote:
> On Thu, 19 Oct 2017, Bart Van Assche wrote:

> > Are there any completion objects for which the cross-release checking is

> > useful?

> 

> All of them by definition.


Sorry but I'm not sure that's the best possible answer. In my opinion
avoiding that completion objects have dependencies on other lock objects,
e.g. by avoiding to wait on a completion object while holding a mutex, is a
far superior strategy over adding cross-release checking to completion
objects. The former strategy namely makes it unnecessary to add
cross-release checking to completion objects because that strategy ensures
that these completion objects cannot get involved in a deadlock. The latter
strategy can lead to false positive deadlock reports by the lockdep code,
something none of us wants.

A possible alternative strategy could be to enable cross-release checking
only for those completion objects for which waiting occurs inside a critical
section.

> > Are there any wait_for_completion() callers that hold a mutex or

> > other locking object?

> 

> Yes, there are also cross completion dependencies. There have been such

> bugs and I expect more to be unearthed.

> 

> I really have to ask what your motiviation is to fight the lockdep coverage

> of synchronization objects tooth and nail?


As explained in another e-mail thread, unlike the lock inversion checking
performed by the <= v4.13 lockdep code, cross-release checking is a heuristic
that does not have a sound theoretical basis. The lock validator is an
important tool for kernel developers. It is important that it produces as few
false positives as possible. Since the cross-release checks are enabled
automatically when enabling lockdep, I think it is normal that I, as a kernel
developer, care that the cross-release checks produce as few false positives
as possible.

Bart.
Byungchul Park Oct. 21, 2017, 2:23 a.m. UTC | #5
On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> Sorry but I'm not sure that's the best possible answer. In my opinion
> avoiding that completion objects have dependencies on other lock objects,
> e.g. by avoiding to wait on a completion object while holding a mutex, is a
> far superior strategy over adding cross-release checking to completion
> objects. The former strategy namely makes it unnecessary to add
> cross-release checking to completion objects because that strategy ensures
> that these completion objects cannot get involved in a deadlock. The latter

It's true if we force it. But do you think it's possible?

> strategy can lead to false positive deadlock reports by the lockdep code,

What do you think false positives come from? It comes from assigning
lock classes falsely where we should more care, rather than lockdep code
itself. The same is applicable to cross-release.

> something none of us wants.
>
> A possible alternative strategy could be to enable cross-release checking
> only for those completion objects for which waiting occurs inside a critical
> section.

Of course, it already did. Cross-release doesn't consider any waiting
outside of critical sections at all, and it should do.

> As explained in another e-mail thread, unlike the lock inversion checking
> performed by the <= v4.13 lockdep code, cross-release checking is a heuristic
> that does not have a sound theoretical basis. The lock validator is an

It's not heuristic but based on the same theoretical basis as <=4.13
lockdep. I mean, the key basis is:

   1) What causes deadlock
   2) What is a dependency
   3) Build a dependency when identified

> important tool for kernel developers. It is important that it produces as few
> false positives as possible. Since the cross-release checks are enabled
> automatically when enabling lockdep, I think it is normal that I, as a kernel
> developer, care that the cross-release checks produce as few false positives
> as possible.

No doubt. That's why I proposed these patches.
Bart Van Assche Oct. 22, 2017, 2:34 p.m. UTC | #6
On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote:
> On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> > As explained in another e-mail thread, unlike the lock inversion checking

> > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic

> > that does not have a sound theoretical basis. The lock validator is an

> 

> It's not heuristic but based on the same theoretical basis as <=4.13

> lockdep. I mean, the key basis is:

> 

>    1) What causes deadlock

>    2) What is a dependency

>    3) Build a dependency when identified


Sorry but I doubt that that statement is correct. The publication [1] contains
a proof that an algorithm that is closely related to the traditional lockdep
lock inversion detector is able to detect all deadlocks and does not report
false positives for programs that only use mutexes as synchronization objects.
The comment of the authors of that paper for programs that use mutexes,
condition variables and semaphores is as follows: "It is unclear how to extend
the lock-graph-based algorithm in Section 3 to efficiently consider the effects
of condition variables and semaphores. Therefore, when considering all three
synchronization mechanisms, we currently use a naive algorithm that checks each
feasible permutation of the trace for deadlock." In other words, if you have
found an approach for detecting potential deadlocks for programs that use these
three kinds of synchronization objects and that does not report false positives
then that's a breakthrough that's worth publishing in a journal or in the
proceedings of a scientific conference.

Bart.

[1] Agarwal, Rahul, and Scott D. Stoller. "Run-time detection of potential
deadlocks for programs with locks, semaphores, and condition variables." In
Proceedings of the 2006 workshop on Parallel and distributed systems: testing
and debugging, pp. 51-60. ACM, 2006.
(https://pdfs.semanticscholar.org/9324/fc0b5d5cd5e05d551a3e98757122039946a2.pdf).
Byungchul Park Oct. 23, 2017, 2:08 a.m. UTC | #7
On Sun, Oct 22, 2017 at 02:34:56PM +0000, Bart Van Assche wrote:
> On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote:
> > On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> > > As explained in another e-mail thread, unlike the lock inversion checking
> > > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic
> > > that does not have a sound theoretical basis. The lock validator is an
> > 
> > It's not heuristic but based on the same theoretical basis as <=4.13
> > lockdep. I mean, the key basis is:
> > 
> >    1) What causes deadlock
> >    2) What is a dependency
> >    3) Build a dependency when identified
> 
> Sorry but I doubt that that statement is correct. The publication [1] contains

IMHO, the paper is talking about totally different things wrt
deadlocks by wait_for_event/event, that is, lost events.

Furthermore, it doesn't rely on dependencies itself, but just lock
ordering 'case by case', which is a subset of the more general concept.

> a proof that an algorithm that is closely related to the traditional lockdep
> lock inversion detector is able to detect all deadlocks and does not report

I can admit this.

> false positives for programs that only use mutexes as synchronization objects.

I want to ask you. What makes false positives avoidable in the paper?

> The comment of the authors of that paper for programs that use mutexes,
> condition variables and semaphores is as follows: "It is unclear how to extend
> the lock-graph-based algorithm in Section 3 to efficiently consider the effects
> of condition variables and semaphores. Therefore, when considering all three
> synchronization mechanisms, we currently use a naive algorithm that checks each

Right. The paper seems to use a naive algorigm for that cases, not
replying on dependencies, which they should.

> feasible permutation of the trace for deadlock." In other words, if you have
> found an approach for detecting potential deadlocks for programs that use these
> three kinds of synchronization objects and that does not report false positives
> then that's a breakthrough that's worth publishing in a journal or in the
> proceedings of a scientific conference.

Please, point out logical problems of cross-release than saying it's
impossbile according to the paper. I think you'd better understand how
cross-release works *first*. I'll do my best to help you do.

> Bart.
> 
> [1] Agarwal, Rahul, and Scott D. Stoller. "Run-time detection of potential
> deadlocks for programs with locks, semaphores, and condition variables." In
> Proceedings of the 2006 workshop on Parallel and distributed systems: testing
> and debugging, pp. 51-60. ACM, 2006.
> (https://pdfs.semanticscholar.org/9324/fc0b5d5cd5e05d551a3e98757122039946a2.pdf).
Bart Van Assche Oct. 25, 2017, 7:07 a.m. UTC | #8
On Mon, 2017-10-23 at 11:08 +0900, Byungchul Park wrote:
> On Sun, Oct 22, 2017 at 02:34:56PM +0000, Bart Van Assche wrote:

> > On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote:

> > > On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> > > > As explained in another e-mail thread, unlike the lock inversion checking

> > > > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic

> > > > that does not have a sound theoretical basis. The lock validator is an

> > > 

> > > It's not heuristic but based on the same theoretical basis as <=4.13

> > > lockdep. I mean, the key basis is:

> > > 

> > >    1) What causes deadlock

> > >    2) What is a dependency

> > >    3) Build a dependency when identified

> > 

> > Sorry but I doubt that that statement is correct. The publication [1] contains

> 

> IMHO, the paper is talking about totally different things wrt

> deadlocks by wait_for_event/event, that is, lost events.


Please reread the paper title. The authors of the paper explain that their algorithm
can detect lost events but the most significant contribution of the paper is deadlock
detection.

> > false positives for programs that only use mutexes as synchronization objects.

> 

> I want to ask you. What makes false positives avoidable in the paper?


The algorithm used to detect deadlocks. That algorithm has been explained clearly
in the paper.

> > The comment of the authors of that paper for programs that use mutexes,

> > condition variables and semaphores is as follows: "It is unclear how to extend

> > the lock-graph-based algorithm in Section 3 to efficiently consider the effects

> > of condition variables and semaphores. Therefore, when considering all three

> > synchronization mechanisms, we currently use a naive algorithm that checks each

> > feasible permutation of the trace for deadlock." In other words, if you have

> > found an approach for detecting potential deadlocks for programs that use these

> > three kinds of synchronization objects and that does not report false positives

> > then that's a breakthrough that's worth publishing in a journal or in the

> > proceedings of a scientific conference.

> 

> Please, point out logical problems of cross-release than saying it's

> impossbile according to the paper.


Isn't that the same? If it's impossible to use lock-graphs for detecting deadlocks
in programs that use mutexes, semaphores and condition variables without triggering
false positives that means that every approach that tries to detect deadlocks and
that is based on lock graphs, including cross-release, must report false positives
for certain programs.

Bart.
Byungchul Park Oct. 25, 2017, 11:49 a.m. UTC | #9
On Wed, Oct 25, 2017 at 07:07:06AM +0000, Bart Van Assche wrote:
> > Please, point out logical problems of cross-release than saying it's
> > impossbile according to the paper.
> 
> Isn't that the same? If it's impossible to use lock-graphs for detecting deadlocks
> in programs that use mutexes, semaphores and condition variables without triggering
> false positives that means that every approach that tries to detect deadlocks and
> that is based on lock graphs, including cross-release, must report false positives
> for certain programs.

Right. That's why I'm currently trying to assign lock classes properly
where false positives were reported. You seems to say there is another
cause of false positives. If yes, please let me know what it is. If you
do with an example, it would be more helpful for me to understand you.
diff mbox

Patch

diff --git a/include/linux/completion.h b/include/linux/completion.h
index cae5400..182d56e 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -49,6 +49,13 @@  static inline void complete_release_commit(struct completion *x)
 	lock_commit_crosslock((struct lockdep_map *)&x->map);
 }
 
+#define init_completion_with_map(x, m)					\
+do {									\
+	lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map,	\
+			(m)->name, (m)->key, 0);				\
+	__init_completion(x);						\
+} while (0)
+
 #define init_completion(x)						\
 do {									\
 	static struct lock_class_key __key;				\
@@ -58,6 +65,7 @@  static inline void complete_release_commit(struct completion *x)
 	__init_completion(x);						\
 } while (0)
 #else
+#define init_completion_with_map(x, m) __init_completion(x)
 #define init_completion(x) __init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}