diff mbox

[2/2] lockdep: check that no locks held at freeze time

Message ID 1367615050-3894-2-git-send-email-ccross@android.com (mailing list archive)
State New, archived
Headers show

Commit Message

Colin Cross May 3, 2013, 9:04 p.m. UTC
From: Mandeep Singh Baines <msb@chromium.org>

We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
deadlock if the lock is later acquired in the suspend or hibernate path
(e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
cgroup_freezer if a lock is held inside a frozen cgroup that is later
acquired by a process outside that group.

History:
This patch was originally applied as 6aa9707099c and reverted in
dbf520a9d7d4 because NFS was freezing with locks held.  It was
deemed better to keep the bad freeze point in NFS to allow laptops
to suspend consistently.  The previous patch in this series converts
NFS to call _unsafe versions of the freezable helpers so that
lockdep doesn't complain about them until a more correct fix
can be applied.

[akpm@linux-foundation.org: export debug_check_no_locks_held]
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ccross@android.com: don't warn if try_to_freeze_unsafe is called]
Signed-off-by: Colin Cross <ccross@android.com>
---
 include/linux/debug_locks.h |  4 ++--
 include/linux/freezer.h     |  3 +++
 kernel/exit.c               |  2 +-
 kernel/lockdep.c            | 17 ++++++++---------
 4 files changed, 14 insertions(+), 12 deletions(-)

As requested by Tejun, this is to prevent incorrect use when
I add new freezable helpers in a subsequent patch series.

I'm not sure what the protocol is for Signed-off-by when
reapplying a reverted patch, but I got the patch from Linus'
tree so I left all of them and added mine at the bottom.

Comments

Pavel Machek May 4, 2013, 1:04 p.m. UTC | #1
On Fri 2013-05-03 14:04:10, Colin Cross wrote:
> From: Mandeep Singh Baines <msb@chromium.org>
> 
> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
> deadlock if the lock is later acquired in the suspend or hibernate path
> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
> cgroup_freezer if a lock is held inside a frozen cgroup that is later
> acquired by a process outside that group.

Ok, but this does not explain why

> --- a/include/linux/debug_locks.h
> +++ b/include/linux/debug_locks.h
> @@ -51,7 +51,7 @@ struct task_struct;
>  extern void debug_show_all_locks(void);
>  extern void debug_show_held_locks(struct task_struct *task);
>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
> -extern void debug_check_no_locks_held(struct task_struct *task);
> +extern void debug_check_no_locks_held(void);
>  #else
>  static inline void debug_show_all_locks(void)
>  {

Removing task_struct argument from  those functions is good idea?

> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -835,7 +835,7 @@ void do_exit(long code)
>  	/*
>  	 * Make sure we are holding no locks:
>  	 */
> -	debug_check_no_locks_held(tsk);
> +	debug_check_no_locks_held();

Is task guaranteed == current?

									Pavel
Colin Cross May 4, 2013, 8:27 p.m. UTC | #2
On Sat, May 4, 2013 at 6:04 AM, Pavel Machek <pavel@ucw.cz> wrote:
> On Fri 2013-05-03 14:04:10, Colin Cross wrote:
>> From: Mandeep Singh Baines <msb@chromium.org>
>>
>> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
>> deadlock if the lock is later acquired in the suspend or hibernate path
>> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
>> cgroup_freezer if a lock is held inside a frozen cgroup that is later
>> acquired by a process outside that group.
>
> Ok, but this does not explain why
>
>> --- a/include/linux/debug_locks.h
>> +++ b/include/linux/debug_locks.h
>> @@ -51,7 +51,7 @@ struct task_struct;
>>  extern void debug_show_all_locks(void);
>>  extern void debug_show_held_locks(struct task_struct *task);
>>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
>> -extern void debug_check_no_locks_held(struct task_struct *task);
>> +extern void debug_check_no_locks_held(void);
>>  #else
>>  static inline void debug_show_all_locks(void)
>>  {
>
> Removing task_struct argument from  those functions is good idea?

This is an existing patch that was merged in 3.9 and then reverted
again, so it has already been reviewed and accepted once.

>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -835,7 +835,7 @@ void do_exit(long code)
>>       /*
>>        * Make sure we are holding no locks:
>>        */
>> -     debug_check_no_locks_held(tsk);
>> +     debug_check_no_locks_held();
>
> Is task guaranteed == current?

Yes, the first line of do_exit is:
        struct task_struct *tsk = current;
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 4, 2013, 10:57 p.m. UTC | #3
On Sat 2013-05-04 13:27:23, Colin Cross wrote:
> On Sat, May 4, 2013 at 6:04 AM, Pavel Machek <pavel@ucw.cz> wrote:
> > On Fri 2013-05-03 14:04:10, Colin Cross wrote:
> >> From: Mandeep Singh Baines <msb@chromium.org>
> >>
> >> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
> >> deadlock if the lock is later acquired in the suspend or hibernate path
> >> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
> >> cgroup_freezer if a lock is held inside a frozen cgroup that is later
> >> acquired by a process outside that group.
> >
> > Ok, but this does not explain why
> >
> >> --- a/include/linux/debug_locks.h
> >> +++ b/include/linux/debug_locks.h
> >> @@ -51,7 +51,7 @@ struct task_struct;
> >>  extern void debug_show_all_locks(void);
> >>  extern void debug_show_held_locks(struct task_struct *task);
> >>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
> >> -extern void debug_check_no_locks_held(struct task_struct *task);
> >> +extern void debug_check_no_locks_held(void);
> >>  #else
> >>  static inline void debug_show_all_locks(void)
> >>  {
> >
> > Removing task_struct argument from  those functions is good idea?
> 
> This is an existing patch that was merged in 3.9 and then reverted
> again, so it has already been reviewed and accepted once.

Well, it was also reverted once :-).

> >> --- a/kernel/exit.c
> >> +++ b/kernel/exit.c
> >> @@ -835,7 +835,7 @@ void do_exit(long code)
> >>       /*
> >>        * Make sure we are holding no locks:
> >>        */
> >> -     debug_check_no_locks_held(tsk);
> >> +     debug_check_no_locks_held();
> >
> > Is task guaranteed == current?
> 
> Yes, the first line of do_exit is:
>         struct task_struct *tsk = current;

Aha, I understand it now.

Accessing current is slower than local variable. So your "new" code
will work but will be slower. Please revert this part.
									Pavel
Colin Cross May 4, 2013, 11:49 p.m. UTC | #4
On Sat, May 4, 2013 at 3:57 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Sat 2013-05-04 13:27:23, Colin Cross wrote:
>> On Sat, May 4, 2013 at 6:04 AM, Pavel Machek <pavel@ucw.cz> wrote:
>> > On Fri 2013-05-03 14:04:10, Colin Cross wrote:
>> >> From: Mandeep Singh Baines <msb@chromium.org>
>> >>
>> >> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
>> >> deadlock if the lock is later acquired in the suspend or hibernate path
>> >> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
>> >> cgroup_freezer if a lock is held inside a frozen cgroup that is later
>> >> acquired by a process outside that group.
>> >
>> > Ok, but this does not explain why
>> >
>> >> --- a/include/linux/debug_locks.h
>> >> +++ b/include/linux/debug_locks.h
>> >> @@ -51,7 +51,7 @@ struct task_struct;
>> >>  extern void debug_show_all_locks(void);
>> >>  extern void debug_show_held_locks(struct task_struct *task);
>> >>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
>> >> -extern void debug_check_no_locks_held(struct task_struct *task);
>> >> +extern void debug_check_no_locks_held(void);
>> >>  #else
>> >>  static inline void debug_show_all_locks(void)
>> >>  {
>> >
>> > Removing task_struct argument from  those functions is good idea?
>>
>> This is an existing patch that was merged in 3.9 and then reverted
>> again, so it has already been reviewed and accepted once.
>
> Well, it was also reverted once :-).

It was reverted because of problems in NFS, not because of any problem
with this patch.

>> >> --- a/kernel/exit.c
>> >> +++ b/kernel/exit.c
>> >> @@ -835,7 +835,7 @@ void do_exit(long code)
>> >>       /*
>> >>        * Make sure we are holding no locks:
>> >>        */
>> >> -     debug_check_no_locks_held(tsk);
>> >> +     debug_check_no_locks_held();
>> >
>> > Is task guaranteed == current?
>>
>> Yes, the first line of do_exit is:
>>         struct task_struct *tsk = current;
>
> Aha, I understand it now.
>
> Accessing current is slower than local variable. So your "new" code
> will work but will be slower. Please revert this part.

Using current instead of passing in tsk was done at Andrew Morton's
suggestion, and makes no difference from the freezer's perspective
since it would have to use current to get the task to pass in, so I'm
going to leave it as is.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 5, 2013, 12:05 a.m. UTC | #5
Hi!

> >> >> --- a/kernel/exit.c
> >> >> +++ b/kernel/exit.c
> >> >> @@ -835,7 +835,7 @@ void do_exit(long code)
> >> >>       /*
> >> >>        * Make sure we are holding no locks:
> >> >>        */
> >> >> -     debug_check_no_locks_held(tsk);
> >> >> +     debug_check_no_locks_held();
> >> >
> >> > Is task guaranteed == current?
> >>
> >> Yes, the first line of do_exit is:
> >>         struct task_struct *tsk = current;
> >
> > Aha, I understand it now.
> >
> > Accessing current is slower than local variable. So your "new" code
> > will work but will be slower. Please revert this part.
> 
> Using current instead of passing in tsk was done at Andrew Morton's
> suggestion, and makes no difference from the freezer's perspective
> since it would have to use current to get the task to pass in, so I'm
> going to leave it as is.

Well, current is:

static inline struct thread_info *current_thread_info(void)
{
        register unsigned long sp asm ("sp");
        return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
}

#define get_current() (current_thread_info()->task)

#define current get_current()

Instead of passing computed value to debug_check_no_locks_held(), you
force it to be computed again. do_exit() performance matters for
configure scripts, etc.

I'd say it makes sense to keep the optimalization. akpm can correct
me.

									Pavel
Colin Cross May 5, 2013, 12:23 a.m. UTC | #6
On Sat, May 4, 2013 at 5:05 PM, Pavel Machek <pavel@ucw.cz> wrote:
> Hi!
>
>> >> >> --- a/kernel/exit.c
>> >> >> +++ b/kernel/exit.c
>> >> >> @@ -835,7 +835,7 @@ void do_exit(long code)
>> >> >>       /*
>> >> >>        * Make sure we are holding no locks:
>> >> >>        */
>> >> >> -     debug_check_no_locks_held(tsk);
>> >> >> +     debug_check_no_locks_held();
>> >> >
>> >> > Is task guaranteed == current?
>> >>
>> >> Yes, the first line of do_exit is:
>> >>         struct task_struct *tsk = current;
>> >
>> > Aha, I understand it now.
>> >
>> > Accessing current is slower than local variable. So your "new" code
>> > will work but will be slower. Please revert this part.
>>
>> Using current instead of passing in tsk was done at Andrew Morton's
>> suggestion, and makes no difference from the freezer's perspective
>> since it would have to use current to get the task to pass in, so I'm
>> going to leave it as is.
>
> Well, current is:
>
> static inline struct thread_info *current_thread_info(void)
> {
>         register unsigned long sp asm ("sp");
>         return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
> }
>
> #define get_current() (current_thread_info()->task)
>
> #define current get_current()
>
> Instead of passing computed value to debug_check_no_locks_held(), you
> force it to be computed again. do_exit() performance matters for
> configure scripts, etc.
>
> I'd say it makes sense to keep the optimalization. akpm can correct
> me.

That translates to 3 instructions, with no memory accesses:
c0008350:       e1a0300d        mov     r3, sp
c0008354:       e3c32d7f        bic     r2, r3, #8128   ; 0x1fc0
c0008358:       e3c2203f        bic     r2, r2, #63     ; 0x3f
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 5, 2013, 1:13 a.m. UTC | #7
On Sat 2013-05-04 17:23:01, Colin Cross wrote:
> On Sat, May 4, 2013 at 5:05 PM, Pavel Machek <pavel@ucw.cz> wrote:
> > Hi!
> >
> >> >> >> --- a/kernel/exit.c
> >> >> >> +++ b/kernel/exit.c
> >> >> >> @@ -835,7 +835,7 @@ void do_exit(long code)
> >> >> >>       /*
> >> >> >>        * Make sure we are holding no locks:
> >> >> >>        */
> >> >> >> -     debug_check_no_locks_held(tsk);
> >> >> >> +     debug_check_no_locks_held();
> >> >> >
> >> >> > Is task guaranteed == current?
> >> >>
> >> >> Yes, the first line of do_exit is:
> >> >>         struct task_struct *tsk = current;
> >> >
> >> > Aha, I understand it now.
> >> >
> >> > Accessing current is slower than local variable. So your "new" code
> >> > will work but will be slower. Please revert this part.
> >>
> >> Using current instead of passing in tsk was done at Andrew Morton's
> >> suggestion, and makes no difference from the freezer's perspective
> >> since it would have to use current to get the task to pass in, so I'm
> >> going to leave it as is.
> >
> > Well, current is:
> >
> > static inline struct thread_info *current_thread_info(void)
> > {
> >         register unsigned long sp asm ("sp");
> >         return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
> > }
> >
> > #define get_current() (current_thread_info()->task)
> >
> > #define current get_current()
> >
> > Instead of passing computed value to debug_check_no_locks_held(), you
> > force it to be computed again. do_exit() performance matters for
> > configure scripts, etc.
> >
> > I'd say it makes sense to keep the optimalization. akpm can correct
> > me.
> 
> That translates to 3 instructions, with no memory accesses:
> c0008350:       e1a0300d        mov     r3, sp
> c0008354:       e3c32d7f        bic     r2, r3, #8128   ; 0x1fc0
> c0008358:       e3c2203f        bic     r2, r2, #63     ; 0x3f

On ARM, you are right. It seems to have memory access on s390 and
x86-64. Ok, it is probably going to be in cache, but...

									Pavel
Ingo Molnar May 5, 2013, 9:18 a.m. UTC | #8
* Pavel Machek <pavel@ucw.cz> wrote:

> Hi!
> 
> > >> >> --- a/kernel/exit.c
> > >> >> +++ b/kernel/exit.c
> > >> >> @@ -835,7 +835,7 @@ void do_exit(long code)
> > >> >>       /*
> > >> >>        * Make sure we are holding no locks:
> > >> >>        */
> > >> >> -     debug_check_no_locks_held(tsk);
> > >> >> +     debug_check_no_locks_held();
> > >> >
> > >> > Is task guaranteed == current?
> > >>
> > >> Yes, the first line of do_exit is:
> > >>         struct task_struct *tsk = current;
> > >
> > > Aha, I understand it now.
> > >
> > > Accessing current is slower than local variable. So your "new" code
> > > will work but will be slower. Please revert this part.
> > 
> > Using current instead of passing in tsk was done at Andrew Morton's
> > suggestion, and makes no difference from the freezer's perspective
> > since it would have to use current to get the task to pass in, so I'm
> > going to leave it as is.
> 
> Well, current is:
> 
> static inline struct thread_info *current_thread_info(void)
> {
>         register unsigned long sp asm ("sp");
>         return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
> }

That's the old 32-bit x86 trick to compute 'current' from the kernel stack 
pointer.

It can be done better - for example on platforms with optimized percpu 
variables (x86-64) it looks like this:

static inline struct thread_info *current_thread_info(void)
{
        struct thread_info *ti;
        ti = (void *)(this_cpu_read_stable(kernel_stack) +
                      KERNEL_STACK_OFFSET - THREAD_SIZE);
        return ti;
}

Which turns the computation of 'current' into a single instruction. For 
example, to access current->pid [which fields is at offset 0x2a4], we get:

    3ad0:       65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
    3ad7:       00 00
    3ad9:       8b 80 a4 02 00 00       mov    0x2a4(%rax),%eax

I also agree with the removal of the 'tsk' parameter because the function 
itself internally assumes that tsk == current.

[ We could perhaps rename the function to 
  debug_check_no_locks_held_curr(), to make it clear it operates on the 
  current task. ]

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 6, 2013, 8:55 a.m. UTC | #9
On Sun, May 05, 2013 at 11:18:44AM +0200, Ingo Molnar wrote:
> That's the old 32-bit x86 trick to compute 'current' from the kernel stack 
> pointer.
> 
> It can be done better - for example on platforms with optimized percpu 
> variables (x86-64) it looks like this:

Doesn't i386 have all the funny per-cpu stuff too? So the only reason it still
does the fugly stack based thing is because nobody could be arsed to do the
work of converting it.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 6, 2013, 12:11 p.m. UTC | #10
1;2403;0cOn Mon 2013-05-06 10:55:47, Peter Zijlstra wrote:
> On Sun, May 05, 2013 at 11:18:44AM +0200, Ingo Molnar wrote:
> > That's the old 32-bit x86 trick to compute 'current' from the kernel stack 
> > pointer.
> > 
> > It can be done better - for example on platforms with optimized percpu 
> > variables (x86-64) it looks like this:
> 
> Doesn't i386 have all the funny per-cpu stuff too? So the only reason it still
> does the fugly stack based thing is because nobody could be arsed to do the
> work of converting it.

(Note that the quoted example was from ARM. But also note that the
percpu stuff requires memory access, so...)
									Pavel
Linus Torvalds May 6, 2013, 2:33 p.m. UTC | #11
On Mon, May 6, 2013 at 1:55 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>
> Doesn't i386 have all the funny per-cpu stuff too? So the only reason it still
> does the fugly stack based thing is because nobody could be arsed to do the
> work of converting it.

Umm. That "fugly stack-based" thing is better than the per-cpu crap.

The percpu stuff implies a memory load. The stack based thing gets
thread_info with pure register accesses. Much better.

For "current()" the per-cpu thing may be better, but if you actually
need the thread-info (not the case here, but in other places), the
stack masking is superior when it works (ie when you don't have
multi-stack issues due to irq's etc)

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 6, 2013, 2:42 p.m. UTC | #12
On Mon, May 06, 2013 at 07:33:28AM -0700, Linus Torvalds wrote:
> On Mon, May 6, 2013 at 1:55 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> >
> > Doesn't i386 have all the funny per-cpu stuff too? So the only reason it still
> > does the fugly stack based thing is because nobody could be arsed to do the
> > work of converting it.
> 
> Umm. That "fugly stack-based" thing is better than the per-cpu crap.
> 
> The percpu stuff implies a memory load. The stack based thing gets
> thread_info with pure register accesses. Much better.
> 
> For "current()" the per-cpu thing may be better, but if you actually
> need the thread-info (not the case here, but in other places), the
> stack masking is superior when it works (ie when you don't have
> multi-stack issues due to irq's etc)

But you can do both right? Use per-cpu for current and stack frobbery for
current_thread_info().

That said, ISTR some risky bits where the stack frobbery went awry due to
irq-stacks which is the source for my feelings towards the stack frobbery.

That and of course that i386 and x86-64 behave differently for no apparent
reason.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo May 6, 2013, 7:01 p.m. UTC | #13
On Fri, May 03, 2013 at 02:04:10PM -0700, Colin Cross wrote:
> From: Mandeep Singh Baines <msb@chromium.org>
> 
> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
> deadlock if the lock is later acquired in the suspend or hibernate path
> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
> cgroup_freezer if a lock is held inside a frozen cgroup that is later
> acquired by a process outside that group.
> 
> History:
> This patch was originally applied as 6aa9707099c and reverted in
> dbf520a9d7d4 because NFS was freezing with locks held.  It was
> deemed better to keep the bad freeze point in NFS to allow laptops
> to suspend consistently.  The previous patch in this series converts
> NFS to call _unsafe versions of the freezable helpers so that
> lockdep doesn't complain about them until a more correct fix
> can be applied.

I don't care about %current change, especially given that it's a debug
interface but that really should be a separate patch, so please split
it out if you want it (and I think we want it).

Thanks.
Colin Cross May 6, 2013, 7:30 p.m. UTC | #14
On Mon, May 6, 2013 at 12:01 PM, Tejun Heo <tj@kernel.org> wrote:
> On Fri, May 03, 2013 at 02:04:10PM -0700, Colin Cross wrote:
>> From: Mandeep Singh Baines <msb@chromium.org>
>>
>> We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
>> deadlock if the lock is later acquired in the suspend or hibernate path
>> (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
>> cgroup_freezer if a lock is held inside a frozen cgroup that is later
>> acquired by a process outside that group.
>>
>> History:
>> This patch was originally applied as 6aa9707099c and reverted in
>> dbf520a9d7d4 because NFS was freezing with locks held.  It was
>> deemed better to keep the bad freeze point in NFS to allow laptops
>> to suspend consistently.  The previous patch in this series converts
>> NFS to call _unsafe versions of the freezable helpers so that
>> lockdep doesn't complain about them until a more correct fix
>> can be applied.
>
> I don't care about %current change, especially given that it's a debug
> interface but that really should be a separate patch, so please split
> it out if you want it (and I think we want it).

The current change was requested by akpm and was part of the original
patch.  Is it really worth confusing the history of this patch even
more, applying it the first time, reverting it, and then applying it
again in two parts?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo May 6, 2013, 7:33 p.m. UTC | #15
Hello,

On Mon, May 06, 2013 at 12:30:19PM -0700, Colin Cross wrote:
> > I don't care about %current change, especially given that it's a debug
> > interface but that really should be a separate patch, so please split
> > it out if you want it (and I think we want it).
> 
> The current change was requested by akpm and was part of the original
> patch.  Is it really worth confusing the history of this patch even
> more, applying it the first time, reverting it, and then applying it
> again in two parts?

I don't know.  The patch seems confusing to me.  It really is about
adding single lockdep annotation but comes with other changes.  I
don't think it's a big deal either way but at least we wouldn't be
having this %current vs. @tsk conversation which is mostly irrelevant
to the actual proposed change, right?  It really should have been a
separate patch from the beginning.  Just refer to the original commit
and explain what happened?

Thanks.
Rafael Wysocki May 6, 2013, 8:06 p.m. UTC | #16
On Monday, May 06, 2013 12:33:07 PM Tejun Heo wrote:
> Hello,
> 
> On Mon, May 06, 2013 at 12:30:19PM -0700, Colin Cross wrote:
> > > I don't care about %current change, especially given that it's a debug
> > > interface but that really should be a separate patch, so please split
> > > it out if you want it (and I think we want it).
> > 
> > The current change was requested by akpm and was part of the original
> > patch.  Is it really worth confusing the history of this patch even
> > more, applying it the first time, reverting it, and then applying it
> > again in two parts?
> 
> I don't know.  The patch seems confusing to me.  It really is about
> adding single lockdep annotation but comes with other changes.  I
> don't think it's a big deal either way but at least we wouldn't be
> having this %current vs. @tsk conversation which is mostly irrelevant
> to the actual proposed change, right?  It really should have been a
> separate patch from the beginning.  Just refer to the original commit
> and explain what happened?

Yeah.  I'd prefer that very much.

Thanks,
Rafael
diff mbox

Patch

diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
index 3bd46f7..a975de1 100644
--- a/include/linux/debug_locks.h
+++ b/include/linux/debug_locks.h
@@ -51,7 +51,7 @@  struct task_struct;
 extern void debug_show_all_locks(void);
 extern void debug_show_held_locks(struct task_struct *task);
 extern void debug_check_no_locks_freed(const void *from, unsigned long len);
-extern void debug_check_no_locks_held(struct task_struct *task);
+extern void debug_check_no_locks_held(void);
 #else
 static inline void debug_show_all_locks(void)
 {
@@ -67,7 +67,7 @@  debug_check_no_locks_freed(const void *from, unsigned long len)
 }
 
 static inline void
-debug_check_no_locks_held(struct task_struct *task)
+debug_check_no_locks_held(void)
 {
 }
 #endif
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 5b31e21c..efb80dd 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -3,6 +3,7 @@ 
 #ifndef FREEZER_H_INCLUDED
 #define FREEZER_H_INCLUDED
 
+#include <linux/debug_locks.h>
 #include <linux/sched.h>
 #include <linux/wait.h>
 #include <linux/atomic.h>
@@ -60,6 +61,8 @@  static inline bool try_to_freeze_unsafe(void)
 
 static inline bool try_to_freeze(void)
 {
+	if (!(current->flags & PF_NOFREEZE))
+		debug_check_no_locks_held();
 	return try_to_freeze_unsafe();
 }
 
diff --git a/kernel/exit.c b/kernel/exit.c
index 60bc027..51e485c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -835,7 +835,7 @@  void do_exit(long code)
 	/*
 	 * Make sure we are holding no locks:
 	 */
-	debug_check_no_locks_held(tsk);
+	debug_check_no_locks_held();
 	/*
 	 * We can do this unlocked here. The futex code uses this flag
 	 * just to verify whether the pi state cleanup has been done
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 8a0efac..259db20 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -4088,7 +4088,7 @@  void debug_check_no_locks_freed(const void *mem_from, unsigned long mem_len)
 }
 EXPORT_SYMBOL_GPL(debug_check_no_locks_freed);
 
-static void print_held_locks_bug(struct task_struct *curr)
+static void print_held_locks_bug(void)
 {
 	if (!debug_locks_off())
 		return;
@@ -4097,22 +4097,21 @@  static void print_held_locks_bug(struct task_struct *curr)
 
 	printk("\n");
 	printk("=====================================\n");
-	printk("[ BUG: lock held at task exit time! ]\n");
+	printk("[ BUG: %s/%d still has locks held! ]\n",
+	       current->comm, task_pid_nr(current));
 	print_kernel_ident();
 	printk("-------------------------------------\n");
-	printk("%s/%d is exiting with locks still held!\n",
-		curr->comm, task_pid_nr(curr));
-	lockdep_print_held_locks(curr);
-
+	lockdep_print_held_locks(current);
 	printk("\nstack backtrace:\n");
 	dump_stack();
 }
 
-void debug_check_no_locks_held(struct task_struct *task)
+void debug_check_no_locks_held(void)
 {
-	if (unlikely(task->lockdep_depth > 0))
-		print_held_locks_bug(task);
+	if (unlikely(current->lockdep_depth > 0))
+		print_held_locks_bug();
 }
+EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
 
 void debug_show_all_locks(void)
 {