diff mbox

[v4,2/5] x86,entry: Only call user_exit if TIF_NOHZ

Message ID 7123b2489cc5d1d5abb7bcf1364ca729cab3e6ca.1406604806.git.luto@amacapital.net (mailing list archive)
State New, archived
Headers show

Commit Message

Andy Lutomirski July 29, 2014, 3:38 a.m. UTC
The RCU context tracking code requires that arch code call
user_exit() on any entry into kernel code if TIF_NOHZ is set.  This
patch adds a check for TIF_NOHZ and a comment to the syscall entry
tracing code.

The main purpose of this patch is to make the code easier to follow:
one can read the body of user_exit and of every function it calls
without finding any explanation of why it's called for traced
syscalls but not for untraced syscalls.  This makes it clear when
user_exit() is necessary.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
 arch/x86/kernel/ptrace.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Oleg Nesterov July 29, 2014, 7:32 p.m. UTC | #1
On 07/28, Andy Lutomirski wrote:
>
> @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs)
>  {
>  	long ret = 0;
>  
> -	user_exit();
> +	/*
> +	 * If TIF_NOHZ is set, we are required to call user_exit() before
> +	 * doing anything that could touch RCU.
> +	 */
> +	if (test_thread_flag(TIF_NOHZ))
> +		user_exit();

Personally I still think this change just adds more confusion, but I leave
this to you and Frederic.

It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we
need to call user_exit() just because we enter the kernel. TIF_NOHZ is just
the implementation detail which triggers this slow path.

At least it should be correct, unless I am confused even more than I think.

Oleg.
Frederic Weisbecker July 30, 2014, 4:43 p.m. UTC | #2
On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote:
> On 07/28, Andy Lutomirski wrote:
> >
> > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs)
> >  {
> >  	long ret = 0;
> >  
> > -	user_exit();
> > +	/*
> > +	 * If TIF_NOHZ is set, we are required to call user_exit() before
> > +	 * doing anything that could touch RCU.
> > +	 */
> > +	if (test_thread_flag(TIF_NOHZ))
> > +		user_exit();
> 
> Personally I still think this change just adds more confusion, but I leave
> this to you and Frederic.
> 
> It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we
> need to call user_exit() just because we enter the kernel. TIF_NOHZ is just
> the implementation detail which triggers this slow path.
> 
> At least it should be correct, unless I am confused even more than I think.

Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests
this is only about tracing? syscall_slowpath_enter() could be an alternative.
But that's still tracing in a general sense so...
Andy Lutomirski July 30, 2014, 5:23 p.m. UTC | #3
On Wed, Jul 30, 2014 at 9:43 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote:
>> On 07/28, Andy Lutomirski wrote:
>> >
>> > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs)
>> >  {
>> >     long ret = 0;
>> >
>> > -   user_exit();
>> > +   /*
>> > +    * If TIF_NOHZ is set, we are required to call user_exit() before
>> > +    * doing anything that could touch RCU.
>> > +    */
>> > +   if (test_thread_flag(TIF_NOHZ))
>> > +           user_exit();
>>
>> Personally I still think this change just adds more confusion, but I leave
>> this to you and Frederic.
>>
>> It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we
>> need to call user_exit() just because we enter the kernel. TIF_NOHZ is just
>> the implementation detail which triggers this slow path.
>>
>> At least it should be correct, unless I am confused even more than I think.
>
> Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests
> this is only about tracing? syscall_slowpath_enter() could be an alternative.
> But that's still tracing in a general sense so...

At the end of the day, the syscall slowpath code calls a bunch of
functions depending on what TIF_XYZ flags are set.  As long as it's
structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
like that, it's comprehensible.  But once random functions with no
explicit flag checks or comments start showing up, it gets confusing.

If it's indeed all-or-nothing, I could remove the check and add a
comment.  But please keep in mind that, currently, the slow path is
*slow*, and my patches only improve the entry case.  So enabling
context tracking on every task will hurt.

--Andy
Frederic Weisbecker July 31, 2014, 3:16 p.m. UTC | #4
On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote:
> On Wed, Jul 30, 2014 at 9:43 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote:
> >> On 07/28, Andy Lutomirski wrote:
> >> >
> >> > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs)
> >> >  {
> >> >     long ret = 0;
> >> >
> >> > -   user_exit();
> >> > +   /*
> >> > +    * If TIF_NOHZ is set, we are required to call user_exit() before
> >> > +    * doing anything that could touch RCU.
> >> > +    */
> >> > +   if (test_thread_flag(TIF_NOHZ))
> >> > +           user_exit();
> >>
> >> Personally I still think this change just adds more confusion, but I leave
> >> this to you and Frederic.
> >>
> >> It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we
> >> need to call user_exit() just because we enter the kernel. TIF_NOHZ is just
> >> the implementation detail which triggers this slow path.
> >>
> >> At least it should be correct, unless I am confused even more than I think.
> >
> > Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests
> > this is only about tracing? syscall_slowpath_enter() could be an alternative.
> > But that's still tracing in a general sense so...
> 
> At the end of the day, the syscall slowpath code calls a bunch of
> functions depending on what TIF_XYZ flags are set.  As long as it's
> structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
> like that, it's comprehensible.  But once random functions with no
> explicit flag checks or comments start showing up, it gets confusing.

Yeah that's a point. I don't mind much the TIF_NOHZ test if you like.

> 
> If it's indeed all-or-nothing, I could remove the check and add a
> comment.  But please keep in mind that, currently, the slow path is
> *slow*, and my patches only improve the entry case.  So enabling
> context tracking on every task will hurt.

That's what we do anyway. I haven't found a safe way to enabled context tracking
without tracking all CPUs.

> 
> --Andy
Oleg Nesterov July 31, 2014, 4:42 p.m. UTC | #5
On 07/31, Frederic Weisbecker wrote:
>
> On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote:
> >
> > At the end of the day, the syscall slowpath code calls a bunch of
> > functions depending on what TIF_XYZ flags are set.  As long as it's
> > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
> > like that, it's comprehensible.  But once random functions with no
> > explicit flag checks or comments start showing up, it gets confusing.
>
> Yeah that's a point. I don't mind much the TIF_NOHZ test if you like.

And in my opinion

	if (work & TIF_XYZ)
		user_exit();

looks even more confusing. Because, once again, TIF_XYZ is not the
reason to call user_exit().

Not to mention this adds a minor performance penalty.

> > If it's indeed all-or-nothing, I could remove the check and add a
> > comment.  But please keep in mind that, currently, the slow path is
> > *slow*, and my patches only improve the entry case.  So enabling
> > context tracking on every task will hurt.
>
> That's what we do anyway. I haven't found a safe way to enabled context tracking
> without tracking all CPUs.

And if we change this, then the code above becomes racy. The state of
TIF_XYZ can be changed right after the check. OK, it is racy anyway ;)
but still this adds more confusion.

I feel that TIF_XYZ must die. But yes, yes, I know that it is very simple
to say this. And no, so far I do not know how we can improve this all.


But again, again, I won't insist. Just another "can't resist" email,
please ignore.

Oleg.
Frederic Weisbecker July 31, 2014, 4:49 p.m. UTC | #6
On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote:
> On 07/31, Frederic Weisbecker wrote:
> >
> > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote:
> > >
> > > At the end of the day, the syscall slowpath code calls a bunch of
> > > functions depending on what TIF_XYZ flags are set.  As long as it's
> > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
> > > like that, it's comprehensible.  But once random functions with no
> > > explicit flag checks or comments start showing up, it gets confusing.
> >
> > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like.
> 
> And in my opinion
> 
> 	if (work & TIF_XYZ)
> 		user_exit();
> 
> looks even more confusing. Because, once again, TIF_XYZ is not the
> reason to call user_exit().
> 
> Not to mention this adds a minor performance penalty.

That's a point too! You guys both convinced me! ;-)

> 
> > > If it's indeed all-or-nothing, I could remove the check and add a
> > > comment.  But please keep in mind that, currently, the slow path is
> > > *slow*, and my patches only improve the entry case.  So enabling
> > > context tracking on every task will hurt.
> >
> > That's what we do anyway. I haven't found a safe way to enabled context tracking
> > without tracking all CPUs.
> 
> And if we change this, then the code above becomes racy. The state of
> TIF_XYZ can be changed right after the check. OK, it is racy anyway ;)
> but still this adds more confusion.

No because all running tasks have this flag set when context tracking is
enabled. And context tracking can't be disabled on runtime.
Oleg Nesterov July 31, 2014, 4:54 p.m. UTC | #7
On 07/31, Frederic Weisbecker wrote:
>
> On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote:
> > On 07/31, Frederic Weisbecker wrote:
> > >
> > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote:
> > > >
> > > > At the end of the day, the syscall slowpath code calls a bunch of
> > > > functions depending on what TIF_XYZ flags are set.  As long as it's
> > > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
> > > > like that, it's comprehensible.  But once random functions with no
> > > > explicit flag checks or comments start showing up, it gets confusing.
> > >
> > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like.
> >
> > And in my opinion
> >
> > 	if (work & TIF_XYZ)
> > 		user_exit();
> >
> > looks even more confusing. Because, once again, TIF_XYZ is not the
> > reason to call user_exit().
> >
> > Not to mention this adds a minor performance penalty.
>
> That's a point too! You guys both convinced me! ;-)

Very nice, now I know that you can agree with 2 opposite opinions at
the same time ;)

> > > > If it's indeed all-or-nothing, I could remove the check and add a
> > > > comment.  But please keep in mind that, currently, the slow path is
> > > > *slow*, and my patches only improve the entry case.  So enabling
> > > > context tracking on every task will hurt.
> > >
> > > That's what we do anyway. I haven't found a safe way to enabled context tracking
> > > without tracking all CPUs.
> >
> > And if we change this, then the code above becomes racy. The state of
> > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;)
> > but still this adds more confusion.
>
> No because all running tasks have this flag set when context tracking is
> enabled. And context tracking can't be disabled on runtime.

Yes, yes, please note that I said "if we change this".

Oleg.
Oleg Nesterov July 31, 2014, 4:58 p.m. UTC | #8
On 07/31, Oleg Nesterov wrote:
>
> On 07/31, Frederic Weisbecker wrote:
> >
> > On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote:
> > >
> > > And if we change this, then the code above becomes racy. The state of
> > > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;)
> > > but still this adds more confusion.
> >
> > No because all running tasks have this flag set when context tracking is
> > enabled. And context tracking can't be disabled on runtime.
>
> Yes, yes, please note that I said "if we change this".

And "it is racy anyway" connects to the problems we discuss in another
thread.

Oleg.
Frederic Weisbecker July 31, 2014, 5:17 p.m. UTC | #9
On Thu, Jul 31, 2014 at 06:54:23PM +0200, Oleg Nesterov wrote:
> On 07/31, Frederic Weisbecker wrote:
> >
> > On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote:
> > > On 07/31, Frederic Weisbecker wrote:
> > > >
> > > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote:
> > > > >
> > > > > At the end of the day, the syscall slowpath code calls a bunch of
> > > > > functions depending on what TIF_XYZ flags are set.  As long as it's
> > > > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something
> > > > > like that, it's comprehensible.  But once random functions with no
> > > > > explicit flag checks or comments start showing up, it gets confusing.
> > > >
> > > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like.
> > >
> > > And in my opinion
> > >
> > > 	if (work & TIF_XYZ)
> > > 		user_exit();
> > >
> > > looks even more confusing. Because, once again, TIF_XYZ is not the
> > > reason to call user_exit().
> > >
> > > Not to mention this adds a minor performance penalty.
> >
> > That's a point too! You guys both convinced me! ;-)
> 
> Very nice, now I know that you can agree with 2 opposite opinions at
> the same time ;)

That's what an Acked-by from Schroedinger would look like!

> 
> > > > > If it's indeed all-or-nothing, I could remove the check and add a
> > > > > comment.  But please keep in mind that, currently, the slow path is
> > > > > *slow*, and my patches only improve the entry case.  So enabling
> > > > > context tracking on every task will hurt.
> > > >
> > > > That's what we do anyway. I haven't found a safe way to enabled context tracking
> > > > without tracking all CPUs.
> > >
> > > And if we change this, then the code above becomes racy. The state of
> > > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;)
> > > but still this adds more confusion.
> >
> > No because all running tasks have this flag set when context tracking is
> > enabled. And context tracking can't be disabled on runtime.
> 
> Yes, yes, please note that I said "if we change this".

Yeah but the NO_HZ test wouldn't change much the situation I think.
diff mbox

Patch

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 39296d2..bbf338a 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1449,7 +1449,12 @@  long syscall_trace_enter(struct pt_regs *regs)
 {
 	long ret = 0;
 
-	user_exit();
+	/*
+	 * If TIF_NOHZ is set, we are required to call user_exit() before
+	 * doing anything that could touch RCU.
+	 */
+	if (test_thread_flag(TIF_NOHZ))
+		user_exit();
 
 	/*
 	 * If we stepped into a sysenter/syscall insn, it trapped in