diff mbox series

[v5,7/9] task_isolation: don't interrupt CPUs with tick_nohz_full_kick_cpu()

Message ID 76ed0b222d2f16fb5aebd144ac0222a7f3b87fa1.camel@marvell.com (mailing list archive)
State New, archived
Headers show
Series "Task_isolation" mode | expand

Commit Message

Alex Belits Nov. 23, 2020, 5:58 p.m. UTC
From: Yuri Norov <ynorov@marvell.com>

For nohz_full CPUs the desirable behavior is to receive interrupts
generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
obviously not desirable because it breaks isolation.

This patch adds check for it.

Signed-off-by: Yuri Norov <ynorov@marvell.com>
[abelits@marvell.com: updated, only exclude CPUs running isolated tasks]
Signed-off-by: Alex Belits <abelits@marvell.com>
---
 kernel/time/tick-sched.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Frederic Weisbecker Nov. 23, 2020, 10:13 p.m. UTC | #1
Hi Alex,

On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> From: Yuri Norov <ynorov@marvell.com>
> 
> For nohz_full CPUs the desirable behavior is to receive interrupts
> generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> obviously not desirable because it breaks isolation.
> 
> This patch adds check for it.
> 
> Signed-off-by: Yuri Norov <ynorov@marvell.com>
> [abelits@marvell.com: updated, only exclude CPUs running isolated tasks]
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  kernel/time/tick-sched.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a213952541db..6c8679e200f0 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -20,6 +20,7 @@
>  #include <linux/sched/clock.h>
>  #include <linux/sched/stat.h>
>  #include <linux/sched/nohz.h>
> +#include <linux/isolation.h>
>  #include <linux/module.h>
>  #include <linux/irq_work.h>
>  #include <linux/posix-timers.h>
> @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
>   */
>  void tick_nohz_full_kick_cpu(int cpu)
>  {
> -	if (!tick_nohz_full_cpu(cpu))
> +	smp_rmb();
> +	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
>  		return;

Like I said in subsequent reviews, we are not going to ignore IPIs.
We must fix the sources of these IPIs instead.

Thanks.

>  
>  	irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> -- 
> 2.20.1
>
Alex Belits Nov. 23, 2020, 10:35 p.m. UTC | #2
On Mon, 2020-11-23 at 23:13 +0100, Frederic Weisbecker wrote:
> External Email
> 
> -------------------------------------------------------------------
> ---
> Hi Alex,
> 
> On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> > From: Yuri Norov <ynorov@marvell.com>
> > 
> > For nohz_full CPUs the desirable behavior is to receive interrupts
> > generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> > obviously not desirable because it breaks isolation.
> > 
> > This patch adds check for it.
> > 
> > Signed-off-by: Yuri Norov <ynorov@marvell.com>
> > [abelits@marvell.com: updated, only exclude CPUs running isolated
> > tasks]
> > Signed-off-by: Alex Belits <abelits@marvell.com>
> > ---
> >  kernel/time/tick-sched.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index a213952541db..6c8679e200f0 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/sched/clock.h>
> >  #include <linux/sched/stat.h>
> >  #include <linux/sched/nohz.h>
> > +#include <linux/isolation.h>
> >  #include <linux/module.h>
> >  #include <linux/irq_work.h>
> >  #include <linux/posix-timers.h>
> > @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
> >   */
> >  void tick_nohz_full_kick_cpu(int cpu)
> >  {
> > -	if (!tick_nohz_full_cpu(cpu))
> > +	smp_rmb();
> > +	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
> >  		return;
> 
> Like I said in subsequent reviews, we are not going to ignore IPIs.
> We must fix the sources of these IPIs instead.

This is what I am working on right now. This is made with an assumption
that CPU running isolated task has no reason to be kicked because
nothing else is supposed to be there. Usually this is true and when not
true is still safe when everything else is behaving right. For this
version I have kept the original implementation with minimal changes to
make it possible to use task isolation at all.

I agree that it's a much better idea is to determine if the CPU should
be kicked. If it really should, that will be a legitimate cause to
break isolation there, because CPU running isolated task has no
legitimate reason to have timers running. Right now I am trying to
determine the origin of timers that _still_ show up as running in the
current kernel version, so I think, this is a rather large chunk of
work that I have to do separately.
Thomas Gleixner Nov. 23, 2020, 10:36 p.m. UTC | #3
On Mon, Nov 23 2020 at 17:58, Alex Belits wrote:
> From: Yuri Norov <ynorov@marvell.com>
>
> For nohz_full CPUs the desirable behavior is to receive interrupts
> generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> obviously not desirable because it breaks isolation.
>
> This patch adds check for it.

git grep 'This patch' Documentation/process/

>   */
>  void tick_nohz_full_kick_cpu(int cpu)
>  {
> -	if (!tick_nohz_full_cpu(cpu))
> +	smp_rmb();

Undocumented smp_rmb() ...

> +	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
>  		return;

I still have to see a convincing argument why task isolation is special
and not just a straight forward extension of NOHZ full cpu isolation.

It's not special as much as you want it to be special.

Thanks,

        tglx
Mark Rutland Dec. 2, 2020, 2:20 p.m. UTC | #4
On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> From: Yuri Norov <ynorov@marvell.com>
> 
> For nohz_full CPUs the desirable behavior is to receive interrupts
> generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> obviously not desirable because it breaks isolation.
> 
> This patch adds check for it.
> 
> Signed-off-by: Yuri Norov <ynorov@marvell.com>
> [abelits@marvell.com: updated, only exclude CPUs running isolated tasks]
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  kernel/time/tick-sched.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a213952541db..6c8679e200f0 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -20,6 +20,7 @@
>  #include <linux/sched/clock.h>
>  #include <linux/sched/stat.h>
>  #include <linux/sched/nohz.h>
> +#include <linux/isolation.h>
>  #include <linux/module.h>
>  #include <linux/irq_work.h>
>  #include <linux/posix-timers.h>
> @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
>   */
>  void tick_nohz_full_kick_cpu(int cpu)
>  {
> -	if (!tick_nohz_full_cpu(cpu))
> +	smp_rmb();

What does this barrier pair with? The commit message doesn't mention it,
and it's not clear in-context.

Thanks,
Mark.

> +	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
>  		return;
>  
>  	irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> -- 
> 2.20.1
>
Alex Belits Dec. 4, 2020, 12:54 a.m. UTC | #5
On Wed, 2020-12-02 at 14:20 +0000, Mark Rutland wrote:
> External Email
> 
> -------------------------------------------------------------------
> ---
> On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> > From: Yuri Norov <ynorov@marvell.com>
> > 
> > For nohz_full CPUs the desirable behavior is to receive interrupts
> > generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> > obviously not desirable because it breaks isolation.
> > 
> > This patch adds check for it.
> > 
> > Signed-off-by: Yuri Norov <ynorov@marvell.com>
> > [abelits@marvell.com: updated, only exclude CPUs running isolated
> > tasks]
> > Signed-off-by: Alex Belits <abelits@marvell.com>
> > ---
> >  kernel/time/tick-sched.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index a213952541db..6c8679e200f0 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/sched/clock.h>
> >  #include <linux/sched/stat.h>
> >  #include <linux/sched/nohz.h>
> > +#include <linux/isolation.h>
> >  #include <linux/module.h>
> >  #include <linux/irq_work.h>
> >  #include <linux/posix-timers.h>
> > @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
> >   */
> >  void tick_nohz_full_kick_cpu(int cpu)
> >  {
> > -	if (!tick_nohz_full_cpu(cpu))
> > +	smp_rmb();
> 
> What does this barrier pair with? The commit message doesn't mention
> it,
> and it's not clear in-context.

With barriers in task_isolation_kernel_enter()
and task_isolation_exit_to_user_mode().
Mark Rutland Dec. 7, 2020, 11:58 a.m. UTC | #6
On Fri, Dec 04, 2020 at 12:54:29AM +0000, Alex Belits wrote:
> 
> On Wed, 2020-12-02 at 14:20 +0000, Mark Rutland wrote:
> > External Email
> > 
> > -------------------------------------------------------------------
> > ---
> > On Mon, Nov 23, 2020 at 05:58:22PM +0000, Alex Belits wrote:
> > > From: Yuri Norov <ynorov@marvell.com>
> > > 
> > > For nohz_full CPUs the desirable behavior is to receive interrupts
> > > generated by tick_nohz_full_kick_cpu(). But for hard isolation it's
> > > obviously not desirable because it breaks isolation.
> > > 
> > > This patch adds check for it.
> > > 
> > > Signed-off-by: Yuri Norov <ynorov@marvell.com>
> > > [abelits@marvell.com: updated, only exclude CPUs running isolated
> > > tasks]
> > > Signed-off-by: Alex Belits <abelits@marvell.com>
> > > ---
> > >  kernel/time/tick-sched.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index a213952541db..6c8679e200f0 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/sched/clock.h>
> > >  #include <linux/sched/stat.h>
> > >  #include <linux/sched/nohz.h>
> > > +#include <linux/isolation.h>
> > >  #include <linux/module.h>
> > >  #include <linux/irq_work.h>
> > >  #include <linux/posix-timers.h>
> > > @@ -268,7 +269,8 @@ static void tick_nohz_full_kick(void)
> > >   */
> > >  void tick_nohz_full_kick_cpu(int cpu)
> > >  {
> > > -	if (!tick_nohz_full_cpu(cpu))
> > > +	smp_rmb();
> > 
> > What does this barrier pair with? The commit message doesn't mention
> > it,
> > and it's not clear in-context.
> 
> With barriers in task_isolation_kernel_enter()
> and task_isolation_exit_to_user_mode().

Please add a comment in the code as to what it pairs with.

Thanks,
Mark.
diff mbox series

Patch

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a213952541db..6c8679e200f0 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -20,6 +20,7 @@ 
 #include <linux/sched/clock.h>
 #include <linux/sched/stat.h>
 #include <linux/sched/nohz.h>
+#include <linux/isolation.h>
 #include <linux/module.h>
 #include <linux/irq_work.h>
 #include <linux/posix-timers.h>
@@ -268,7 +269,8 @@  static void tick_nohz_full_kick(void)
  */
 void tick_nohz_full_kick_cpu(int cpu)
 {
-	if (!tick_nohz_full_cpu(cpu))
+	smp_rmb();
+	if (!tick_nohz_full_cpu(cpu) || task_isolation_on_cpu(cpu))
 		return;
 
 	irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);