diff mbox series

arm64/sve: Make kernel FPU protection RT friendly

Message ID 20210729105215.2222338-3-bigeasy@linutronix.de (mailing list archive)
State New, archived
Headers show
Series arm64/sve: Make kernel FPU protection RT friendly | expand

Commit Message

Sebastian Andrzej Siewior July 29, 2021, 10:52 a.m. UTC
Non RT kernels need to protect FPU against preemption and bottom half
processing. This is achieved by disabling bottom halves via
local_bh_disable() which implictly disables preemption.

On RT kernels this protection mechanism is not sufficient because
local_bh_disable() does not disable preemption. It serializes bottom half
related processing via a CPU local lock.

As bottom halves are running always in thread context on RT kernels
disabling preemption is the proper choice as it implicitly prevents bottom
half processing.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/arm64/kernel/fpsimd.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Dave Martin July 29, 2021, 1:54 p.m. UTC | #1
On Thu, Jul 29, 2021 at 12:52:15PM +0200, Sebastian Andrzej Siewior wrote:
> Non RT kernels need to protect FPU against preemption and bottom half
> processing. This is achieved by disabling bottom halves via
> local_bh_disable() which implictly disables preemption.
> 
> On RT kernels this protection mechanism is not sufficient because
> local_bh_disable() does not disable preemption. It serializes bottom half
> related processing via a CPU local lock.
> 
> As bottom halves are running always in thread context on RT kernels
> disabling preemption is the proper choice as it implicitly prevents bottom
> half processing.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/arm64/kernel/fpsimd.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index e098f6c67b1de..a208514bd69a9 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void)
>   *
>   * The double-underscore version must only be called if you know the task
>   * can't be preempted.
> + *
> + * On RT kernels local_bh_disable() is not sufficient because it only
> + * serializes soft interrupt related sections via a local lock, but stays
> + * preemptible. Disabling preemption is the right choice here as bottom
> + * half processing is always in thread context on RT kernels so it
> + * implicitly prevents bottom half processing as well.
>   */
>  static void get_cpu_fpsimd_context(void)
>  {
> -	local_bh_disable();
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> +		local_bh_disable();
> +	else
> +		preempt_disable();

Is this wrongly abstracted for RT?

The requirement here is that the code should temporarily be
nonpreemptible by anything except hardirq context.

Having to do this conditional everywhere that is required feels fragile.
Is a similar thing needed anywhere else?

If bh (as a preempting context) doesn't exist on RT, then can't
local_bh_disable() just suppress all preemption up to but excluding
hardirq?  Would anything break?

[...]

Cheers
---Dave
Sebastian Andrzej Siewior July 29, 2021, 2:17 p.m. UTC | #2
On 2021-07-29 14:54:59 [+0100], Dave Martin wrote:
> > index e098f6c67b1de..a208514bd69a9 100644
> > --- a/arch/arm64/kernel/fpsimd.c
> > +++ b/arch/arm64/kernel/fpsimd.c
> > @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void)
> >   *
> >   * The double-underscore version must only be called if you know the task
> >   * can't be preempted.
> > + *
> > + * On RT kernels local_bh_disable() is not sufficient because it only
> > + * serializes soft interrupt related sections via a local lock, but stays
> > + * preemptible. Disabling preemption is the right choice here as bottom
> > + * half processing is always in thread context on RT kernels so it
> > + * implicitly prevents bottom half processing as well.
> >   */
> >  static void get_cpu_fpsimd_context(void)
> >  {
> > -	local_bh_disable();
> > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > +		local_bh_disable();
> > +	else
> > +		preempt_disable();
> 
> Is this wrongly abstracted for RT?

No, we want to keep BH preemptible. Say your NAPI callback is busy for
the next 200us and your RT task needs the CPU now.

> The requirement here is that the code should temporarily be
> nonpreemptible by anything except hardirq context.

That is what I assumed.

> Having to do this conditional everywhere that is required feels fragile.
> Is a similar thing needed anywhere else?

pssst. I wisper now so that the other don't hear us. If you look at
arch/x86/include/asm/fpu/api.h and search for fpregs_lock() then you
find the same pattern. Even some of the comments look similar. And
please don't look up the original commit :)
x86 restores the FPU registers on return to userland (not immediately on
context switch) and requires the same kind of synchronisation/
protection regarding other tasks and crypto in softirq. So it should be
more the same thing that arm64 does here.

> If bh (as a preempting context) doesn't exist on RT, then can't
> local_bh_disable() just suppress all preemption up to but excluding
> hardirq?  Would anything break?

Yes. A lot. Starting with spin_lock_bh() itself because it does:
	local_bh_disable();
	spin_lock()

and with disabled preemption you can't do spin_lock() and you have to
because the owner may be preempted. The next thing is that kmalloc() and
friends won't work in a local_bh_disable() section for the same reason.
The list goes on.

> [...]
> 
> Cheers
> ---Dave

Sebastian
Mark Brown July 29, 2021, 2:22 p.m. UTC | #3
On Thu, Jul 29, 2021 at 12:52:15PM +0200, Sebastian Andrzej Siewior wrote:
> Non RT kernels need to protect FPU against preemption and bottom half
> processing. This is achieved by disabling bottom halves via
> local_bh_disable() which implictly disables preemption.
> 
> On RT kernels this protection mechanism is not sufficient because
> local_bh_disable() does not disable preemption. It serializes bottom half
> related processing via a CPU local lock.
> 
> As bottom halves are running always in thread context on RT kernels
> disabling preemption is the proper choice as it implicitly prevents bottom
> half processing.

I think someone sent a very similar looking patch in the past week or
two?
Sebastian Andrzej Siewior July 29, 2021, 2:41 p.m. UTC | #4
On 2021-07-29 15:22:10 [+0100], Mark Brown wrote:
> I think someone sent a very similar looking patch in the past week or
> two?

Not that I' aware of.

Sebastian
Dave Martin July 29, 2021, 3:34 p.m. UTC | #5
On Thu, Jul 29, 2021 at 04:17:48PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-07-29 14:54:59 [+0100], Dave Martin wrote:
> > > index e098f6c67b1de..a208514bd69a9 100644
> > > --- a/arch/arm64/kernel/fpsimd.c
> > > +++ b/arch/arm64/kernel/fpsimd.c
> > > @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void)
> > >   *
> > >   * The double-underscore version must only be called if you know the task
> > >   * can't be preempted.
> > > + *
> > > + * On RT kernels local_bh_disable() is not sufficient because it only
> > > + * serializes soft interrupt related sections via a local lock, but stays
> > > + * preemptible. Disabling preemption is the right choice here as bottom
> > > + * half processing is always in thread context on RT kernels so it
> > > + * implicitly prevents bottom half processing as well.
> > >   */
> > >  static void get_cpu_fpsimd_context(void)
> > >  {
> > > -	local_bh_disable();
> > > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > > +		local_bh_disable();
> > > +	else
> > > +		preempt_disable();
> > 
> > Is this wrongly abstracted for RT?
> 
> No, we want to keep BH preemptible. Say your NAPI callback is busy for
> the next 200us and your RT task needs the CPU now.
> 
> > The requirement here is that the code should temporarily be
> > nonpreemptible by anything except hardirq context.
> 
> That is what I assumed.
> 
> > Having to do this conditional everywhere that is required feels fragile.
> > Is a similar thing needed anywhere else?
> 
> pssst. I wisper now so that the other don't hear us. If you look at
> arch/x86/include/asm/fpu/api.h and search for fpregs_lock() then you
> find the same pattern. Even some of the comments look similar. And
> please don't look up the original commit :)
> x86 restores the FPU registers on return to userland (not immediately on
> context switch) and requires the same kind of synchronisation/
> protection regarding other tasks and crypto in softirq. So it should be
> more the same thing that arm64 does here.

That rather suggests to me that it is worth factoring this and giving it
a name, precisely because irrespectively of CONFIG_PREEMPT_RT, we need to
make sure that to task swtich _and_ no bh runs on the same cpu.  The
problem seems to be that the local_bh_disable() API doesn't express the
difference between wanting to prevent local bh processing and wanting to
prevent local bh _and_ task switch.

So, could this be wrapped up and called something like:

preempt_and_local_bh_disable()
...
local_bh_and_preempt_enable()?

I do wonder whether there are other places making the same assumption
about the local_irq > local_bh > preempt hierarchy that have been
missed...

> > If bh (as a preempting context) doesn't exist on RT, then can't
> > local_bh_disable() just suppress all preemption up to but excluding
> > hardirq?  Would anything break?
> 
> Yes. A lot. Starting with spin_lock_bh() itself because it does:
> 	local_bh_disable();
> 	spin_lock()
> 
> and with disabled preemption you can't do spin_lock() and you have to
> because the owner may be preempted. The next thing is that kmalloc() and
> friends won't work in a local_bh_disable() section for the same reason.

Couldn't this be solved with a trylock loop that re-enables bh (and
preemption) on the sleeping path?  But that may still be trying to
achieve something that doesn't make sense given the goals of
PREEMPT_RT(?)

> The list goes on.
> 
> Sebastian

Cheers
---Dave
Sebastian Andrzej Siewior July 29, 2021, 4 p.m. UTC | #6
On 2021-07-29 16:34:22 [+0100], Dave Martin wrote:
> 
> That rather suggests to me that it is worth factoring this and giving it
> a name, precisely because irrespectively of CONFIG_PREEMPT_RT, we need to
> make sure that to task swtich _and_ no bh runs on the same cpu.  The
> problem seems to be that the local_bh_disable() API doesn't express the
> difference between wanting to prevent local bh processing and wanting to
> prevent local bh _and_ task switch.
> 
> So, could this be wrapped up and called something like:
> 
> preempt_and_local_bh_disable()
> ...
> local_bh_and_preempt_enable()?

We don't disable both on RT. It is preemption on RT and BH + implicit
preemption on non-RT.
The difference is that on RT a softirq may have been preempted and in
this case get_cpu_fpsimd_context() won't force its completion. However
since get_cpu_fpsimd_context() disables preemption on RT it won't be
preempted in the section where the SIMD registers are modified. And the
softirq does not run on HARDIRQ-exit but in task context so it is okay.

But I get what you mean. I'm just not sure regarding the naming since
the code behaves the same on x86 and arm64 here.

> I do wonder whether there are other places making the same assumption
> about the local_irq > local_bh > preempt hierarchy that have been
> missed...

Based on memory we had a few cases of those while cleaning up
in_atomic(), in_softirq() and friends.

> > > If bh (as a preempting context) doesn't exist on RT, then can't
> > > local_bh_disable() just suppress all preemption up to but excluding
> > > hardirq?  Would anything break?
> > 
> > Yes. A lot. Starting with spin_lock_bh() itself because it does:
> > 	local_bh_disable();
> > 	spin_lock()
> > 
> > and with disabled preemption you can't do spin_lock() and you have to
> > because the owner may be preempted. The next thing is that kmalloc() and
> > friends won't work in a local_bh_disable() section for the same reason.
> 
> Couldn't this be solved with a trylock loop that re-enables bh (and
> preemption) on the sleeping path?  But that may still be trying to
> achieve something that doesn't make sense given the goals of
> PREEMPT_RT(?)

What about
 spin_lock_bh(a);
 spin_lock_bh(b);

? And then still you can't kmalloc() in a spin_lock_bh() section if you
disable preemption as part of local_bh_disable( if you disable
preemption as part of local_bh_disable()).

> Cheers
> ---Dave

Sebastian
Sebastian Andrzej Siewior July 29, 2021, 4:07 p.m. UTC | #7
On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote:
> 
> But I get what you mean. I'm just not sure regarding the naming since
> the code behaves the same on x86 and arm64 here.

Assuming that everyone follows this pattern what about
  fpregs_lock()
  fpregs_unlock()

?

Sebastian
Mark Brown July 29, 2021, 4:23 p.m. UTC | #8
On Thu, Jul 29, 2021 at 04:41:11PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-07-29 15:22:10 [+0100], Mark Brown wrote:
> > I think someone sent a very similar looking patch in the past week or
> > two?
> 
> Not that I' aware of.

Hrm, right - I can't find it (even internally but I guess it must've
been some internal thing).  Anyway

Reviewed-by: Mark Brown <broonie@kernel.org>

though it does feel like we should have a helper for this.
Dave Martin July 29, 2021, 4:32 p.m. UTC | #9
On Thu, Jul 29, 2021 at 06:07:56PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote:
> > 
> > But I get what you mean. I'm just not sure regarding the naming since
> > the code behaves the same on x86 and arm64 here.
> 
> Assuming that everyone follows this pattern what about
>   fpregs_lock()
>   fpregs_unlock()

I'm not sure about the "fpregs".  This is really about CPU-local
resources that are contended between softirq and task context.

Some arches might not to use fp in softirq context and then their fp
regs wouldn't need this; others might have other resources that aren't
"fp" regs, but are contended in the same way.


My "local_bh" was meaning purely softirqs running on this CPU.  This was
the original meaning of "local" in this API IIUC.  This is one reason
why they must disable preemption: "local" is meaningless if preemption
is enabled, since otherwise we might randomly migrate between CPUs.

I guess the "local" was preserved in the naming on PREEMPT_RT to reduce
the amount of noise that would have resulted from a treewide rename, but
this word seems confusing if there is no CPU-localness involved.


In this particular case, we really do want to bind ourselves onto the
current CPU and disable softirqs on this CPU -- if they continue to run
elsewhere, that's just fine.

What do you think about:

get_bh_cpu_context()
put_bh_cpu_context()

or something along those lines?

Cheers
---Dave
Sebastian Andrzej Siewior July 29, 2021, 5:11 p.m. UTC | #10
On 2021-07-29 17:32:27 [+0100], Dave Martin wrote:
> On Thu, Jul 29, 2021 at 06:07:56PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote:
> > > 
> > > But I get what you mean. I'm just not sure regarding the naming since
> > > the code behaves the same on x86 and arm64 here.
> > 
> > Assuming that everyone follows this pattern what about
> >   fpregs_lock()
> >   fpregs_unlock()
> 
> I'm not sure about the "fpregs".  This is really about CPU-local
> resources that are contended between softirq and task context.
> 
> Some arches might not to use fp in softirq context and then their fp
> regs wouldn't need this; others might have other resources that aren't
> "fp" regs, but are contended in the same way.

if FP / SIMD is used in crypto then it is likely that they will aim for
softirq for ipsec reasons. Wireguard, dm-crypt perform crypto in process
context if I remember correctly.

> My "local_bh" was meaning purely softirqs running on this CPU.  This was
> the original meaning of "local" in this API IIUC.  This is one reason
> why they must disable preemption: "local" is meaningless if preemption
> is enabled, since otherwise we might randomly migrate between CPUs.

You assume that BH also disable preemption which is an implementation
detail. On RT local_bh_disable() disables BH on the current CPU. The
task won't migrate to another CPU while preempted and another task, which
invokes local_bh_disable() on the same CPU, will wait until the previous
local_bh_disable() section completes.
So all semantics which are expected by local_bh_disable() are preserved
in RT - except for the part where preemption is disabled.

> I guess the "local" was preserved in the naming on PREEMPT_RT to reduce
> the amount of noise that would have resulted from a treewide rename, but
> this word seems confusing if there is no CPU-localness involved.

As I explained above, the BH on the local/current CPU is disabled as in
no softirq is currently running on this CPU. The context however remains
preemptible so there is no need for a rename.

> In this particular case, we really do want to bind ourselves onto the
> current CPU and disable softirqs on this CPU -- if they continue to run
> elsewhere, that's just fine.
> 
> What do you think about:
> 
> get_bh_cpu_context()
> put_bh_cpu_context()
> 
> or something along those lines?

So you are not buying the fpregs_lock()? Then I don't need to reach for
the simd_lock() or such from the shelf?
The problem I have with _bh_ is that on RT the BH/softirq context is not
forced to complete if preempted as it would be the case with
local_bh_disable(). So that is misleading.
It is just that it happens not to matter in this case and it is
sufficient on RT to disable preemption. It would be wrong to disable BH
and preemption on RT (but it might be okay / needed in other cases).

powerpc for instance has
	arch/powerpc/crypto/crc32c-vpmsum_glue.c
doing doing crc32c with ALTIVEC (simd). They only disable preemption and
their crypto_simd_usable() (the generic version of it) forbids an
invocation in the softirq context.
So matter how I look at it, it really comes down to the specific SIMD
usage on an architecture.

> Cheers
> ---Dave

Sebastian
diff mbox series

Patch

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e098f6c67b1de..a208514bd69a9 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -177,10 +177,19 @@  static void __get_cpu_fpsimd_context(void)
  *
  * The double-underscore version must only be called if you know the task
  * can't be preempted.
+ *
+ * On RT kernels local_bh_disable() is not sufficient because it only
+ * serializes soft interrupt related sections via a local lock, but stays
+ * preemptible. Disabling preemption is the right choice here as bottom
+ * half processing is always in thread context on RT kernels so it
+ * implicitly prevents bottom half processing as well.
  */
 static void get_cpu_fpsimd_context(void)
 {
-	local_bh_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_bh_disable();
+	else
+		preempt_disable();
 	__get_cpu_fpsimd_context();
 }
 
@@ -201,7 +210,10 @@  static void __put_cpu_fpsimd_context(void)
 static void put_cpu_fpsimd_context(void)
 {
 	__put_cpu_fpsimd_context();
-	local_bh_enable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_bh_enable();
+	else
+		preempt_enable();
 }
 
 static bool have_cpu_fpsimd_context(void)