Message ID | 20210729105215.2222338-3-bigeasy@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/sve: Make kernel FPU protection RT friendly | expand |
On Thu, Jul 29, 2021 at 12:52:15PM +0200, Sebastian Andrzej Siewior wrote: > Non RT kernels need to protect FPU against preemption and bottom half > processing. This is achieved by disabling bottom halves via > local_bh_disable() which implictly disables preemption. > > On RT kernels this protection mechanism is not sufficient because > local_bh_disable() does not disable preemption. It serializes bottom half > related processing via a CPU local lock. > > As bottom halves are running always in thread context on RT kernels > disabling preemption is the proper choice as it implicitly prevents bottom > half processing. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > arch/arm64/kernel/fpsimd.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index e098f6c67b1de..a208514bd69a9 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void) > * > * The double-underscore version must only be called if you know the task > * can't be preempted. > + * > + * On RT kernels local_bh_disable() is not sufficient because it only > + * serializes soft interrupt related sections via a local lock, but stays > + * preemptible. Disabling preemption is the right choice here as bottom > + * half processing is always in thread context on RT kernels so it > + * implicitly prevents bottom half processing as well. > */ > static void get_cpu_fpsimd_context(void) > { > - local_bh_disable(); > + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > + local_bh_disable(); > + else > + preempt_disable(); Is this wrongly abstracted for RT? The requirement here is that the code should temporarily be nonpreemptible by anything except hardirq context. Having to do this conditional everywhere that is required feels fragile. Is a similar thing needed anywhere else? If bh (as a preempting context) doesn't exist on RT, then can't local_bh_disable() just suppress all preemption up to but excluding hardirq? Would anything break? [...] Cheers ---Dave
On 2021-07-29 14:54:59 [+0100], Dave Martin wrote: > > index e098f6c67b1de..a208514bd69a9 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void) > > * > > * The double-underscore version must only be called if you know the task > > * can't be preempted. > > + * > > + * On RT kernels local_bh_disable() is not sufficient because it only > > + * serializes soft interrupt related sections via a local lock, but stays > > + * preemptible. Disabling preemption is the right choice here as bottom > > + * half processing is always in thread context on RT kernels so it > > + * implicitly prevents bottom half processing as well. > > */ > > static void get_cpu_fpsimd_context(void) > > { > > - local_bh_disable(); > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > > + local_bh_disable(); > > + else > > + preempt_disable(); > > Is this wrongly abstracted for RT? No, we want to keep BH preemptible. Say your NAPI callback is busy for the next 200us and your RT task needs the CPU now. > The requirement here is that the code should temporarily be > nonpreemptible by anything except hardirq context. That is what I assumed. > Having to do this conditional everywhere that is required feels fragile. > Is a similar thing needed anywhere else? pssst. I wisper now so that the other don't hear us. If you look at arch/x86/include/asm/fpu/api.h and search for fpregs_lock() then you find the same pattern. Even some of the comments look similar. And please don't look up the original commit :) x86 restores the FPU registers on return to userland (not immediately on context switch) and requires the same kind of synchronisation/ protection regarding other tasks and crypto in softirq. So it should be more the same thing that arm64 does here. > If bh (as a preempting context) doesn't exist on RT, then can't > local_bh_disable() just suppress all preemption up to but excluding > hardirq? Would anything break? Yes. A lot. Starting with spin_lock_bh() itself because it does: local_bh_disable(); spin_lock() and with disabled preemption you can't do spin_lock() and you have to because the owner may be preempted. The next thing is that kmalloc() and friends won't work in a local_bh_disable() section for the same reason. The list goes on. > [...] > > Cheers > ---Dave Sebastian
On Thu, Jul 29, 2021 at 12:52:15PM +0200, Sebastian Andrzej Siewior wrote: > Non RT kernels need to protect FPU against preemption and bottom half > processing. This is achieved by disabling bottom halves via > local_bh_disable() which implictly disables preemption. > > On RT kernels this protection mechanism is not sufficient because > local_bh_disable() does not disable preemption. It serializes bottom half > related processing via a CPU local lock. > > As bottom halves are running always in thread context on RT kernels > disabling preemption is the proper choice as it implicitly prevents bottom > half processing. I think someone sent a very similar looking patch in the past week or two?
On 2021-07-29 15:22:10 [+0100], Mark Brown wrote: > I think someone sent a very similar looking patch in the past week or > two? Not that I' aware of. Sebastian
On Thu, Jul 29, 2021 at 04:17:48PM +0200, Sebastian Andrzej Siewior wrote: > On 2021-07-29 14:54:59 [+0100], Dave Martin wrote: > > > index e098f6c67b1de..a208514bd69a9 100644 > > > --- a/arch/arm64/kernel/fpsimd.c > > > +++ b/arch/arm64/kernel/fpsimd.c > > > @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void) > > > * > > > * The double-underscore version must only be called if you know the task > > > * can't be preempted. > > > + * > > > + * On RT kernels local_bh_disable() is not sufficient because it only > > > + * serializes soft interrupt related sections via a local lock, but stays > > > + * preemptible. Disabling preemption is the right choice here as bottom > > > + * half processing is always in thread context on RT kernels so it > > > + * implicitly prevents bottom half processing as well. > > > */ > > > static void get_cpu_fpsimd_context(void) > > > { > > > - local_bh_disable(); > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > > > + local_bh_disable(); > > > + else > > > + preempt_disable(); > > > > Is this wrongly abstracted for RT? > > No, we want to keep BH preemptible. Say your NAPI callback is busy for > the next 200us and your RT task needs the CPU now. > > > The requirement here is that the code should temporarily be > > nonpreemptible by anything except hardirq context. > > That is what I assumed. > > > Having to do this conditional everywhere that is required feels fragile. > > Is a similar thing needed anywhere else? > > pssst. I wisper now so that the other don't hear us. If you look at > arch/x86/include/asm/fpu/api.h and search for fpregs_lock() then you > find the same pattern. Even some of the comments look similar. And > please don't look up the original commit :) > x86 restores the FPU registers on return to userland (not immediately on > context switch) and requires the same kind of synchronisation/ > protection regarding other tasks and crypto in softirq. So it should be > more the same thing that arm64 does here. That rather suggests to me that it is worth factoring this and giving it a name, precisely because irrespectively of CONFIG_PREEMPT_RT, we need to make sure that to task swtich _and_ no bh runs on the same cpu. The problem seems to be that the local_bh_disable() API doesn't express the difference between wanting to prevent local bh processing and wanting to prevent local bh _and_ task switch. So, could this be wrapped up and called something like: preempt_and_local_bh_disable() ... local_bh_and_preempt_enable()? I do wonder whether there are other places making the same assumption about the local_irq > local_bh > preempt hierarchy that have been missed... > > If bh (as a preempting context) doesn't exist on RT, then can't > > local_bh_disable() just suppress all preemption up to but excluding > > hardirq? Would anything break? > > Yes. A lot. Starting with spin_lock_bh() itself because it does: > local_bh_disable(); > spin_lock() > > and with disabled preemption you can't do spin_lock() and you have to > because the owner may be preempted. The next thing is that kmalloc() and > friends won't work in a local_bh_disable() section for the same reason. Couldn't this be solved with a trylock loop that re-enables bh (and preemption) on the sleeping path? But that may still be trying to achieve something that doesn't make sense given the goals of PREEMPT_RT(?) > The list goes on. > > Sebastian Cheers ---Dave
On 2021-07-29 16:34:22 [+0100], Dave Martin wrote: > > That rather suggests to me that it is worth factoring this and giving it > a name, precisely because irrespectively of CONFIG_PREEMPT_RT, we need to > make sure that to task swtich _and_ no bh runs on the same cpu. The > problem seems to be that the local_bh_disable() API doesn't express the > difference between wanting to prevent local bh processing and wanting to > prevent local bh _and_ task switch. > > So, could this be wrapped up and called something like: > > preempt_and_local_bh_disable() > ... > local_bh_and_preempt_enable()? We don't disable both on RT. It is preemption on RT and BH + implicit preemption on non-RT. The difference is that on RT a softirq may have been preempted and in this case get_cpu_fpsimd_context() won't force its completion. However since get_cpu_fpsimd_context() disables preemption on RT it won't be preempted in the section where the SIMD registers are modified. And the softirq does not run on HARDIRQ-exit but in task context so it is okay. But I get what you mean. I'm just not sure regarding the naming since the code behaves the same on x86 and arm64 here. > I do wonder whether there are other places making the same assumption > about the local_irq > local_bh > preempt hierarchy that have been > missed... Based on memory we had a few cases of those while cleaning up in_atomic(), in_softirq() and friends. > > > If bh (as a preempting context) doesn't exist on RT, then can't > > > local_bh_disable() just suppress all preemption up to but excluding > > > hardirq? Would anything break? > > > > Yes. A lot. Starting with spin_lock_bh() itself because it does: > > local_bh_disable(); > > spin_lock() > > > > and with disabled preemption you can't do spin_lock() and you have to > > because the owner may be preempted. The next thing is that kmalloc() and > > friends won't work in a local_bh_disable() section for the same reason. > > Couldn't this be solved with a trylock loop that re-enables bh (and > preemption) on the sleeping path? But that may still be trying to > achieve something that doesn't make sense given the goals of > PREEMPT_RT(?) What about spin_lock_bh(a); spin_lock_bh(b); ? And then still you can't kmalloc() in a spin_lock_bh() section if you disable preemption as part of local_bh_disable( if you disable preemption as part of local_bh_disable()). > Cheers > ---Dave Sebastian
On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote: > > But I get what you mean. I'm just not sure regarding the naming since > the code behaves the same on x86 and arm64 here. Assuming that everyone follows this pattern what about fpregs_lock() fpregs_unlock() ? Sebastian
On Thu, Jul 29, 2021 at 04:41:11PM +0200, Sebastian Andrzej Siewior wrote: > On 2021-07-29 15:22:10 [+0100], Mark Brown wrote: > > I think someone sent a very similar looking patch in the past week or > > two? > > Not that I' aware of. Hrm, right - I can't find it (even internally but I guess it must've been some internal thing). Anyway Reviewed-by: Mark Brown <broonie@kernel.org> though it does feel like we should have a helper for this.
On Thu, Jul 29, 2021 at 06:07:56PM +0200, Sebastian Andrzej Siewior wrote: > On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote: > > > > But I get what you mean. I'm just not sure regarding the naming since > > the code behaves the same on x86 and arm64 here. > > Assuming that everyone follows this pattern what about > fpregs_lock() > fpregs_unlock() I'm not sure about the "fpregs". This is really about CPU-local resources that are contended between softirq and task context. Some arches might not to use fp in softirq context and then their fp regs wouldn't need this; others might have other resources that aren't "fp" regs, but are contended in the same way. My "local_bh" was meaning purely softirqs running on this CPU. This was the original meaning of "local" in this API IIUC. This is one reason why they must disable preemption: "local" is meaningless if preemption is enabled, since otherwise we might randomly migrate between CPUs. I guess the "local" was preserved in the naming on PREEMPT_RT to reduce the amount of noise that would have resulted from a treewide rename, but this word seems confusing if there is no CPU-localness involved. In this particular case, we really do want to bind ourselves onto the current CPU and disable softirqs on this CPU -- if they continue to run elsewhere, that's just fine. What do you think about: get_bh_cpu_context() put_bh_cpu_context() or something along those lines? Cheers ---Dave
On 2021-07-29 17:32:27 [+0100], Dave Martin wrote: > On Thu, Jul 29, 2021 at 06:07:56PM +0200, Sebastian Andrzej Siewior wrote: > > On 2021-07-29 18:00:31 [+0200], To Dave Martin wrote: > > > > > > But I get what you mean. I'm just not sure regarding the naming since > > > the code behaves the same on x86 and arm64 here. > > > > Assuming that everyone follows this pattern what about > > fpregs_lock() > > fpregs_unlock() > > I'm not sure about the "fpregs". This is really about CPU-local > resources that are contended between softirq and task context. > > Some arches might not to use fp in softirq context and then their fp > regs wouldn't need this; others might have other resources that aren't > "fp" regs, but are contended in the same way. if FP / SIMD is used in crypto then it is likely that they will aim for softirq for ipsec reasons. Wireguard, dm-crypt perform crypto in process context if I remember correctly. > My "local_bh" was meaning purely softirqs running on this CPU. This was > the original meaning of "local" in this API IIUC. This is one reason > why they must disable preemption: "local" is meaningless if preemption > is enabled, since otherwise we might randomly migrate between CPUs. You assume that BH also disable preemption which is an implementation detail. On RT local_bh_disable() disables BH on the current CPU. The task won't migrate to another CPU while preempted and another task, which invokes local_bh_disable() on the same CPU, will wait until the previous local_bh_disable() section completes. So all semantics which are expected by local_bh_disable() are preserved in RT - except for the part where preemption is disabled. > I guess the "local" was preserved in the naming on PREEMPT_RT to reduce > the amount of noise that would have resulted from a treewide rename, but > this word seems confusing if there is no CPU-localness involved. As I explained above, the BH on the local/current CPU is disabled as in no softirq is currently running on this CPU. The context however remains preemptible so there is no need for a rename. > In this particular case, we really do want to bind ourselves onto the > current CPU and disable softirqs on this CPU -- if they continue to run > elsewhere, that's just fine. > > What do you think about: > > get_bh_cpu_context() > put_bh_cpu_context() > > or something along those lines? So you are not buying the fpregs_lock()? Then I don't need to reach for the simd_lock() or such from the shelf? The problem I have with _bh_ is that on RT the BH/softirq context is not forced to complete if preempted as it would be the case with local_bh_disable(). So that is misleading. It is just that it happens not to matter in this case and it is sufficient on RT to disable preemption. It would be wrong to disable BH and preemption on RT (but it might be okay / needed in other cases). powerpc for instance has arch/powerpc/crypto/crc32c-vpmsum_glue.c doing doing crc32c with ALTIVEC (simd). They only disable preemption and their crypto_simd_usable() (the generic version of it) forbids an invocation in the softirq context. So matter how I look at it, it really comes down to the specific SIMD usage on an architecture. > Cheers > ---Dave Sebastian
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index e098f6c67b1de..a208514bd69a9 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -177,10 +177,19 @@ static void __get_cpu_fpsimd_context(void) * * The double-underscore version must only be called if you know the task * can't be preempted. + * + * On RT kernels local_bh_disable() is not sufficient because it only + * serializes soft interrupt related sections via a local lock, but stays + * preemptible. Disabling preemption is the right choice here as bottom + * half processing is always in thread context on RT kernels so it + * implicitly prevents bottom half processing as well. */ static void get_cpu_fpsimd_context(void) { - local_bh_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_disable(); + else + preempt_disable(); __get_cpu_fpsimd_context(); } @@ -201,7 +210,10 @@ static void __put_cpu_fpsimd_context(void) static void put_cpu_fpsimd_context(void) { __put_cpu_fpsimd_context(); - local_bh_enable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_enable(); + else + preempt_enable(); } static bool have_cpu_fpsimd_context(void)
Non RT kernels need to protect FPU against preemption and bottom half processing. This is achieved by disabling bottom halves via local_bh_disable() which implictly disables preemption. On RT kernels this protection mechanism is not sufficient because local_bh_disable() does not disable preemption. It serializes bottom half related processing via a CPU local lock. As bottom halves are running always in thread context on RT kernels disabling preemption is the proper choice as it implicitly prevents bottom half processing. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- arch/arm64/kernel/fpsimd.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)