Message ID | 20240521163720.3812851-3-tabba@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: arm64: Fix handling of host fpsimd/sve state in protected mode | expand |
On Tue, 21 May 2024 17:37:15 +0100, Fuad Tabba <tabba@google.com> wrote: > > The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be > toggled in different parts of the code, but the exact bits and > their polarity differ between these two formats and the mode > (vhe/nvhe/hvhe). > > To reduce the amount of duplicated code and the chance of getting > the wrong bit/polarity or missing a field, abstract the set/clear > of CPTR_EL2 bits behind a helper. > > Since (h)VHE is the way of the future, use the CPACR_EL1 format, > which is a subset of the VHE CPTR_EL2, as a reference. > > No functional change intended. > > Suggested-by: Oliver Upton <oliver.upton@linux.dev> > Signed-off-by: Fuad Tabba <tabba@google.com> > --- > arch/arm64/include/asm/kvm_emulate.h | 34 +++++++++++++++++++++++++ > arch/arm64/kvm/hyp/include/hyp/switch.h | 17 +++---------- > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +---- > 3 files changed, 39 insertions(+), 18 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > index 501e3e019c93..74837d1762e5 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h > @@ -557,6 +557,40 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu) > vcpu_set_flag((v), e); \ > } while (0) > > + > +static inline void __cptr_clear_set_nvhe(u64 cpacr_clr, u64 cpacr_set) > +{ > + u64 clr = 0, set = 0; > + > + if (cpacr_clr & CPACR_ELx_FPEN) > + set |= CPTR_EL2_TFP; > + if (cpacr_clr & CPACR_ELx_ZEN) > + set |= CPTR_EL2_TZ; > + if (cpacr_clr & CPACR_ELx_SMEN) These 3 fields are actually pairs of bits. Can we have a compile-time check that both bits are set? > + set |= CPTR_EL2_TSM; > + if (cpacr_clr & CPACR_ELx_TTA) > + clr |= CPTR_EL2_TTA; How about TCPAC, TAM, and E0POE? > + > + if (cpacr_set & CPACR_ELx_FPEN) > + clr |= CPTR_EL2_TFP; > + if (cpacr_set & CPACR_ELx_ZEN) > + clr |= CPTR_EL2_TZ; > + if (cpacr_set & CPACR_ELx_SMEN) > + clr |= CPTR_EL2_TSM; > + if (cpacr_set & CPACR_ELx_TTA) > + set |= CPTR_EL2_TTA; The duplication is pretty unfortunate. Having a single helper that translate a register layout into another would be better. > + > + sysreg_clear_set(cptr_el2, clr, set); And omit this... > +} > + > +static inline void cpacr_clear_set(u64 clr, u64 set) > +{ > + if (has_vhe() || has_hvhe()) > + sysreg_clear_set(cpacr_el1, clr, set); > + else > + __cptr_clear_set_nvhe(clr, set); So that this could read as: sysreg_clear_set(cptr_el2, cpacr_to_cptr(clr), cpacr_to_cptr(set)); Thanks, M.
Hi Marc, On Tue, May 21, 2024 at 10:08 PM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 21 May 2024 17:37:15 +0100, > Fuad Tabba <tabba@google.com> wrote: > > > > The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be > > toggled in different parts of the code, but the exact bits and > > their polarity differ between these two formats and the mode > > (vhe/nvhe/hvhe). > > > > To reduce the amount of duplicated code and the chance of getting > > the wrong bit/polarity or missing a field, abstract the set/clear > > of CPTR_EL2 bits behind a helper. > > > > Since (h)VHE is the way of the future, use the CPACR_EL1 format, > > which is a subset of the VHE CPTR_EL2, as a reference. > > > > No functional change intended. > > > > Suggested-by: Oliver Upton <oliver.upton@linux.dev> > > Signed-off-by: Fuad Tabba <tabba@google.com> > > --- > > arch/arm64/include/asm/kvm_emulate.h | 34 +++++++++++++++++++++++++ > > arch/arm64/kvm/hyp/include/hyp/switch.h | 17 +++---------- > > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +---- > > 3 files changed, 39 insertions(+), 18 deletions(-) > > > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > > index 501e3e019c93..74837d1762e5 100644 > > --- a/arch/arm64/include/asm/kvm_emulate.h > > +++ b/arch/arm64/include/asm/kvm_emulate.h > > @@ -557,6 +557,40 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu) > > vcpu_set_flag((v), e); \ > > } while (0) > > > > + > > +static inline void __cptr_clear_set_nvhe(u64 cpacr_clr, u64 cpacr_set) > > +{ > > + u64 clr = 0, set = 0; > > + > > + if (cpacr_clr & CPACR_ELx_FPEN) > > + set |= CPTR_EL2_TFP; > > + if (cpacr_clr & CPACR_ELx_ZEN) > > + set |= CPTR_EL2_TZ; > > + if (cpacr_clr & CPACR_ELx_SMEN) > > These 3 fields are actually pairs of bits. Can we have a compile-time > check that both bits are set? Yes. > > + set |= CPTR_EL2_TSM; > > + if (cpacr_clr & CPACR_ELx_TTA) > > + clr |= CPTR_EL2_TTA; > > How about TCPAC, TAM, and E0POE? CPTR_EL2 (nvhe) doesn't have an equivalent to E0POE. Should I have a compile time check that it's not cleared at all (since it has negative polarity), or a runtime check for that only for nvhe? As for TCPAC, TAM, they're not defined under the CPACR bits, but they are in CPTR_EL2, which is what CPACR refers to in VHE. I will handle them, but should I rename cpacr_to_cptr() to cptr_vhe_to_nvhe()? > > > + > > + if (cpacr_set & CPACR_ELx_FPEN) > > + clr |= CPTR_EL2_TFP; > > + if (cpacr_set & CPACR_ELx_ZEN) > > + clr |= CPTR_EL2_TZ; > > + if (cpacr_set & CPACR_ELx_SMEN) > > + clr |= CPTR_EL2_TSM; > > + if (cpacr_set & CPACR_ELx_TTA) > > + set |= CPTR_EL2_TTA; > > The duplication is pretty unfortunate. Having a single helper that > translate a register layout into another would be better. > > + > > + sysreg_clear_set(cptr_el2, clr, set); > > And omit this... > > > +} > > + > > +static inline void cpacr_clear_set(u64 clr, u64 set) > > +{ > > + if (has_vhe() || has_hvhe()) > > + sysreg_clear_set(cpacr_el1, clr, set); > > + else > > + __cptr_clear_set_nvhe(clr, set); > > So that this could read as: > > sysreg_clear_set(cptr_el2, cpacr_to_cptr(clr), cpacr_to_cptr(set)); I don't know how this could work with only one function/macro, while covering all the fields. The problem is that the polarity of these bits vary: some have the same polarity between CPACR/CPTR. and others have the opposite polarity. Therefore, the conversion to `clr` is different from the conversion to `set`. Say we need to clear both CPACR_ELx_FPEN and CPACR_ELx_TTA. The problem is that CPACR_ELx_FPEN has the opposite polarity as CPTR_EL2_TFP, whereas CPACR_ELx_TTA has the same polarity as CPTR_EL2_TTA (different position though). What would cpacr_to_cptr() return that would work in the code you propose: sysreg_clear_set(cptr_el2, cpacr_to_cptr(CPACR_ELx_FPEN|CPACR_ELx_TTA), cpacr_to_cptr(0)); Desired result: sysreg_clear_set(cptr_el2, CPTR_EL2_TTA, CPTR_EL2_TFP); I can make it work with two functions/macros: __cpacr_to_cptr_clear(clr, set) and __cpacr_to_cptr_set(clr, set) --> sysreg_clear_set(cptr_el2, __cpacr_to_cptr_clear(clr, set), __cpacr_to_cptr_set(clr, set)); Unfortunately this doesn't get rid of the duplication, but it might be tidier (I converted them to macros to add compile-time checks later): +#define __cpacr_to_cptr_clr(clr, set) \ + ({ \ + u64 cptr = 0; \ + \ + if ((set) & CPACR_ELx_FPEN) \ + cptr |= CPTR_EL2_TFP; \ + if ((set) & CPACR_ELx_ZEN) \ + cptr |= CPTR_EL2_TZ; \ + if ((set) & CPACR_ELx_SMEN) \ + cptr |= CPTR_EL2_TSM; \ + if ((clr) & CPACR_ELx_TTA) \ + cptr |= CPTR_EL2_TTA; \ + if ((clr) & CPTR_EL2_TAM) \ + cptr |= CPTR_EL2_TAM; \ + if ((clr) & CPTR_EL2_TCPAC) \ + cptr |= CPTR_EL2_TCPAC; \ + \ + cptr; \ + }) + +#define __cpacr_to_cptr_set(clr, set) \ + ({ \ + u64 cptr = 0; \ + \ + if ((clr) & CPACR_ELx_FPEN) \ + cptr |= CPTR_EL2_TFP; \ + if ((clr) & CPACR_ELx_ZEN) \ + cptr |= CPTR_EL2_TZ; \ + if ((clr) & CPACR_ELx_SMEN) \ + cptr |= CPTR_EL2_TSM; \ + if ((set) & CPACR_ELx_TTA) \ + cptr |= CPTR_EL2_TTA; \ + if ((set) & CPTR_EL2_TAM) \ + cptr |= CPTR_EL2_TAM; \ + if ((set) & CPTR_EL2_TCPAC) \ + cptr |= CPTR_EL2_TCPAC; \ + \ + cptr; \ + }) + +#define cpacr_clear_set(clr, set) \ + do { \ + if (has_vhe() || has_hvhe()) \ + sysreg_clear_set(cpacr_el1, clr, set); \ + else \ + sysreg_clear_set(cptr_el2, \ + __cpacr_to_cptr_clr(clr, set), \ + __cpacr_to_cptr_set(clr, set));\ + } while (0) What do you think? Cheers, /fuad > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.
On Wed, 22 May 2024 14:48:19 +0100, Fuad Tabba <tabba@google.com> wrote: > > Hi Marc, > > On Tue, May 21, 2024 at 10:08 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Tue, 21 May 2024 17:37:15 +0100, > > Fuad Tabba <tabba@google.com> wrote: > > > > > > The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be > > > toggled in different parts of the code, but the exact bits and > > > their polarity differ between these two formats and the mode > > > (vhe/nvhe/hvhe). > > > > > > To reduce the amount of duplicated code and the chance of getting > > > the wrong bit/polarity or missing a field, abstract the set/clear > > > of CPTR_EL2 bits behind a helper. > > > > > > Since (h)VHE is the way of the future, use the CPACR_EL1 format, > > > which is a subset of the VHE CPTR_EL2, as a reference. > > > > > > No functional change intended. > > > > > > Suggested-by: Oliver Upton <oliver.upton@linux.dev> > > > Signed-off-by: Fuad Tabba <tabba@google.com> > > > --- > > > arch/arm64/include/asm/kvm_emulate.h | 34 +++++++++++++++++++++++++ > > > arch/arm64/kvm/hyp/include/hyp/switch.h | 17 +++---------- > > > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +---- > > > 3 files changed, 39 insertions(+), 18 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > > > index 501e3e019c93..74837d1762e5 100644 > > > --- a/arch/arm64/include/asm/kvm_emulate.h > > > +++ b/arch/arm64/include/asm/kvm_emulate.h > > > @@ -557,6 +557,40 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu) > > > vcpu_set_flag((v), e); \ > > > } while (0) > > > > > > + > > > +static inline void __cptr_clear_set_nvhe(u64 cpacr_clr, u64 cpacr_set) > > > +{ > > > + u64 clr = 0, set = 0; > > > + > > > + if (cpacr_clr & CPACR_ELx_FPEN) > > > + set |= CPTR_EL2_TFP; > > > + if (cpacr_clr & CPACR_ELx_ZEN) > > > + set |= CPTR_EL2_TZ; > > > + if (cpacr_clr & CPACR_ELx_SMEN) > > > > These 3 fields are actually pairs of bits. Can we have a compile-time > > check that both bits are set? > > Yes. > > > > + set |= CPTR_EL2_TSM; > > > + if (cpacr_clr & CPACR_ELx_TTA) > > > + clr |= CPTR_EL2_TTA; > > > > How about TCPAC, TAM, and E0POE? > > CPTR_EL2 (nvhe) doesn't have an equivalent to E0POE. Should I have a > compile time check that it's not cleared at all (since it has negative > polarity), or a runtime check for that only for nvhe? You're right, and I ended figuring that out while reworking the NV side of CPTR_EL2 handling. > > As for TCPAC, TAM, they're not defined under the CPACR bits, but they > are in CPTR_EL2, which is what CPACR refers to in VHE. I will handle > them, but should I rename cpacr_to_cptr() to cptr_vhe_to_nvhe()? I think this is clear enough in context, but I don't mind either way. > > > > > > + > > > + if (cpacr_set & CPACR_ELx_FPEN) > > > + clr |= CPTR_EL2_TFP; > > > + if (cpacr_set & CPACR_ELx_ZEN) > > > + clr |= CPTR_EL2_TZ; > > > + if (cpacr_set & CPACR_ELx_SMEN) > > > + clr |= CPTR_EL2_TSM; > > > + if (cpacr_set & CPACR_ELx_TTA) > > > + set |= CPTR_EL2_TTA; > > > > The duplication is pretty unfortunate. Having a single helper that > > translate a register layout into another would be better. > > > + > > > + sysreg_clear_set(cptr_el2, clr, set); > > > > And omit this... > > > > > +} > > > + > > > +static inline void cpacr_clear_set(u64 clr, u64 set) > > > +{ > > > + if (has_vhe() || has_hvhe()) > > > + sysreg_clear_set(cpacr_el1, clr, set); > > > + else > > > + __cptr_clear_set_nvhe(clr, set); > > > > So that this could read as: > > > > sysreg_clear_set(cptr_el2, cpacr_to_cptr(clr), cpacr_to_cptr(set)); > > I don't know how this could work with only one function/macro, while > covering all the fields. The problem is that the polarity of these > bits vary: some have the same polarity between CPACR/CPTR. and others > have the opposite polarity. Therefore, the conversion to `clr` is > different from the conversion to `set`. Ah, you're absolutely right. I totally glanced over the fact that the polarities are inverted, totally ruining the fun. > > Say we need to clear both CPACR_ELx_FPEN and CPACR_ELx_TTA. The > problem is that CPACR_ELx_FPEN has the opposite polarity as > CPTR_EL2_TFP, whereas CPACR_ELx_TTA has the same polarity as > CPTR_EL2_TTA (different position though). > > What would cpacr_to_cptr() return that would work in the code you propose: > sysreg_clear_set(cptr_el2, > cpacr_to_cptr(CPACR_ELx_FPEN|CPACR_ELx_TTA), cpacr_to_cptr(0)); > > Desired result: > sysreg_clear_set(cptr_el2, CPTR_EL2_TTA, CPTR_EL2_TFP); > > I can make it work with two functions/macros: > __cpacr_to_cptr_clear(clr, set) and __cpacr_to_cptr_set(clr, set) --> > sysreg_clear_set(cptr_el2, __cpacr_to_cptr_clear(clr, set), > __cpacr_to_cptr_set(clr, set)); > > Unfortunately this doesn't get rid of the duplication, but it might be > tidier (I converted them to macros to add compile-time checks later): > > +#define __cpacr_to_cptr_clr(clr, set) \ > + ({ \ > + u64 cptr = 0; \ > + \ > + if ((set) & CPACR_ELx_FPEN) \ > + cptr |= CPTR_EL2_TFP; \ > + if ((set) & CPACR_ELx_ZEN) \ > + cptr |= CPTR_EL2_TZ; \ > + if ((set) & CPACR_ELx_SMEN) \ > + cptr |= CPTR_EL2_TSM; \ > + if ((clr) & CPACR_ELx_TTA) \ > + cptr |= CPTR_EL2_TTA; \ > + if ((clr) & CPTR_EL2_TAM) \ > + cptr |= CPTR_EL2_TAM; \ > + if ((clr) & CPTR_EL2_TCPAC) \ > + cptr |= CPTR_EL2_TCPAC; \ > + \ > + cptr; \ > + }) > + > +#define __cpacr_to_cptr_set(clr, set) \ > + ({ \ > + u64 cptr = 0; \ > + \ > + if ((clr) & CPACR_ELx_FPEN) \ > + cptr |= CPTR_EL2_TFP; \ > + if ((clr) & CPACR_ELx_ZEN) \ > + cptr |= CPTR_EL2_TZ; \ > + if ((clr) & CPACR_ELx_SMEN) \ > + cptr |= CPTR_EL2_TSM; \ > + if ((set) & CPACR_ELx_TTA) \ > + cptr |= CPTR_EL2_TTA; \ > + if ((set) & CPTR_EL2_TAM) \ > + cptr |= CPTR_EL2_TAM; \ > + if ((set) & CPTR_EL2_TCPAC) \ > + cptr |= CPTR_EL2_TCPAC; \ > + \ > + cptr; \ > + }) > + > +#define cpacr_clear_set(clr, set) \ > + do { \ > + if (has_vhe() || has_hvhe()) \ > + sysreg_clear_set(cpacr_el1, clr, set); \ > + else \ > + sysreg_clear_set(cptr_el2, \ > + __cpacr_to_cptr_clr(clr, set), \ > + __cpacr_to_cptr_set(clr, set));\ > + } while (0) > > What do you think? I quite like this. It is a large body of code, but it is easy to read and symmetric. Thanks, M.
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 501e3e019c93..74837d1762e5 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -557,6 +557,40 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu) vcpu_set_flag((v), e); \ } while (0) + +static inline void __cptr_clear_set_nvhe(u64 cpacr_clr, u64 cpacr_set) +{ + u64 clr = 0, set = 0; + + if (cpacr_clr & CPACR_ELx_FPEN) + set |= CPTR_EL2_TFP; + if (cpacr_clr & CPACR_ELx_ZEN) + set |= CPTR_EL2_TZ; + if (cpacr_clr & CPACR_ELx_SMEN) + set |= CPTR_EL2_TSM; + if (cpacr_clr & CPACR_ELx_TTA) + clr |= CPTR_EL2_TTA; + + if (cpacr_set & CPACR_ELx_FPEN) + clr |= CPTR_EL2_TFP; + if (cpacr_set & CPACR_ELx_ZEN) + clr |= CPTR_EL2_TZ; + if (cpacr_set & CPACR_ELx_SMEN) + clr |= CPTR_EL2_TSM; + if (cpacr_set & CPACR_ELx_TTA) + set |= CPTR_EL2_TTA; + + sysreg_clear_set(cptr_el2, clr, set); +} + +static inline void cpacr_clear_set(u64 clr, u64 set) +{ + if (has_vhe() || has_hvhe()) + sysreg_clear_set(cpacr_el1, clr, set); + else + __cptr_clear_set_nvhe(clr, set); +} + static __always_inline void kvm_write_cptr_el2(u64 val) { if (has_vhe() || has_hvhe()) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index a92566f36022..2ebe2fea8768 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -353,19 +353,10 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) /* Valid trap. Switch the context: */ /* First disable enough traps to allow us to update the registers */ - if (has_vhe() || has_hvhe()) { - reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN; - if (sve_guest) - reg |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; - - sysreg_clear_set(cpacr_el1, 0, reg); - } else { - reg = CPTR_EL2_TFP; - if (sve_guest) - reg |= CPTR_EL2_TZ; - - sysreg_clear_set(cptr_el2, reg, 0); - } + reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN; + if (sve_guest) + reg |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN; + cpacr_clear_set(0, reg); isb(); /* Write out the host state if it's in the registers */ diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index d5c48dc98f67..b07d44484f42 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -405,11 +405,7 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) handle_host_smc(host_ctxt); break; case ESR_ELx_EC_SVE: - if (has_hvhe()) - sysreg_clear_set(cpacr_el1, 0, (CPACR_EL1_ZEN_EL1EN | - CPACR_EL1_ZEN_EL0EN)); - else - sysreg_clear_set(cptr_el2, CPTR_EL2_TZ, 0); + cpacr_clear_set(0, (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)); isb(); sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); break;
The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be toggled in different parts of the code, but the exact bits and their polarity differ between these two formats and the mode (vhe/nvhe/hvhe). To reduce the amount of duplicated code and the chance of getting the wrong bit/polarity or missing a field, abstract the set/clear of CPTR_EL2 bits behind a helper. Since (h)VHE is the way of the future, use the CPACR_EL1 format, which is a subset of the VHE CPTR_EL2, as a reference. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> --- arch/arm64/include/asm/kvm_emulate.h | 34 +++++++++++++++++++++++++ arch/arm64/kvm/hyp/include/hyp/switch.h | 17 +++---------- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +---- 3 files changed, 39 insertions(+), 18 deletions(-)