Message ID | 20241211154611.40395-3-miko.lenczewski@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Initial BBML2 support for contpte_convert() | expand |
On Wed, Dec 11, 2024 at 03:45:03PM +0000, Mikołaj Lenczewski wrote: > The Break-Before-Make cpu feature supports multiple levels (levels 0-2), > and this commit adds a dedicated BBML2 cpufeature to test against > support for. > > In supporting BBM level 2, we open ourselves up to potential TLB > Conflict Abort Exceptions during expected execution, instead of only > in exceptional circumstances. In the case of an abort, it is > implementation defined at what stage the abort is generated, and > the minimal set of required invalidations is also implementation > defined. The maximal set of invalidations is to do a `tlbi vmalle1` > or `tlbi vmalls12e1`, depending on the stage. > > Such aborts should not occur on Arm hardware, and were not seen in > benchmarked systems, so unless performance concerns arise, implementing > the abort handlers with the worst-case invalidations seems like an > alright hack. > > Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> > --- > arch/arm64/include/asm/cpufeature.h | 14 ++++++++++++++ > arch/arm64/kernel/cpufeature.c | 7 +++++++ > arch/arm64/mm/fault.c | 27 ++++++++++++++++++++++++++- > arch/arm64/tools/cpucaps | 1 + > 4 files changed, 48 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index 8b4e5a3cd24c..a9f2ac335392 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -866,6 +866,20 @@ static __always_inline bool system_supports_mpam_hcr(void) > return alternative_has_cap_unlikely(ARM64_MPAM_HCR); > } > > +static inline bool system_supports_bbml2(void) > +{ > + /* currently, BBM is only relied on by code touching the userspace page > + * tables, and as such we are guaranteed that caps have been finalised. > + * > + * if later we want to use BBM for kernel mappings, particularly early > + * in the kernel, this may return 0 even if BBML2 is actually supported, > + * which means unnecessary break-before-make sequences, but is still > + * correct > + */ > + > + return alternative_has_cap_unlikely(ARM64_HAS_BBML2); > +} > + > int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); > bool try_emulate_mrs(struct pt_regs *regs, u32 isn); > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index 6ce71f444ed8..7cc94bd5da24 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -2917,6 +2917,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > .matches = has_cpuid_feature, > ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) > }, > + { > + .desc = "BBM Level 2 Support", > + .capability = ARM64_HAS_BBML2, > + .type = ARM64_CPUCAP_SYSTEM_FEATURE, > + .matches = has_cpuid_feature, > + ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, BBM, 2) > + }, > { > .desc = "52-bit Virtual Addressing for KVM (LPA2)", > .capability = ARM64_HAS_LPA2, > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index ef63651099a9..dc119358cbc1 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -844,6 +844,31 @@ static int do_tag_check_fault(unsigned long far, unsigned long esr, > return 0; > } > > +static int do_conflict_abort(unsigned long far, unsigned long esr, > + struct pt_regs *regs) > +{ > + if (!system_supports_bbml2()) > + return do_bad(far, esr, regs); > + > + /* if we receive a TLB conflict abort, we know that there are multiple > + * TLB entries that translate the same address range. the minimum set > + * of invalidations to clear these entries is implementation defined. > + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1). > + * > + * if el2 is enabled and stage 2 translation enabled, this may be > + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation > + * disabled, or if el2 is disabled, it will be raised as a stage 1 > + * abort. > + * > + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to > + * handle a stage 1 abort. > + */ > + > + local_flush_tlb_all(); > + > + return 0; > +} Can we actually guarantee that we make it this far without taking another abort? Given that I'm yet to see one of these things in the wild, I'm fairly opposed to pretending that we can handle them. We'd be much better off only violating BBM on CPUs that are known to handle the conflict gracefully. Judging by your later patch, this is practically keyed off the MIDR _anyway_... Will
> > +static int do_conflict_abort(unsigned long far, unsigned long esr, > > + struct pt_regs *regs) > > +{ > > + if (!system_supports_bbml2()) > > + return do_bad(far, esr, regs); > > + > > + /* if we receive a TLB conflict abort, we know that there are multiple > > + * TLB entries that translate the same address range. the minimum set > > + * of invalidations to clear these entries is implementation defined. > > + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1). > > + * > > + * if el2 is enabled and stage 2 translation enabled, this may be > > + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation > > + * disabled, or if el2 is disabled, it will be raised as a stage 1 > > + * abort. > > + * > > + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to > > + * handle a stage 1 abort. > > + */ > > + > > + local_flush_tlb_all(); > > + > > + return 0; > > +} > > Can we actually guarantee that we make it this far without taking another > abort? Given that I'm yet to see one of these things in the wild, I'm > fairly opposed to pretending that we can handle them. We'd be much better > off only violating BBM on CPUs that are known to handle the conflict > gracefully. Judging by your later patch, this is practically keyed off > the MIDR _anyway_... > > Will Thanks for reviewing. Apologies for the delay in responding, and for spam (replied instead of group-replied). There should not be an option to take another fault while performing the handler, as long as the mappings covering the fault handler table or any code in this path are not screwed with. This is discussed further in the resent patch series [1]. The MIDR revisions will be fixed. I was confused as to which revisions were affected on an earlier version of the series, and had missed updating them. The kconfig workarounds should be correct in this regard. [1]: https://lore.kernel.org/all/084c5ada-51af-4c1a-b50a-4401e62ddbd6@arm.com/
On 12/13/24 8:17 AM, Mikołaj Lenczewski wrote: >>> +static int do_conflict_abort(unsigned long far, unsigned long esr, >>> + struct pt_regs *regs) >>> +{ >>> + if (!system_supports_bbml2()) >>> + return do_bad(far, esr, regs); >>> + >>> + /* if we receive a TLB conflict abort, we know that there are multiple >>> + * TLB entries that translate the same address range. the minimum set >>> + * of invalidations to clear these entries is implementation defined. >>> + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1). >>> + * >>> + * if el2 is enabled and stage 2 translation enabled, this may be >>> + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation >>> + * disabled, or if el2 is disabled, it will be raised as a stage 1 >>> + * abort. >>> + * >>> + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to >>> + * handle a stage 1 abort. >>> + */ >>> + >>> + local_flush_tlb_all(); >>> + >>> + return 0; >>> +} >> Can we actually guarantee that we make it this far without taking another >> abort? Given that I'm yet to see one of these things in the wild, I'm >> fairly opposed to pretending that we can handle them. We'd be much better >> off only violating BBM on CPUs that are known to handle the conflict >> gracefully. Judging by your later patch, this is practically keyed off >> the MIDR _anyway_... >> >> Will Hi Mikołaj, > Thanks for reviewing. Apologies for the delay in responding, and for > spam (replied instead of group-replied). > > There should not be an option to take another fault while performing the > handler, as long as the mappings covering the fault handler table or any > code in this path are not screwed with. This is discussed further in the > resent patch series [1]. Will lead me to this thread when we discussed about my series (https://lore.kernel.org/lkml/20241211223034.GA17836@willie-the-truck/#t). My series tried to use large mapping for kernel linear mapping when rodata=full. The common part of the both series is BBM lv2 support. My series depends on BBM lv2 to change page table block size when changing permission for kernel linear mapping. Handling TLB conflict should be ok for your usecase since the conflict should just happen in user address space. But it may be not fine for my usecase. At least the TLB conflict handler needs to depend on CONFIG_VMAP_STACK otherwise the kernel stack pointer points to kernel linear address space. I'm not quite sure whether there is any other corner case which can trigger recursive abort in the path or not, maybe bad stack handler? So Will suggested MIDR based lookup. IIUC, he suggested just enable BBM lv2 support on the CPUs which can handle TLB conflict gracefully. This should make our lives easier. But the downside is maybe fewer CPUs can actually advertise BBM lv2 support. Thanks, Yang > The MIDR revisions will be fixed. I was confused as to which revisions > were affected on an earlier version of the series, and had missed > updating them. The kconfig workarounds should be correct in this regard. > > [1]:https://lore.kernel.org/all/084c5ada-51af-4c1a-b50a-4401e62ddbd6@arm.com/ >
On Fri, Dec 13, 2024 at 02:34:57PM -0800, Yang Shi wrote: > On 12/13/24 8:17 AM, Mikołaj Lenczewski wrote: > > > > +static int do_conflict_abort(unsigned long far, unsigned long esr, > > > > + struct pt_regs *regs) > > > > +{ > > > > + if (!system_supports_bbml2()) > > > > + return do_bad(far, esr, regs); > > > > + > > > > + /* if we receive a TLB conflict abort, we know that there are multiple > > > > + * TLB entries that translate the same address range. the minimum set > > > > + * of invalidations to clear these entries is implementation defined. > > > > + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1). > > > > + * > > > > + * if el2 is enabled and stage 2 translation enabled, this may be > > > > + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation > > > > + * disabled, or if el2 is disabled, it will be raised as a stage 1 > > > > + * abort. > > > > + * > > > > + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to > > > > + * handle a stage 1 abort. > > > > + */ > > > > + > > > > + local_flush_tlb_all(); > > > > + > > > > + return 0; > > > > +} > > > Can we actually guarantee that we make it this far without taking another > > > abort? Given that I'm yet to see one of these things in the wild, I'm > > > fairly opposed to pretending that we can handle them. We'd be much better > > > off only violating BBM on CPUs that are known to handle the conflict > > > gracefully. Judging by your later patch, this is practically keyed off > > > the MIDR _anyway_... > > > > > > Will > > Hi Mikołaj, > > > Thanks for reviewing. Apologies for the delay in responding, and for > > spam (replied instead of group-replied). > > > > There should not be an option to take another fault while performing the > > handler, as long as the mappings covering the fault handler table or any > > code in this path are not screwed with. This is discussed further in the > > resent patch series [1]. > > Will lead me to this thread when we discussed about my series > (https://lore.kernel.org/lkml/20241211223034.GA17836@willie-the-truck/#t). > My series tried to use large mapping for kernel linear mapping when > rodata=full. The common part of the both series is BBM lv2 support. My > series depends on BBM lv2 to change page table block size when changing > permission for kernel linear mapping. > > Handling TLB conflict should be ok for your usecase since the conflict > should just happen in user address space. But it may be not fine for my > usecase. At least the TLB conflict handler needs to depend on > CONFIG_VMAP_STACK otherwise the kernel stack pointer points to kernel linear > address space. I'm not quite sure whether there is any other corner case > which can trigger recursive abort in the path or not, maybe bad stack > handler? So Will suggested MIDR based lookup. IIUC, he suggested just enable > BBM lv2 support on the CPUs which can handle TLB conflict gracefully. This > should make our lives easier. But the downside is maybe fewer CPUs can > actually advertise BBM lv2 support. Since most of them seem to be broken anyway, I still think this (having an allowlist and not handling the conflict abort in the kernel) is the right approach. Will
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 8b4e5a3cd24c..a9f2ac335392 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -866,6 +866,20 @@ static __always_inline bool system_supports_mpam_hcr(void) return alternative_has_cap_unlikely(ARM64_MPAM_HCR); } +static inline bool system_supports_bbml2(void) +{ + /* currently, BBM is only relied on by code touching the userspace page + * tables, and as such we are guaranteed that caps have been finalised. + * + * if later we want to use BBM for kernel mappings, particularly early + * in the kernel, this may return 0 even if BBML2 is actually supported, + * which means unnecessary break-before-make sequences, but is still + * correct + */ + + return alternative_has_cap_unlikely(ARM64_HAS_BBML2); +} + int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); bool try_emulate_mrs(struct pt_regs *regs, u32 isn); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 6ce71f444ed8..7cc94bd5da24 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2917,6 +2917,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) }, + { + .desc = "BBM Level 2 Support", + .capability = ARM64_HAS_BBML2, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_cpuid_feature, + ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, BBM, 2) + }, { .desc = "52-bit Virtual Addressing for KVM (LPA2)", .capability = ARM64_HAS_LPA2, diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index ef63651099a9..dc119358cbc1 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -844,6 +844,31 @@ static int do_tag_check_fault(unsigned long far, unsigned long esr, return 0; } +static int do_conflict_abort(unsigned long far, unsigned long esr, + struct pt_regs *regs) +{ + if (!system_supports_bbml2()) + return do_bad(far, esr, regs); + + /* if we receive a TLB conflict abort, we know that there are multiple + * TLB entries that translate the same address range. the minimum set + * of invalidations to clear these entries is implementation defined. + * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1). + * + * if el2 is enabled and stage 2 translation enabled, this may be + * raised as a stage 2 abort. if el2 is enabled but stage 2 translation + * disabled, or if el2 is disabled, it will be raised as a stage 1 + * abort. + * + * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to + * handle a stage 1 abort. + */ + + local_flush_tlb_all(); + + return 0; +} + static const struct fault_info fault_info[] = { { do_bad, SIGKILL, SI_KERNEL, "ttbr address size fault" }, { do_bad, SIGKILL, SI_KERNEL, "level 1 address size fault" }, @@ -893,7 +918,7 @@ static const struct fault_info fault_info[] = { { do_bad, SIGKILL, SI_KERNEL, "unknown 45" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 46" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 47" }, - { do_bad, SIGKILL, SI_KERNEL, "TLB conflict abort" }, + { do_conflict_abort, SIGKILL, SI_KERNEL, "TLB conflict abort" }, { do_bad, SIGKILL, SI_KERNEL, "Unsupported atomic hardware update fault" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 50" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 51" }, diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index eb17f59e543c..4ee0fbb7765b 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -26,6 +26,7 @@ HAS_ECV HAS_ECV_CNTPOFF HAS_EPAN HAS_EVT +HAS_BBML2 HAS_FPMR HAS_FGT HAS_FPSIMD
The Break-Before-Make cpu feature supports multiple levels (levels 0-2), and this commit adds a dedicated BBML2 cpufeature to test against support for. In supporting BBM level 2, we open ourselves up to potential TLB Conflict Abort Exceptions during expected execution, instead of only in exceptional circumstances. In the case of an abort, it is implementation defined at what stage the abort is generated, and the minimal set of required invalidations is also implementation defined. The maximal set of invalidations is to do a `tlbi vmalle1` or `tlbi vmalls12e1`, depending on the stage. Such aborts should not occur on Arm hardware, and were not seen in benchmarked systems, so unless performance concerns arise, implementing the abort handlers with the worst-case invalidations seems like an alright hack. Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> --- arch/arm64/include/asm/cpufeature.h | 14 ++++++++++++++ arch/arm64/kernel/cpufeature.c | 7 +++++++ arch/arm64/mm/fault.c | 27 ++++++++++++++++++++++++++- arch/arm64/tools/cpucaps | 1 + 4 files changed, 48 insertions(+), 1 deletion(-)