Message ID | 20250404220659.1312465-3-rananta@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM : selftests: arm64: Explicitly set the page attrs to Inner-Shareable | expand |
On Fri, Apr 04, 2025 at 10:06:59PM +0000, Raghavendra Rao Ananta wrote: > Atomic instructions such as 'ldset' over (global) variables in the guest > is observed to cause an EL1 data abort with FSC 0x35 (IMPLEMENTATION > DEFINED fault (Unsupported Exclusive or Atomic access)). The observation > was particularly apparent on Neoverse-N3. > > According to DDI0487L.a C3.2.6 (Single-copy atomic 64-byte load/store), > it is implementation defined that a data abort with the mentioned FSC > is reported for the first stage of translation that provides an > inappropriate memory type. It's likely that the same rule also applies > to memory attribute mismatch. When the guest loads the memory location of > the variable that was already cached during the host userspace's copying > of the ELF into the memory, the core is likely running into a mismatch > of memory attrs that's checked in stage-1 itself, and thus causing the > abort in EL1. Sorry, my index of the ARM ARM was trashed when we were discussing this before. DDI0487L.a B2.2.6 describes the exact situation you encountered, where atomics are only guaranteed to work on Inner/Outer Shareable MT_NORMAL memory. What's a bit more explicit for other memory attribute aborts (like the one you've cited) is whether or not the implementation can generate the abort solely on the stage-1 attributes vs. the combined stage-1/stage-2 attributes at the end of translation. Either way, let's correct the citation to point at the right section. Thanks, Oliver
Hi Oliver On Fri, Apr 4, 2025 at 4:01 PM Oliver Upton <oliver.upton@linux.dev> wrote: > > On Fri, Apr 04, 2025 at 10:06:59PM +0000, Raghavendra Rao Ananta wrote: > > Atomic instructions such as 'ldset' over (global) variables in the guest > > is observed to cause an EL1 data abort with FSC 0x35 (IMPLEMENTATION > > DEFINED fault (Unsupported Exclusive or Atomic access)). The observation > > was particularly apparent on Neoverse-N3. > > > > According to DDI0487L.a C3.2.6 (Single-copy atomic 64-byte load/store), > > it is implementation defined that a data abort with the mentioned FSC > > is reported for the first stage of translation that provides an > > inappropriate memory type. It's likely that the same rule also applies > > to memory attribute mismatch. When the guest loads the memory location of > > the variable that was already cached during the host userspace's copying > > of the ELF into the memory, the core is likely running into a mismatch > > of memory attrs that's checked in stage-1 itself, and thus causing the > > abort in EL1. > > Sorry, my index of the ARM ARM was trashed when we were discussing this > before. > > DDI0487L.a B2.2.6 describes the exact situation you encountered, where > atomics are only guaranteed to work on Inner/Outer Shareable MT_NORMAL > memory. > > What's a bit more explicit for other memory attribute aborts (like the > one you've cited) is whether or not the implementation can generate the > abort solely on the stage-1 attributes vs. the combined stage-1/stage-2 > attributes at the end of translation. > > Either way, let's correct the citation to point at the right section. > Ah yes, DDI0487L.a B2.2.6 seems to be very close. OTOH DDI0487L.a C3.2.6 explains why we see an abort in EL1. I can cite both to get a full picture. Thank you. Raghavendra > Thanks, > Oliver
diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h index 691670bbe226..b337a606aac4 100644 --- a/tools/testing/selftests/kvm/include/arm64/processor.h +++ b/tools/testing/selftests/kvm/include/arm64/processor.h @@ -75,6 +75,7 @@ #define PMD_TYPE_TABLE BIT(1) #define PTE_TYPE_PAGE BIT(1) +#define PTE_SHARED (UL(3) << 8) /* SH[1:0], inner shareable */ #define PTE_AF BIT(10) #define PTE_ADDR_MASK(page_shift) GENMASK(47, (page_shift)) diff --git a/tools/testing/selftests/kvm/lib/arm64/processor.c b/tools/testing/selftests/kvm/lib/arm64/processor.c index da5802c8a59c..9d69904cb608 100644 --- a/tools/testing/selftests/kvm/lib/arm64/processor.c +++ b/tools/testing/selftests/kvm/lib/arm64/processor.c @@ -172,6 +172,9 @@ static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, } pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID; + if (!use_lpa2_pte_format(vm)) + pg_attr |= PTE_SHARED; + *ptep = addr_pte(vm, paddr, pg_attr); }
Atomic instructions such as 'ldset' over (global) variables in the guest is observed to cause an EL1 data abort with FSC 0x35 (IMPLEMENTATION DEFINED fault (Unsupported Exclusive or Atomic access)). The observation was particularly apparent on Neoverse-N3. According to DDI0487L.a C3.2.6 (Single-copy atomic 64-byte load/store), it is implementation defined that a data abort with the mentioned FSC is reported for the first stage of translation that provides an inappropriate memory type. It's likely that the same rule also applies to memory attribute mismatch. When the guest loads the memory location of the variable that was already cached during the host userspace's copying of the ELF into the memory, the core is likely running into a mismatch of memory attrs that's checked in stage-1 itself, and thus causing the abort in EL1. Fix this by explicitly setting the memory attribute to Inner-Shareable to avoid the mismatch, and by extension, the data abort. Suggested-by: Oliver Upton <oupton@google.com> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> --- tools/testing/selftests/kvm/include/arm64/processor.h | 1 + tools/testing/selftests/kvm/lib/arm64/processor.c | 3 +++ 2 files changed, 4 insertions(+)