diff mbox series

[v3,06/20] accel/tcg: Use the alignment test in tlb_fill_align

Message ID 20241009000453.315652-7-richard.henderson@linaro.org (mailing list archive)
State New
Headers show
Series accel/tcg: Introduce tlb_fill_align hook | expand

Commit Message

Richard Henderson Oct. 9, 2024, 12:04 a.m. UTC
When we have a tlb miss, defer the alignment check to
the new tlb_fill_align hook.  Move the existing alignment
check so that we only perform it with a tlb hit.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 88 ++++++++++++++++++++++++----------------------
 1 file changed, 45 insertions(+), 43 deletions(-)

Comments

Helge Deller Oct. 9, 2024, 9:29 p.m. UTC | #1
On 10/9/24 02:04, Richard Henderson wrote:
> When we have a tlb miss, defer the alignment check to
> the new tlb_fill_align hook.  Move the existing alignment
> check so that we only perform it with a tlb hit.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Helge Deller <deller@gmx.de>
Peter Maydell Oct. 10, 2024, 10:44 a.m. UTC | #2
On Wed, 9 Oct 2024 at 01:05, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> When we have a tlb miss, defer the alignment check to
> the new tlb_fill_align hook.  Move the existing alignment
> check so that we only perform it with a tlb hit.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


> @@ -1754,8 +1767,8 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           * Lookup both pages, recognizing exceptions from either.  If the
>           * second lookup potentially resized, refresh first CPUTLBEntryFull.
>           */
> -        mmu_lookup1(cpu, &l->page[0], l->mmu_idx, type, ra);
> -        if (mmu_lookup1(cpu, &l->page[1], l->mmu_idx, type, ra)) {
> +        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
> +        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {

Is 0 the right thing here ? Given that alignment-requirements can
differ per-page, what happens for the case of an unaligned access
that crosses a page boundary where the first page is "memory that
doesn't care about alignment" and the second page is "memory that
does enforce alignment" ?

For Arm this is moot, because an access that crosses a page
boundary into something with different memory attributes is
CONSTRAINED UNPREDICTABLE (per Arm ARM rev K.a B2.15.3), but
maybe other architectures are less flexible.

Also, the comment does say "recognizing exceptions from either",
and we aren't going to do that for alignment issues in page 2,
so if we're OK with ignoring this we should update the comment.

thanks
-- PMM
Richard Henderson Oct. 11, 2024, 3:22 p.m. UTC | #3
On 10/10/24 03:44, Peter Maydell wrote:
> On Wed, 9 Oct 2024 at 01:05, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> When we have a tlb miss, defer the alignment check to
>> the new tlb_fill_align hook.  Move the existing alignment
>> check so that we only perform it with a tlb hit.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
>> @@ -1754,8 +1767,8 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>>            * Lookup both pages, recognizing exceptions from either.  If the
>>            * second lookup potentially resized, refresh first CPUTLBEntryFull.
>>            */
>> -        mmu_lookup1(cpu, &l->page[0], l->mmu_idx, type, ra);
>> -        if (mmu_lookup1(cpu, &l->page[1], l->mmu_idx, type, ra)) {
>> +        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
>> +        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
> 
> Is 0 the right thing here ? Given that alignment-requirements can
> differ per-page, what happens for the case of an unaligned access
> that crosses a page boundary where the first page is "memory that
> doesn't care about alignment" and the second page is "memory that
> does enforce alignment" ?

I can't think of anything else that makes sense.  The access to the second page is 
"aligned" in the sense that it begins at offset 0.

Anyway, alignment as a page property is unique to Arm and...


> For Arm this is moot, because an access that crosses a page
> boundary into something with different memory attributes is
> CONSTRAINED UNPREDICTABLE (per Arm ARM rev K.a B2.15.3), but
> maybe other architectures are less flexible.

... as you say.

> Also, the comment does say "recognizing exceptions from either",
> and we aren't going to do that for alignment issues in page 2,
> so if we're OK with ignoring this we should update the comment.

The second page can still raise page faults.  How would you re-word this?


r~
Peter Maydell Oct. 11, 2024, 3:36 p.m. UTC | #4
On Fri, 11 Oct 2024 at 16:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 10/10/24 03:44, Peter Maydell wrote:
> > On Wed, 9 Oct 2024 at 01:05, Richard Henderson
> > <richard.henderson@linaro.org> wrote:
> >>
> >> When we have a tlb miss, defer the alignment check to
> >> the new tlb_fill_align hook.  Move the existing alignment
> >> check so that we only perform it with a tlb hit.
> >>
> >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> >
> >
> >> @@ -1754,8 +1767,8 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
> >>            * Lookup both pages, recognizing exceptions from either.  If the
> >>            * second lookup potentially resized, refresh first CPUTLBEntryFull.
> >>            */
> >> -        mmu_lookup1(cpu, &l->page[0], l->mmu_idx, type, ra);
> >> -        if (mmu_lookup1(cpu, &l->page[1], l->mmu_idx, type, ra)) {
> >> +        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
> >> +        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
> >
> > Is 0 the right thing here ? Given that alignment-requirements can
> > differ per-page, what happens for the case of an unaligned access
> > that crosses a page boundary where the first page is "memory that
> > doesn't care about alignment" and the second page is "memory that
> > does enforce alignment" ?
>
> I can't think of anything else that makes sense.  The access to the second page is
> "aligned" in the sense that it begins at offset 0.

Ah, yes, you don't get to see how unaligned the access
is in the second mmu lookup.

> Anyway, alignment as a page property is unique to Arm and...
>
>
> > For Arm this is moot, because an access that crosses a page
> > boundary into something with different memory attributes is
> > CONSTRAINED UNPREDICTABLE (per Arm ARM rev K.a B2.15.3), but
> > maybe other architectures are less flexible.
>
> ... as you say.
>
> > Also, the comment does say "recognizing exceptions from either",
> > and we aren't going to do that for alignment issues in page 2,
> > so if we're OK with ignoring this we should update the comment.
>
> The second page can still raise page faults.  How would you re-word this?

Mmm, so there's also the case of an unaligned access across the
page boundary where page 1 is present and page 2 is not -- do
we prioritize the alignment fault or the page fault? This
patch will always prioritize the alignment fault. For Arm
that's fine, as we architecturally should do the checks in
that order.

-- PMM
diff mbox series

Patch

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d72f454e9e..b76a4eac4e 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1620,6 +1620,7 @@  typedef struct MMULookupLocals {
  * mmu_lookup1: translate one page
  * @cpu: generic cpu state
  * @data: lookup parameters
+ * @memop: memory operation for the access, or 0
  * @mmu_idx: virtual address context
  * @access_type: load/store/code
  * @ra: return address into tcg generated code, or 0
@@ -1629,7 +1630,7 @@  typedef struct MMULookupLocals {
  * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
  * @mmu_idx may have resized.
  */
-static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data,
+static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
                         int mmu_idx, MMUAccessType access_type, uintptr_t ra)
 {
     vaddr addr = data->addr;
@@ -1645,7 +1646,7 @@  static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data,
         if (!victim_tlb_hit(cpu, mmu_idx, index, access_type,
                             addr & TARGET_PAGE_MASK)) {
             tlb_fill_align(cpu, addr, access_type, mmu_idx,
-                           0, data->size, false, ra);
+                           memop, data->size, false, ra);
             maybe_resized = true;
             index = tlb_index(cpu, mmu_idx, addr);
             entry = tlb_entry(cpu, mmu_idx, addr);
@@ -1657,6 +1658,25 @@  static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data,
     flags = tlb_addr & (TLB_FLAGS_MASK & ~TLB_FORCE_SLOW);
     flags |= full->slow_flags[access_type];
 
+    if (likely(!maybe_resized)) {
+        /* Alignment has not been checked by tlb_fill_align. */
+        int a_bits = memop_alignment_bits(memop);
+
+        /*
+         * This alignment check differs from the one above, in that this is
+         * based on the atomicity of the operation. The intended use case is
+         * the ARM memory type field of each PTE, where access to pages with
+         * Device memory type require alignment.
+         */
+        if (unlikely(flags & TLB_CHECK_ALIGNED)) {
+            int at_bits = memop_atomicity_bits(memop);
+            a_bits = MAX(a_bits, at_bits);
+        }
+        if (unlikely(addr & ((1 << a_bits) - 1))) {
+            cpu_unaligned_access(cpu, addr, access_type, mmu_idx, ra);
+        }
+    }
+
     data->full = full;
     data->flags = flags;
     /* Compute haddr speculatively; depending on flags it might be invalid. */
@@ -1713,7 +1733,6 @@  static void mmu_watch_or_dirty(CPUState *cpu, MMULookupPageData *data,
 static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
                        uintptr_t ra, MMUAccessType type, MMULookupLocals *l)
 {
-    unsigned a_bits;
     bool crosspage;
     int flags;
 
@@ -1722,12 +1741,6 @@  static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
 
     tcg_debug_assert(l->mmu_idx < NB_MMU_MODES);
 
-    /* Handle CPU specific unaligned behaviour */
-    a_bits = memop_alignment_bits(l->memop);
-    if (addr & ((1 << a_bits) - 1)) {
-        cpu_unaligned_access(cpu, addr, type, l->mmu_idx, ra);
-    }
-
     l->page[0].addr = addr;
     l->page[0].size = memop_size(l->memop);
     l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK;
@@ -1735,7 +1748,7 @@  static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK;
 
     if (likely(!crosspage)) {
-        mmu_lookup1(cpu, &l->page[0], l->mmu_idx, type, ra);
+        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
 
         flags = l->page[0].flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
@@ -1754,8 +1767,8 @@  static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
          * Lookup both pages, recognizing exceptions from either.  If the
          * second lookup potentially resized, refresh first CPUTLBEntryFull.
          */
-        mmu_lookup1(cpu, &l->page[0], l->mmu_idx, type, ra);
-        if (mmu_lookup1(cpu, &l->page[1], l->mmu_idx, type, ra)) {
+        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
+        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
             uintptr_t index = tlb_index(cpu, l->mmu_idx, addr);
             l->page[0].full = &cpu->neg.tlb.d[l->mmu_idx].fulltlb[index];
         }
@@ -1774,19 +1787,6 @@  static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         tcg_debug_assert((flags & TLB_BSWAP) == 0);
     }
 
-    /*
-     * This alignment check differs from the one above, in that this is
-     * based on the atomicity of the operation. The intended use case is
-     * the ARM memory type field of each PTE, where access to pages with
-     * Device memory type require alignment.
-     */
-    if (unlikely(flags & TLB_CHECK_ALIGNED)) {
-        a_bits = memop_atomicity_bits(l->memop);
-        if (addr & ((1 << a_bits) - 1)) {
-            cpu_unaligned_access(cpu, addr, type, l->mmu_idx, ra);
-        }
-    }
-
     return crosspage;
 }
 
@@ -1799,34 +1799,18 @@  static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
 {
     uintptr_t mmu_idx = get_mmuidx(oi);
     MemOp mop = get_memop(oi);
-    int a_bits = memop_alignment_bits(mop);
     uintptr_t index;
     CPUTLBEntry *tlbe;
     vaddr tlb_addr;
     void *hostaddr;
     CPUTLBEntryFull *full;
+    bool did_tlb_fill = false;
 
     tcg_debug_assert(mmu_idx < NB_MMU_MODES);
 
     /* Adjust the given return address.  */
     retaddr -= GETPC_ADJ;
 
-    /* Enforce guest required alignment.  */
-    if (unlikely(a_bits > 0 && (addr & ((1 << a_bits) - 1)))) {
-        /* ??? Maybe indicate atomic op to cpu_unaligned_access */
-        cpu_unaligned_access(cpu, addr, MMU_DATA_STORE,
-                             mmu_idx, retaddr);
-    }
-
-    /* Enforce qemu required alignment.  */
-    if (unlikely(addr & (size - 1))) {
-        /* We get here if guest alignment was not requested,
-           or was not enforced by cpu_unaligned_access above.
-           We might widen the access and emulate, but for now
-           mark an exception and exit the cpu loop.  */
-        goto stop_the_world;
-    }
-
     index = tlb_index(cpu, mmu_idx, addr);
     tlbe = tlb_entry(cpu, mmu_idx, addr);
 
@@ -1836,7 +1820,8 @@  static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
                             addr & TARGET_PAGE_MASK)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
-                           0, size, false, retaddr);
+                           mop, size, false, retaddr);
+            did_tlb_fill = true;
             index = tlb_index(cpu, mmu_idx, addr);
             tlbe = tlb_entry(cpu, mmu_idx, addr);
         }
@@ -1859,6 +1844,23 @@  static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
          */
         g_assert_not_reached();
     }
+
+    /* Enforce guest required alignment, if not handled by tlb_fill_align. */
+    if (!did_tlb_fill && (addr & ((1 << memop_alignment_bits(mop)) - 1))) {
+        cpu_unaligned_access(cpu, addr, MMU_DATA_STORE, mmu_idx, retaddr);
+    }
+
+    /* Enforce qemu required alignment.  */
+    if (unlikely(addr & (size - 1))) {
+        /*
+         * We get here if guest alignment was not requested, or was not
+         * enforced by cpu_unaligned_access or tlb_fill_align above.
+         * We might widen the access and emulate, but for now
+         * mark an exception and exit the cpu loop.
+         */
+        goto stop_the_world;
+    }
+
     /* Collect tlb flags for read. */
     tlb_addr |= tlbe->addr_read;