diff mbox series

[01/17] accel/tcg: Store section pointer in CPUTLBEntryFull

Message ID 20250415081231.21186-2-jim.shu@sifive.com (mailing list archive)
State New
Headers show
Series Implements RISC-V WorldGuard extension v0.4 | expand

Commit Message

Jim Shu April 15, 2025, 8:12 a.m. UTC
'CPUTLBEntryFull.xlat_section' stores section_index in last 12 bits to
find the correct section when CPU access the IO region over the IOTLB
(iotlb_to_section()).

However, section_index is only unique inside single AddressSpace. If
address space translation is over IOMMUMemoryRegion, it could return
section from other AddressSpace. 'iotlb_to_section()' API only finds the
sections from CPU's AddressSpace so that it couldn't find section in
other AddressSpace. Thus, using 'iotlb_to_section()' API will find the
wrong section and QEMU will have wrong load/store access.

To fix this bug, store complete MemoryRegionSection pointer in
CPUTLBEntryFull instead of section_index.

This bug occurs only when
(1) IOMMUMemoryRegion is in the path of CPU access.
(2) IOMMUMemoryRegion returns different target_as and the section is in
the IO region.

Common IOMMU devices don't have this issue since they are only in the
path of DMA access. Currently, the bug only occurs when ARM MPC device
(hw/misc/tz-mpc.c) returns 'blocked_io_as' to emulate blocked access
handling. Upcoming RISC-V wgChecker device is also affected by this bug.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
---
 accel/tcg/cputlb.c    | 19 +++++++++----------
 include/hw/core/cpu.h |  3 +++
 2 files changed, 12 insertions(+), 10 deletions(-)

Comments

Ilya Leoshkevich April 15, 2025, 9:12 a.m. UTC | #1
On 2025-04-15 10:12, Jim Shu wrote:
> 'CPUTLBEntryFull.xlat_section' stores section_index in last 12 bits to
> find the correct section when CPU access the IO region over the IOTLB
> (iotlb_to_section()).
> 
> However, section_index is only unique inside single AddressSpace. If
> address space translation is over IOMMUMemoryRegion, it could return
> section from other AddressSpace. 'iotlb_to_section()' API only finds 
> the
> sections from CPU's AddressSpace so that it couldn't find section in
> other AddressSpace. Thus, using 'iotlb_to_section()' API will find the
> wrong section and QEMU will have wrong load/store access.
> 
> To fix this bug, store complete MemoryRegionSection pointer in
> CPUTLBEntryFull instead of section_index.
> 
> This bug occurs only when
> (1) IOMMUMemoryRegion is in the path of CPU access.
> (2) IOMMUMemoryRegion returns different target_as and the section is in
> the IO region.
> 
> Common IOMMU devices don't have this issue since they are only in the
> path of DMA access. Currently, the bug only occurs when ARM MPC device
> (hw/misc/tz-mpc.c) returns 'blocked_io_as' to emulate blocked access
> handling. Upcoming RISC-V wgChecker device is also affected by this 
> bug.
> 
> Signed-off-by: Jim Shu <jim.shu@sifive.com>
> ---
>  accel/tcg/cputlb.c    | 19 +++++++++----------
>  include/hw/core/cpu.h |  3 +++
>  2 files changed, 12 insertions(+), 10 deletions(-)

Does this mean that there can be more than 4k sections now and the
assertion in phys_section_add() can be removed?
Jim Shu April 15, 2025, 4:38 p.m. UTC | #2
Hi Ilya,

My patch removes the use of `iotlb_to_section()`, so the section_index
in the last 12-bit of `xlat_section` is no longer used. (Please
correct me if I'm wrong!)
Based on the comment of `phys_section_add()` and the commit log [1], I
believe we can remove the assertion and support more than 4k sections.

I think my commit should also remove the `iotlb_to_section()` function
and rename `xlat_section` to `xlat`.
I will fix it in the next patch.

[1] https://github.com/qemu/qemu/commit/68f3f65b09a1ce8c82fac17911ffc3bb6031ebe4

Jim





On Tue, Apr 15, 2025 at 5:13 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On 2025-04-15 10:12, Jim Shu wrote:
> > 'CPUTLBEntryFull.xlat_section' stores section_index in last 12 bits to
> > find the correct section when CPU access the IO region over the IOTLB
> > (iotlb_to_section()).
> >
> > However, section_index is only unique inside single AddressSpace. If
> > address space translation is over IOMMUMemoryRegion, it could return
> > section from other AddressSpace. 'iotlb_to_section()' API only finds
> > the
> > sections from CPU's AddressSpace so that it couldn't find section in
> > other AddressSpace. Thus, using 'iotlb_to_section()' API will find the
> > wrong section and QEMU will have wrong load/store access.
> >
> > To fix this bug, store complete MemoryRegionSection pointer in
> > CPUTLBEntryFull instead of section_index.
> >
> > This bug occurs only when
> > (1) IOMMUMemoryRegion is in the path of CPU access.
> > (2) IOMMUMemoryRegion returns different target_as and the section is in
> > the IO region.
> >
> > Common IOMMU devices don't have this issue since they are only in the
> > path of DMA access. Currently, the bug only occurs when ARM MPC device
> > (hw/misc/tz-mpc.c) returns 'blocked_io_as' to emulate blocked access
> > handling. Upcoming RISC-V wgChecker device is also affected by this
> > bug.
> >
> > Signed-off-by: Jim Shu <jim.shu@sifive.com>
> > ---
> >  accel/tcg/cputlb.c    | 19 +++++++++----------
> >  include/hw/core/cpu.h |  3 +++
> >  2 files changed, 12 insertions(+), 10 deletions(-)
>
> Does this mean that there can be more than 4k sections now and the
> assertion in phys_section_add() can be removed?
diff mbox series

Patch

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index fb22048876..581611d82d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1150,6 +1150,7 @@  void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     desc->fulltlb[index] = *full;
     full = &desc->fulltlb[index];
     full->xlat_section = iotlb - addr_page;
+    full->section = section;
     full->phys_addr = paddr_page;
 
     /* Now calculate the new entry */
@@ -1265,14 +1266,14 @@  static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
 }
 
 static MemoryRegionSection *
-io_prepare(hwaddr *out_offset, CPUState *cpu, hwaddr xlat,
+io_prepare(hwaddr *out_offset, CPUState *cpu, CPUTLBEntryFull *full,
            MemTxAttrs attrs, vaddr addr, uintptr_t retaddr)
 {
     MemoryRegionSection *section;
     hwaddr mr_offset;
 
-    section = iotlb_to_section(cpu, xlat, attrs);
-    mr_offset = (xlat & TARGET_PAGE_MASK) + addr;
+    section = full->section;
+    mr_offset = (full->xlat_section & TARGET_PAGE_MASK) + addr;
     cpu->mem_io_pc = retaddr;
     if (!cpu->neg.can_do_io) {
         cpu_io_recompile(cpu, retaddr);
@@ -1588,9 +1589,7 @@  bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
 
     /* We must have an iotlb entry for MMIO */
     if (tlb_addr & TLB_MMIO) {
-        MemoryRegionSection *section =
-            iotlb_to_section(cpu, full->xlat_section & ~TARGET_PAGE_MASK,
-                             full->attrs);
+        MemoryRegionSection *section = full->section;
         data->is_io = true;
         data->mr = section->mr;
     } else {
@@ -1980,7 +1979,7 @@  static uint64_t do_ld_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
     tcg_debug_assert(size > 0 && size <= 8);
 
     attrs = full->attrs;
-    section = io_prepare(&mr_offset, cpu, full->xlat_section, attrs, addr, ra);
+    section = io_prepare(&mr_offset, cpu, full, attrs, addr, ra);
     mr = section->mr;
 
     BQL_LOCK_GUARD();
@@ -2001,7 +2000,7 @@  static Int128 do_ld16_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
     tcg_debug_assert(size > 8 && size <= 16);
 
     attrs = full->attrs;
-    section = io_prepare(&mr_offset, cpu, full->xlat_section, attrs, addr, ra);
+    section = io_prepare(&mr_offset, cpu, full, attrs, addr, ra);
     mr = section->mr;
 
     BQL_LOCK_GUARD();
@@ -2521,7 +2520,7 @@  static uint64_t do_st_mmio_leN(CPUState *cpu, CPUTLBEntryFull *full,
     tcg_debug_assert(size > 0 && size <= 8);
 
     attrs = full->attrs;
-    section = io_prepare(&mr_offset, cpu, full->xlat_section, attrs, addr, ra);
+    section = io_prepare(&mr_offset, cpu, full, attrs, addr, ra);
     mr = section->mr;
 
     BQL_LOCK_GUARD();
@@ -2541,7 +2540,7 @@  static uint64_t do_st16_mmio_leN(CPUState *cpu, CPUTLBEntryFull *full,
     tcg_debug_assert(size > 8 && size <= 16);
 
     attrs = full->attrs;
-    section = io_prepare(&mr_offset, cpu, full->xlat_section, attrs, addr, ra);
+    section = io_prepare(&mr_offset, cpu, full, attrs, addr, ra);
     mr = section->mr;
 
     BQL_LOCK_GUARD();
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index abd8764e83..8759e30fcc 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -218,6 +218,9 @@  struct CPUTLBEntryFull {
      */
     hwaddr xlat_section;
 
+    /* @section contains physical section. */
+    MemoryRegionSection *section;
+
     /*
      * @phys_addr contains the physical address in the address space
      * given by cpu_asidx_from_attrs(cpu, @attrs).