mbox series

[RFC,0/3] translation performance improvements

Message ID 20250331155423.619451-1-npiggin@gmail.com (mailing list archive)
Headers show
Series translation performance improvements | expand

Message

Nicholas Piggin March 31, 2025, 3:54 p.m. UTC
I've been struggling with these couple of performance issues with
TB coherency. I almost thought deferring flush to icbi would be
workable, buta note in the docs says that exceptions require TB
to be coherent... I don't know what requires that, maybe it could
be worked around?

Another thing is PowerVM runtime firmware runs with MMU disabled
for ifetch. This means a fixed linear map with no memory protection.
Is it possible we can enable goto tb across TARGET_PAGE_SIZE for
ifetches in this mode?

Thanks,
Nick

Nicholas Piggin (3):
  accel/tcg: Option to permit incoherent translation block cache vs
    stores
  target/ppc: define TARGET_HAS_LAZY_ICACHE
  target/ppc: Allow goto-tb on fixed real mode translations

 accel/tcg/tb-internal.h  | 10 ++++++
 include/exec/tb-flush.h  |  3 ++
 target/ppc/cpu.h         | 16 +++++++++
 accel/tcg/cputlb.c       | 15 +++++++--
 accel/tcg/tb-maint.c     | 73 ++++++++++++++++++++++++++++++++++++++++
 target/ppc/mem_helper.c  |  2 ++
 target/ppc/translate.c   | 21 ++++++++++++
 system/memory_ldst.c.inc |  2 +-
 8 files changed, 138 insertions(+), 4 deletions(-)

Comments

Richard Henderson March 31, 2025, 7:40 p.m. UTC | #1
On 3/31/25 10:54, Nicholas Piggin wrote:
> I've been struggling with these couple of performance issues with
> TB coherency. I almost thought deferring flush to icbi would be
> workable, buta note in the docs says that exceptions require TB
> to be coherent... I don't know what requires that, maybe it could
> be worked around?

Which note?  Anyway, qemu implements accurate tb invalidation for x86 and s390x, which 
means we don't really need to do anything special for other targets.

Compare aarch64 "IC_IVAU" which (at least for system mode) is implemented as a nop.

> Another thing is PowerVM runtime firmware runs with MMU disabled
> for ifetch. This means a fixed linear map with no memory protection.
> Is it possible we can enable goto tb across TARGET_PAGE_SIZE for
> ifetches in this mode?

No, there are several things that assume nothing jumps across TARGET_PAGE_SIZE, including 
breakpoints.


r~
Nicholas Piggin April 1, 2025, 8:33 a.m. UTC | #2
On Tue Apr 1, 2025 at 5:40 AM AEST, Richard Henderson wrote:
> On 3/31/25 10:54, Nicholas Piggin wrote:
>> I've been struggling with these couple of performance issues with
>> TB coherency. I almost thought deferring flush to icbi would be
>> workable, buta note in the docs says that exceptions require TB
>> to be coherent... I don't know what requires that, maybe it could
>> be worked around?
>
> Which note?  Anyway, qemu implements accurate tb invalidation for x86 and s390x, which 
> means we don't really need to do anything special for other targets.

In docs/devel/tcg.rst

On RISC targets, correctly written software uses memory barriers and
cache flushes, so some of the protection above would not be
necessary. However, QEMU still requires that the generated code always
matches the target instructions in memory in order to handle
exceptions correctly.

>
> Compare aarch64 "IC_IVAU" which (at least for system mode) is implemented as a nop.

I'll take a look at it.


>> Another thing is PowerVM runtime firmware runs with MMU disabled
>> for ifetch. This means a fixed linear map with no memory protection.
>> Is it possible we can enable goto tb across TARGET_PAGE_SIZE for
>> ifetches in this mode?
>
> No, there are several things that assume nothing jumps across TARGET_PAGE_SIZE, including 
> breakpoints.

I see. It did actually work and run fine, so I wonder how much effort
it would take to cater for these issues. I guess for this rather niche
"real mode" it may not be worth bending over backward.

Thanks,
Nick