Message ID | 33748BB7-E617-4661-BDE3-5D29780FC9FF@wdc.com |
---|---|
State | New, archived |
Headers | show |
Series | Another CXL/MMIO tcg tlb corner case | expand |
Jørgen Hansen <Jorgen.Hansen@wdc.com> writes: > Hi, > > While doing some testing using numactl-based interleaving of application memory > across regular memory and CXL-based memory using QEMU with tcg, I ran into an > issue similar to what we saw a while back - link to old issue: > https://lore.kernel.org/qemu-devel/CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com/#t. > > When running: > > numactl --interleave 0,1 ./cachebench … > > I hit the following: > > numactl --interleave 0,1 ./cachebench --json_test_config ../test_configs/hit_ratio/graph_cache_follower_assocs/config.json > qemu: fatal: cpu_io_recompile: could not find TB for pc=0x7fffc3926dd4 Ok so this will be because we (by design) don't cache TB's for MMIO regions. But as you say: > > Looking at the tb being executed, it looks like it is a single instruction tb, > so with my _very_ limited understanding of tcg, it shouldn’t be necessary to > do the IO recompile: > > (gdb) up 13 > > #13 0x0000555555ca13ac in cpu_loop_exec_tb (tb_exit=0x7ffff49ff6d8, last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffc3926cc0 <code_gen_buffer+464678035>, cpu=0x5555578c19c0) at ../accel/tcg/cpu-exec.c:904 > 904 tb = cpu_tb_exec(cpu, tb, tb_exit); > (gdb) print *tb > $1 = {pc = 0, cs_base = 0, flags = 415285939, cflags = 4278321152, size = 7, icount = 1, tc = {ptr = 0x7fffc3926d80 <code_gen_buffer+464678227>, size = 176}, page_next = {0, 0}, page_addr = {18446744073709551615, > 18446744073709551615}, jmp_lock = {value = 0}, jmp_reset_offset = > {65535, 65535}, jmp_insn_offset = {65535, 65535}, jmp_target_addr = > {0, 0}, jmp_list_head = 140736474540928, jmp_list_next = {0, 0}, > jmp_dest = {0, 0}} with an icount of 1 we by definition can do io. This is done in the translator_loop: /* * Disassemble one instruction. The translate_insn hook should * update db->pc_next and db->is_jmp to indicate what should be * done next -- either exiting this loop or locate the start of * the next instruction. */ if (db->num_insns == db->max_insns) { /* Accept I/O on the last instruction. */ set_can_do_io(db, true); } and we see later on: tb->icount = db->num_insns; So I guess we must have gone into the translator loop expecting to translate more? (side note having int8_t type for the saved bool value seems weird). Could you confirm by doing something like: if (tb->icount == 1 && db->max_insns > 1) { fprintf(stderr, "ended up doing one insn (%d, %d)", db->max_insns, db->saved_can_do_io); } after we set tb->icount? > If the application is run entirely out of MMIO memory, things work fine (the > previous patches related to this is in Jonathans branch), so one thought is that > it is related to having the code on a mix of regular and CXL memory. Since we > previously had issues with code crossing page boundaries where only the second > page is MMIO, I tried out the following change to the fix introduced for that > issue thinking that reverting to the slow path in the middle of the translation > might not correctly update can_do_io: > > diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c > index 38c34009a5..db6ea360e0 100644 > --- a/accel/tcg/translator.c > +++ b/accel/tcg/translator.c > @@ -258,6 +258,7 @@ static void *translator_access(CPUArchState *env, DisasContextBase *db, > if (unlikely(new_page1 == -1)) { > tb_unlock_pages(tb); > tb_set_page_addr0(tb, -1); > + set_can_do_io(db, true); > return NULL; > } > > With that change, things work fine. Not saying that this is a valid fix for the > issue, but just trying to provide as much information as possible :) I can see why this works, my worry is if we address all the early exit cases here. > > Thanks, > Jorgen
> On 15 Mar 2024, at 13.25, Alex Bennée <alex.bennee@linaro.org> wrote: > > Jørgen Hansen <Jorgen.Hansen@wdc.com> writes: > >> Hi, >> >> While doing some testing using numactl-based interleaving of application memory >> across regular memory and CXL-based memory using QEMU with tcg, I ran into an >> issue similar to what we saw a while back - link to old issue: >> https://lore.kernel.org/qemu-devel/CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com/#t. >> >> When running: >> >> numactl --interleave 0,1 ./cachebench … >> >> I hit the following: >> >> numactl --interleave 0,1 ./cachebench --json_test_config ../test_configs/hit_ratio/graph_cache_follower_assocs/config.json >> qemu: fatal: cpu_io_recompile: could not find TB for pc=0x7fffc3926dd4 > > Ok so this will be because we (by design) don't cache TB's for MMIO > regions. But as you say: >> >> Looking at the tb being executed, it looks like it is a single instruction tb, >> so with my _very_ limited understanding of tcg, it shouldn’t be necessary to >> do the IO recompile: >> >> (gdb) up 13 >> >> #13 0x0000555555ca13ac in cpu_loop_exec_tb (tb_exit=0x7ffff49ff6d8, last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffc3926cc0 <code_gen_buffer+464678035>, cpu=0x5555578c19c0) at ../accel/tcg/cpu-exec.c:904 >> 904 tb = cpu_tb_exec(cpu, tb, tb_exit); >> (gdb) print *tb >> $1 = {pc = 0, cs_base = 0, flags = 415285939, cflags = 4278321152, size = 7, icount = 1, tc = {ptr = 0x7fffc3926d80 <code_gen_buffer+464678227>, size = 176}, page_next = {0, 0}, page_addr = {18446744073709551615, >> 18446744073709551615}, jmp_lock = {value = 0}, jmp_reset_offset = >> {65535, 65535}, jmp_insn_offset = {65535, 65535}, jmp_target_addr = >> {0, 0}, jmp_list_head = 140736474540928, jmp_list_next = {0, 0}, >> jmp_dest = {0, 0}} > > with an icount of 1 we by definition can do io. This is done in the > translator_loop: > > /* > * Disassemble one instruction. The translate_insn hook should > * update db->pc_next and db->is_jmp to indicate what should be > * done next -- either exiting this loop or locate the start of > * the next instruction. > */ > if (db->num_insns == db->max_insns) { > /* Accept I/O on the last instruction. */ > set_can_do_io(db, true); > } > > and we see later on: > > tb->icount = db->num_insns; > > So I guess we must have gone into the translator loop expecting to > translate more? (side note having int8_t type for the saved bool value > seems weird). > > Could you confirm by doing something like: > > if (tb->icount == 1 && db->max_insns > 1) { > fprintf(stderr, "ended up doing one insn (%d, %d)", db->max_insns, db->saved_can_do_io); > } > > after we set tb->icount? > Thanks for looking into this - I tried what you suggest above and it looks like that is a case that happens quite often (I see 100s of these just when booting the VM), so it is hard to tell whether it is related directly to the issue, e.g.,: ended up doing one insn (512, 0)ended up doing one insn (512, 0)ended up doing one insn (512, 0)ended up doing one insn (512, 0)ended up doing one insn (512, 0)ended up doing one insn (512, 0)ended up doing one insn (512, 0)qemu: fatal: cpu_io_recompile: could not find TB for pc=0x7f4264da3d48 RAX=0000000000000000 RBX=00007f6798e69040 RCX=000000000205f958 RDX=000000000205f8f0 RSI=0000000000000000 RDI=00007ffd5663c720 RBP=00007ffd5663c720 RSP=00007ffd5663c6c0 R8 =000000000000001e R9 =000000000205f920 R10=0000000000000004 R11=00007f67984b56b0 R12=00007ffd5663c7e0 R13=00007f6798e5d0b8 R14=00007f6798e5d588 R15=00007ffd5663c700 RIP=00007f6798e39ffd RFL=00000246 [---Z-P-] CPL=3 II=0 A20=1 SMM=0 HLT=0 >> If the application is run entirely out of MMIO memory, things work fine (the >> previous patches related to this is in Jonathans branch), so one thought is that >> it is related to having the code on a mix of regular and CXL memory. Since we >> previously had issues with code crossing page boundaries where only the second >> page is MMIO, I tried out the following change to the fix introduced for that >> issue thinking that reverting to the slow path in the middle of the translation >> might not correctly update can_do_io: >> >> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c >> index 38c34009a5..db6ea360e0 100644 >> --- a/accel/tcg/translator.c >> +++ b/accel/tcg/translator.c >> @@ -258,6 +258,7 @@ static void *translator_access(CPUArchState *env, DisasContextBase *db, >> if (unlikely(new_page1 == -1)) { >> tb_unlock_pages(tb); >> tb_set_page_addr0(tb, -1); >> + set_can_do_io(db, true); >> return NULL; >> } >> >> With that change, things work fine. Not saying that this is a valid fix for the >> issue, but just trying to provide as much information as possible :) > > I can see why this works, my worry is if we address all the early exit > cases here. > >> >> Thanks, >> Jorgen > > -- > Alex Bennée > Virtualisation Tech Lead @ Linaro Thanks, Jorgen
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c index 38c34009a5..db6ea360e0 100644 --- a/accel/tcg/translator.c +++ b/accel/tcg/translator.c @@ -258,6 +258,7 @@ static void *translator_access(CPUArchState *env, DisasContextBase *db, if (unlikely(new_page1 == -1)) { tb_unlock_pages(tb); tb_set_page_addr0(tb, -1); + set_can_do_io(db, true); return NULL; }