Message ID | 163662450891.125458.6706022775465303586.stgit@pasha-ThinkPad-X280 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Some watchpoint-related patches | expand |
Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: > When debugging with the watchpoints, qemu may need to create > TB with single instruction. This is achieved by setting cpu->cflags_next_tb. > But when this block is about to execute, it may be interrupted by another > thread. In this case cflags will be lost and next executed TB will not > be the special one. > This patch checks TB exit reason and restores cflags_next_tb to allow > finding the interrupted block. > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> > --- > accel/tcg/cpu-exec.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 2d14d02f6c..df12452b8f 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, > * cpu_handle_interrupt. cpu_handle_interrupt will also > * clear cpu->icount_decr.u16.high. > */ > + if (cpu->cflags_next_tb == -1 > + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) Why check use_icount here? The cflags should always have CF_USE_ICOUNT set when icount is enabled. Lets not over complicate the inverted || tests we have here. > + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) > { Is u16.low ever set when icount isn't enabled? > + /* > + * icount is disabled or there are enough instructions > + * in the budget, do not retranslate this block with > + * different parameters. > + */ > + cpu->cflags_next_tb = tb->cflags; > + } > return; > } >
On 11.11.2021 15:20, Alex Bennée wrote: > > Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: > >> When debugging with the watchpoints, qemu may need to create >> TB with single instruction. This is achieved by setting cpu->cflags_next_tb. >> But when this block is about to execute, it may be interrupted by another >> thread. In this case cflags will be lost and next executed TB will not >> be the special one. >> This patch checks TB exit reason and restores cflags_next_tb to allow >> finding the interrupted block. >> >> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> >> --- >> accel/tcg/cpu-exec.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >> index 2d14d02f6c..df12452b8f 100644 >> --- a/accel/tcg/cpu-exec.c >> +++ b/accel/tcg/cpu-exec.c >> @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, >> * cpu_handle_interrupt. cpu_handle_interrupt will also >> * clear cpu->icount_decr.u16.high. >> */ >> + if (cpu->cflags_next_tb == -1 >> + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) > > Why check use_icount here? The cflags should always have CF_USE_ICOUNT > set when icount is enabled. Lets not over complicate the inverted || > tests we have here. Not really. Sometimes we use non-icount blocks in icount mode. But AFAIR they are used only for triggering the exeptions, but not for real execution. > >> + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) >> { > > Is u16.low ever set when icount isn't enabled? This condition is checked for icount mode only. u16.low is not used without icount. > >> + /* >> + * icount is disabled or there are enough instructions >> + * in the budget, do not retranslate this block with >> + * different parameters. >> + */ >> + cpu->cflags_next_tb = tb->cflags; >> + } >> return; >> } >> > >
Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: > On 11.11.2021 15:20, Alex Bennée wrote: >> Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: >> >>> When debugging with the watchpoints, qemu may need to create >>> TB with single instruction. This is achieved by setting cpu->cflags_next_tb. >>> But when this block is about to execute, it may be interrupted by another >>> thread. In this case cflags will be lost and next executed TB will not >>> be the special one. >>> This patch checks TB exit reason and restores cflags_next_tb to allow >>> finding the interrupted block. >>> >>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> >>> --- >>> accel/tcg/cpu-exec.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >>> index 2d14d02f6c..df12452b8f 100644 >>> --- a/accel/tcg/cpu-exec.c >>> +++ b/accel/tcg/cpu-exec.c >>> @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, >>> * cpu_handle_interrupt. cpu_handle_interrupt will also >>> * clear cpu->icount_decr.u16.high. >>> */ >>> + if (cpu->cflags_next_tb == -1 Can cpu->cflags_next_tb ever be anything else? It is consumed in cpu_exec() and it can only be reset if we have executed some instructions which resulted in some sort of helper call that set it for the next TB. >>> + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) >> Why check use_icount here? The cflags should always have >> CF_USE_ICOUNT >> set when icount is enabled. Lets not over complicate the inverted || >> tests we have here. > > Not really. Right this is were the logic gets complicated to follow. Are we dealing with icount cases or non-icount cases or some mixture of both? > Sometimes we use non-icount blocks in icount mode. > But AFAIR they are used only for triggering the exeptions, but not for > real execution. Right so tcg_cpu_init_cflags ensures CF_USE_ICOUNT is set for all blocks when use_icount() in enabled except the one special case during exception replay where we suppress it: #ifndef CONFIG_USER_ONLY if (replay_has_exception() && cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra == 0) { /* Execute just one insn to trigger exception pending in the log */ cpu->cflags_next_tb = (curr_cflags(cpu) & ~CF_USE_ICOUNT) | 1; } #endif which still slightly scrambles my brain because does that affect the final updating of icount_get_executed() or do we "loose" the instruction in that case. > >> >>> + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) >>> { >> Is u16.low ever set when icount isn't enabled? > > This condition is checked for icount mode only. > u16.low is not used without icount. > >> >>> + /* >>> + * icount is disabled or there are enough instructions >>> + * in the budget, do not retranslate this block with >>> + * different parameters. >>> + */ >>> + cpu->cflags_next_tb = tb->cflags; Technically this isn't what cpu->cflags_next_tb used to be because the eventual tb->cflags might get tweaked by various conditions in tb_gen_code(). It seems to me what we really need is a clear unambiguous indication from cpu_tb_exec() that the we have executed nothing apart from the initial preamble generated by gen_tb_start(). If we have advanced beyond that point it would never be valid to restore the cflag state form the TB. Richard, what do you think?
Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: > When debugging with the watchpoints, qemu may need to create > TB with single instruction. This is achieved by setting cpu->cflags_next_tb. > But when this block is about to execute, it may be interrupted by another > thread. In this case cflags will be lost and next executed TB will not > be the special one. > This patch checks TB exit reason and restores cflags_next_tb to allow > finding the interrupted block. How about this alternative? --8<---------------cut here---------------start------------->8--- accel/tcg: suppress IRQ check for special TBs Generally when we set cpu->cflags_next_tb it is because we want to carefully control the execution of the next TB. Currently there is a race that causes cflags_next_tb to get ignored if an IRQ is processed before we execute any actual instructions. To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress this check in the generated code so we know we will definitely execute the next block. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 3 files changed, 22 insertions(+), 3 deletions(-) include/exec/exec-all.h | 1 + include/exec/gen-icount.h | 19 ++++++++++++++++--- accel/tcg/cpu-exec.c | 5 +++++ modified include/exec/exec-all.h @@ -503,6 +503,7 @@ struct TranslationBlock { #define CF_USE_ICOUNT 0x00020000 #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ #define CF_CLUSTER_SHIFT 24 modified include/exec/gen-icount.h @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) { TCGv_i32 count; - tcg_ctx->exitreq_label = gen_new_label(); if (tb_cflags(tb) & CF_USE_ICOUNT) { count = tcg_temp_local_new_i32(); } else { @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) icount_start_insn = tcg_last_op(); } - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); + /* + * Emit the check against icount_decr.u32 to see if we should exit + * unless we suppress the check with CF_NOIRQ. If we are using + * icount and have suppressed interruption the higher level code + * should have ensured we don't run more instructions than the + * budget. + */ + if (tb_cflags(tb) & CF_NOIRQ) { + tcg_ctx->exitreq_label = NULL; + } else { + tcg_ctx->exitreq_label = gen_new_label(); + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); + } if (tb_cflags(tb) & CF_USE_ICOUNT) { tcg_gen_st16_i32(count, cpu_env, @@ -74,7 +85,9 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) tcgv_i32_arg(tcg_constant_i32(num_insns))); } - gen_set_label(tcg_ctx->exitreq_label); + if (tcg_ctx->exitreq_label) { + gen_set_label(tcg_ctx->exitreq_label); + } tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); } modified accel/tcg/cpu-exec.c @@ -954,11 +954,16 @@ int cpu_exec(CPUState *cpu) * after-access watchpoints. Since this request should never * have CF_INVALID set, -1 is a convenient invalid value that * does not require tcg headers for cpu_common_reset. + * + * As we don't want this special TB being interrupted by + * some sort of asynchronous event we apply CF_NOIRQ to + * disable the usual event checking. */ cflags = cpu->cflags_next_tb; if (cflags == -1) { cflags = curr_cflags(cpu); } else { + cflags |= CF_NOIRQ; cpu->cflags_next_tb = -1; } --8<---------------cut here---------------end--------------->8--- > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> > --- > accel/tcg/cpu-exec.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 2d14d02f6c..df12452b8f 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, > * cpu_handle_interrupt. cpu_handle_interrupt will also > * clear cpu->icount_decr.u16.high. > */ > + if (cpu->cflags_next_tb == -1 > + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) > + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) { > + /* > + * icount is disabled or there are enough instructions > + * in the budget, do not retranslate this block with > + * different parameters. > + */ > + cpu->cflags_next_tb = tb->cflags; > + } > return; > } >
On 11/17/21 10:47 AM, Alex Bennée wrote: > - gen_set_label(tcg_ctx->exitreq_label); > + if (tcg_ctx->exitreq_label) { > + gen_set_label(tcg_ctx->exitreq_label); > + } > tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); The exit_tb is also not reachable, and should go in with the label. > } > > modified accel/tcg/cpu-exec.c > @@ -954,11 +954,16 @@ int cpu_exec(CPUState *cpu) > * after-access watchpoints. Since this request should never > * have CF_INVALID set, -1 is a convenient invalid value that > * does not require tcg headers for cpu_common_reset. > + * > + * As we don't want this special TB being interrupted by > + * some sort of asynchronous event we apply CF_NOIRQ to > + * disable the usual event checking. > */ > cflags = cpu->cflags_next_tb; > if (cflags == -1) { > cflags = curr_cflags(cpu); > } else { > + cflags |= CF_NOIRQ; > cpu->cflags_next_tb = -1; > } Still missing something to avoid cpu_handle_interrupt firing? r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 11/17/21 10:47 AM, Alex Bennée wrote: >> - gen_set_label(tcg_ctx->exitreq_label); >> + if (tcg_ctx->exitreq_label) { >> + gen_set_label(tcg_ctx->exitreq_label); >> + } >> tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); > > The exit_tb is also not reachable, and should go in with the label. ok > >> } >> modified accel/tcg/cpu-exec.c >> @@ -954,11 +954,16 @@ int cpu_exec(CPUState *cpu) >> * after-access watchpoints. Since this request should never >> * have CF_INVALID set, -1 is a convenient invalid value that >> * does not require tcg headers for cpu_common_reset. >> + * >> + * As we don't want this special TB being interrupted by >> + * some sort of asynchronous event we apply CF_NOIRQ to >> + * disable the usual event checking. >> */ >> cflags = cpu->cflags_next_tb; >> if (cflags == -1) { >> cflags = curr_cflags(cpu); >> } else { >> + cflags |= CF_NOIRQ; >> cpu->cflags_next_tb = -1; >> } > > Still missing something to avoid cpu_handle_interrupt firing? Something as simple as: --8<---------------cut here---------------start------------->8--- modified accel/tcg/cpu-exec.c @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int interrupt_request) static inline bool cpu_handle_interrupt(CPUState *cpu, TranslationBlock **last_tb) { + /* + * If we have special cflags lets not get distracted with IRQs. We + * shall exit the loop as soon as the next TB completes what it + * needs to do. + */ + if (cpu->cflags_next_tb != -1) { + return false; + } + /* Clear the interrupt flag now since we're processing * cpu->interrupt_request and cpu->exit_request. --8<---------------cut here---------------end--------------->8--- ? > > > r~
On 11/17/21 11:29 AM, Alex Bennée wrote: >> Still missing something to avoid cpu_handle_interrupt firing? > > Something as simple as: > > --8<---------------cut here---------------start------------->8--- > modified accel/tcg/cpu-exec.c > @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int interrupt_request) > static inline bool cpu_handle_interrupt(CPUState *cpu, > TranslationBlock **last_tb) > { > + /* > + * If we have special cflags lets not get distracted with IRQs. We > + * shall exit the loop as soon as the next TB completes what it > + * needs to do. > + */ > + if (cpu->cflags_next_tb != -1) { > + return false; > + } > + > /* Clear the interrupt flag now since we're processing > * cpu->interrupt_request and cpu->exit_request. > --8<---------------cut here---------------end--------------->8--- > > ? Yeah, that should do it. r~
On 17.11.2021 12:47, Alex Bennée wrote: > > Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes: > >> When debugging with the watchpoints, qemu may need to create >> TB with single instruction. This is achieved by setting cpu->cflags_next_tb. >> But when this block is about to execute, it may be interrupted by another >> thread. In this case cflags will be lost and next executed TB will not >> be the special one. >> This patch checks TB exit reason and restores cflags_next_tb to allow >> finding the interrupted block. > > How about this alternative? I checked all cflags_next_tb assignments. Looks that this variant should work. > > --8<---------------cut here---------------start------------->8--- > accel/tcg: suppress IRQ check for special TBs > > Generally when we set cpu->cflags_next_tb it is because we want to > carefully control the execution of the next TB. Currently there is a > race that causes cflags_next_tb to get ignored if an IRQ is processed > before we execute any actual instructions. > > To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress > this check in the generated code so we know we will definitely execute > the next block. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 > > 3 files changed, 22 insertions(+), 3 deletions(-) > include/exec/exec-all.h | 1 + > include/exec/gen-icount.h | 19 ++++++++++++++++--- > accel/tcg/cpu-exec.c | 5 +++++ > > modified include/exec/exec-all.h > @@ -503,6 +503,7 @@ struct TranslationBlock { > #define CF_USE_ICOUNT 0x00020000 > #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ > #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ > +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ > #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ > #define CF_CLUSTER_SHIFT 24 > > modified include/exec/gen-icount.h > @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) > { > TCGv_i32 count; > > - tcg_ctx->exitreq_label = gen_new_label(); > if (tb_cflags(tb) & CF_USE_ICOUNT) { > count = tcg_temp_local_new_i32(); > } else { > @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) > icount_start_insn = tcg_last_op(); > } > > - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); > + /* > + * Emit the check against icount_decr.u32 to see if we should exit > + * unless we suppress the check with CF_NOIRQ. If we are using > + * icount and have suppressed interruption the higher level code > + * should have ensured we don't run more instructions than the > + * budget. > + */ > + if (tb_cflags(tb) & CF_NOIRQ) { > + tcg_ctx->exitreq_label = NULL; > + } else { > + tcg_ctx->exitreq_label = gen_new_label(); > + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); > + } > > if (tb_cflags(tb) & CF_USE_ICOUNT) { > tcg_gen_st16_i32(count, cpu_env, > @@ -74,7 +85,9 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) > tcgv_i32_arg(tcg_constant_i32(num_insns))); > } > > - gen_set_label(tcg_ctx->exitreq_label); > + if (tcg_ctx->exitreq_label) { > + gen_set_label(tcg_ctx->exitreq_label); > + } > tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); > } > > modified accel/tcg/cpu-exec.c > @@ -954,11 +954,16 @@ int cpu_exec(CPUState *cpu) > * after-access watchpoints. Since this request should never > * have CF_INVALID set, -1 is a convenient invalid value that > * does not require tcg headers for cpu_common_reset. > + * > + * As we don't want this special TB being interrupted by > + * some sort of asynchronous event we apply CF_NOIRQ to > + * disable the usual event checking. > */ > cflags = cpu->cflags_next_tb; > if (cflags == -1) { > cflags = curr_cflags(cpu); > } else { > + cflags |= CF_NOIRQ; > cpu->cflags_next_tb = -1; > } > > --8<---------------cut here---------------end--------------->8--- > >> >> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> >> --- >> accel/tcg/cpu-exec.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >> index 2d14d02f6c..df12452b8f 100644 >> --- a/accel/tcg/cpu-exec.c >> +++ b/accel/tcg/cpu-exec.c >> @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, >> * cpu_handle_interrupt. cpu_handle_interrupt will also >> * clear cpu->icount_decr.u16.high. >> */ >> + if (cpu->cflags_next_tb == -1 >> + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) >> + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) { >> + /* >> + * icount is disabled or there are enough instructions >> + * in the budget, do not retranslate this block with >> + * different parameters. >> + */ >> + cpu->cflags_next_tb = tb->cflags; >> + } >> return; >> } >> > >
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 2d14d02f6c..df12452b8f 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -846,6 +846,16 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb, * cpu_handle_interrupt. cpu_handle_interrupt will also * clear cpu->icount_decr.u16.high. */ + if (cpu->cflags_next_tb == -1 + && (!use_icount || !(tb->cflags & CF_USE_ICOUNT) + || cpu_neg(cpu)->icount_decr.u16.low >= tb->icount)) { + /* + * icount is disabled or there are enough instructions + * in the budget, do not retranslate this block with + * different parameters. + */ + cpu->cflags_next_tb = tb->cflags; + } return; }
When debugging with the watchpoints, qemu may need to create TB with single instruction. This is achieved by setting cpu->cflags_next_tb. But when this block is about to execute, it may be interrupted by another thread. In this case cflags will be lost and next executed TB will not be the special one. This patch checks TB exit reason and restores cflags_next_tb to allow finding the interrupted block. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> --- accel/tcg/cpu-exec.c | 10 ++++++++++ 1 file changed, 10 insertions(+)