Message ID | 20190616191900.GH61734@hippo.sing.id.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | atomic failures on qemu-system-riscv64 | expand |
Patchew URL: https://patchew.org/QEMU/20190616191900.GH61734@hippo.sing.id.au/ Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Qemu-devel] [PATCH] atomic failures on qemu-system-riscv64 Type: series Message-id: 20190616191900.GH61734@hippo.sing.id.au === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === From https://github.com/patchew-project/qemu * [new tag] patchew/20190616191900.GH61734@hippo.sing.id.au -> patchew/20190616191900.GH61734@hippo.sing.id.au Switched to a new branch 'test' b582c6f8fc atomic failures on qemu-system-riscv64 === OUTPUT BEGIN === WARNING: Block comments use a leading /* on a separate line #81: FILE: target/riscv/cpu_helper.c:136: + /* Clear the load reservation - otherwise a reservation placed in one ERROR: Missing Signed-off-by: line(s) total: 1 errors, 1 warnings, 35 lines checked Commit b582c6f8fccc (atomic failures on qemu-system-riscv64) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20190616191900.GH61734@hippo.sing.id.au/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On 6/16/19 12:19 PM, Joel Sing wrote: > + /* > + * Clear the load reservation, since an SC must fail if there is > + * an SC to any address, in between an LR and SC pair. > + */ > + tcg_gen_movi_tl(load_res, 0); > + > gen_set_label(l2); This clear needs to be moved down below label l2. Otherwise, with lr / sc / sc, the second sc could succeed in error. FWIW, other targets have used -1 as the "invalid" load reservation, since the architecture does not require address 0 to be unmapped. This should be quite visible in M-mode with paging disabled and ram at offset 0. Often, other targets require alignment for the lr/sc address, though I don't see that for riscv. r~
On Mon, Jun 17, 2019 at 4:53 PM Richard Henderson < richard.henderson@linaro.org> wrote: > On 6/16/19 12:19 PM, Joel Sing wrote: > > + /* > > + * Clear the load reservation, since an SC must fail if there is > > + * an SC to any address, in between an LR and SC pair. > > + */ > > + tcg_gen_movi_tl(load_res, 0); > > + > > gen_set_label(l2); > > This clear needs to be moved down below label l2. > Otherwise, with lr / sc / sc, the second sc could succeed in error. > > FWIW, other targets have used -1 as the "invalid" load reservation, since > the > architecture does not require address 0 to be unmapped. This should be > quite > visible in M-mode with paging disabled and ram at offset 0. Often, other > targets require alignment for the lr/sc address, though I don't see that > for riscv. > > > r~ > Joel: can you submit this with a "Signed-off-by" line? There's only a week until the soft freeze, and I'd really like this in if possible. It'd be great if you could fix up the other issues, but I can't even do that myself without a SOB.
On 19-06-17 16:52:44, Richard Henderson wrote: > On 6/16/19 12:19 PM, Joel Sing wrote: > > + /* > > + * Clear the load reservation, since an SC must fail if there is > > + * an SC to any address, in between an LR and SC pair. > > + */ > > + tcg_gen_movi_tl(load_res, 0); > > + > > gen_set_label(l2); > > This clear needs to be moved down below label l2. > Otherwise, with lr / sc / sc, the second sc could succeed in error. Indeed, thanks. > FWIW, other targets have used -1 as the "invalid" load reservation, since the > architecture does not require address 0 to be unmapped. This should be quite > visible in M-mode with paging disabled and ram at offset 0. Often, other > targets require alignment for the lr/sc address, though I don't see that for riscv. I've switched to -1 as suggested. Regarding the alignment for reservations, the specification does require this, although I do not recall seeing any enforcement of this by qemu itself. New diff follows. From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 From: Joel Sing <joel@sing.id.au> Date: Tue, 25 Jun 2019 02:44:24 +1000 Subject: [PATCH] Clear load reservations on qemu riscv target This prevents a load reservation from being placed in one context/process, then being used in another, resulting in an SC succeeding incorrectly and breaking atomics. Signed-off-by: Joel Sing <joel@sing.id.au> --- target/riscv/cpu.c | 1 + target/riscv/cpu_helper.c | 9 +++++++++ target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++- 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index d61bce6d55..e7c8bf48fc 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs) env->pc = env->resetvec; #endif cs->exception_index = EXCP_NONE; + env->load_res = -1; set_default_nan_mode(1, &env->fp_status); } diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index b17f169681..6a07b12e65 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) } /* tlb_flush is unnecessary as mode is contained in mmu_idx */ env->priv = newpriv; + + /* Clear the load reservation - otherwise a reservation placed in one + * context/process can be used by another, resulting in an SC succeeding + * incorrectly. Version 2.2 of the ISA specification explicitly requires + * this behaviour, while later revisions say that the kernel "should" use + * an SC instruction to force the yielding of a load reservation on a + * preemptive context switch. As a result, do both. + */ + env->load_res = -1; } /* get_physical_address - get the physical address for this virtual address diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c index f6dbbc065e..fadd88849e 100644 --- a/target/riscv/insn_trans/trans_rva.inc.c +++ b/target/riscv/insn_trans/trans_rva.inc.c @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) gen_set_label(l1); /* - * Address comparion failure. However, we still need to + * Address comparison failure. However, we still need to * provide the memory barrier implied by AQ/RL. */ tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) gen_set_gpr(a->rd, dat); gen_set_label(l2); + /* + * Clear the load reservation, since an SC must fail if there is + * an SC to any address, in between an LR and SC pair. + */ + tcg_gen_movi_tl(load_res, -1); + tcg_temp_free(dat); tcg_temp_free(src1); tcg_temp_free(src2);
On 6/24/19 8:08 PM, Joel Sing wrote: > Regarding the alignment for reservations, the > specification does require this, although I do not recall seeing any enforcement > of this by qemu itself. Ah, I see it now. Enforcement begins here: static bool trans_lr_w(DisasContext *ctx, arg_lr_w *a) { REQUIRE_EXT(ctx, RVA); return gen_lr(ctx, a, (MO_ALIGN | MO_TESL)); ^^^^^^^^ This will force softmmu (but notably not linux-user; a design limitation) to generate an alignment fault for an unaligned address. r~
On 6/24/19 8:08 PM, Joel Sing wrote: > From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 > From: Joel Sing <joel@sing.id.au> > Date: Tue, 25 Jun 2019 02:44:24 +1000 > Subject: [PATCH] Clear load reservations on qemu riscv target > > This prevents a load reservation from being placed in one context/process, > then being used in another, resulting in an SC succeeding incorrectly and > breaking atomics. > > Signed-off-by: Joel Sing <joel@sing.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > + /* Clear the load reservation - otherwise a reservation placed in one Excepting this line, which will fail checkpatch.pl. Needs to be formatted as /* * Clear the load... r~
On Tue, 25 Jun 2019 08:36:28 PDT (-0700), richard.henderson@linaro.org wrote: > On 6/24/19 8:08 PM, Joel Sing wrote: >> Regarding the alignment for reservations, the >> specification does require this, although I do not recall seeing any enforcement >> of this by qemu itself. > > Ah, I see it now. Enforcement begins here: > > static bool trans_lr_w(DisasContext *ctx, arg_lr_w *a) > { > REQUIRE_EXT(ctx, RVA); > return gen_lr(ctx, a, (MO_ALIGN | MO_TESL)); > ^^^^^^^^ > > This will force softmmu (but notably not linux-user; a design limitation) to > generate an alignment fault for an unaligned address. That was probably correct at the time the code went in, as the ISA used to allow these to succeed but not be atomic. No implementations did this, so as part of the ratification process we just mandated that unaligned atomics always trap. Is there a better way to fix this than just doing the alignment check explicitly?
On Tue, 25 Jun 2019 08:39:21 PDT (-0700), richard.henderson@linaro.org wrote: > On 6/24/19 8:08 PM, Joel Sing wrote: >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 >> From: Joel Sing <joel@sing.id.au> >> Date: Tue, 25 Jun 2019 02:44:24 +1000 >> Subject: [PATCH] Clear load reservations on qemu riscv target >> >> This prevents a load reservation from being placed in one context/process, >> then being used in another, resulting in an SC succeeding incorrectly and >> breaking atomics. >> >> Signed-off-by: Joel Sing <joel@sing.id.au> > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > >> + /* Clear the load reservation - otherwise a reservation placed in one > > Excepting this line, which will fail checkpatch.pl. > Needs to be formatted as > > /* > * Clear the load... Thanks. I fixed that one up as I pulled the patch in. > > > r~
On 6/26/19 8:07 AM, Palmer Dabbelt wrote: > On Tue, 25 Jun 2019 08:36:28 PDT (-0700), richard.henderson@linaro.org wrote: >> On 6/24/19 8:08 PM, Joel Sing wrote: >>> Regarding the alignment for reservations, the >>> specification does require this, although I do not recall seeing any >>> enforcement >>> of this by qemu itself. >> >> Ah, I see it now. Enforcement begins here: >> >> static bool trans_lr_w(DisasContext *ctx, arg_lr_w *a) >> { >> REQUIRE_EXT(ctx, RVA); >> return gen_lr(ctx, a, (MO_ALIGN | MO_TESL)); >> ^^^^^^^^ >> >> This will force softmmu (but notably not linux-user; a design limitation) to >> generate an alignment fault for an unaligned address. > > That was probably correct at the time the code went in, as the ISA used to > allow these to succeed but not be atomic. No implementations did this, so as > part of the ratification process we just mandated that unaligned atomics always > trap. > > Is there a better way to fix this than just doing the alignment check > explicitly? You misunderstand. The code is exactly correct as-is. The alignment check happens implicitly as a part of the softmmu tlb resolution. r~
On Wed, 26 Jun 2019 00:48:51 PDT (-0700), richard.henderson@linaro.org wrote: > On 6/26/19 8:07 AM, Palmer Dabbelt wrote: >> On Tue, 25 Jun 2019 08:36:28 PDT (-0700), richard.henderson@linaro.org wrote: >>> On 6/24/19 8:08 PM, Joel Sing wrote: >>>> Regarding the alignment for reservations, the >>>> specification does require this, although I do not recall seeing any >>>> enforcement >>>> of this by qemu itself. >>> >>> Ah, I see it now. Enforcement begins here: >>> >>> static bool trans_lr_w(DisasContext *ctx, arg_lr_w *a) >>> { >>> REQUIRE_EXT(ctx, RVA); >>> return gen_lr(ctx, a, (MO_ALIGN | MO_TESL)); >>> ^^^^^^^^ >>> >>> This will force softmmu (but notably not linux-user; a design limitation) to >>> generate an alignment fault for an unaligned address. >> >> That was probably correct at the time the code went in, as the ISA used to >> allow these to succeed but not be atomic. No implementations did this, so as >> part of the ratification process we just mandated that unaligned atomics always >> trap. >> >> Is there a better way to fix this than just doing the alignment check >> explicitly? > > You misunderstand. The code is exactly correct as-is. The alignment check > happens implicitly as a part of the softmmu tlb resolution. Sorry, I thought you said it wasn't happening for linux-user? If it happens for both then we're good.
On 6/26/19 10:25 AM, Palmer Dabbelt wrote: >> You misunderstand. The code is exactly correct as-is. The alignment check >> happens implicitly as a part of the softmmu tlb resolution. > > Sorry, I thought you said it wasn't happening for linux-user? If it happens > for both then we're good. Oh, that. No, there's no better way for linux-user. Though honestly, if we're going to fix this, it should be done in tcg/* rather than adding hacks within target/riscv. r~
On Wed, 26 Jun 2019 01:30:35 PDT (-0700), richard.henderson@linaro.org wrote: > On 6/26/19 10:25 AM, Palmer Dabbelt wrote: >>> You misunderstand. The code is exactly correct as-is. The alignment check >>> happens implicitly as a part of the softmmu tlb resolution. >> >> Sorry, I thought you said it wasn't happening for linux-user? If it happens >> for both then we're good. > > Oh, that. No, there's no better way for linux-user. Though honestly, if we're > going to fix this, it should be done in tcg/* rather than adding hacks within > target/riscv. OK, I'm fine punting on it for now. If I ever get time to fix it up then we'll attempt to do so genericly. Thanks!
On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote: > > On 19-06-17 16:52:44, Richard Henderson wrote: > > On 6/16/19 12:19 PM, Joel Sing wrote: > > > + /* > > > + * Clear the load reservation, since an SC must fail if there is > > > + * an SC to any address, in between an LR and SC pair. > > > + */ > > > + tcg_gen_movi_tl(load_res, 0); > > > + > > > gen_set_label(l2); > > > > This clear needs to be moved down below label l2. > > Otherwise, with lr / sc / sc, the second sc could succeed in error. > > Indeed, thanks. > > > FWIW, other targets have used -1 as the "invalid" load reservation, since the > > architecture does not require address 0 to be unmapped. This should be quite > > visible in M-mode with paging disabled and ram at offset 0. Often, other > > targets require alignment for the lr/sc address, though I don't see that for riscv. > > I've switched to -1 as suggested. Regarding the alignment for reservations, the > specification does require this, although I do not recall seeing any enforcement > of this by qemu itself. > > New diff follows. > > From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 > From: Joel Sing <joel@sing.id.au> > Date: Tue, 25 Jun 2019 02:44:24 +1000 > Subject: [PATCH] Clear load reservations on qemu riscv target > > This prevents a load reservation from being placed in one context/process, > then being used in another, resulting in an SC succeeding incorrectly and > breaking atomics. > > Signed-off-by: Joel Sing <joel@sing.id.au> > --- > target/riscv/cpu.c | 1 + > target/riscv/cpu_helper.c | 9 +++++++++ > target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++- > 3 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index d61bce6d55..e7c8bf48fc 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs) > env->pc = env->resetvec; > #endif > cs->exception_index = EXCP_NONE; > + env->load_res = -1; > set_default_nan_mode(1, &env->fp_status); > } > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > index b17f169681..6a07b12e65 100644 > --- a/target/riscv/cpu_helper.c > +++ b/target/riscv/cpu_helper.c > @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) > } > /* tlb_flush is unnecessary as mode is contained in mmu_idx */ > env->priv = newpriv; > + > + /* Clear the load reservation - otherwise a reservation placed in one > + * context/process can be used by another, resulting in an SC succeeding > + * incorrectly. Version 2.2 of the ISA specification explicitly requires > + * this behaviour, while later revisions say that the kernel "should" use > + * an SC instruction to force the yielding of a load reservation on a > + * preemptive context switch. As a result, do both. > + */ > + env->load_res = -1; > } > > /* get_physical_address - get the physical address for this virtual address > diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c > index f6dbbc065e..fadd88849e 100644 > --- a/target/riscv/insn_trans/trans_rva.inc.c > +++ b/target/riscv/insn_trans/trans_rva.inc.c > @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > > gen_set_label(l1); > /* > - * Address comparion failure. However, we still need to > + * Address comparison failure. However, we still need to > * provide the memory barrier implied by AQ/RL. > */ > tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); > @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > gen_set_gpr(a->rd, dat); > > gen_set_label(l2); > + /* > + * Clear the load reservation, since an SC must fail if there is > + * an SC to any address, in between an LR and SC pair. > + */ > + tcg_gen_movi_tl(load_res, -1); > + > tcg_temp_free(dat); > tcg_temp_free(src1); > tcg_temp_free(src2); > -- This patch causes boot failures when booting systemd built with musl on RV64. It could possibly be a musl bug, but I wanted to throw that out here first to see what people think. Alistair > 2.21.0 > >
On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistair23@gmail.com wrote: > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote: >> >> On 19-06-17 16:52:44, Richard Henderson wrote: >> > On 6/16/19 12:19 PM, Joel Sing wrote: >> > > + /* >> > > + * Clear the load reservation, since an SC must fail if there is >> > > + * an SC to any address, in between an LR and SC pair. >> > > + */ >> > > + tcg_gen_movi_tl(load_res, 0); >> > > + >> > > gen_set_label(l2); >> > >> > This clear needs to be moved down below label l2. >> > Otherwise, with lr / sc / sc, the second sc could succeed in error. >> >> Indeed, thanks. >> >> > FWIW, other targets have used -1 as the "invalid" load reservation, since the >> > architecture does not require address 0 to be unmapped. This should be quite >> > visible in M-mode with paging disabled and ram at offset 0. Often, other >> > targets require alignment for the lr/sc address, though I don't see that for riscv. >> >> I've switched to -1 as suggested. Regarding the alignment for reservations, the >> specification does require this, although I do not recall seeing any enforcement >> of this by qemu itself. >> >> New diff follows. >> >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 >> From: Joel Sing <joel@sing.id.au> >> Date: Tue, 25 Jun 2019 02:44:24 +1000 >> Subject: [PATCH] Clear load reservations on qemu riscv target >> >> This prevents a load reservation from being placed in one context/process, >> then being used in another, resulting in an SC succeeding incorrectly and >> breaking atomics. >> >> Signed-off-by: Joel Sing <joel@sing.id.au> >> --- >> target/riscv/cpu.c | 1 + >> target/riscv/cpu_helper.c | 9 +++++++++ >> target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++- >> 3 files changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c >> index d61bce6d55..e7c8bf48fc 100644 >> --- a/target/riscv/cpu.c >> +++ b/target/riscv/cpu.c >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs) >> env->pc = env->resetvec; >> #endif >> cs->exception_index = EXCP_NONE; >> + env->load_res = -1; >> set_default_nan_mode(1, &env->fp_status); >> } >> >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c >> index b17f169681..6a07b12e65 100644 >> --- a/target/riscv/cpu_helper.c >> +++ b/target/riscv/cpu_helper.c >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) >> } >> /* tlb_flush is unnecessary as mode is contained in mmu_idx */ >> env->priv = newpriv; >> + >> + /* Clear the load reservation - otherwise a reservation placed in one >> + * context/process can be used by another, resulting in an SC succeeding >> + * incorrectly. Version 2.2 of the ISA specification explicitly requires >> + * this behaviour, while later revisions say that the kernel "should" use >> + * an SC instruction to force the yielding of a load reservation on a >> + * preemptive context switch. As a result, do both. >> + */ >> + env->load_res = -1; >> } >> >> /* get_physical_address - get the physical address for this virtual address >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c >> index f6dbbc065e..fadd88849e 100644 >> --- a/target/riscv/insn_trans/trans_rva.inc.c >> +++ b/target/riscv/insn_trans/trans_rva.inc.c >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) >> >> gen_set_label(l1); >> /* >> - * Address comparion failure. However, we still need to >> + * Address comparison failure. However, we still need to >> * provide the memory barrier implied by AQ/RL. >> */ >> tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) >> gen_set_gpr(a->rd, dat); >> >> gen_set_label(l2); >> + /* >> + * Clear the load reservation, since an SC must fail if there is >> + * an SC to any address, in between an LR and SC pair. >> + */ >> + tcg_gen_movi_tl(load_res, -1); >> + >> tcg_temp_free(dat); >> tcg_temp_free(src1); >> tcg_temp_free(src2); >> -- > > This patch causes boot failures when booting systemd built with musl on RV64. > > It could possibly be a musl bug, but I wanted to throw that out here > first to see what people think. Looking at the musl port, I see at least one bug in their atomics jumping out at me: diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h index c9765342..41ad4d04 100644 --- a/arch/riscv64/atomic_arch.h +++ b/arch/riscv64/atomic_arch.h @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s) " sc.w.aqrl %1, %4, (%2)\n" " bnez %1, 1b\n" "1:" - : "=&r"(old), "=r"(tmp) + : "=&r"(old), "=&r"(tmp) : "r"(p), "r"(t), "r"(s) : "memory"); return old; @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s) " sc.d.aqrl %1, %4, (%2)\n" " bnez %1, 1b\n" "1:" - : "=&r"(old), "=r"(tmp) + : "=&r"(old), "=&r"(tmp) : "r"(p), "r"(t), "r"(s) : "memory"); return old; It's a shot in the dark as to whether that'll fix your bug, but I could at least see a mechanism for it: before we yielded load reservations on context switches then that backwards branch would never be taken, so we wouldn't notice if tmp was allocated into one of the same registers as the inputs. Even if it doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so I can indicate how important the issue is. This should manifest on hardware, but it looks like we managed to drop that SC patch. I'll go send the patch out now... > > Alistair > >> 2.21.0 >> >>
On Tue, Sep 24, 2019 at 1:04 PM Palmer Dabbelt <palmer@sifive.com> wrote: > > On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistair23@gmail.com wrote: > > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote: > >> > >> On 19-06-17 16:52:44, Richard Henderson wrote: > >> > On 6/16/19 12:19 PM, Joel Sing wrote: > >> > > + /* > >> > > + * Clear the load reservation, since an SC must fail if there is > >> > > + * an SC to any address, in between an LR and SC pair. > >> > > + */ > >> > > + tcg_gen_movi_tl(load_res, 0); > >> > > + > >> > > gen_set_label(l2); > >> > > >> > This clear needs to be moved down below label l2. > >> > Otherwise, with lr / sc / sc, the second sc could succeed in error. > >> > >> Indeed, thanks. > >> > >> > FWIW, other targets have used -1 as the "invalid" load reservation, since the > >> > architecture does not require address 0 to be unmapped. This should be quite > >> > visible in M-mode with paging disabled and ram at offset 0. Often, other > >> > targets require alignment for the lr/sc address, though I don't see that for riscv. > >> > >> I've switched to -1 as suggested. Regarding the alignment for reservations, the > >> specification does require this, although I do not recall seeing any enforcement > >> of this by qemu itself. > >> > >> New diff follows. > >> > >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 > >> From: Joel Sing <joel@sing.id.au> > >> Date: Tue, 25 Jun 2019 02:44:24 +1000 > >> Subject: [PATCH] Clear load reservations on qemu riscv target > >> > >> This prevents a load reservation from being placed in one context/process, > >> then being used in another, resulting in an SC succeeding incorrectly and > >> breaking atomics. > >> > >> Signed-off-by: Joel Sing <joel@sing.id.au> > >> --- > >> target/riscv/cpu.c | 1 + > >> target/riscv/cpu_helper.c | 9 +++++++++ > >> target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++- > >> 3 files changed, 17 insertions(+), 1 deletion(-) > >> > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > >> index d61bce6d55..e7c8bf48fc 100644 > >> --- a/target/riscv/cpu.c > >> +++ b/target/riscv/cpu.c > >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs) > >> env->pc = env->resetvec; > >> #endif > >> cs->exception_index = EXCP_NONE; > >> + env->load_res = -1; > >> set_default_nan_mode(1, &env->fp_status); > >> } > >> > >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > >> index b17f169681..6a07b12e65 100644 > >> --- a/target/riscv/cpu_helper.c > >> +++ b/target/riscv/cpu_helper.c > >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) > >> } > >> /* tlb_flush is unnecessary as mode is contained in mmu_idx */ > >> env->priv = newpriv; > >> + > >> + /* Clear the load reservation - otherwise a reservation placed in one > >> + * context/process can be used by another, resulting in an SC succeeding > >> + * incorrectly. Version 2.2 of the ISA specification explicitly requires > >> + * this behaviour, while later revisions say that the kernel "should" use > >> + * an SC instruction to force the yielding of a load reservation on a > >> + * preemptive context switch. As a result, do both. > >> + */ > >> + env->load_res = -1; > >> } > >> > >> /* get_physical_address - get the physical address for this virtual address > >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c > >> index f6dbbc065e..fadd88849e 100644 > >> --- a/target/riscv/insn_trans/trans_rva.inc.c > >> +++ b/target/riscv/insn_trans/trans_rva.inc.c > >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > >> > >> gen_set_label(l1); > >> /* > >> - * Address comparion failure. However, we still need to > >> + * Address comparison failure. However, we still need to > >> * provide the memory barrier implied by AQ/RL. > >> */ > >> tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); > >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > >> gen_set_gpr(a->rd, dat); > >> > >> gen_set_label(l2); > >> + /* > >> + * Clear the load reservation, since an SC must fail if there is > >> + * an SC to any address, in between an LR and SC pair. > >> + */ > >> + tcg_gen_movi_tl(load_res, -1); > >> + > >> tcg_temp_free(dat); > >> tcg_temp_free(src1); > >> tcg_temp_free(src2); > >> -- > > > > This patch causes boot failures when booting systemd built with musl on RV64. > > > > It could possibly be a musl bug, but I wanted to throw that out here > > first to see what people think. > > Looking at the musl port, I see at least one bug in their atomics jumping out > at me: > > diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h > index c9765342..41ad4d04 100644 > --- a/arch/riscv64/atomic_arch.h > +++ b/arch/riscv64/atomic_arch.h > @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s) > " sc.w.aqrl %1, %4, (%2)\n" > " bnez %1, 1b\n" > "1:" > - : "=&r"(old), "=r"(tmp) > + : "=&r"(old), "=&r"(tmp) > : "r"(p), "r"(t), "r"(s) > : "memory"); > return old; > @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s) > " sc.d.aqrl %1, %4, (%2)\n" > " bnez %1, 1b\n" > "1:" > - : "=&r"(old), "=r"(tmp) > + : "=&r"(old), "=&r"(tmp) > : "r"(p), "r"(t), "r"(s) > : "memory"); > return old; > > It's a shot in the dark as to whether that'll fix your bug, but I could at > least see a mechanism for it: before we yielded load reservations on context > switches then that backwards branch would never be taken, so we wouldn't notice > if tmp was allocated into one of the same registers as the inputs. Even if it > doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so > I can indicate how important the issue is. I haven't had a chance to test this fix yet. The bug was reported by Khem (and other OE people) as it's break musl for RISC-V. > > This should manifest on hardware, but it looks like we managed to drop that SC > patch. I'll go send the patch out now... Thanks, do you mind CCing me? Alistair > > > > > Alistair > > > >> 2.21.0 > >> > >>
On Tue, Sep 24, 2019 at 4:35 PM Alistair Francis <alistair23@gmail.com> wrote: > > On Tue, Sep 24, 2019 at 1:04 PM Palmer Dabbelt <palmer@sifive.com> wrote: > > > > On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistair23@gmail.com wrote: > > > On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <joel@sing.id.au> wrote: > > >> > > >> On 19-06-17 16:52:44, Richard Henderson wrote: > > >> > On 6/16/19 12:19 PM, Joel Sing wrote: > > >> > > + /* > > >> > > + * Clear the load reservation, since an SC must fail if there is > > >> > > + * an SC to any address, in between an LR and SC pair. > > >> > > + */ > > >> > > + tcg_gen_movi_tl(load_res, 0); > > >> > > + > > >> > > gen_set_label(l2); > > >> > > > >> > This clear needs to be moved down below label l2. > > >> > Otherwise, with lr / sc / sc, the second sc could succeed in error. > > >> > > >> Indeed, thanks. > > >> > > >> > FWIW, other targets have used -1 as the "invalid" load reservation, since the > > >> > architecture does not require address 0 to be unmapped. This should be quite > > >> > visible in M-mode with paging disabled and ram at offset 0. Often, other > > >> > targets require alignment for the lr/sc address, though I don't see that for riscv. > > >> > > >> I've switched to -1 as suggested. Regarding the alignment for reservations, the > > >> specification does require this, although I do not recall seeing any enforcement > > >> of this by qemu itself. > > >> > > >> New diff follows. > > >> > > >> From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001 > > >> From: Joel Sing <joel@sing.id.au> > > >> Date: Tue, 25 Jun 2019 02:44:24 +1000 > > >> Subject: [PATCH] Clear load reservations on qemu riscv target > > >> > > >> This prevents a load reservation from being placed in one context/process, > > >> then being used in another, resulting in an SC succeeding incorrectly and > > >> breaking atomics. > > >> > > >> Signed-off-by: Joel Sing <joel@sing.id.au> > > >> --- > > >> target/riscv/cpu.c | 1 + > > >> target/riscv/cpu_helper.c | 9 +++++++++ > > >> target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++- > > >> 3 files changed, 17 insertions(+), 1 deletion(-) > > >> > > >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > > >> index d61bce6d55..e7c8bf48fc 100644 > > >> --- a/target/riscv/cpu.c > > >> +++ b/target/riscv/cpu.c > > >> @@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs) > > >> env->pc = env->resetvec; > > >> #endif > > >> cs->exception_index = EXCP_NONE; > > >> + env->load_res = -1; > > >> set_default_nan_mode(1, &env->fp_status); > > >> } > > >> > > >> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > > >> index b17f169681..6a07b12e65 100644 > > >> --- a/target/riscv/cpu_helper.c > > >> +++ b/target/riscv/cpu_helper.c > > >> @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) > > >> } > > >> /* tlb_flush is unnecessary as mode is contained in mmu_idx */ > > >> env->priv = newpriv; > > >> + > > >> + /* Clear the load reservation - otherwise a reservation placed in one > > >> + * context/process can be used by another, resulting in an SC succeeding > > >> + * incorrectly. Version 2.2 of the ISA specification explicitly requires > > >> + * this behaviour, while later revisions say that the kernel "should" use > > >> + * an SC instruction to force the yielding of a load reservation on a > > >> + * preemptive context switch. As a result, do both. > > >> + */ > > >> + env->load_res = -1; > > >> } > > >> > > >> /* get_physical_address - get the physical address for this virtual address > > >> diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c > > >> index f6dbbc065e..fadd88849e 100644 > > >> --- a/target/riscv/insn_trans/trans_rva.inc.c > > >> +++ b/target/riscv/insn_trans/trans_rva.inc.c > > >> @@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > > >> > > >> gen_set_label(l1); > > >> /* > > >> - * Address comparion failure. However, we still need to > > >> + * Address comparison failure. However, we still need to > > >> * provide the memory barrier implied by AQ/RL. > > >> */ > > >> tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); > > >> @@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) > > >> gen_set_gpr(a->rd, dat); > > >> > > >> gen_set_label(l2); > > >> + /* > > >> + * Clear the load reservation, since an SC must fail if there is > > >> + * an SC to any address, in between an LR and SC pair. > > >> + */ > > >> + tcg_gen_movi_tl(load_res, -1); > > >> + > > >> tcg_temp_free(dat); > > >> tcg_temp_free(src1); > > >> tcg_temp_free(src2); > > >> -- > > > > > > This patch causes boot failures when booting systemd built with musl on RV64. > > > > > > It could possibly be a musl bug, but I wanted to throw that out here > > > first to see what people think. > > > > Looking at the musl port, I see at least one bug in their atomics jumping out > > at me: > > > > diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h > > index c9765342..41ad4d04 100644 > > --- a/arch/riscv64/atomic_arch.h > > +++ b/arch/riscv64/atomic_arch.h > > @@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s) > > " sc.w.aqrl %1, %4, (%2)\n" > > " bnez %1, 1b\n" > > "1:" > > - : "=&r"(old), "=r"(tmp) > > + : "=&r"(old), "=&r"(tmp) > > : "r"(p), "r"(t), "r"(s) > > : "memory"); > > return old; > > @@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s) > > " sc.d.aqrl %1, %4, (%2)\n" > > " bnez %1, 1b\n" > > "1:" > > - : "=&r"(old), "=r"(tmp) > > + : "=&r"(old), "=&r"(tmp) > > : "r"(p), "r"(t), "r"(s) > > : "memory"); > > return old; > > > > It's a shot in the dark as to whether that'll fix your bug, but I could at > > least see a mechanism for it: before we yielded load reservations on context > > switches then that backwards branch would never be taken, so we wouldn't notice > > if tmp was allocated into one of the same registers as the inputs. Even if it > > doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so > > I can indicate how important the issue is. > > I haven't had a chance to test this fix yet. The bug was reported by > Khem (and other OE people) as it's break musl for RISC-V. I did get a chance to test it. This seems to fix the issue :) Please send the patch to musl and CC me when you do. Good catch! Alistair > > > > > This should manifest on hardware, but it looks like we managed to drop that SC > > patch. I'll go send the patch out now... > > Thanks, do you mind CCing me? > > Alistair > > > > > > > > > Alistair > > > > > >> 2.21.0 > > >> > > >>
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index b17f169681..19029429a7 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) } /* tlb_flush is unnecessary as mode is contained in mmu_idx */ env->priv = newpriv; + + /* Clear the load reservation - otherwise a reservation placed in one + * context/process can be used by another, resulting in an SC succeeding + * incorrectly. Version 2.2 of the ISA specification explicitly requires + * this behaviour, while later revisions say that the kernel "should" use + * an SC instruction to force the yielding of a load reservation on a + * preemptive context switch. As a result, do both. + */ + env->load_res = 0; } /* get_physical_address - get the physical address for this virtual address diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c index f6dbbc065e..bb560a9d05 100644 --- a/target/riscv/insn_trans/trans_rva.inc.c +++ b/target/riscv/insn_trans/trans_rva.inc.c @@ -61,13 +61,19 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, TCGMemOp mop) gen_set_label(l1); /* - * Address comparion failure. However, we still need to + * Address comparison failure. However, we still need to * provide the memory barrier implied by AQ/RL. */ tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL); tcg_gen_movi_tl(dat, 1); gen_set_gpr(a->rd, dat); + /* + * Clear the load reservation, since an SC must fail if there is + * an SC to any address, in between an LR and SC pair. + */ + tcg_gen_movi_tl(load_res, 0); + gen_set_label(l2); tcg_temp_free(dat); tcg_temp_free(src1);