Message ID | 20250116120710.51673-1-luxu.kernel@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | irqchip: riscv: Order normal writes and IPI writes | expand |
On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote: > Replace writel_relaxed() with writel() when issuing IPI to ensure all > previous write operations made by current CPU are visible to other CPUs. Did you experience an ordering issue from this? - Charlie > > Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> > --- > drivers/irqchip/irq-riscv-imsic-early.c | 2 +- > drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c > index c5c2e6929a2f..275df5005705 100644 > --- a/drivers/irqchip/irq-riscv-imsic-early.c > +++ b/drivers/irqchip/irq-riscv-imsic-early.c > @@ -27,7 +27,7 @@ static void imsic_ipi_send(unsigned int cpu) > { > struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu); > > - writel_relaxed(IMSIC_IPI_ID, local->msi_va); > + writel(IMSIC_IPI_ID, local->msi_va); > } > > static void imsic_ipi_starting_cpu(void) > diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c > index b0e366ade427..8ff6e7a1363b 100644 > --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c > +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c > @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs); > > static void thead_aclint_sswi_ipi_send(unsigned int cpu) > { > - writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu)); > + writel(0x1, per_cpu(sswi_cpu_regs, cpu)); > } > > static void thead_aclint_sswi_ipi_clear(void) > -- > 2.20.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
Charlie! On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote: > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote: >> Replace writel_relaxed() with writel() when issuing IPI to ensure all >> previous write operations made by current CPU are visible to other CPUs. > > Did you experience an ordering issue from this? That's not the right question. CPU 0 CPU 1 store A // data store B // IPI IPI handler load A The real question is whether the RISC-V memory model guarantees under all circumstances that A is globally visible before the IPI handler load. If so, then the writel_relaxed() is fine. If not, the fence is required. That's not a question of observation. It's a question of facts defined by the architecture. People have wasted months to analyze such fails which tend to happen once every blue moon with no other trace than "something went wrong" .... > - Charlie Please trim your replies... Thanks, tglx
Hi Thomas, On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > Charlie! > > On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote: > > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote: > >> Replace writel_relaxed() with writel() when issuing IPI to ensure all > >> previous write operations made by current CPU are visible to other CPUs. > > > > Did you experience an ordering issue from this? > > That's not the right question. > > CPU 0 CPU 1 > store A // data > store B // IPI > IPI handler > load A > > The real question is whether the RISC-V memory model guarantees under > all circumstances that A is globally visible before the IPI handler > load. If so, then the writel_relaxed() is fine. If not, the fence is > required. > > That's not a question of observation. It's a question of facts defined > by the architecture. People have wasted months to analyze such fails > which tend to happen once every blue moon with no other trace than > "something went wrong" .... The RISC-V FENCE instruction distinguishes between normal memory operations and I/O operations in its predecessor and successor sets where r = normal read, w = normal write, i = I/O read, and o = I/O write. The ipi_mux_send_mask() does smp_mb__after_atomic() (equals to "fence rw,rw") before calling imsic_ipi_send(). This prevents ordering of normal memory writes in imsic_ipi_send() before smp_mb__after_atomic() in ipi_mux_send_mask() but it does not prevent I/O memory writes in imsic_ipi_send() to be ordered before smp_mb__after_atomic(). This means currently nothing prevents the I/O memory write in imsic_ipi_send() to be ordered before normal memory writes in ipi_mux_send_mask() hence there is a very unlikely possibility of an IPI handler on the target CPU seeing incorrect data. The conversion of writel_relaxed() to writel() in imsic_ipi_send() adds a "fence w,o" before the actual I/O memory write which ensures that I/O memory write is not ordered before normal memory writes. Based on the above, the conversion of writel_relaxed() to writel() in imsic_ipi_send() looks good to me. Regards, Anup
On Fri, Jan 17 2025 at 21:23, Anup Patel wrote: > On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote: >> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote: >> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote: >> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all >> >> previous write operations made by current CPU are visible to other CPUs. >> > >> > Did you experience an ordering issue from this? >> >> That's not the right question. >> >> CPU 0 CPU 1 >> store A // data >> store B // IPI >> IPI handler >> load A >> >> The real question is whether the RISC-V memory model guarantees under >> all circumstances that A is globally visible before the IPI handler >> load. If so, then the writel_relaxed() is fine. If not, the fence is >> required. >> >> That's not a question of observation. It's a question of facts defined >> by the architecture. People have wasted months to analyze such fails >> which tend to happen once every blue moon with no other trace than >> "something went wrong" .... > > The RISC-V FENCE instruction distinguishes between normal > memory operations and I/O operations in its predecessor and > successor sets where r = normal read, w = normal write, > i = I/O read, and o = I/O write. > > The ipi_mux_send_mask() does smp_mb__after_atomic() (equals > to "fence rw,rw") before calling imsic_ipi_send(). This prevents > ordering of normal memory writes in imsic_ipi_send() before > smp_mb__after_atomic() in ipi_mux_send_mask() but it does > not prevent I/O memory writes in imsic_ipi_send() to be ordered > before smp_mb__after_atomic(). > > This means currently nothing prevents the I/O memory write in > imsic_ipi_send() to be ordered before normal memory writes in > ipi_mux_send_mask() hence there is a very unlikely possibility > of an IPI handler on the target CPU seeing incorrect data. Very unlikely is a valid assumption for a single system, but at scale it becomes very likely :) > The conversion of writel_relaxed() to writel() in imsic_ipi_send() > adds a "fence w,o" before the actual I/O memory write which > ensures that I/O memory write is not ordered before normal > memory writes. > > Based on the above, the conversion of writel_relaxed() to > writel() in imsic_ipi_send() looks good to me. Can we please have something like the above in the change log so this is documented for posterity? Thanks tglx
On Mon, Jan 20, 2025 at 3:37 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Fri, Jan 17 2025 at 21:23, Anup Patel wrote: > > On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote: > >> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote: > >> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote: > >> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all > >> >> previous write operations made by current CPU are visible to other CPUs. > >> > > >> > Did you experience an ordering issue from this? > >> > >> That's not the right question. > >> > >> CPU 0 CPU 1 > >> store A // data > >> store B // IPI > >> IPI handler > >> load A > >> > >> The real question is whether the RISC-V memory model guarantees under > >> all circumstances that A is globally visible before the IPI handler > >> load. If so, then the writel_relaxed() is fine. If not, the fence is > >> required. > >> > >> That's not a question of observation. It's a question of facts defined > >> by the architecture. People have wasted months to analyze such fails > >> which tend to happen once every blue moon with no other trace than > >> "something went wrong" .... > > > > The RISC-V FENCE instruction distinguishes between normal > > memory operations and I/O operations in its predecessor and > > successor sets where r = normal read, w = normal write, > > i = I/O read, and o = I/O write. > > > > The ipi_mux_send_mask() does smp_mb__after_atomic() (equals > > to "fence rw,rw") before calling imsic_ipi_send(). This prevents > > ordering of normal memory writes in imsic_ipi_send() before > > smp_mb__after_atomic() in ipi_mux_send_mask() but it does > > not prevent I/O memory writes in imsic_ipi_send() to be ordered > > before smp_mb__after_atomic(). > > > > This means currently nothing prevents the I/O memory write in > > imsic_ipi_send() to be ordered before normal memory writes in > > ipi_mux_send_mask() hence there is a very unlikely possibility > > of an IPI handler on the target CPU seeing incorrect data. > > Very unlikely is a valid assumption for a single system, but at scale it > becomes very likely :) > > > The conversion of writel_relaxed() to writel() in imsic_ipi_send() > > adds a "fence w,o" before the actual I/O memory write which > > ensures that I/O memory write is not ordered before normal > > memory writes. > > > > Based on the above, the conversion of writel_relaxed() to > > writel() in imsic_ipi_send() looks good to me. > > Can we please have something like the above in the change log so this is > documented for posterity? Thanks for Anup's supplied information. I will refine the commit message and resend the patch. Thanks, Xu Lu > > Thanks > > tglx
diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c index c5c2e6929a2f..275df5005705 100644 --- a/drivers/irqchip/irq-riscv-imsic-early.c +++ b/drivers/irqchip/irq-riscv-imsic-early.c @@ -27,7 +27,7 @@ static void imsic_ipi_send(unsigned int cpu) { struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu); - writel_relaxed(IMSIC_IPI_ID, local->msi_va); + writel(IMSIC_IPI_ID, local->msi_va); } static void imsic_ipi_starting_cpu(void) diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c index b0e366ade427..8ff6e7a1363b 100644 --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs); static void thead_aclint_sswi_ipi_send(unsigned int cpu) { - writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu)); + writel(0x1, per_cpu(sswi_cpu_regs, cpu)); } static void thead_aclint_sswi_ipi_clear(void)
Replace writel_relaxed() with writel() when issuing IPI to ensure all previous write operations made by current CPU are visible to other CPUs. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> --- drivers/irqchip/irq-riscv-imsic-early.c | 2 +- drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)