diff mbox series

irqchip: riscv: Order normal writes and IPI writes

Message ID 20250116120710.51673-1-luxu.kernel@bytedance.com (mailing list archive)
State New
Headers show
Series irqchip: riscv: Order normal writes and IPI writes | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh took 107.17s
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh took 1076.12s
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh took 1274.48s
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh took 16.54s
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh took 18.30s
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh took 0.56s
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh took 36.95s
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh took 0.00s
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh took 0.48s
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh took 0.01s
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh took 0.00s
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh took 0.02s

Commit Message

Xu Lu Jan. 16, 2025, 12:07 p.m. UTC
Replace writel_relaxed() with writel() when issuing IPI to ensure all
previous write operations made by current CPU are visible to other CPUs.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
---
 drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
 drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Charlie Jenkins Jan. 16, 2025, 9:09 p.m. UTC | #1
On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
> Replace writel_relaxed() with writel() when issuing IPI to ensure all
> previous write operations made by current CPU are visible to other CPUs.

Did you experience an ordering issue from this?

- Charlie

> 
> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> ---
>  drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
>  drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
> index c5c2e6929a2f..275df5005705 100644
> --- a/drivers/irqchip/irq-riscv-imsic-early.c
> +++ b/drivers/irqchip/irq-riscv-imsic-early.c
> @@ -27,7 +27,7 @@ static void imsic_ipi_send(unsigned int cpu)
>  {
>  	struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
>  
> -	writel_relaxed(IMSIC_IPI_ID, local->msi_va);
> +	writel(IMSIC_IPI_ID, local->msi_va);
>  }
>  
>  static void imsic_ipi_starting_cpu(void)
> diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> index b0e366ade427..8ff6e7a1363b 100644
> --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
>  
>  static void thead_aclint_sswi_ipi_send(unsigned int cpu)
>  {
> -	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
> +	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
>  }
>  
>  static void thead_aclint_sswi_ipi_clear(void)
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Thomas Gleixner Jan. 17, 2025, 10:35 a.m. UTC | #2
Charlie!

On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote:
> On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
>> Replace writel_relaxed() with writel() when issuing IPI to ensure all
>> previous write operations made by current CPU are visible to other CPUs.
>
> Did you experience an ordering issue from this?

That's not the right question.

      CPU 0                     CPU 1
      store A   // data
      store B   // IPI
                                IPI handler
                                load A

The real question is whether the RISC-V memory model guarantees under
all circumstances that A is globally visible before the IPI handler
load. If so, then the writel_relaxed() is fine. If not, the fence is
required.

That's not a question of observation. It's a question of facts defined
by the architecture. People have wasted months to analyze such fails
which tend to happen once every blue moon with no other trace than
"something went wrong" ....

> - Charlie

Please trim your replies...

Thanks,

        tglx
Anup Patel Jan. 17, 2025, 3:53 p.m. UTC | #3
Hi Thomas,

On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Charlie!
>
> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote:
> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all
> >> previous write operations made by current CPU are visible to other CPUs.
> >
> > Did you experience an ordering issue from this?
>
> That's not the right question.
>
>       CPU 0                     CPU 1
>       store A   // data
>       store B   // IPI
>                                 IPI handler
>                                 load A
>
> The real question is whether the RISC-V memory model guarantees under
> all circumstances that A is globally visible before the IPI handler
> load. If so, then the writel_relaxed() is fine. If not, the fence is
> required.
>
> That's not a question of observation. It's a question of facts defined
> by the architecture. People have wasted months to analyze such fails
> which tend to happen once every blue moon with no other trace than
> "something went wrong" ....

The RISC-V FENCE instruction distinguishes between normal
memory operations and I/O operations in its predecessor and
successor sets where r = normal read, w = normal write,
i = I/O read, and o = I/O write.

The ipi_mux_send_mask() does smp_mb__after_atomic() (equals
to "fence rw,rw") before calling imsic_ipi_send(). This prevents
ordering of normal memory writes in imsic_ipi_send() before
smp_mb__after_atomic() in ipi_mux_send_mask() but it does
not prevent I/O memory writes in imsic_ipi_send() to be ordered
before smp_mb__after_atomic().

This means currently nothing prevents the I/O memory write in
imsic_ipi_send() to be ordered before normal memory writes in
ipi_mux_send_mask() hence there is a very unlikely possibility
of an IPI handler on the target CPU seeing incorrect data.

The conversion of writel_relaxed() to writel() in imsic_ipi_send()
adds a "fence w,o" before the actual I/O memory write which
ensures that I/O memory write is not ordered before normal
memory writes.

Based on the above, the conversion of writel_relaxed() to
writel() in imsic_ipi_send() looks good to me.

Regards,
Anup
Thomas Gleixner Jan. 20, 2025, 7:37 a.m. UTC | #4
On Fri, Jan 17 2025 at 21:23, Anup Patel wrote:
> On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote:
>> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
>> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all
>> >> previous write operations made by current CPU are visible to other CPUs.
>> >
>> > Did you experience an ordering issue from this?
>>
>> That's not the right question.
>>
>>       CPU 0                     CPU 1
>>       store A   // data
>>       store B   // IPI
>>                                 IPI handler
>>                                 load A
>>
>> The real question is whether the RISC-V memory model guarantees under
>> all circumstances that A is globally visible before the IPI handler
>> load. If so, then the writel_relaxed() is fine. If not, the fence is
>> required.
>>
>> That's not a question of observation. It's a question of facts defined
>> by the architecture. People have wasted months to analyze such fails
>> which tend to happen once every blue moon with no other trace than
>> "something went wrong" ....
>
> The RISC-V FENCE instruction distinguishes between normal
> memory operations and I/O operations in its predecessor and
> successor sets where r = normal read, w = normal write,
> i = I/O read, and o = I/O write.
>
> The ipi_mux_send_mask() does smp_mb__after_atomic() (equals
> to "fence rw,rw") before calling imsic_ipi_send(). This prevents
> ordering of normal memory writes in imsic_ipi_send() before
> smp_mb__after_atomic() in ipi_mux_send_mask() but it does
> not prevent I/O memory writes in imsic_ipi_send() to be ordered
> before smp_mb__after_atomic().
>
> This means currently nothing prevents the I/O memory write in
> imsic_ipi_send() to be ordered before normal memory writes in
> ipi_mux_send_mask() hence there is a very unlikely possibility
> of an IPI handler on the target CPU seeing incorrect data.

Very unlikely is a valid assumption for a single system, but at scale it
becomes very likely :)

> The conversion of writel_relaxed() to writel() in imsic_ipi_send()
> adds a "fence w,o" before the actual I/O memory write which
> ensures that I/O memory write is not ordered before normal
> memory writes.
>
> Based on the above, the conversion of writel_relaxed() to
> writel() in imsic_ipi_send() looks good to me.

Can we please have something like the above in the change log so this is
documented for posterity?

Thanks

        tglx
Xu Lu Jan. 20, 2025, 11:05 a.m. UTC | #5
On Mon, Jan 20, 2025 at 3:37 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, Jan 17 2025 at 21:23, Anup Patel wrote:
> > On Fri, Jan 17, 2025 at 4:05 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Thu, Jan 16 2025 at 13:09, Charlie Jenkins wrote:
> >> > On Thu, Jan 16, 2025 at 08:07:10PM +0800, Xu Lu wrote:
> >> >> Replace writel_relaxed() with writel() when issuing IPI to ensure all
> >> >> previous write operations made by current CPU are visible to other CPUs.
> >> >
> >> > Did you experience an ordering issue from this?
> >>
> >> That's not the right question.
> >>
> >>       CPU 0                     CPU 1
> >>       store A   // data
> >>       store B   // IPI
> >>                                 IPI handler
> >>                                 load A
> >>
> >> The real question is whether the RISC-V memory model guarantees under
> >> all circumstances that A is globally visible before the IPI handler
> >> load. If so, then the writel_relaxed() is fine. If not, the fence is
> >> required.
> >>
> >> That's not a question of observation. It's a question of facts defined
> >> by the architecture. People have wasted months to analyze such fails
> >> which tend to happen once every blue moon with no other trace than
> >> "something went wrong" ....
> >
> > The RISC-V FENCE instruction distinguishes between normal
> > memory operations and I/O operations in its predecessor and
> > successor sets where r = normal read, w = normal write,
> > i = I/O read, and o = I/O write.
> >
> > The ipi_mux_send_mask() does smp_mb__after_atomic() (equals
> > to "fence rw,rw") before calling imsic_ipi_send(). This prevents
> > ordering of normal memory writes in imsic_ipi_send() before
> > smp_mb__after_atomic() in ipi_mux_send_mask() but it does
> > not prevent I/O memory writes in imsic_ipi_send() to be ordered
> > before smp_mb__after_atomic().
> >
> > This means currently nothing prevents the I/O memory write in
> > imsic_ipi_send() to be ordered before normal memory writes in
> > ipi_mux_send_mask() hence there is a very unlikely possibility
> > of an IPI handler on the target CPU seeing incorrect data.
>
> Very unlikely is a valid assumption for a single system, but at scale it
> becomes very likely :)
>
> > The conversion of writel_relaxed() to writel() in imsic_ipi_send()
> > adds a "fence w,o" before the actual I/O memory write which
> > ensures that I/O memory write is not ordered before normal
> > memory writes.
> >
> > Based on the above, the conversion of writel_relaxed() to
> > writel() in imsic_ipi_send() looks good to me.
>
> Can we please have something like the above in the change log so this is
> documented for posterity?

Thanks for Anup's supplied information. I will refine the commit
message and resend the patch.

Thanks,

Xu Lu

>
> Thanks
>
>         tglx
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
index c5c2e6929a2f..275df5005705 100644
--- a/drivers/irqchip/irq-riscv-imsic-early.c
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -27,7 +27,7 @@  static void imsic_ipi_send(unsigned int cpu)
 {
 	struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
 
-	writel_relaxed(IMSIC_IPI_ID, local->msi_va);
+	writel(IMSIC_IPI_ID, local->msi_va);
 }
 
 static void imsic_ipi_starting_cpu(void)
diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
index b0e366ade427..8ff6e7a1363b 100644
--- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
+++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
@@ -31,7 +31,7 @@  static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
 
 static void thead_aclint_sswi_ipi_send(unsigned int cpu)
 {
-	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
+	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
 }
 
 static void thead_aclint_sswi_ipi_clear(void)