diff mbox series

[kvm-unit-tests,01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs

Message ID 20201125155113.192079-2-alexandru.elisei@arm.com (mailing list archive)
State New, archived
Headers show
Series GIC fixes and improvements | expand

Commit Message

Alexandru Elisei Nov. 25, 2020, 3:51 p.m. UTC
One common usage for IPIs is for one CPU to write to a shared memory
location, send the IPI to kick another CPU, and the receiver to read from
the same location. Proper synchronization is needed to make sure that the
IPI receiver reads the most recent value and not stale data (for example,
the write from the sender CPU might still be in a store buffer).

For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
To make sure the memory stores are observable by other CPUs, we need a
wmb() barrier (DSB ST), which waits for stores to complete.

From the definition of DSB from ARM DDI 0487F.b, page B2-139:

"In addition, no instruction that appears in program order after the DSB
instruction can alter any state of the system or perform any part of its
functionality until the DSB completes other than:

- Being fetched from memory and decoded.
- Reading the general-purpose, SIMD and floating-point, Special-purpose, or
System registers that are directly or indirectly read without causing
side-effects."

Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).

The DSB instruction is enough to prevent reordering of the GIC register
write which comes in program order after the memory access.

This also matches what the Linux GICv3 irqchip driver does (commit
21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
gic_raise_softirq()")).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/gic-v3.c | 3 +++
 arm/gic.c        | 2 ++
 2 files changed, 5 insertions(+)

Comments

Eric Auger Dec. 1, 2020, 4:37 p.m. UTC | #1
Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> One common usage for IPIs is for one CPU to write to a shared memory
> location, send the IPI to kick another CPU, and the receiver to read from
> the same location. Proper synchronization is needed to make sure that the
> IPI receiver reads the most recent value and not stale data (for example,
> the write from the sender CPU might still be in a store buffer).
> 
> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
> To make sure the memory stores are observable by other CPUs, we need a
> wmb() barrier (DSB ST), which waits for stores to complete.
> 
> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
> 
> "In addition, no instruction that appears in program order after the DSB
> instruction can alter any state of the system or perform any part of its
> functionality until the DSB completes other than:
> 
> - Being fetched from memory and decoded.
> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
> System registers that are directly or indirectly read without causing
> side-effects."
> 
> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
> 
> The DSB instruction is enough to prevent reordering of the GIC register
> write which comes in program order after the memory access.
> 
> This also matches what the Linux GICv3 irqchip driver does (commit
> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
> gic_raise_softirq()")).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

> ---
>  lib/arm/gic-v3.c | 3 +++
>  arm/gic.c        | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
> index a7e2cb819746..a6afa42d5fbe 100644
> --- a/lib/arm/gic-v3.c
> +++ b/lib/arm/gic-v3.c
> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>  
>  	assert(irq < 16);
>  
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
nit: stores to normal memory ...
> +	wmb();
> +
>  	/*
>  	 * For each cpu in the mask collect its peers, which are also in
>  	 * the mask, in order to form target lists.
> diff --git a/arm/gic.c b/arm/gic.c
> index acb060585fae..512c83636a2e 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>  
>  static void gicv3_ipi_send_broadcast(void)
>  {
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
same
> +	wmb();
>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>  	isb();
>  }
> 
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
Alexandru Elisei Dec. 1, 2020, 5:37 p.m. UTC | #2
Hi Eric,

Thank you so much for having a look at the patches!

On 12/1/20 4:37 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> One common usage for IPIs is for one CPU to write to a shared memory
>> location, send the IPI to kick another CPU, and the receiver to read from
>> the same location. Proper synchronization is needed to make sure that the
>> IPI receiver reads the most recent value and not stale data (for example,
>> the write from the sender CPU might still be in a store buffer).
>>
>> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
>> To make sure the memory stores are observable by other CPUs, we need a
>> wmb() barrier (DSB ST), which waits for stores to complete.
>>
>> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
>>
>> "In addition, no instruction that appears in program order after the DSB
>> instruction can alter any state of the system or perform any part of its
>> functionality until the DSB completes other than:
>>
>> - Being fetched from memory and decoded.
>> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
>> System registers that are directly or indirectly read without causing
>> side-effects."
>>
>> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
>>
>> The DSB instruction is enough to prevent reordering of the GIC register
>> write which comes in program order after the memory access.
>>
>> This also matches what the Linux GICv3 irqchip driver does (commit
>> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
>> gic_raise_softirq()")).
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  lib/arm/gic-v3.c | 3 +++
>>  arm/gic.c        | 2 ++
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
>> index a7e2cb819746..a6afa42d5fbe 100644
>> --- a/lib/arm/gic-v3.c
>> +++ b/lib/arm/gic-v3.c
>> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>>  
>>  	assert(irq < 16);
>>  
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> nit: stores to normal memory ...

Yes, you are completely right. Specifying that it affects only stores to normal
memory would match the comment in the Linux irqchip driver and also what the
architecture specifies for device memory (page B2-158):

"Data accesses to memory locations are coherent for all observers in the system,
and correspondingly are treated as being Outer Shareable.
A memory location with any Device memory attribute cannot be allocated into a cache".

Same thing below, will change.

Thanks,

Alex

>> +	wmb();
>> +
>>  	/*
>>  	 * For each cpu in the mask collect its peers, which are also in
>>  	 * the mask, in order to form target lists.
>> diff --git a/arm/gic.c b/arm/gic.c
>> index acb060585fae..512c83636a2e 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>>  
>>  static void gicv3_ipi_send_broadcast(void)
>>  {
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> same
>> +	wmb();
>>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>>  	isb();
>>  }
>>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
> Thanks
>
> Eric
>
diff mbox series

Patch

diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
index a7e2cb819746..a6afa42d5fbe 100644
--- a/lib/arm/gic-v3.c
+++ b/lib/arm/gic-v3.c
@@ -77,6 +77,9 @@  void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
 
 	assert(irq < 16);
 
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
+
 	/*
 	 * For each cpu in the mask collect its peers, which are also in
 	 * the mask, in order to form target lists.
diff --git a/arm/gic.c b/arm/gic.c
index acb060585fae..512c83636a2e 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -275,6 +275,8 @@  static void gicv3_ipi_send_self(void)
 
 static void gicv3_ipi_send_broadcast(void)
 {
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
 	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
 	isb();
 }