diff mbox series

[v2,03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer

Message ID 59c24309e2d8494edf414904fe9725b4e7387098.1662123432.git.rahul.singh@arm.com (mailing list archive)
State Superseded
Headers show
Series xen/arm: smmuv3: Merge Linux fixes to Xen | expand

Commit Message

Rahul Singh Sept. 2, 2022, 1:34 p.m. UTC
From: Zhou Wang <wangzhou1@hisilicon.com>

Backport Linux commit a76a37777f2c. This is the clean backport without
any changes.

Reading the 'prod' MMIO register in order to determine whether or
not there is valid data beyond 'cons' for a given queue does not
provide sufficient dependency ordering, as the resulting access is
address dependent only on 'cons' and can therefore be speculated
ahead of time, potentially allowing stale data to be read by the
CPU.

Use readl() instead of readl_relaxed() when updating the shadow copy
of the 'prod' pointer, so that all speculated memory reads from the
corresponding queue can occur only from valid slots.

Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
[will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
Signed-off-by: Will Deacon <will@kernel.org>
Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
Signed-off-by: Rahul Singh <rahul.singh@arm.com>
---
Changes in v2:
 - fix commit msg
 - add _iomb changes also from the origin patch
---
 xen/arch/arm/include/asm/system.h     |  1 +
 xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

Comments

Stefano Stabellini Sept. 2, 2022, 11:21 p.m. UTC | #1
On Fri, 2 Sep 2022, Rahul Singh wrote:
> From: Zhou Wang <wangzhou1@hisilicon.com>
> 
> Backport Linux commit a76a37777f2c. This is the clean backport without
> any changes.
> 
> Reading the 'prod' MMIO register in order to determine whether or
> not there is valid data beyond 'cons' for a given queue does not
> provide sufficient dependency ordering, as the resulting access is
> address dependent only on 'cons' and can therefore be speculated
> ahead of time, potentially allowing stale data to be read by the
> CPU.
> 
> Use readl() instead of readl_relaxed() when updating the shadow copy
> of the 'prod' pointer, so that all speculated memory reads from the
> corresponding queue can occur only from valid slots.
> 
> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
> Signed-off-by: Will Deacon <will@kernel.org>
> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> ---
> Changes in v2:
>  - fix commit msg
>  - add _iomb changes also from the origin patch
> ---
>  xen/arch/arm/include/asm/system.h     |  1 +
>  xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
> index 65d5c8e423..fe27cf8c5e 100644
> --- a/xen/arch/arm/include/asm/system.h
> +++ b/xen/arch/arm/include/asm/system.h
> @@ -29,6 +29,7 @@
>  #endif
>  
>  #define smp_wmb()       dmb(ishst)
> +#define __iomb()        dmb(osh)

We don't have any other #define starting with __ in system.h.
I wonder if we should call this macro differently or simply iomb().


>  #define smp_mb__before_atomic()    smp_mb()
>  #define smp_mb__after_atomic()     smp_mb()
> diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c
> index 64d39bb4d3..cee13d1fc7 100644
> --- a/xen/drivers/passthrough/arm/smmu-v3.c
> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
> @@ -951,7 +951,7 @@ static void queue_sync_cons_out(struct arm_smmu_queue *q)
>  	 * Ensure that all CPU accesses (reads and writes) to the queue
>  	 * are complete before we update the cons pointer.
>  	 */
> -	mb();
> +	__iomb();
>  	writel_relaxed(q->llq.cons, q->cons_reg);
>  }
>  
> @@ -963,8 +963,15 @@ static void queue_inc_cons(struct arm_smmu_ll_queue *q)
>  
>  static int queue_sync_prod_in(struct arm_smmu_queue *q)
>  {
> +	u32 prod;
>  	int ret = 0;
> -	u32 prod = readl_relaxed(q->prod_reg);
> +
> +	/*
> +	 * We can't use the _relaxed() variant here, as we must prevent
> +	 * speculative reads of the queue before we have determined that
> +	 * prod has indeed moved.
> +	 */
> +	prod = readl(q->prod_reg);
>  
>  	if (Q_OVF(prod) != Q_OVF(q->llq.prod))
>  		ret = -EOVERFLOW;
> -- 
> 2.25.1
>
Rahul Singh Sept. 5, 2022, 9:18 a.m. UTC | #2
Hi Stefano,

> On 3 Sep 2022, at 12:21 am, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Fri, 2 Sep 2022, Rahul Singh wrote:
>> From: Zhou Wang <wangzhou1@hisilicon.com>
>> 
>> Backport Linux commit a76a37777f2c. This is the clean backport without
>> any changes.
>> 
>> Reading the 'prod' MMIO register in order to determine whether or
>> not there is valid data beyond 'cons' for a given queue does not
>> provide sufficient dependency ordering, as the resulting access is
>> address dependent only on 'cons' and can therefore be speculated
>> ahead of time, potentially allowing stale data to be read by the
>> CPU.
>> 
>> Use readl() instead of readl_relaxed() when updating the shadow copy
>> of the 'prod' pointer, so that all speculated memory reads from the
>> corresponding queue can occur only from valid slots.
>> 
>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
>> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
>> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
>> Signed-off-by: Will Deacon <will@kernel.org>
>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>> ---
>> Changes in v2:
>> - fix commit msg
>> - add _iomb changes also from the origin patch
>> ---
>> xen/arch/arm/include/asm/system.h     |  1 +
>> xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
>> 2 files changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
>> index 65d5c8e423..fe27cf8c5e 100644
>> --- a/xen/arch/arm/include/asm/system.h
>> +++ b/xen/arch/arm/include/asm/system.h
>> @@ -29,6 +29,7 @@
>> #endif
>> 
>> #define smp_wmb()       dmb(ishst)
>> +#define __iomb()        dmb(osh)
> 
> We don't have any other #define starting with __ in system.h.
> I wonder if we should call this macro differently or simply iomb().
 
I think either iomb() or dma_mb() will be the right name.
Please let me know your view on this.

Regards,
Rahul
Julien Grall Sept. 5, 2022, 9:31 a.m. UTC | #3
On 05/09/2022 10:18, Rahul Singh wrote:
>> On 3 Sep 2022, at 12:21 am, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>
>> On Fri, 2 Sep 2022, Rahul Singh wrote:
>>> From: Zhou Wang <wangzhou1@hisilicon.com>
>>>
>>> Backport Linux commit a76a37777f2c. This is the clean backport without
>>> any changes.
>>>
>>> Reading the 'prod' MMIO register in order to determine whether or
>>> not there is valid data beyond 'cons' for a given queue does not
>>> provide sufficient dependency ordering, as the resulting access is
>>> address dependent only on 'cons' and can therefore be speculated
>>> ahead of time, potentially allowing stale data to be read by the
>>> CPU.
>>>
>>> Use readl() instead of readl_relaxed() when updating the shadow copy
>>> of the 'prod' pointer, so that all speculated memory reads from the
>>> corresponding queue can occur only from valid slots.
>>>
>>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
>>> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
>>> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
>>> Signed-off-by: Will Deacon <will@kernel.org>
>>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>> ---
>>> Changes in v2:
>>> - fix commit msg
>>> - add _iomb changes also from the origin patch
>>> ---
>>> xen/arch/arm/include/asm/system.h     |  1 +
>>> xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
>>> 2 files changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
>>> index 65d5c8e423..fe27cf8c5e 100644
>>> --- a/xen/arch/arm/include/asm/system.h
>>> +++ b/xen/arch/arm/include/asm/system.h
>>> @@ -29,6 +29,7 @@
>>> #endif
>>>
>>> #define smp_wmb()       dmb(ishst)
>>> +#define __iomb()        dmb(osh)
>>
>> We don't have any other #define starting with __ in system.h.
>> I wonder if we should call this macro differently or simply iomb().
>   
> I think either iomb() or dma_mb() will be the right name.
> Please let me know your view on this.

It is not 100% clear why Linux went with __iomb() rather than iomb(). 
But I would prefer to keep the __* version to match Linux.

If the others really want to drop the __. Then I think it should be name 
iomb(). The rationale is while __iomb() is an alias to dma_mb(), the 
__iormb() behaves differently compare to dma_mb() (I haven't into 
details why).

So if it was a read barrier, we would likely want to use the iormb() 
semantic. This will keep the terminology consistent with Linux (even if 
we remove the __).

Cheers,
Bertrand Marquis Sept. 5, 2022, 10:23 a.m. UTC | #4
Hi,

> On 5 Sep 2022, at 10:31, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 05/09/2022 10:18, Rahul Singh wrote:
>>> On 3 Sep 2022, at 12:21 am, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> 
>>> On Fri, 2 Sep 2022, Rahul Singh wrote:
>>>> From: Zhou Wang <wangzhou1@hisilicon.com>
>>>> 
>>>> Backport Linux commit a76a37777f2c. This is the clean backport without
>>>> any changes.
>>>> 
>>>> Reading the 'prod' MMIO register in order to determine whether or
>>>> not there is valid data beyond 'cons' for a given queue does not
>>>> provide sufficient dependency ordering, as the resulting access is
>>>> address dependent only on 'cons' and can therefore be speculated
>>>> ahead of time, potentially allowing stale data to be read by the
>>>> CPU.
>>>> 
>>>> Use readl() instead of readl_relaxed() when updating the shadow copy
>>>> of the 'prod' pointer, so that all speculated memory reads from the
>>>> corresponding queue can occur only from valid slots.
>>>> 
>>>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
>>>> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
>>>> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>> ---
>>>> Changes in v2:
>>>> - fix commit msg
>>>> - add _iomb changes also from the origin patch
>>>> ---
>>>> xen/arch/arm/include/asm/system.h     |  1 +
>>>> xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
>>>> 2 files changed, 10 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
>>>> index 65d5c8e423..fe27cf8c5e 100644
>>>> --- a/xen/arch/arm/include/asm/system.h
>>>> +++ b/xen/arch/arm/include/asm/system.h
>>>> @@ -29,6 +29,7 @@
>>>> #endif
>>>> 
>>>> #define smp_wmb()       dmb(ishst)
>>>> +#define __iomb()        dmb(osh)
>>> 
>>> We don't have any other #define starting with __ in system.h.
>>> I wonder if we should call this macro differently or simply iomb().
>>  I think either iomb() or dma_mb() will be the right name.
>> Please let me know your view on this.
> 
> It is not 100% clear why Linux went with __iomb() rather than iomb(). But I would prefer to keep the __* version to match Linux.
> 
> If the others really want to drop the __. Then I think it should be name iomb(). The rationale is while __iomb() is an alias to dma_mb(), the __iormb() behaves differently compare to dma_mb() (I haven't into details why).
> 
> So if it was a read barrier, we would likely want to use the iormb() semantic. This will keep the terminology consistent with Linux (even if we remove the __).

We need the __iomb as “linux compatibility” in fact so I would suggest for now to only introduce it at the beginning of smmu-v3.c with other linux compatibility stuff to prevent adding this to Xen overall.

Cheers
Bertrand


> 
> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Sept. 5, 2022, 10:24 a.m. UTC | #5
Hi Bertrand,

On 05/09/2022 11:23, Bertrand Marquis wrote:
> Hi,
> 
>> On 5 Sep 2022, at 10:31, Julien Grall <julien@xen.org> wrote:
>>
>>
>>
>> On 05/09/2022 10:18, Rahul Singh wrote:
>>>> On 3 Sep 2022, at 12:21 am, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>
>>>> On Fri, 2 Sep 2022, Rahul Singh wrote:
>>>>> From: Zhou Wang <wangzhou1@hisilicon.com>
>>>>>
>>>>> Backport Linux commit a76a37777f2c. This is the clean backport without
>>>>> any changes.
>>>>>
>>>>> Reading the 'prod' MMIO register in order to determine whether or
>>>>> not there is valid data beyond 'cons' for a given queue does not
>>>>> provide sufficient dependency ordering, as the resulting access is
>>>>> address dependent only on 'cons' and can therefore be speculated
>>>>> ahead of time, potentially allowing stale data to be read by the
>>>>> CPU.
>>>>>
>>>>> Use readl() instead of readl_relaxed() when updating the shadow copy
>>>>> of the 'prod' pointer, so that all speculated memory reads from the
>>>>> corresponding queue can occur only from valid slots.
>>>>>
>>>>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
>>>>> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
>>>>> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
>>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a37777f2c
>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>>> ---
>>>>> Changes in v2:
>>>>> - fix commit msg
>>>>> - add _iomb changes also from the origin patch
>>>>> ---
>>>>> xen/arch/arm/include/asm/system.h     |  1 +
>>>>> xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
>>>>> 2 files changed, 10 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
>>>>> index 65d5c8e423..fe27cf8c5e 100644
>>>>> --- a/xen/arch/arm/include/asm/system.h
>>>>> +++ b/xen/arch/arm/include/asm/system.h
>>>>> @@ -29,6 +29,7 @@
>>>>> #endif
>>>>>
>>>>> #define smp_wmb()       dmb(ishst)
>>>>> +#define __iomb()        dmb(osh)
>>>>
>>>> We don't have any other #define starting with __ in system.h.
>>>> I wonder if we should call this macro differently or simply iomb().
>>>   I think either iomb() or dma_mb() will be the right name.
>>> Please let me know your view on this.
>>
>> It is not 100% clear why Linux went with __iomb() rather than iomb(). But I would prefer to keep the __* version to match Linux.
>>
>> If the others really want to drop the __. Then I think it should be name iomb(). The rationale is while __iomb() is an alias to dma_mb(), the __iormb() behaves differently compare to dma_mb() (I haven't into details why).
>>
>> So if it was a read barrier, we would likely want to use the iormb() semantic. This will keep the terminology consistent with Linux (even if we remove the __).
> 
> We need the __iomb as “linux compatibility” in fact so I would suggest for now to only introduce it at the beginning of smmu-v3.c with other linux compatibility stuff to prevent adding this to Xen overall.

I would be fine with that.

Cheers,
Stefano Stabellini Sept. 5, 2022, 10:17 p.m. UTC | #6
On Mon, 5 Sep 2022, Julien Grall wrote:
> On 05/09/2022 11:23, Bertrand Marquis wrote:
> > > On 5 Sep 2022, at 10:31, Julien Grall <julien@xen.org> wrote:
> > > On 05/09/2022 10:18, Rahul Singh wrote:
> > > > > On 3 Sep 2022, at 12:21 am, Stefano Stabellini
> > > > > <sstabellini@kernel.org> wrote:
> > > > > 
> > > > > On Fri, 2 Sep 2022, Rahul Singh wrote:
> > > > > > From: Zhou Wang <wangzhou1@hisilicon.com>
> > > > > > 
> > > > > > Backport Linux commit a76a37777f2c. This is the clean backport
> > > > > > without
> > > > > > any changes.
> > > > > > 
> > > > > > Reading the 'prod' MMIO register in order to determine whether or
> > > > > > not there is valid data beyond 'cons' for a given queue does not
> > > > > > provide sufficient dependency ordering, as the resulting access is
> > > > > > address dependent only on 'cons' and can therefore be speculated
> > > > > > ahead of time, potentially allowing stale data to be read by the
> > > > > > CPU.
> > > > > > 
> > > > > > Use readl() instead of readl_relaxed() when updating the shadow copy
> > > > > > of the 'prod' pointer, so that all speculated memory reads from the
> > > > > > corresponding queue can occur only from valid slots.
> > > > > > 
> > > > > > Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> > > > > > Link:
> > > > > > https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
> > > > > > [will: Use readl() instead of explicit barrier. Update 'cons' side
> > > > > > to match.]
> > > > > > Signed-off-by: Will Deacon <will@kernel.org>
> > > > > > Origin:
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > > > > a76a37777f2c
> > > > > > Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> > > > > > ---
> > > > > > Changes in v2:
> > > > > > - fix commit msg
> > > > > > - add _iomb changes also from the origin patch
> > > > > > ---
> > > > > > xen/arch/arm/include/asm/system.h     |  1 +
> > > > > > xen/drivers/passthrough/arm/smmu-v3.c | 11 +++++++++--
> > > > > > 2 files changed, 10 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/xen/arch/arm/include/asm/system.h
> > > > > > b/xen/arch/arm/include/asm/system.h
> > > > > > index 65d5c8e423..fe27cf8c5e 100644
> > > > > > --- a/xen/arch/arm/include/asm/system.h
> > > > > > +++ b/xen/arch/arm/include/asm/system.h
> > > > > > @@ -29,6 +29,7 @@
> > > > > > #endif
> > > > > > 
> > > > > > #define smp_wmb()       dmb(ishst)
> > > > > > +#define __iomb()        dmb(osh)
> > > > > 
> > > > > We don't have any other #define starting with __ in system.h.
> > > > > I wonder if we should call this macro differently or simply iomb().
> > > >   I think either iomb() or dma_mb() will be the right name.
> > > > Please let me know your view on this.
> > > 
> > > It is not 100% clear why Linux went with __iomb() rather than iomb(). But
> > > I would prefer to keep the __* version to match Linux.
> > > 
> > > If the others really want to drop the __. Then I think it should be name
> > > iomb(). The rationale is while __iomb() is an alias to dma_mb(), the
> > > __iormb() behaves differently compare to dma_mb() (I haven't into details
> > > why).
> > > 
> > > So if it was a read barrier, we would likely want to use the iormb()
> > > semantic. This will keep the terminology consistent with Linux (even if we
> > > remove the __).
> > 
> > We need the __iomb as “linux compatibility” in fact so I would suggest for
> > now to only introduce it at the beginning of smmu-v3.c with other linux
> > compatibility stuff to prevent adding this to Xen overall.
> 
> I would be fine with that.

+1
diff mbox series

Patch

diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h
index 65d5c8e423..fe27cf8c5e 100644
--- a/xen/arch/arm/include/asm/system.h
+++ b/xen/arch/arm/include/asm/system.h
@@ -29,6 +29,7 @@ 
 #endif
 
 #define smp_wmb()       dmb(ishst)
+#define __iomb()        dmb(osh)
 
 #define smp_mb__before_atomic()    smp_mb()
 #define smp_mb__after_atomic()     smp_mb()
diff --git a/xen/drivers/passthrough/arm/smmu-v3.c b/xen/drivers/passthrough/arm/smmu-v3.c
index 64d39bb4d3..cee13d1fc7 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -951,7 +951,7 @@  static void queue_sync_cons_out(struct arm_smmu_queue *q)
 	 * Ensure that all CPU accesses (reads and writes) to the queue
 	 * are complete before we update the cons pointer.
 	 */
-	mb();
+	__iomb();
 	writel_relaxed(q->llq.cons, q->cons_reg);
 }
 
@@ -963,8 +963,15 @@  static void queue_inc_cons(struct arm_smmu_ll_queue *q)
 
 static int queue_sync_prod_in(struct arm_smmu_queue *q)
 {
+	u32 prod;
 	int ret = 0;
-	u32 prod = readl_relaxed(q->prod_reg);
+
+	/*
+	 * We can't use the _relaxed() variant here, as we must prevent
+	 * speculative reads of the queue before we have determined that
+	 * prod has indeed moved.
+	 */
+	prod = readl(q->prod_reg);
 
 	if (Q_OVF(prod) != Q_OVF(q->llq.prod))
 		ret = -EOVERFLOW;