diff mbox

[v4,1/6] RDMA/bnxt_re: Eliminate duplicate barriers on weakly-ordered archs

Message ID 1521514068-8856-2-git-send-email-okaya@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Sinan Kaya March 20, 2018, 2:47 a.m. UTC
Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Jason Gunthorpe March 20, 2018, 2:48 p.m. UTC | #1
On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
> Code includes wmb() followed by writel(). writel() already has a barrier on
> some architectures like arm64.
> 
> This ends up CPU observing two barriers back to back before executing the
> register write.
> 
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> index 8329ec6..4a6b981 100644
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
>  
>  	/* ring CMDQ DB */
>  	wmb();
> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> -	       rcfw->cmdq_bar_reg_prod_off);
> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> -	       rcfw->cmdq_bar_reg_trig_off);
> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> +		       rcfw->cmdq_bar_reg_prod_off);
> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> +		       rcfw->cmdq_bar_reg_trig_off);

Woah, this may not be safe..

The definition of writel_relaxed() is that it is fully unordered, so
the above two writes may change order now. Broadcom guys would have to
ack if that it is OK or not for their hardware.

In general this is not an OK approach for a mechanical
conversion.. Only the first writel can be convereted.

You need to check all your patches to make sure there are no
subsequent writel's in the places touched.

Jason
Sinan Kaya March 20, 2018, 3 p.m. UTC | #2
On 3/20/2018 9:48 AM, Jason Gunthorpe wrote:
> On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
>> Code includes wmb() followed by writel(). writel() already has a barrier on
>> some architectures like arm64.
>>
>> This ends up CPU observing two barriers back to back before executing the
>> register write.
>>
>> Since code already has an explicit barrier call, changing writel() to
>> writel_relaxed().
>>
>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>> index 8329ec6..4a6b981 100644
>> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
>>  
>>  	/* ring CMDQ DB */
>>  	wmb();
>> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>> -	       rcfw->cmdq_bar_reg_prod_off);
>> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>> -	       rcfw->cmdq_bar_reg_trig_off);
>> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>> +		       rcfw->cmdq_bar_reg_prod_off);
>> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>> +		       rcfw->cmdq_bar_reg_trig_off);
> 
> Woah, this may not be safe..
> 
> The definition of writel_relaxed() is that it is fully unordered, so
> the above two writes may change order now. Broadcom guys would have to
> ack if that it is OK or not for their hardware.
> 
> In general this is not an OK approach for a mechanical
> conversion.. Only the first writel can be convereted.
> 
> You need to check all your patches to make sure there are no
> subsequent writel's in the places touched.

I paid special attention to this one and went to check the barriers
document. According to the document, writes (whether it is relaxed or not)
are always observed by the HW inorder with respect to each other.
It just doesn't guarantee anything above writel_relaxed() to be observed.
Since we already have a wmb(), this is taken care of. 

If somebody wants things to be observed after register write, there should
have been a wmb() or mmiowb() afterwards.


> 
> Jason
>
Sinan Kaya March 20, 2018, 3:08 p.m. UTC | #3
On 3/20/2018 10:00 AM, Sinan Kaya wrote:
> On 3/20/2018 9:48 AM, Jason Gunthorpe wrote:
>> On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
>>> Code includes wmb() followed by writel(). writel() already has a barrier on
>>> some architectures like arm64.
>>>
>>> This ends up CPU observing two barriers back to back before executing the
>>> register write.
>>>
>>> Since code already has an explicit barrier call, changing writel() to
>>> writel_relaxed().
>>>
>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>>> index 8329ec6..4a6b981 100644
>>> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>>> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
>>>  
>>>  	/* ring CMDQ DB */
>>>  	wmb();
>>> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>>> -	       rcfw->cmdq_bar_reg_prod_off);
>>> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>>> -	       rcfw->cmdq_bar_reg_trig_off);
>>> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>>> +		       rcfw->cmdq_bar_reg_prod_off);
>>> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>>> +		       rcfw->cmdq_bar_reg_trig_off);
>>
>> Woah, this may not be safe..
>>
>> The definition of writel_relaxed() is that it is fully unordered, so
>> the above two writes may change order now. Broadcom guys would have to
>> ack if that it is OK or not for their hardware.
>>
>> In general this is not an OK approach for a mechanical
>> conversion.. Only the first writel can be convereted.
>>
>> You need to check all your patches to make sure there are no
>> subsequent writel's in the places touched.
> 
> I paid special attention to this one and went to check the barriers
> document. According to the document, writes (whether it is relaxed or not)
> are always observed by the HW inorder with respect to each other.
> It just doesn't guarantee anything above writel_relaxed() to be observed.
> Since we already have a wmb(), this is taken care of. 
> 
> If somebody wants things to be observed after register write, there should
> have been a wmb() or mmiowb() afterwards.

Never mind, it will break some architectures. I'll only change the first one.

 (1) On some systems, I/O stores are not strongly ordered across all CPUs, and
     so for _all_ general drivers locks should be used and mmiowb() must be
     issued prior to unlocking the critical section.

 (2) If the accessor functions are used to refer to an I/O memory window with
     relaxed memory access properties, then _mandatory_ memory barriers are
     required to enforce ordering. 


> 
> 
>>
>> Jason
>>
> 
>
Jason Gunthorpe March 20, 2018, 3:20 p.m. UTC | #4
On Tue, Mar 20, 2018 at 10:00:49AM -0500, Sinan Kaya wrote:
> On 3/20/2018 9:48 AM, Jason Gunthorpe wrote:
> > On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
> >> Code includes wmb() followed by writel(). writel() already has a barrier on
> >> some architectures like arm64.
> >>
> >> This ends up CPU observing two barriers back to back before executing the
> >> register write.
> >>
> >> Since code already has an explicit barrier call, changing writel() to
> >> writel_relaxed().
> >>
> >> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> >>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> >> index 8329ec6..4a6b981 100644
> >> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> >> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
> >>  
> >>  	/* ring CMDQ DB */
> >>  	wmb();
> >> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> >> -	       rcfw->cmdq_bar_reg_prod_off);
> >> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> >> -	       rcfw->cmdq_bar_reg_trig_off);
> >> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> >> +		       rcfw->cmdq_bar_reg_prod_off);
> >> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> >> +		       rcfw->cmdq_bar_reg_trig_off);
> > 
> > Woah, this may not be safe..
> > 
> > The definition of writel_relaxed() is that it is fully unordered, so
> > the above two writes may change order now. Broadcom guys would have to
> > ack if that it is OK or not for their hardware.
> > 
> > In general this is not an OK approach for a mechanical
> > conversion.. Only the first writel can be convereted.
> > 
> > You need to check all your patches to make sure there are no
> > subsequent writel's in the places touched.
> 
> I paid special attention to this one and went to check the barriers
> document. According to the document, writes (whether it is relaxed or not)
> are always observed by the HW inorder with respect to each other.

Oh interesting, that document got revised to make writel_relaxed less
relaxed a few years ago, didn't know that. Thanks.

However, this is still not OK, the full code is:

        /* ring CMDQ DB */
        wmb();
        writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
               rcfw->cmdq_bar_reg_prod_off);
        writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
               rcfw->cmdq_bar_reg_trig_off);
done:
        spin_unlock_irqrestore(&cmdq->lock, flags);


And the definition of _relaxed allows the writes to order outside the
spinlock region, which is very likely to be wrong in this driver.

I'm not sure adding a mmiowb() just to use a writel_relaxed is any
sort of win though?

Jason
Jason Gunthorpe March 20, 2018, 3:23 p.m. UTC | #5
On Tue, Mar 20, 2018 at 10:08:16AM -0500, Sinan Kaya wrote:

> Never mind, it will break some architectures. I'll only change the first one.
> 
>  (1) On some systems, I/O stores are not strongly ordered across all CPUs, and
>      so for _all_ general drivers locks should be used and mmiowb() must be
>      issued prior to unlocking the critical section.

I think the kernel could do well to have a spin_unlock_mmiowb()
function. We have this patern quite a bit.

Arches like x86 can just make it == spin_unlock, while PPC and ARM can
add their extra barriers.

Then we can safely and efficiently use _realxed within such a
spinlock region.

Jason
Sinan Kaya March 20, 2018, 3:30 p.m. UTC | #6
On 3/20/2018 10:20 AM, Jason Gunthorpe wrote:
> On Tue, Mar 20, 2018 at 10:00:49AM -0500, Sinan Kaya wrote:
>> On 3/20/2018 9:48 AM, Jason Gunthorpe wrote:
>>> On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
>>>> Code includes wmb() followed by writel(). writel() already has a barrier on
>>>> some architectures like arm64.
>>>>
>>>> This ends up CPU observing two barriers back to back before executing the
>>>> register write.
>>>>
>>>> Since code already has an explicit barrier call, changing writel() to
>>>> writel_relaxed().
>>>>
>>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>>>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>>>> index 8329ec6..4a6b981 100644
>>>> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
>>>> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
>>>>  
>>>>  	/* ring CMDQ DB */
>>>>  	wmb();
>>>> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>>>> -	       rcfw->cmdq_bar_reg_prod_off);
>>>> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>>>> -	       rcfw->cmdq_bar_reg_trig_off);
>>>> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>>>> +		       rcfw->cmdq_bar_reg_prod_off);
>>>> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>>>> +		       rcfw->cmdq_bar_reg_trig_off);
>>>
>>> Woah, this may not be safe..
>>>
>>> The definition of writel_relaxed() is that it is fully unordered, so
>>> the above two writes may change order now. Broadcom guys would have to
>>> ack if that it is OK or not for their hardware.
>>>
>>> In general this is not an OK approach for a mechanical
>>> conversion.. Only the first writel can be convereted.
>>>
>>> You need to check all your patches to make sure there are no
>>> subsequent writel's in the places touched.
>>
>> I paid special attention to this one and went to check the barriers
>> document. According to the document, writes (whether it is relaxed or not)
>> are always observed by the HW inorder with respect to each other.
> 
> Oh interesting, that document got revised to make writel_relaxed less
> relaxed a few years ago, didn't know that. Thanks.
> 
> However, this is still not OK, the full code is:
> 
>         /* ring CMDQ DB */
>         wmb();
>         writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
>                rcfw->cmdq_bar_reg_prod_off);
>         writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
>                rcfw->cmdq_bar_reg_trig_off);
> done:
>         spin_unlock_irqrestore(&cmdq->lock, flags);
> 
> 
> And the definition of _relaxed allows the writes to order outside the
> spinlock region, which is very likely to be wrong in this driver.
> 
> I'm not sure adding a mmiowb() just to use a writel_relaxed is any
> sort of win though?

I'd prefer this. 

mmiowb() on ARM64 is empty. mmiowb() guarantees that code also works for PPC too.

I'll switch to this instead so it works for everybody.

> 
> Jason
>
Jason Gunthorpe March 20, 2018, 4:02 p.m. UTC | #7
On Tue, Mar 20, 2018 at 10:30:34AM -0500, Sinan Kaya wrote:
> On 3/20/2018 10:20 AM, Jason Gunthorpe wrote:
> > On Tue, Mar 20, 2018 at 10:00:49AM -0500, Sinan Kaya wrote:
> >> On 3/20/2018 9:48 AM, Jason Gunthorpe wrote:
> >>> On Mon, Mar 19, 2018 at 10:47:43PM -0400, Sinan Kaya wrote:
> >>>> Code includes wmb() followed by writel(). writel() already has a barrier on
> >>>> some architectures like arm64.
> >>>>
> >>>> This ends up CPU observing two barriers back to back before executing the
> >>>> register write.
> >>>>
> >>>> Since code already has an explicit barrier call, changing writel() to
> >>>> writel_relaxed().
> >>>>
> >>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> >>>>  drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 8 ++++----
> >>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> >>>> index 8329ec6..4a6b981 100644
> >>>> +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
> >>>> @@ -181,10 +181,10 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
> >>>>  
> >>>>  	/* ring CMDQ DB */
> >>>>  	wmb();
> >>>> -	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> >>>> -	       rcfw->cmdq_bar_reg_prod_off);
> >>>> -	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> >>>> -	       rcfw->cmdq_bar_reg_trig_off);
> >>>> +	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> >>>> +		       rcfw->cmdq_bar_reg_prod_off);
> >>>> +	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> >>>> +		       rcfw->cmdq_bar_reg_trig_off);
> >>>
> >>> Woah, this may not be safe..
> >>>
> >>> The definition of writel_relaxed() is that it is fully unordered, so
> >>> the above two writes may change order now. Broadcom guys would have to
> >>> ack if that it is OK or not for their hardware.
> >>>
> >>> In general this is not an OK approach for a mechanical
> >>> conversion.. Only the first writel can be convereted.
> >>>
> >>> You need to check all your patches to make sure there are no
> >>> subsequent writel's in the places touched.
> >>
> >> I paid special attention to this one and went to check the barriers
> >> document. According to the document, writes (whether it is relaxed or not)
> >> are always observed by the HW inorder with respect to each other.
> > 
> > Oh interesting, that document got revised to make writel_relaxed less
> > relaxed a few years ago, didn't know that. Thanks.
> > 
> > However, this is still not OK, the full code is:
> > 
> >         /* ring CMDQ DB */
> >         wmb();
> >         writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
> >                rcfw->cmdq_bar_reg_prod_off);
> >         writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
> >                rcfw->cmdq_bar_reg_trig_off);
> > done:
> >         spin_unlock_irqrestore(&cmdq->lock, flags);
> > 
> > 
> > And the definition of _relaxed allows the writes to order outside the
> > spinlock region, which is very likely to be wrong in this driver.
> > 
> > I'm not sure adding a mmiowb() just to use a writel_relaxed is any
> > sort of win though?
> 
> I'd prefer this. 
> 
> mmiowb() on ARM64 is empty. mmiowb() guarantees that code also works for PPC too.
> 
> I'll switch to this instead so it works for everybody.

It looks like a compiler barrier on x86 so that seems fine too.

Jason
diff mbox

Patch

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 8329ec6..4a6b981 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -181,10 +181,10 @@  static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
 
 	/* ring CMDQ DB */
 	wmb();
-	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
-	       rcfw->cmdq_bar_reg_prod_off);
-	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
-	       rcfw->cmdq_bar_reg_trig_off);
+	writel_relaxed(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
+		       rcfw->cmdq_bar_reg_prod_off);
+	writel_relaxed(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
+		       rcfw->cmdq_bar_reg_trig_off);
 done:
 	spin_unlock_irqrestore(&cmdq->lock, flags);
 	/* Return the CREQ response pointer */