Message ID | 1521514068-8856-4-git-send-email-okaya@codeaurora.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
On Mon, Mar 19, 2018 at 10:47:45PM -0400, Sinan Kaya wrote: > Code includes wmb() followed by writel(). writel() already has a barrier on > some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing the > register write. > > Create a new wrapper function with relaxed write operator. Use the new > wrapper when a write is following a wmb(). > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > Signed-off-by: Sinan Kaya <okaya@codeaurora.org> > drivers/infiniband/hw/i40iw/i40iw_ctrl.c | 6 ++++-- > drivers/infiniband/hw/i40iw/i40iw_osdep.h | 1 + > drivers/infiniband/hw/i40iw/i40iw_uk.c | 2 +- > drivers/infiniband/hw/i40iw/i40iw_utils.c | 11 +++++++++++ > 4 files changed, 17 insertions(+), 3 deletions(-) The one looks fine Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 19, 2018 at 08:47:45PM -0600, Sinan Kaya wrote: > Code includes wmb() followed by writel(). writel() already has a barrier on > some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing the > register write. > > Create a new wrapper function with relaxed write operator. Use the new > wrapper when a write is following a wmb(). > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 19, 2018 at 10:47:45PM -0400, Sinan Kaya wrote: > diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c > index 8afa5a6..7f0ebed 100644 > +++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c > @@ -723,7 +723,7 @@ static void i40iw_cq_request_notification(struct i40iw_cq_uk *cq, > > wmb(); /* make sure WQE is populated before valid bit is set */ > > - writel(cq->cq_id, cq->cqe_alloc_reg); > + writel_relaxed(cq->cq_id, cq->cqe_alloc_reg); > } Ah, this one is probably not OK, i40iw_cq_request_notification is called here: spin_lock_irqsave(&iwcq->lock, flags); ukcq->ops.iw_cq_request_notification(ukcq, cq_notify); spin_unlock_irqrestore(&iwcq->lock, flags); So this needs to add mmmiomb(); to keep the same semantics. Generally I think you need to be very careful to ensure that any conversion to _relaxed isn't contained by a spinlock, or the mmiomb() is present. Maybe even do a first series with this obviously correct pattern: wmb(); writel() -> writel_relaxed() writel() -> writel_relaxed() [..] mmiowmb(); Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/21/2018 3:02 PM, Jason Gunthorpe wrote: > On Mon, Mar 19, 2018 at 10:47:45PM -0400, Sinan Kaya wrote: >> diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c >> index 8afa5a6..7f0ebed 100644 >> +++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c >> @@ -723,7 +723,7 @@ static void i40iw_cq_request_notification(struct i40iw_cq_uk *cq, >> >> wmb(); /* make sure WQE is populated before valid bit is set */ >> >> - writel(cq->cq_id, cq->cqe_alloc_reg); >> + writel_relaxed(cq->cq_id, cq->cqe_alloc_reg); >> } > > Ah, this one is probably not OK, i40iw_cq_request_notification is > called here: > > spin_lock_irqsave(&iwcq->lock, flags); > ukcq->ops.iw_cq_request_notification(ukcq, cq_notify); > spin_unlock_irqrestore(&iwcq->lock, flags); > > So this needs to add mmmiomb(); to keep the same semantics. > > Generally I think you need to be very careful to ensure that any > conversion to _relaxed isn't contained by a spinlock, or the mmiomb() > is present. > > Maybe even do a first series with this obviously correct pattern: > > wmb(); > writel() -> writel_relaxed() > writel() -> writel_relaxed() > [..] > mmiowmb(); Good catch. I changed it as follows: +++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c @@ -723,7 +723,8 @@ static void i40iw_cq_request_notification(struct i40iw_cq_uk *cq, wmb(); /* make sure WQE is populated before valid bit is set */ - writel(cq->cq_id, cq->cqe_alloc_reg); + writel_relaxed(cq->cq_id, cq->cqe_alloc_reg); + mmiowb(); } > > Jason >
diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c index c74fd33..47f473e 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c +++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c @@ -706,9 +706,11 @@ static void i40iw_sc_ccq_arm(struct i40iw_sc_cq *ccq) wmb(); /* make sure shadow area is updated before arming */ if (ccq->dev->is_pf) - i40iw_wr32(ccq->dev->hw, I40E_PFPE_CQARM, ccq->cq_uk.cq_id); + i40iw_wr32_relaxed(ccq->dev->hw, I40E_PFPE_CQARM, + ccq->cq_uk.cq_id); else - i40iw_wr32(ccq->dev->hw, I40E_VFPE_CQARM1, ccq->cq_uk.cq_id); + i40iw_wr32_relaxed(ccq->dev->hw, I40E_VFPE_CQARM1, + ccq->cq_uk.cq_id); } /** diff --git a/drivers/infiniband/hw/i40iw/i40iw_osdep.h b/drivers/infiniband/hw/i40iw/i40iw_osdep.h index f27be3e..e06f4b9 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_osdep.h +++ b/drivers/infiniband/hw/i40iw/i40iw_osdep.h @@ -213,5 +213,6 @@ void i40iw_hw_stats_start_timer(struct i40iw_sc_vsi *vsi); void i40iw_hw_stats_stop_timer(struct i40iw_sc_vsi *vsi); #define i40iw_mmiowb() mmiowb() void i40iw_wr32(struct i40iw_hw *hw, u32 reg, u32 value); +void i40iw_wr32_relaxed(struct i40iw_hw *hw, u32 reg, u32 value); u32 i40iw_rd32(struct i40iw_hw *hw, u32 reg); #endif /* _I40IW_OSDEP_H_ */ diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c index 8afa5a6..7f0ebed 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_uk.c +++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c @@ -723,7 +723,7 @@ static void i40iw_cq_request_notification(struct i40iw_cq_uk *cq, wmb(); /* make sure WQE is populated before valid bit is set */ - writel(cq->cq_id, cq->cqe_alloc_reg); + writel_relaxed(cq->cq_id, cq->cqe_alloc_reg); } /** diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c index ddc1056..99aa6f8 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_utils.c +++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c @@ -125,6 +125,17 @@ inline void i40iw_wr32(struct i40iw_hw *hw, u32 reg, u32 value) } /** + * i40iw_wr32_relaxed - write 32 bits to hw register without ordering + * @hw: hardware information including registers + * @reg: register offset + * @value: vvalue to write to register + */ +inline void i40iw_wr32_relaxed(struct i40iw_hw *hw, u32 reg, u32 value) +{ + writel_relaxed(value, hw->hw_addr + reg); +} + +/** * i40iw_rd32 - read a 32 bit hw register * @hw: hardware information including registers * @reg: register offset
Code includes wmb() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a wmb(). Since code already has an explicit barrier call, changing writel() to writel_relaxed(). Signed-off-by: Sinan Kaya <okaya@codeaurora.org> --- drivers/infiniband/hw/i40iw/i40iw_ctrl.c | 6 ++++-- drivers/infiniband/hw/i40iw/i40iw_osdep.h | 1 + drivers/infiniband/hw/i40iw/i40iw_uk.c | 2 +- drivers/infiniband/hw/i40iw/i40iw_utils.c | 11 +++++++++++ 4 files changed, 17 insertions(+), 3 deletions(-)