Message ID | 1521829277-9398-8-git-send-email-okaya@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 23, 2018 at 11:21 AM, Sinan Kaya <okaya@codeaurora.org> wrote: > Code includes wmb() followed by writel() in multiple places. writel() > already has a barrier on some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing the > register write. > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > Signed-off-by: Sinan Kaya <okaya@codeaurora.org> > --- > drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > index 815cb1a..9e684b1 100644 > --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c > @@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring, > * such as IA-64). > */ > wmb(); > - writel(i, rx_ring->tail); > + writel_relaxed(i, rx_ring->tail); > + > + /* We need this if more than one processor can write to our tail > + * at a time, it synchronizes IO on IA64/Altix systems > + */ > + mmiowb(); > } The mmiowb shouldn't be needed for Rx. Only one CPU will be running NAPI for the queue and we will synchronize this with a full writel anyway when we re-enable the interrupts. > } > > @@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector, > * know there are new descriptors to fetch. > */ > wmb(); > - writel(xdp_ring->next_to_use, xdp_ring->tail); > + writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail); > + > + /* We need this if more than one processor can write to our tail > + * at a time, it synchronizes IO on IA64/Altix systems > + */ > + mmiowb(); > } > > u64_stats_update_begin(&rx_ring->syncp); > @@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring, > tx_ring->next_to_use = i; > > /* notify HW of packet */ > - writel(i, tx_ring->tail); > + writel_relaxed(i, tx_ring->tail); > + > + /* We need this if more than one processor can write to our tail > + * at a time, it synchronizes IO on IA64/Altix systems > + */ > + mmiowb(); > > return; > dma_error: > -- > 2.7.4 > > _______________________________________________ > Intel-wired-lan mailing list > Intel-wired-lan@osuosl.org > https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
On 3/23/2018 2:25 PM, Alexander Duyck wrote: >> + /* We need this if more than one processor can write to our tail >> + * at a time, it synchronizes IO on IA64/Altix systems >> + */ >> + mmiowb(); >> } > The mmiowb shouldn't be needed for Rx. Only one CPU will be running > NAPI for the queue and we will synchronize this with a full writel > anyway when we re-enable the interrupts. > OK. I can fix this on the next version. I did a blanket search and replace for my writel_relaxed() changes as I don't know the code well enough. Please point me to the redundant ones.
On Fri, Mar 23, 2018 at 11:27 AM, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/23/2018 2:25 PM, Alexander Duyck wrote: >>> + /* We need this if more than one processor can write to our tail >>> + * at a time, it synchronizes IO on IA64/Altix systems >>> + */ >>> + mmiowb(); >>> } >> The mmiowb shouldn't be needed for Rx. Only one CPU will be running >> NAPI for the queue and we will synchronize this with a full writel >> anyway when we re-enable the interrupts. >> > > OK. I can fix this on the next version. I did a blanket search and replace for > my writel_relaxed() changes as I don't know the code well enough. > > Please point me to the redundant ones. So from what I can tell only this file and i40e needed any additional mmiowb calls added. The rest are not needed. - Alex
On 3/23/2018 2:31 PM, Alexander Duyck wrote: >> Please point me to the redundant ones. > So from what I can tell only this file and i40e needed any additional > mmiowb calls added. The rest are not needed. Thanks, I'll clean up between 2..6 and then make your suggested changes on 1 and 7.
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 815cb1a..9e684b1 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring, * such as IA-64). */ wmb(); - writel(i, rx_ring->tail); + writel_relaxed(i, rx_ring->tail); + + /* We need this if more than one processor can write to our tail + * at a time, it synchronizes IO on IA64/Altix systems + */ + mmiowb(); } } @@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector, * know there are new descriptors to fetch. */ wmb(); - writel(xdp_ring->next_to_use, xdp_ring->tail); + writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail); + + /* We need this if more than one processor can write to our tail + * at a time, it synchronizes IO on IA64/Altix systems + */ + mmiowb(); } u64_stats_update_begin(&rx_ring->syncp); @@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring, tx_ring->next_to_use = i; /* notify HW of packet */ - writel(i, tx_ring->tail); + writel_relaxed(i, tx_ring->tail); + + /* We need this if more than one processor can write to our tail + * at a time, it synchronizes IO on IA64/Altix systems + */ + mmiowb(); return; dma_error:
Code includes wmb() followed by writel() in multiple places. writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Since code already has an explicit barrier call, changing writel() to writel_relaxed(). Signed-off-by: Sinan Kaya <okaya@codeaurora.org> --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)