diff mbox

[v6,7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs

Message ID 1521829277-9398-8-git-send-email-okaya@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Sinan Kaya March 23, 2018, 6:21 p.m. UTC
Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Alexander Duyck March 23, 2018, 6:25 p.m. UTC | #1
On Fri, Mar 23, 2018 at 11:21 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 815cb1a..9e684b1 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -725,7 +725,12 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>                  * such as IA-64).
>                  */
>                 wmb();
> -               writel(i, rx_ring->tail);
> +               writel_relaxed(i, rx_ring->tail);
> +
> +               /* We need this if more than one processor can write to our tail
> +                * at a time, it synchronizes IO on IA64/Altix systems
> +                */
> +               mmiowb();
>         }

The mmiowb shouldn't be needed for Rx. Only one CPU will be running
NAPI for the queue and we will synchronize this with a full writel
anyway when we re-enable the interrupts.

>  }
>
> @@ -1232,7 +1237,12 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
>                  * know there are new descriptors to fetch.
>                  */
>                 wmb();
> -               writel(xdp_ring->next_to_use, xdp_ring->tail);
> +               writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
> +
> +               /* We need this if more than one processor can write to our tail
> +                * at a time, it synchronizes IO on IA64/Altix systems
> +                */
> +               mmiowb();
>         }
>
>         u64_stats_update_begin(&rx_ring->syncp);
> @@ -4004,7 +4014,12 @@ static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
>         tx_ring->next_to_use = i;
>
>         /* notify HW of packet */
> -       writel(i, tx_ring->tail);
> +       writel_relaxed(i, tx_ring->tail);
> +
> +       /* We need this if more than one processor can write to our tail
> +        * at a time, it synchronizes IO on IA64/Altix systems
> +        */
> +       mmiowb();
>
>         return;
>  dma_error:
> --
> 2.7.4
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
Sinan Kaya March 23, 2018, 6:27 p.m. UTC | #2
On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>> +               /* We need this if more than one processor can write to our tail
>> +                * at a time, it synchronizes IO on IA64/Altix systems
>> +                */
>> +               mmiowb();
>>         }
> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
> NAPI for the queue and we will synchronize this with a full writel
> anyway when we re-enable the interrupts.
> 

OK. I can fix this on the next version. I did a blanket search and replace for
my writel_relaxed() changes as I don't know the code well enough. 

Please point me to the redundant ones.
Alexander Duyck March 23, 2018, 6:31 p.m. UTC | #3
On Fri, Mar 23, 2018 at 11:27 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> On 3/23/2018 2:25 PM, Alexander Duyck wrote:
>>> +               /* We need this if more than one processor can write to our tail
>>> +                * at a time, it synchronizes IO on IA64/Altix systems
>>> +                */
>>> +               mmiowb();
>>>         }
>> The mmiowb shouldn't be needed for Rx. Only one CPU will be running
>> NAPI for the queue and we will synchronize this with a full writel
>> anyway when we re-enable the interrupts.
>>
>
> OK. I can fix this on the next version. I did a blanket search and replace for
> my writel_relaxed() changes as I don't know the code well enough.
>
> Please point me to the redundant ones.

So from what I can tell only this file and i40e needed any additional
mmiowb calls added. The rest are not needed.

- Alex
Sinan Kaya March 23, 2018, 6:45 p.m. UTC | #4
On 3/23/2018 2:31 PM, Alexander Duyck wrote:
>> Please point me to the redundant ones.
> So from what I can tell only this file and i40e needed any additional
> mmiowb calls added. The rest are not needed.


Thanks, I'll clean up between 2..6 and then make your suggested changes
on 1 and 7.
diff mbox

Patch

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 815cb1a..9e684b1 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -725,7 +725,12 @@  static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 		 * such as IA-64).
 		 */
 		wmb();
-		writel(i, rx_ring->tail);
+		writel_relaxed(i, rx_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 }
 
@@ -1232,7 +1237,12 @@  static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		 * know there are new descriptors to fetch.
 		 */
 		wmb();
-		writel(xdp_ring->next_to_use, xdp_ring->tail);
+		writel_relaxed(xdp_ring->next_to_use, xdp_ring->tail);
+
+		/* We need this if more than one processor can write to our tail
+		 * at a time, it synchronizes IO on IA64/Altix systems
+		 */
+		mmiowb();
 	}
 
 	u64_stats_update_begin(&rx_ring->syncp);
@@ -4004,7 +4014,12 @@  static void ixgbevf_tx_map(struct ixgbevf_ring *tx_ring,
 	tx_ring->next_to_use = i;
 
 	/* notify HW of packet */
-	writel(i, tx_ring->tail);
+	writel_relaxed(i, tx_ring->tail);
+
+	/* We need this if more than one processor can write to our tail
+	 * at a time, it synchronizes IO on IA64/Altix systems
+	 */
+	mmiowb();
 
 	return;
 dma_error: