[v2] ethernet:arc: Fix racing of TX ring buffer
diff mbox

Message ID 20160517152520.GA2750@debian-dorm
State New
Headers show

Commit Message

Shuyu Wei May 17, 2016, 3:25 p.m. UTC
Setting the FOR_EMAC flag should be the last step of modifying the buffer
descriptor, or possible racing will occur.

The loop counter i in tx_clean() is not needed, and we need to make sure
it does not clear the finished txbds.

Signed-off-by: Shuyu Wei <sy.w@outlook.com>
---
Changes in v2:
- Remove loop counter in tx_clean and check for unfinished txbds (Ueimor)
- Use dma_wmb() to sync writes (Ueimor)

Comments

Lino Sanfilippo May 17, 2016, 4:36 p.m. UTC | #1
Hi,

> Von: "Shuyu Wei" <wsy2220@gmail.com>
> @@ -685,13 +684,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev)
>  	wmb();
>  
>  	skb_tx_timestamp(skb);
> +	priv->tx_buff[*txbd_curr].skb = skb;
> +
> +	dma_wmb();
>  
>  	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
>  	/* Make sure info word is set */
>  	wmb();
>  
> -	priv->tx_buff[*txbd_curr].skb = skb;
>  
>  	/* Increment index to point to the next BD */
>  	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
> 

I wonder if this is correct. AFAIK a dma_wmb() only guarantees ordering of writes to DMA memory.
The assignment of the skb however is to RAM not to DMA.


Regards,
Lino
David Miller May 17, 2016, 6:24 p.m. UTC | #2
From: Shuyu Wei <wsy2220@gmail.com>
Date: Tue, 17 May 2016 23:25:20 +0800

> diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
> index a3a9392..df3dfef 100644
> --- a/drivers/net/ethernet/arc/emac_main.c
> +++ b/drivers/net/ethernet/arc/emac_main.c
> @@ -153,9 +153,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>  {
>  	struct arc_emac_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> -	unsigned int i;
>  
> -	for (i = 0; i < TX_BD_NUM; i++) {
> +	while (priv->txbd_dirty != priv->txbd_curr) {
>  		unsigned int *txbd_dirty = &priv->txbd_dirty;
>  		struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
>  		struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> @@ -685,13 +684,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev)
>  	wmb();
>  
>  	skb_tx_timestamp(skb);
> +	priv->tx_buff[*txbd_curr].skb = skb;
> +
> +	dma_wmb();
>  
>  	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
>  	/* Make sure info word is set */
>  	wmb();
>  
> -	priv->tx_buff[*txbd_curr].skb = skb;
>  
>  	/* Increment index to point to the next BD */
>  	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
> 

These memory barriers do not look correct to me.

dma_wmb() is about visibility between CPU reads/writes and device
accesses to a piece of memory.  But what you're concerned about wrt.
the SKB pointer assignment is CPU to CPU accesses.  Therefore something
like smp_wmb() would be appropriate.

And the wmb() looks like it should be a dma_wmb().
Francois Romieu May 18, 2016, 12:01 a.m. UTC | #3
David Miller <davem@davemloft.net> :
> From: Shuyu Wei <wsy2220@gmail.com>
> Date: Tue, 17 May 2016 23:25:20 +0800
> 
> > diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
> > index a3a9392..df3dfef 100644
> > --- a/drivers/net/ethernet/arc/emac_main.c
> > +++ b/drivers/net/ethernet/arc/emac_main.c
> > @@ -153,9 +153,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
> >  {
> >  	struct arc_emac_priv *priv = netdev_priv(ndev);
> >  	struct net_device_stats *stats = &ndev->stats;
> > -	unsigned int i;
> >  
> > -	for (i = 0; i < TX_BD_NUM; i++) {
> > +	while (priv->txbd_dirty != priv->txbd_curr) {
> >  		unsigned int *txbd_dirty = &priv->txbd_dirty;
> >  		struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
> >  		struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> > @@ -685,13 +684,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev)
> >  	wmb();
> >  
> >  	skb_tx_timestamp(skb);
> > +	priv->tx_buff[*txbd_curr].skb = skb;
> > +
> > +	dma_wmb();
> >  
> >  	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
> >  
> >  	/* Make sure info word is set */
> >  	wmb();
> >  
> > -	priv->tx_buff[*txbd_curr].skb = skb;
> >  
> >  	/* Increment index to point to the next BD */
> >  	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
> > 
> 
> These memory barriers do not look correct to me.
> 
> dma_wmb() is about visibility between CPU reads/writes and device
> accesses to a piece of memory.  But what you're concerned about wrt.
> the SKB pointer assignment is CPU to CPU accesses.  Therefore something
> like smp_wmb() would be appropriate.

Something like:

 	skb_tx_timestamp(skb);

	/* CPU write vs device access. Must be done before releasing control
	 * of the descriptor (*info).
	 */
	dma_wmb();

	priv->tx_buff[*txbd_curr].skb = skb;

	/* CPU arc_emac_tx_clean vs CPU arc_emac_tx. Must be done before
	 * index (tx_curr) update. Does not necessarily deserves to be done
	 * before releasing control of the descriptor (*info) due to
	 * descriptor vs index ordering.
	 *
	 * FIXME: missing smp_rmb before the while loop in arc_emac_tx_clean.
	 */
	smp_wmb();

 	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);

	/* local descriptor (*info) update vs index (tx_curr) update. */
 	wmb();

	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;

	smp_mb();	// The driver alreay contains this one.

The smp_wmb() and wmb() could be made side-by-side once *info is
updated but I don't see the adequate idiom to improve the smp_wmb + wmb
combo. :o/

> And the wmb() looks like it should be a dma_wmb().

I see two points against it:
- it could be too late for skb_tx_timestamp().
- arc_emac_tx_clean must not see an index update before the device
  got a chance to acquire the descriptor. arc_emac_tx_clean can't
  tell the difference between an about-to-be-released descriptor
  and a returned-from-device one.

Patch
diff mbox

diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
index a3a9392..df3dfef 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -153,9 +153,8 @@  static void arc_emac_tx_clean(struct net_device *ndev)
 {
 	struct arc_emac_priv *priv = netdev_priv(ndev);
 	struct net_device_stats *stats = &ndev->stats;
-	unsigned int i;
 
-	for (i = 0; i < TX_BD_NUM; i++) {
+	while (priv->txbd_dirty != priv->txbd_curr) {
 		unsigned int *txbd_dirty = &priv->txbd_dirty;
 		struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
 		struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
@@ -685,13 +684,15 @@  static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev)
 	wmb();
 
 	skb_tx_timestamp(skb);
+	priv->tx_buff[*txbd_curr].skb = skb;
+
+	dma_wmb();
 
 	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
 	/* Make sure info word is set */
 	wmb();
 
-	priv->tx_buff[*txbd_curr].skb = skb;
 
 	/* Increment index to point to the next BD */
 	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;