diff mbox

[v6,2/4] net: moxa: replace build_skb() with netdev_alloc_skb_ip_align() / memcpy()

Message ID 1408976542-15624-1-git-send-email-jonas.jensen@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jonas Jensen Aug. 25, 2014, 2:22 p.m. UTC
build_skb() is used to make skbs out of existing RX ring memory
which is bad because the RX ring is allocated only once, on probe.
Memory corruption occur because said memory is reclaimed, i.e.
__kfree_skb() (and eventually put_page()).

Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy().

Remove SKB_DATA_ALIGN() from RX buffer size while we're at it.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69041

Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
---

Notes:
    Changes since v5:
    
    1. broke out DMA synchronization to separate patch
    
    Applies to next-20140825

 drivers/net/ethernet/moxa/moxart_ether.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

David Miller Aug. 26, 2014, 12:26 a.m. UTC | #1
From: Jonas Jensen <jonas.jensen@gmail.com>
Date: Mon, 25 Aug 2014 16:22:22 +0200

> build_skb() is used to make skbs out of existing RX ring memory
> which is bad because the RX ring is allocated only once, on probe.
> Memory corruption occur because said memory is reclaimed, i.e.
> __kfree_skb() (and eventually put_page()).
> 
> Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy().
> 
> Remove SKB_DATA_ALIGN() from RX buffer size while we're at it.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69041
> 
> Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>

Applied.
Arnd Bergmann Aug. 26, 2014, 9:04 a.m. UTC | #2
On Monday 25 August 2014 16:22:22 Jonas Jensen wrote:
> @@ -226,13 +226,15 @@ static int moxart_rx_poll(struct napi_struct *napi, int budget)
>                 if (len > RX_BUF_SIZE)
>                         len = RX_BUF_SIZE;
>  
> -               skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size);
> +               skb = netdev_alloc_skb_ip_align(ndev, len);
> +
>                 if (unlikely(!skb)) {
> -                       net_dbg_ratelimited("build_skb failed\n");
> +                       net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n");
>                         priv->stats.rx_dropped++;
>                         priv->stats.rx_errors++;
>                 }
>  
> +               memcpy(skb->data, priv->rx_buf[rx_head], len);
>                 skb_put(skb, len);
>                 skb->protocol = eth_type_trans(skb, ndev);
>                 napi_gro_receive(&priv->napi, skb);

While this seems correct, I wonder why you don't do the normal approach of
dequeuing the skb from the chain and adding a newly allocated skb to it to
save the memcpy.

	Arnd
David Laight Aug. 26, 2014, 9:10 a.m. UTC | #3
From: Arnd Bergmann
> On Monday 25 August 2014 16:22:22 Jonas Jensen wrote:
> > @@ -226,13 +226,15 @@ static int moxart_rx_poll(struct napi_struct *napi, int budget)
> >                 if (len > RX_BUF_SIZE)
> >                         len = RX_BUF_SIZE;
> >
> > -               skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size);
> > +               skb = netdev_alloc_skb_ip_align(ndev, len);
> > +
> >                 if (unlikely(!skb)) {
> > -                       net_dbg_ratelimited("build_skb failed\n");
> > +                       net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n");
> >                         priv->stats.rx_dropped++;
> >                         priv->stats.rx_errors++;
> >                 }
> >
> > +               memcpy(skb->data, priv->rx_buf[rx_head], len);

Is this memcpy() aligned?
If the hardware can receive to a 4n+2 offset it is probably better to
copy the two bytes before the frame data and to round the length of to
a whole number of words.

> >                 skb_put(skb, len);
> >                 skb->protocol = eth_type_trans(skb, ndev);
> >                 napi_gro_receive(&priv->napi, skb);
> 
> While this seems correct, I wonder why you don't do the normal approach of
> dequeuing the skb from the chain and adding a newly allocated skb to it to
> save the memcpy.

Because the receive buffer area isn't made of skbs.
Post-allocating the skb also reduces the 'true size' of the skb.

	David
Eric Dumazet Aug. 26, 2014, 10:55 a.m. UTC | #4
On Tue, 2014-08-26 at 09:10 +0000, David Laight wrote:
> From: Arnd Bergmann

> > While this seems correct, I wonder why you don't do the normal approach of
> > dequeuing the skb from the chain and adding a newly allocated skb to it to
> > save the memcpy.
> 
> Because the receive buffer area isn't made of skbs.
> Post-allocating the skb also reduces the 'true size' of the skb.

This strategy assumes this is not a 10Gbe NIC.

We try to avoid copies because they are generally not needed.

Wifi devices are usually slow, and packet losses are more frequent, so
the copybreak gives better chance to not doing the collapses [1] later
in the TCP stack.

[1] collapses : reducing skb overhead (skb->len / skb->truesize ratio)
diff mbox

Patch

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index eed70d9..d66058d 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -226,13 +226,15 @@  static int moxart_rx_poll(struct napi_struct *napi, int budget)
 		if (len > RX_BUF_SIZE)
 			len = RX_BUF_SIZE;
 
-		skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size);
+		skb = netdev_alloc_skb_ip_align(ndev, len);
+
 		if (unlikely(!skb)) {
-			net_dbg_ratelimited("build_skb failed\n");
+			net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n");
 			priv->stats.rx_dropped++;
 			priv->stats.rx_errors++;
 		}
 
+		memcpy(skb->data, priv->rx_buf[rx_head], len);
 		skb_put(skb, len);
 		skb->protocol = eth_type_trans(skb, ndev);
 		napi_gro_receive(&priv->napi, skb);
@@ -464,8 +466,7 @@  static int moxart_mac_probe(struct platform_device *pdev)
 	spin_lock_init(&priv->txlock);
 
 	priv->tx_buf_size = TX_BUF_SIZE;
-	priv->rx_buf_size = RX_BUF_SIZE +
-			    SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	priv->rx_buf_size = RX_BUF_SIZE;
 
 	priv->tx_desc_base = dma_alloc_coherent(NULL, TX_REG_DESC_SIZE *
 						TX_DESC_NUM, &priv->tx_base,