diff mbox series

[net-next] bcm63xx_enet: batch process rx path

Message ID 20201204054616.26876-1-liew.s.piaw@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] bcm63xx_enet: batch process rx path | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 25 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Sieng-Piaw Liew Dec. 4, 2020, 5:46 a.m. UTC
Use netif_receive_skb_list to batch process rx skb.
Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
by 12.5%.

Before:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec   120 MBytes  33.7 Mbits/sec  277         sender
[  4]   0.00-30.00  sec   120 MBytes  33.5 Mbits/sec            receiver

After:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec   136 MBytes  37.9 Mbits/sec  203         sender
[  4]   0.00-30.00  sec   135 MBytes  37.7 Mbits/sec            receiver

Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>
---
 drivers/net/ethernet/broadcom/bcm63xx_enet.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Eric Dumazet Dec. 4, 2020, 9:50 a.m. UTC | #1
On 12/4/20 6:46 AM, Sieng Piaw Liew wrote:
> Use netif_receive_skb_list to batch process rx skb.
> Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
> by 12.5%.
> 



Well, the real question is why you do not simply use GRO,
to get 100% performance gain or more for TCP flows.


netif_receive_skb_list() is no longer needed,
GRO layer already uses batching for non TCP packets.

We probably should mark is deprecated.

diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index 916824cca3fda194c42fefec7f514ced1a060043..6fdbe231b7c1b27f523889bda8a20ab7eaab65a6 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -391,7 +391,7 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget)
                skb->protocol = eth_type_trans(skb, dev);
                dev->stats.rx_packets++;
                dev->stats.rx_bytes += len;
-               netif_receive_skb(skb);
+               napi_gro_receive_skb(&priv->napi, skb);
 
        } while (--budget > 0);
Florian Fainelli Dec. 4, 2020, 6:24 p.m. UTC | #2
On 12/3/2020 9:46 PM, Sieng Piaw Liew wrote:
> Use netif_receive_skb_list to batch process rx skb.
> Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
> by 12.5%.
> 
> Before:
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-30.00  sec   120 MBytes  33.7 Mbits/sec  277         sender
> [  4]   0.00-30.00  sec   120 MBytes  33.5 Mbits/sec            receiver
> 
> After:
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-30.00  sec   136 MBytes  37.9 Mbits/sec  203         sender
> [  4]   0.00-30.00  sec   135 MBytes  37.7 Mbits/sec            receiver
> 
> Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>

Your patches are all dependent on one another and part of a series to
please have a cover letter and order them so they can be applied in the
correct order, after you address Eric's feedback. Thank you
Sieng-Piaw Liew Dec. 9, 2020, 3:33 a.m. UTC | #3
On Fri, Dec 04, 2020 at 10:50:45AM +0100, Eric Dumazet wrote:
> 
> 
> On 12/4/20 6:46 AM, Sieng Piaw Liew wrote:
> > Use netif_receive_skb_list to batch process rx skb.
> > Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
> > by 12.5%.
> > 
> 
> 
> 
> Well, the real question is why you do not simply use GRO,
> to get 100% performance gain or more for TCP flows.
> 
> 
> netif_receive_skb_list() is no longer needed,
> GRO layer already uses batching for non TCP packets.
> 
> We probably should mark is deprecated.
> 
> diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> index 916824cca3fda194c42fefec7f514ced1a060043..6fdbe231b7c1b27f523889bda8a20ab7eaab65a6 100644
> --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> @@ -391,7 +391,7 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget)
>                 skb->protocol = eth_type_trans(skb, dev);
>                 dev->stats.rx_packets++;
>                 dev->stats.rx_bytes += len;
> -               netif_receive_skb(skb);
> +               napi_gro_receive_skb(&priv->napi, skb);
>  
>         } while (--budget > 0);
>  

The bcm63xx router SoC does not have enough CPU power nor hardware
accelerator to process checksum validation fast enough for GRO/GSO.

I have tested napi_gro_receive() on LAN-WAN setup. The resulting
bandwidth dropped from 95Mbps wire speed down to 80Mbps. And it's
inconsistent, with spikes and drops of >5Mbps.

The ag71xx driver for ath79 router SoC reverted its use for the same
reason.
http://lists.infradead.org/pipermail/lede-commits/2017-October/004864.html
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index 916824cca3fd..b82b7805c36a 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -297,10 +297,12 @@  static void bcm_enet_refill_rx_timer(struct timer_list *t)
 static int bcm_enet_receive_queue(struct net_device *dev, int budget)
 {
 	struct bcm_enet_priv *priv;
+	struct list_head rx_list;
 	struct device *kdev;
 	int processed;
 
 	priv = netdev_priv(dev);
+	INIT_LIST_HEAD(&rx_list);
 	kdev = &priv->pdev->dev;
 	processed = 0;
 
@@ -391,10 +393,12 @@  static int bcm_enet_receive_queue(struct net_device *dev, int budget)
 		skb->protocol = eth_type_trans(skb, dev);
 		dev->stats.rx_packets++;
 		dev->stats.rx_bytes += len;
-		netif_receive_skb(skb);
+		list_add_tail(&skb->list, &rx_list);
 
 	} while (--budget > 0);
 
+	netif_receive_skb_list(&rx_list);
+
 	if (processed || !priv->rx_desc_count) {
 		bcm_enet_refill_rx(dev);