Message ID | 20210211161830.17366-2-TheSven73@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | lan743x speed boost | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 5 of 5 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 414 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Thursday, February 11, 2021 7:18:26 PM MSK you wrote: > From: Sven Van Asbroeck <thesven73@gmail.com> > > The buffers in the lan743x driver's receive ring are always 9K, > even when the largest packet that can be received (the mtu) is > much smaller. This performs particularly badly on cpu archs > without dma cache snooping (such as ARM): each received packet > results in a 9K dma_{map|unmap} operation, which is very expensive > because cpu caches need to be invalidated. > > Careful measurement of the driver rx path on armv7 reveals that > the cpu spends the majority of its time waiting for cache > invalidation. > > Optimize by keeping the rx ring buffer size as close as possible > to the mtu. This limits the amount of cache that requires > invalidation. > > This optimization would normally force us to re-allocate all > ring buffers when the mtu is changed - a disruptive event, > because it can only happen when the network interface is down. > > Remove the need to re-allocate all ring buffers by adding support > for multi-buffer frames. Now any combination of mtu and ring > buffer size will work. When the mtu changes from mtu1 to mtu2, > consumed buffers of size mtu1 are lazily replaced by newly > allocated buffers of size mtu2. > > These optimizations double the rx performance on armv7. > Third parties report 3x rx speedup on armv8. > > Tested with iperf3 on a freescale imx6qp + lan7430, both sides > set to mtu 1500 bytes, measure rx performance: > > Before: > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0 > After: > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0 > > Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> > --- ( for the reference to current speed, response to v1 of the patch can be found at https://lkml.org/lkml/2021/2/5/472 ) Hi Sven although whole set of tests might be an overly extensive, but after applying patch v2 [1/5] tests are: sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ ifmtu eth7 500 mtu = 500 sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf ... number of sent packets = 1000000 number of received packets = 747411 number of lost packets = 252589 number of out of order packets = 0 number of bit errors = 0 total errors detected = 252589 bit error rate = 0.252589 average speed: 408.0757 Mbit/s ... number of sent packets = 1000000 number of received packets = 738377 number of lost packets = 261623 number of out of order packets = 0 number of bit errors = 0 total errors detected = 261623 bit error rate = 0.261623 average speed: 413.1470 Mbit/s ... number of sent packets = 1000000 number of received packets = 738142 number of lost packets = 261858 number of out of order packets = 0 number of bit errors = 0 total errors detected = 261858 bit error rate = 0.261858 average speed: 413.2262 Mbit/s ... number of sent packets = 1000000 number of received packets = 708973 number of lost packets = 291027 number of out of order packets = 0 number of bit errors = 0 total errors detected = 291027 bit error rate = 0.291027 average speed: 430.6224 Mbit/s ... number of sent packets = 1000000 number of received packets = 725452 number of lost packets = 274548 number of out of order packets = 0 number of bit errors = 0 total errors detected = 274548 bit error rate = 0.274548 average speed: 420.7341 Mbit/s sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ ifmtu eth7 1500 mtu = 1500 sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f500 --no-conf ... number of sent packets = 1000000 number of received packets = 714228 number of lost packets = 285772 number of out of order packets = 0 number of bit errors = 0 total errors detected = 285772 bit error rate = 0.285772 average speed: 427.1300 Mbit/s ... number of sent packets = 1000000 number of received packets = 750055 number of lost packets = 249945 number of out of order packets = 0 number of bit errors = 0 total errors detected = 249945 bit error rate = 0.249945 average speed: 405.0383 Mbit/s ... number of sent packets = 1000000 number of received packets = 689458 number of lost packets = 310542 number of out of order packets = 0 number of bit errors = 0 total errors detected = 310542 bit error rate = 0.310542 average speed: 442.5301 Mbit/s number of sent packets = 1000000 number of received packets = 676830 number of lost packets = 323170 number of out of order packets = 0 number of bit errors = 0 total errors detected = 323170 bit error rate = 0.32317 average speed: 450.9439 Mbit/s number of sent packets = 1000000 number of received packets = 701719 number of lost packets = 298281 number of out of order packets = 0 number of bit errors = 0 total errors detected = 298281 bit error rate = 0.298281 average speed: 434.7563 Mbit/s sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f1500 --no-conf ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 643.5758 Mbit/s ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 644.7713 Mbit/s ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 645.4407 Mbit/s ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 645.6741 Mbit/s ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 646.0109 Mbit/s sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ ifmtu eth7 9216 mtu = 9216 bauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f1500 --no-conf ... number of sent packets = 1000000 number of received packets = 575141 number of lost packets = 424859 number of out of order packets = 0 number of bit errors = 0 total errors detected = 424859 bit error rate = 0.424859 average speed: 646.7859 Mbit/s ... number of sent packets = 1000000 number of received packets = 583353 number of lost packets = 416647 number of out of order packets = 0 number of bit errors = 0 total errors detected = 416647 bit error rate = 0.416647 average speed: 637.8472 Mbit/s ... number of sent packets = 1000000 number of received packets = 577127 number of lost packets = 422873 number of out of order packets = 0 number of bit errors = 0 total errors detected = 422873 bit error rate = 0.422873 average speed: 644.5562 Mbit/s ... number of sent packets = 1000000 number of received packets = 576916 number of lost packets = 423084 number of out of order packets = 0 number of bit errors = 0 total errors detected = 423084 bit error rate = 0.423084 average speed: 644.8260 Mbit/s ... number of sent packets = 1000000 number of received packets = 577154 number of lost packets = 422846 number of out of order packets = 0 number of bit errors = 0 total errors detected = 422846 bit error rate = 0.422846 average speed: 644.6815 Mbit/s sbauer@metamini ~/devel/kernel-works/net-next.git lan743x_virtual_phy$ sudo test_ber -l eth7 -c 1000 -n 1000000 -f9216 --no-conf ... number of sent packets = 1000000 number of received packets = 1000000 number of lost packets = 0 number of out of order packets = 0 number of bit errors = 0 total errors detected = 0 bit error rate = 0 average speed: 775.2005 Mbit/s ... number of sent packets = 1000000 number of received packets = 999998 number of lost packets = 2 number of out of order packets = 0 number of bit errors = 0 total errors detected = 2 bit error rate = 2e-06 average speed: 775.0468 Mbit/ ... number of sent packets = 1000000 number of received packets = 999998 number of lost packets = 2 number of out of order packets = 0 number of bit errors = 0 total errors detected = 2 bit error rate = 2e-06 average speed: 775.2150 Mbit/s ... number of sent packets = 1000000 number of received packets = 999997 number of lost packets = 3 number of out of order packets = 0 number of bit errors = 0 total errors detected = 3 bit error rate = 3e-06 average speed: 775.2666 Mbit/s ... number of sent packets = 1000000 number of received packets = 999999 number of lost packets = 1 number of out of order packets = 0 number of bit errors = 0 total errors detected = 1 bit error rate = 1e-06 average speed: 775.2182 Mbit/s
Hi Sergej, thank you for testing this ! On Thu, Feb 11, 2021 at 7:18 PM Sergej Bauer <sbauer@blackbox.su> wrote: > > although whole set of tests might be an overly extensive, but after applying patch v2 [1/5] > tests are: I am unfamiliar with the test_ber tool. Does this patch improve things?
On Friday, February 12, 2021 3:27:40 AM MSK you wrote: > Hi Sergej, thank you for testing this ! Don't mention it, it's just a small assistance > On Thu, Feb 11, 2021 at 7:18 PM Sergej Bauer <sbauer@blackbox.su> wrote: > > although whole set of tests might be an overly extensive, but after > > applying patch v2 [1/5] > > tests are: > I am unfamiliar with the test_ber tool. Does this patch improve things? v1 does a great job number of lost packets decreased by 2.5-3 times except of this, without the patch I have bit error rate about 0.423531 with MTU=1500 and now with this patch BER=0. resuls of v2 are about the same as results of v1 tomorrow I can test it in more wide range of frame sizes. tomorrow I can test v2 again, if it needs to be tested again.
Hi Sven, > Subject: [PATCH net-next v2 1/5] lan743x: boost performance on cpu archs > w/o dma cache snooping > > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > From: Sven Van Asbroeck <thesven73@gmail.com> > > The buffers in the lan743x driver's receive ring are always 9K, even when the > largest packet that can be received (the mtu) is much smaller. This performs > particularly badly on cpu archs without dma cache snooping (such as ARM): > each received packet results in a 9K dma_{map|unmap} operation, which is > very expensive because cpu caches need to be invalidated. > > Careful measurement of the driver rx path on armv7 reveals that the cpu > spends the majority of its time waiting for cache invalidation. > > Optimize by keeping the rx ring buffer size as close as possible to the mtu. > This limits the amount of cache that requires invalidation. > > This optimization would normally force us to re-allocate all ring buffers when > the mtu is changed - a disruptive event, because it can only happen when > the network interface is down. > > Remove the need to re-allocate all ring buffers by adding support for multi- > buffer frames. Now any combination of mtu and ring buffer size will work. > When the mtu changes from mtu1 to mtu2, consumed buffers of size mtu1 > are lazily replaced by newly allocated buffers of size mtu2. > > These optimizations double the rx performance on armv7. > Third parties report 3x rx speedup on armv8. > > Tested with iperf3 on a freescale imx6qp + lan7430, both sides set to mtu > 1500 bytes, measure rx performance: > > Before: > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0 > After: > [ ID] Interval Transfer Bandwidth Retr > [ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0 > > Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Looks good Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index f1f6eba4ace4..0c48bb559719 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1955,15 +1955,6 @@ static int lan743x_rx_next_index(struct lan743x_rx *rx, int index) return ((++index) % rx->ring_size); } -static struct sk_buff *lan743x_rx_allocate_skb(struct lan743x_rx *rx) -{ - int length = 0; - - length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING); - return __netdev_alloc_skb(rx->adapter->netdev, - length, GFP_ATOMIC | GFP_DMA); -} - static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index) { /* update the tail once per 8 descriptors */ @@ -1972,36 +1963,40 @@ static void lan743x_rx_update_tail(struct lan743x_rx *rx, int index) index); } -static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index, - struct sk_buff *skb) +static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index) { + struct net_device *netdev = rx->adapter->netdev; + struct device *dev = &rx->adapter->pdev->dev; struct lan743x_rx_buffer_info *buffer_info; struct lan743x_rx_descriptor *descriptor; - int length = 0; + struct sk_buff *skb; + dma_addr_t dma_ptr; + int length; + + length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING; - length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING); descriptor = &rx->ring_cpu_ptr[index]; buffer_info = &rx->buffer_info[index]; - buffer_info->skb = skb; - if (!(buffer_info->skb)) + skb = __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA); + if (!skb) return -ENOMEM; - buffer_info->dma_ptr = dma_map_single(&rx->adapter->pdev->dev, - buffer_info->skb->data, - length, - DMA_FROM_DEVICE); - if (dma_mapping_error(&rx->adapter->pdev->dev, - buffer_info->dma_ptr)) { - buffer_info->dma_ptr = 0; + dma_ptr = dma_map_single(dev, skb->data, length, DMA_FROM_DEVICE); + if (dma_mapping_error(dev, dma_ptr)) { + dev_kfree_skb_any(skb); return -ENOMEM; } + if (buffer_info->dma_ptr) + dma_unmap_single(dev, buffer_info->dma_ptr, + buffer_info->buffer_length, DMA_FROM_DEVICE); + buffer_info->skb = skb; + buffer_info->dma_ptr = dma_ptr; buffer_info->buffer_length = length; descriptor->data1 = cpu_to_le32(DMA_ADDR_LOW32(buffer_info->dma_ptr)); descriptor->data2 = cpu_to_le32(DMA_ADDR_HIGH32(buffer_info->dma_ptr)); descriptor->data3 = 0; descriptor->data0 = cpu_to_le32((RX_DESC_DATA0_OWN_ | (length & RX_DESC_DATA0_BUF_LENGTH_MASK_))); - skb_reserve(buffer_info->skb, RX_HEAD_PADDING); lan743x_rx_update_tail(rx, index); return 0; @@ -2050,16 +2045,32 @@ static void lan743x_rx_release_ring_element(struct lan743x_rx *rx, int index) memset(buffer_info, 0, sizeof(*buffer_info)); } -static int lan743x_rx_process_packet(struct lan743x_rx *rx) +static struct sk_buff * +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length) +{ + if (skb_linearize(skb)) { + dev_kfree_skb_irq(skb); + return NULL; + } + frame_length = max_t(int, 0, frame_length - RX_HEAD_PADDING - 2); + if (skb->len > frame_length) { + skb->tail -= skb->len - frame_length; + skb->len = frame_length; + } + return skb; +} + +static int lan743x_rx_process_buffer(struct lan743x_rx *rx) { - struct skb_shared_hwtstamps *hwtstamps = NULL; - int result = RX_PROCESS_RESULT_NOTHING_TO_DO; int current_head_index = le32_to_cpu(*rx->head_cpu_ptr); + struct lan743x_rx_descriptor *descriptor, *desc_ext; + struct net_device *netdev = rx->adapter->netdev; + int result = RX_PROCESS_RESULT_NOTHING_TO_DO; struct lan743x_rx_buffer_info *buffer_info; - struct lan743x_rx_descriptor *descriptor; + int frame_length, buffer_length; int extension_index = -1; - int first_index = -1; - int last_index = -1; + bool is_last, is_first; + struct sk_buff *skb; if (current_head_index < 0 || current_head_index >= rx->ring_size) goto done; @@ -2067,163 +2078,121 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx) if (rx->last_head < 0 || rx->last_head >= rx->ring_size) goto done; - if (rx->last_head != current_head_index) { - descriptor = &rx->ring_cpu_ptr[rx->last_head]; - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_) - goto done; + if (rx->last_head == current_head_index) + goto done; - if (!(le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_)) - goto done; + descriptor = &rx->ring_cpu_ptr[rx->last_head]; + if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_) + goto done; + buffer_info = &rx->buffer_info[rx->last_head]; - first_index = rx->last_head; - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) { - last_index = rx->last_head; - } else { - int index; - - index = lan743x_rx_next_index(rx, first_index); - while (index != current_head_index) { - descriptor = &rx->ring_cpu_ptr[index]; - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_) - goto done; - - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) { - last_index = index; - break; - } - index = lan743x_rx_next_index(rx, index); - } - } - if (last_index >= 0) { - descriptor = &rx->ring_cpu_ptr[last_index]; - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) { - /* extension is expected to follow */ - int index = lan743x_rx_next_index(rx, - last_index); - if (index != current_head_index) { - descriptor = &rx->ring_cpu_ptr[index]; - if (le32_to_cpu(descriptor->data0) & - RX_DESC_DATA0_OWN_) { - goto done; - } - if (le32_to_cpu(descriptor->data0) & - RX_DESC_DATA0_EXT_) { - extension_index = index; - } else { - goto done; - } - } else { - /* extension is not yet available */ - /* prevent processing of this packet */ - first_index = -1; - last_index = -1; - } - } - } - } - if (first_index >= 0 && last_index >= 0) { - int real_last_index = last_index; - struct sk_buff *skb = NULL; - u32 ts_sec = 0; - u32 ts_nsec = 0; - - /* packet is available */ - if (first_index == last_index) { - /* single buffer packet */ - struct sk_buff *new_skb = NULL; - int packet_length; - - new_skb = lan743x_rx_allocate_skb(rx); - if (!new_skb) { - /* failed to allocate next skb. - * Memory is very low. - * Drop this packet and reuse buffer. - */ - lan743x_rx_reuse_ring_element(rx, first_index); - goto process_extension; - } + is_last = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_; + is_first = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_; - buffer_info = &rx->buffer_info[first_index]; - skb = buffer_info->skb; - descriptor = &rx->ring_cpu_ptr[first_index]; - - /* unmap from dma */ - if (buffer_info->dma_ptr) { - dma_unmap_single(&rx->adapter->pdev->dev, - buffer_info->dma_ptr, - buffer_info->buffer_length, - DMA_FROM_DEVICE); - buffer_info->dma_ptr = 0; - buffer_info->buffer_length = 0; - } - buffer_info->skb = NULL; - packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_ - (le32_to_cpu(descriptor->data0)); - skb_put(skb, packet_length - 4); - skb->protocol = eth_type_trans(skb, - rx->adapter->netdev); - lan743x_rx_init_ring_element(rx, first_index, new_skb); - } else { - int index = first_index; + if (is_last && le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) { + /* extension is expected to follow */ + int index = lan743x_rx_next_index(rx, rx->last_head); - /* multi buffer packet not supported */ - /* this should not happen since - * buffers are allocated to be at least jumbo size - */ + if (index == current_head_index) + /* extension not yet available */ + goto done; + desc_ext = &rx->ring_cpu_ptr[index]; + if (le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_OWN_) + /* extension not yet available */ + goto done; + if (!(le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_EXT_)) + goto move_forward; + extension_index = index; + } - /* clean up buffers */ - if (first_index <= last_index) { - while ((index >= first_index) && - (index <= last_index)) { - lan743x_rx_reuse_ring_element(rx, - index); - index = lan743x_rx_next_index(rx, - index); - } - } else { - while ((index >= first_index) || - (index <= last_index)) { - lan743x_rx_reuse_ring_element(rx, - index); - index = lan743x_rx_next_index(rx, - index); - } - } - } + /* Only the last buffer in a multi-buffer frame contains the total frame + * length. All other buffers have a zero frame length. The chip + * occasionally sends more buffers than strictly required to reach the + * total frame length. + * Handle this by adding all buffers to the skb in their entirety. + * Once the real frame length is known, trim the skb. + */ + frame_length = + RX_DESC_DATA0_FRAME_LENGTH_GET_(le32_to_cpu(descriptor->data0)); + buffer_length = buffer_info->buffer_length; + + netdev_dbg(netdev, "%s%schunk: %d/%d", + is_first ? "first " : " ", + is_last ? "last " : " ", + frame_length, buffer_length); + + /* save existing skb, allocate new skb and map to dma */ + skb = buffer_info->skb; + if (lan743x_rx_init_ring_element(rx, rx->last_head)) { + /* failed to allocate next skb. + * Memory is very low. + * Drop this packet and reuse buffer. + */ + lan743x_rx_reuse_ring_element(rx, rx->last_head); + /* drop packet that was being assembled */ + dev_kfree_skb_irq(rx->skb_head); + rx->skb_head = NULL; + goto process_extension; + } + + /* add buffers to skb via skb->frag_list */ + if (is_first) { + skb_reserve(skb, RX_HEAD_PADDING); + skb_put(skb, buffer_length - RX_HEAD_PADDING); + if (rx->skb_head) + dev_kfree_skb_irq(rx->skb_head); + rx->skb_head = skb; + } else if (rx->skb_head) { + skb_put(skb, buffer_length); + if (skb_shinfo(rx->skb_head)->frag_list) + rx->skb_tail->next = skb; + else + skb_shinfo(rx->skb_head)->frag_list = skb; + rx->skb_tail = skb; + rx->skb_head->len += skb->len; + rx->skb_head->data_len += skb->len; + rx->skb_head->truesize += skb->truesize; + } else { + /* packet to assemble has already been dropped because one or + * more of its buffers could not be allocated + */ + netdev_dbg(netdev, "drop buffer intended for dropped packet"); + dev_kfree_skb_irq(skb); + } process_extension: - if (extension_index >= 0) { - descriptor = &rx->ring_cpu_ptr[extension_index]; - buffer_info = &rx->buffer_info[extension_index]; - - ts_sec = le32_to_cpu(descriptor->data1); - ts_nsec = (le32_to_cpu(descriptor->data2) & - RX_DESC_DATA2_TS_NS_MASK_); - lan743x_rx_reuse_ring_element(rx, extension_index); - real_last_index = extension_index; - } + if (extension_index >= 0) { + u32 ts_sec; + u32 ts_nsec; - if (!skb) { - result = RX_PROCESS_RESULT_PACKET_DROPPED; - goto move_forward; - } + ts_sec = le32_to_cpu(desc_ext->data1); + ts_nsec = (le32_to_cpu(desc_ext->data2) & + RX_DESC_DATA2_TS_NS_MASK_); + if (rx->skb_head) + skb_hwtstamps(rx->skb_head)->hwtstamp = + ktime_set(ts_sec, ts_nsec); + lan743x_rx_reuse_ring_element(rx, extension_index); + rx->last_head = extension_index; + netdev_dbg(netdev, "process extension"); + } - if (extension_index < 0) - goto pass_packet_to_os; - hwtstamps = skb_hwtstamps(skb); - if (hwtstamps) - hwtstamps->hwtstamp = ktime_set(ts_sec, ts_nsec); + if (is_last && rx->skb_head) + rx->skb_head = lan743x_rx_trim_skb(rx->skb_head, frame_length); -pass_packet_to_os: - /* pass packet to OS */ - napi_gro_receive(&rx->napi, skb); - result = RX_PROCESS_RESULT_PACKET_RECEIVED; + if (is_last && rx->skb_head) { + rx->skb_head->protocol = eth_type_trans(rx->skb_head, + rx->adapter->netdev); + netdev_dbg(netdev, "sending %d byte frame to OS", + rx->skb_head->len); + napi_gro_receive(&rx->napi, rx->skb_head); + rx->skb_head = NULL; + } move_forward: - /* push tail and head forward */ - rx->last_tail = real_last_index; - rx->last_head = lan743x_rx_next_index(rx, real_last_index); - } + /* push tail and head forward */ + rx->last_tail = rx->last_head; + rx->last_head = lan743x_rx_next_index(rx, rx->last_head); + result = RX_PROCESS_RESULT_BUFFER_RECEIVED; done: return result; } @@ -2242,12 +2211,12 @@ static int lan743x_rx_napi_poll(struct napi_struct *napi, int weight) DMAC_INT_BIT_RXFRM_(rx->channel_number)); } for (count = 0; count < weight; count++) { - result = lan743x_rx_process_packet(rx); + result = lan743x_rx_process_buffer(rx); if (result == RX_PROCESS_RESULT_NOTHING_TO_DO) break; } rx->frame_count += count; - if (count == weight || result == RX_PROCESS_RESULT_PACKET_RECEIVED) + if (count == weight || result == RX_PROCESS_RESULT_BUFFER_RECEIVED) return weight; if (!napi_complete_done(napi, count)) @@ -2359,9 +2328,7 @@ static int lan743x_rx_ring_init(struct lan743x_rx *rx) rx->last_head = 0; for (index = 0; index < rx->ring_size; index++) { - struct sk_buff *new_skb = lan743x_rx_allocate_skb(rx); - - ret = lan743x_rx_init_ring_element(rx, index, new_skb); + ret = lan743x_rx_init_ring_element(rx, index); if (ret) goto cleanup; } diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h index 751f2bc9ce84..40dfb564c4f7 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.h +++ b/drivers/net/ethernet/microchip/lan743x_main.h @@ -698,6 +698,8 @@ struct lan743x_rx { struct napi_struct napi; u32 frame_count; + + struct sk_buff *skb_head, *skb_tail; }; struct lan743x_adapter { @@ -831,8 +833,7 @@ struct lan743x_rx_buffer_info { #define LAN743X_RX_RING_SIZE (65) #define RX_PROCESS_RESULT_NOTHING_TO_DO (0) -#define RX_PROCESS_RESULT_PACKET_RECEIVED (1) -#define RX_PROCESS_RESULT_PACKET_DROPPED (2) +#define RX_PROCESS_RESULT_BUFFER_RECEIVED (1) u32 lan743x_csr_read(struct lan743x_adapter *adapter, int offset); void lan743x_csr_write(struct lan743x_adapter *adapter, int offset, u32 data);