Message ID | 20210106122403.1321180-1-kristian.evensen@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] qmi_wwan: Increase headroom for QMAP SKBs | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 3 maintainers not CCed: kuba@kernel.org linux-usb@vger.kernel.org davem@davemloft.net |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 15 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
Kristian Evensen <kristian.evensen@gmail.com> writes: > When measuring the throughput (iperf3 + TCP) while routing on a > not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I > achieved significantly lower speeds with QMI-based modems than for > example a USB LAN dongle. The CPU was saturated in all of my tests. > > With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s > with the modems. All offloads, etc. were switched off for the dongle, > and I configured the modems to use QMAP (16k aggregation). The tests > with the dongle were performed in my local (gigabit) network, while the > LTE network the modems were connected to delivers 700-800 Mbit/s. > > Profiling the kernel revealed the cause of the performance difference. > In qmimux_rx_fixup(), an SKB is allocated for each packet contained in > the URB. This SKB has too little headroom, causing the check in > skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then > called and the SKB is reallocated. In the output from perf, I see that a > significant amount of time is spent in pskb_expand_head() + support > functions. > > In order to ensure that the SKB has enough headroom, this commit > increases the amount of memory allocated in qmimux_rx_fixup() by > LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more > accurate value, is that we do not know the type of the outgoing network > interface. After making this change, I achieve the same throughput with > the modems as with the dongle. > > Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Nice work! Just wondering: Will the same problem affect the usbnet allocated skbs as well in case of raw-ip? They will obviously be large enough, but the reserved headroom probably isn't when we put an IP packet there without any L2 header? In any case: Acked-by: Bjørn Mork <bjorn@mork.no>
Hi Bjørn, On Wed, Jan 6, 2021 at 3:31 PM Bjørn Mork <bjorn@mork.no> wrote: > Nice work! Thanks a lot! > Just wondering: Will the same problem affect the usbnet allocated skbs > as well in case of raw-ip? They will obviously be large enough, but the > reserved headroom probably isn't when we put an IP packet there without > any L2 header? You are right, I completely forgot about those SKBs. I will try to find some time to investigate the non-QMAP performance, if a similar fix (I guess an skb_reserve after the case-statement is enough) will have an effect and submit a follow-up patch in case. Thanks for reminding me, I have switched to only use QMAP :) BR, Kristian
On Wed, 06 Jan 2021 15:31:10 +0100 Bjørn Mork wrote: > Kristian Evensen <kristian.evensen@gmail.com> writes: > > > When measuring the throughput (iperf3 + TCP) while routing on a > > not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I > > achieved significantly lower speeds with QMI-based modems than for > > example a USB LAN dongle. The CPU was saturated in all of my tests. > > > > With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s > > with the modems. All offloads, etc. were switched off for the dongle, > > and I configured the modems to use QMAP (16k aggregation). The tests > > with the dongle were performed in my local (gigabit) network, while the > > LTE network the modems were connected to delivers 700-800 Mbit/s. > > > > Profiling the kernel revealed the cause of the performance difference. > > In qmimux_rx_fixup(), an SKB is allocated for each packet contained in > > the URB. This SKB has too little headroom, causing the check in > > skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then > > called and the SKB is reallocated. In the output from perf, I see that a > > significant amount of time is spent in pskb_expand_head() + support > > functions. > > > > In order to ensure that the SKB has enough headroom, this commit > > increases the amount of memory allocated in qmimux_rx_fixup() by > > LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more > > accurate value, is that we do not know the type of the outgoing network > > interface. After making this change, I achieve the same throughput with > > the modems as with the dongle. > > > > Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> > > Nice work! > > Just wondering: Will the same problem affect the usbnet allocated skbs > as well in case of raw-ip? They will obviously be large enough, but the > reserved headroom probably isn't when we put an IP packet there without > any L2 header? > > In any case: > > Acked-by: Bjørn Mork <bjorn@mork.no> Applied, thanks!
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index af19513a9..7ea113f51 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -186,7 +186,7 @@ static int qmimux_rx_fixup(struct usbnet *dev, struct sk_buff *skb) net = qmimux_find_dev(dev, hdr->mux_id); if (!net) goto skip; - skbn = netdev_alloc_skb(net, pkt_len); + skbn = netdev_alloc_skb(net, pkt_len + LL_MAX_HEADER); if (!skbn) return 0; skbn->dev = net; @@ -203,6 +203,7 @@ static int qmimux_rx_fixup(struct usbnet *dev, struct sk_buff *skb) goto skip; } + skb_reserve(skbn, LL_MAX_HEADER); skb_put_data(skbn, skb->data + offset + qmimux_hdr_sz, pkt_len); if (netif_rx(skbn) != NET_RX_SUCCESS) { net->stats.rx_errors++;
When measuring the throughput (iperf3 + TCP) while routing on a not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I achieved significantly lower speeds with QMI-based modems than for example a USB LAN dongle. The CPU was saturated in all of my tests. With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s with the modems. All offloads, etc. were switched off for the dongle, and I configured the modems to use QMAP (16k aggregation). The tests with the dongle were performed in my local (gigabit) network, while the LTE network the modems were connected to delivers 700-800 Mbit/s. Profiling the kernel revealed the cause of the performance difference. In qmimux_rx_fixup(), an SKB is allocated for each packet contained in the URB. This SKB has too little headroom, causing the check in skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then called and the SKB is reallocated. In the output from perf, I see that a significant amount of time is spent in pskb_expand_head() + support functions. In order to ensure that the SKB has enough headroom, this commit increases the amount of memory allocated in qmimux_rx_fixup() by LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more accurate value, is that we do not know the type of the outgoing network interface. After making this change, I achieve the same throughput with the modems as with the dongle. Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> --- drivers/net/usb/qmi_wwan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)