Message ID | 1533724802-30944-1-git-send-email-wgong@codeaurora.org (mailing list archive) |
---|---|
Headers | show |
Series | Change sk_pacing_shift in ieee80211_hw for best tx throughput | expand |
On 08/08/2018 03:40 AM, Wen Gong wrote: > Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set > the default value to 8, and ath10k will change it to 6. Then mac80211 > will use the changed value 6 as sk_pacing_shift since 6 is the best > value for tx throughput by test result. I don't think you can convince people with the numbers unless you provide latency along with the numbers and also measurement result on different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From users view point, I also agree on Toke that we cannot scarify latency for the small throughput improvement. Thanks, Peter
On 8/8/2018 9:00 PM, Peter Oh wrote: > > > On 08/08/2018 03:40 AM, Wen Gong wrote: >> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >> the default value to 8, and ath10k will change it to 6. Then mac80211 >> will use the changed value 6 as sk_pacing_shift since 6 is the best >> value for tx throughput by test result. > I don't think you can convince people with the numbers unless you > provide latency along with the numbers and also measurement result on > different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From > users view point, I also agree on Toke that we cannot scarify latency > for the small throughput improvement. Yeah. The wireless industry (admittedly that is me too :-p ) has been focused on just throughput long enough. All the preaching about bufferbloat from Dave and others is (just) starting to sink in here and there. Now as for the value of the sk_pacing_shift I think we agree it depends on the specific device so in that sense the api makes sense, but I think there are a lot of variables so I was wondering if we could introduce a sysctl parameter for it. Does that make sense? Regards, Arend
Arend van Spriel <arend.vanspriel@broadcom.com> writes: > On 8/8/2018 9:00 PM, Peter Oh wrote: >> >> >> On 08/08/2018 03:40 AM, Wen Gong wrote: >>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>> value for tx throughput by test result. >> I don't think you can convince people with the numbers unless you >> provide latency along with the numbers and also measurement result on >> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >> users view point, I also agree on Toke that we cannot scarify latency >> for the small throughput improvement. > > Yeah. The wireless industry (admittedly that is me too :-p ) has been > focused on just throughput long enough. Tell me about it ;) > All the preaching about bufferbloat from Dave and others is (just) > starting to sink in here and there. Yeah, I've noticed; this is good! > Now as for the value of the sk_pacing_shift I think we agree it > depends on the specific device so in that sense the api makes sense, > but I think there are a lot of variables so I was wondering if we > could introduce a sysctl parameter for it. Does that make sense? I'm not sure a sysctl parameter would make sense; for one thing, it would be global for the host, while different network interfaces will probably need different values. And for another, I don't think it's something a user can reasonably be expected to set correctly, and I think it *is* actually possible to pick a value that works well at the driver level. -Toke
On 8/10/2018 3:20 PM, Toke Høiland-Jørgensen wrote: > Arend van Spriel <arend.vanspriel@broadcom.com> writes: > >> On 8/8/2018 9:00 PM, Peter Oh wrote: >>> >>> >>> On 08/08/2018 03:40 AM, Wen Gong wrote: >>>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>>> value for tx throughput by test result. >>> I don't think you can convince people with the numbers unless you >>> provide latency along with the numbers and also measurement result on >>> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >>> users view point, I also agree on Toke that we cannot scarify latency >>> for the small throughput improvement. >> >> Yeah. The wireless industry (admittedly that is me too :-p ) has been >> focused on just throughput long enough. > > Tell me about it ;) > >> All the preaching about bufferbloat from Dave and others is (just) >> starting to sink in here and there. > > Yeah, I've noticed; this is good! > >> Now as for the value of the sk_pacing_shift I think we agree it >> depends on the specific device so in that sense the api makes sense, >> but I think there are a lot of variables so I was wondering if we >> could introduce a sysctl parameter for it. Does that make sense? > > I'm not sure a sysctl parameter would make sense; for one thing, it > would be global for the host, while different network interfaces will > probably need different values. And for another, I don't think it's > something a user can reasonably be expected to set correctly, and I > think it *is* actually possible to pick a value that works well at the > driver level. I not sure either. Do you think a user could come up with something like this (found here [1]): sysctl -w net.core.rmem_max=8388608 sysctl -w net.core.wmem_max=8388608 sysctl -w net.core.rmem_default=65536 sysctl -w net.core.wmem_default=65536 sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' sysctl -w net.ipv4.route.flush=1 Now the page listing this config claims this is for use "on Linux 2.4+ for high-bandwidth applications". Beats me if it still is correct in 4.17. Anyway, sysctl is nice for parameterizing code that is built-in the kernel so you don't need to rebuild it. mac80211 tends to be a module in most distros so maybe sysctl is not a good fit. So lets agree on that. Picking a value at driver level may be possible, but a driver tends to support a number of different devices. So how do you see the picking work. Some static table with entries for the different devices? Regards, Arend [1] https://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php
On 08/10/2018 12:28 PM, Arend van Spriel wrote: > On 8/10/2018 3:20 PM, Toke Høiland-Jørgensen wrote: >> Arend van Spriel <arend.vanspriel@broadcom.com> writes: >> >>> On 8/8/2018 9:00 PM, Peter Oh wrote: >>>> >>>> >>>> On 08/08/2018 03:40 AM, Wen Gong wrote: >>>>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>>>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>>>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>>>> value for tx throughput by test result. >>>> I don't think you can convince people with the numbers unless you >>>> provide latency along with the numbers and also measurement result on >>>> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >>>> users view point, I also agree on Toke that we cannot scarify latency >>>> for the small throughput improvement. >>> >>> Yeah. The wireless industry (admittedly that is me too :-p ) has been >>> focused on just throughput long enough. >> >> Tell me about it ;) >> >>> All the preaching about bufferbloat from Dave and others is (just) >>> starting to sink in here and there. >> >> Yeah, I've noticed; this is good! >> >>> Now as for the value of the sk_pacing_shift I think we agree it >>> depends on the specific device so in that sense the api makes sense, >>> but I think there are a lot of variables so I was wondering if we >>> could introduce a sysctl parameter for it. Does that make sense? >> >> I'm not sure a sysctl parameter would make sense; for one thing, it >> would be global for the host, while different network interfaces will >> probably need different values. And for another, I don't think it's >> something a user can reasonably be expected to set correctly, and I >> think it *is* actually possible to pick a value that works well at the >> driver level. > > I not sure either. Do you think a user could come up with something like this (found here [1]): > > sysctl -w net.core.rmem_max=8388608 > sysctl -w net.core.wmem_max=8388608 > sysctl -w net.core.rmem_default=65536 > sysctl -w net.core.wmem_default=65536 > sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' > sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' > sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' > sysctl -w net.ipv4.route.flush=1 > > Now the page listing this config claims this is for use "on Linux 2.4+ for high-bandwidth applications". Beats me if it still is correct in 4.17. > > Anyway, sysctl is nice for parameterizing code that is built-in the kernel so you don't need to rebuild it. mac80211 tends to be a module in most distros so > maybe sysctl is not a good fit. So lets agree on that. > > Picking a value at driver level may be possible, but a driver tends to support a number of different devices. So how do you see the picking work. Some static > table with entries for the different devices? Some users are not going to care about latency, and for others, latency may be absolutely important and they don't care about bandwidth. So, it should be tunable. sysctl can support per network-device settings, right? Or, probably could use ethtool API to set a per-netdev value as well. That might be nice for other network devices as well, not just wifi. If the driver is configuring the defaults, it can know the hardware type, firmware revision, and lots of other info to make the best decision it can when registering the radio with the upper stacks. Thanks, Ben
+ Eric On 8/10/2018 9:52 PM, Ben Greear wrote: > On 08/10/2018 12:28 PM, Arend van Spriel wrote: >> On 8/10/2018 3:20 PM, Toke Høiland-Jørgensen wrote: >>> Arend van Spriel <arend.vanspriel@broadcom.com> writes: >>> >>>> On 8/8/2018 9:00 PM, Peter Oh wrote: >>>>> >>>>> >>>>> On 08/08/2018 03:40 AM, Wen Gong wrote: >>>>>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>>>>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>>>>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>>>>> value for tx throughput by test result. >>>>> I don't think you can convince people with the numbers unless you >>>>> provide latency along with the numbers and also measurement result on >>>>> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >>>>> users view point, I also agree on Toke that we cannot scarify latency >>>>> for the small throughput improvement. >>>> >>>> Yeah. The wireless industry (admittedly that is me too :-p ) has been >>>> focused on just throughput long enough. >>> >>> Tell me about it ;) >>> >>>> All the preaching about bufferbloat from Dave and others is (just) >>>> starting to sink in here and there. >>> >>> Yeah, I've noticed; this is good! >>> >>>> Now as for the value of the sk_pacing_shift I think we agree it >>>> depends on the specific device so in that sense the api makes sense, >>>> but I think there are a lot of variables so I was wondering if we >>>> could introduce a sysctl parameter for it. Does that make sense? >>> >>> I'm not sure a sysctl parameter would make sense; for one thing, it >>> would be global for the host, while different network interfaces will >>> probably need different values. And for another, I don't think it's >>> something a user can reasonably be expected to set correctly, and I >>> think it *is* actually possible to pick a value that works well at the >>> driver level. >> >> I not sure either. Do you think a user could come up with something >> like this (found here [1]): >> >> sysctl -w net.core.rmem_max=8388608 >> sysctl -w net.core.wmem_max=8388608 >> sysctl -w net.core.rmem_default=65536 >> sysctl -w net.core.wmem_default=65536 >> sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' >> sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' >> sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' >> sysctl -w net.ipv4.route.flush=1 >> >> Now the page listing this config claims this is for use "on Linux 2.4+ >> for high-bandwidth applications". Beats me if it still is correct in >> 4.17. >> >> Anyway, sysctl is nice for parameterizing code that is built-in the >> kernel so you don't need to rebuild it. mac80211 tends to be a module >> in most distros so >> maybe sysctl is not a good fit. So lets agree on that. >> >> Picking a value at driver level may be possible, but a driver tends to >> support a number of different devices. So how do you see the picking >> work. Some static >> table with entries for the different devices? > > Some users are not going to care about latency, and for others, latency may > be absolutely important and they don't care about bandwidth. > > So, it should be tunable. sysctl can support per network-device settings, > right? Or, probably could use ethtool API to set a per-netdev value as > well. > That might be nice for other network devices as well, not just wifi. I was under the impression that the parameters are all global, but your statement made me look. I came across some references here [2] so I checked the kernel sources under net/ and found net/ipv4/devinet.c [3]. So that confirms it supports per-netdev settings. The sk_pacing_shift is actually a socket setting although I did not find a user-space api to change it. For instance it is not a socket option. There might be a good reason for not exposing it to user-space (added Eric). Regards, Arend [2] https://askubuntu.com/questions/316492/disabling-ipv6-on-a-single-interface [3] https://elixir.bootlin.com/linux/latest/source/net/ipv4/devinet.c#L2309
Arend van Spriel <arend.vanspriel@broadcom.com> writes: > + Eric > > On 8/10/2018 9:52 PM, Ben Greear wrote: >> On 08/10/2018 12:28 PM, Arend van Spriel wrote: >>> On 8/10/2018 3:20 PM, Toke Høiland-Jørgensen wrote: >>>> Arend van Spriel <arend.vanspriel@broadcom.com> writes: >>>> >>>>> On 8/8/2018 9:00 PM, Peter Oh wrote: >>>>>> >>>>>> >>>>>> On 08/08/2018 03:40 AM, Wen Gong wrote: >>>>>>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>>>>>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>>>>>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>>>>>> value for tx throughput by test result. >>>>>> I don't think you can convince people with the numbers unless you >>>>>> provide latency along with the numbers and also measurement result on >>>>>> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >>>>>> users view point, I also agree on Toke that we cannot scarify latency >>>>>> for the small throughput improvement. >>>>> >>>>> Yeah. The wireless industry (admittedly that is me too :-p ) has been >>>>> focused on just throughput long enough. >>>> >>>> Tell me about it ;) >>>> >>>>> All the preaching about bufferbloat from Dave and others is (just) >>>>> starting to sink in here and there. >>>> >>>> Yeah, I've noticed; this is good! >>>> >>>>> Now as for the value of the sk_pacing_shift I think we agree it >>>>> depends on the specific device so in that sense the api makes sense, >>>>> but I think there are a lot of variables so I was wondering if we >>>>> could introduce a sysctl parameter for it. Does that make sense? >>>> >>>> I'm not sure a sysctl parameter would make sense; for one thing, it >>>> would be global for the host, while different network interfaces will >>>> probably need different values. And for another, I don't think it's >>>> something a user can reasonably be expected to set correctly, and I >>>> think it *is* actually possible to pick a value that works well at the >>>> driver level. >>> >>> I not sure either. Do you think a user could come up with something >>> like this (found here [1]): >>> >>> sysctl -w net.core.rmem_max=8388608 >>> sysctl -w net.core.wmem_max=8388608 >>> sysctl -w net.core.rmem_default=65536 >>> sysctl -w net.core.wmem_default=65536 >>> sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' >>> sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' >>> sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' >>> sysctl -w net.ipv4.route.flush=1 >>> >>> Now the page listing this config claims this is for use "on Linux 2.4+ >>> for high-bandwidth applications". Beats me if it still is correct in >>> 4.17. >>> >>> Anyway, sysctl is nice for parameterizing code that is built-in the >>> kernel so you don't need to rebuild it. mac80211 tends to be a module >>> in most distros so >>> maybe sysctl is not a good fit. So lets agree on that. >>> >>> Picking a value at driver level may be possible, but a driver tends to >>> support a number of different devices. So how do you see the picking >>> work. Some static >>> table with entries for the different devices? >> >> Some users are not going to care about latency, and for others, latency may >> be absolutely important and they don't care about bandwidth. >> >> So, it should be tunable. sysctl can support per network-device settings, >> right? Or, probably could use ethtool API to set a per-netdev value as >> well. >> That might be nice for other network devices as well, not just wifi. > > I was under the impression that the parameters are all global, but your > statement made me look. I came across some references here [2] so I > checked the kernel sources under net/ and found net/ipv4/devinet.c [3]. > So that confirms it supports per-netdev settings. Yeah, I think that *if* this is to be made configurable, a per-netdev sysctl would be the way to go, with the driver being able to set the default. However, the reason I think it may not be worth it to expose this as a setting is that it is very much a case of diminishing returns. Once the buffer size is large enough that full aggregates can be built, increasing it further just adds latency with very little effect on throughput. Which means that fiddling with the parameter is not going to have a lot of effect, so it is not very useful to expose, which makes it not worth the added configuration complexity... -Toke
On 08/20/2018 05:46 AM, Toke Høiland-Jørgensen wrote: > Arend van Spriel <arend.vanspriel@broadcom.com> writes: > >> + Eric >> >> On 8/10/2018 9:52 PM, Ben Greear wrote: >>> On 08/10/2018 12:28 PM, Arend van Spriel wrote: >>>> On 8/10/2018 3:20 PM, Toke Høiland-Jørgensen wrote: >>>>> Arend van Spriel <arend.vanspriel@broadcom.com> writes: >>>>> >>>>>> On 8/8/2018 9:00 PM, Peter Oh wrote: >>>>>>> >>>>>>> >>>>>>> On 08/08/2018 03:40 AM, Wen Gong wrote: >>>>>>>> Add a field for ath10k to adjust the sk_pacing_shift, mac80211 set >>>>>>>> the default value to 8, and ath10k will change it to 6. Then mac80211 >>>>>>>> will use the changed value 6 as sk_pacing_shift since 6 is the best >>>>>>>> value for tx throughput by test result. >>>>>>> I don't think you can convince people with the numbers unless you >>>>>>> provide latency along with the numbers and also measurement result on >>>>>>> different chipsets as Michal addressed (QCA4019, QCA9984, etc.) From >>>>>>> users view point, I also agree on Toke that we cannot scarify latency >>>>>>> for the small throughput improvement. >>>>>> >>>>>> Yeah. The wireless industry (admittedly that is me too :-p ) has been >>>>>> focused on just throughput long enough. >>>>> >>>>> Tell me about it ;) >>>>> >>>>>> All the preaching about bufferbloat from Dave and others is (just) >>>>>> starting to sink in here and there. >>>>> >>>>> Yeah, I've noticed; this is good! >>>>> >>>>>> Now as for the value of the sk_pacing_shift I think we agree it >>>>>> depends on the specific device so in that sense the api makes sense, >>>>>> but I think there are a lot of variables so I was wondering if we >>>>>> could introduce a sysctl parameter for it. Does that make sense? >>>>> >>>>> I'm not sure a sysctl parameter would make sense; for one thing, it >>>>> would be global for the host, while different network interfaces will >>>>> probably need different values. And for another, I don't think it's >>>>> something a user can reasonably be expected to set correctly, and I >>>>> think it *is* actually possible to pick a value that works well at the >>>>> driver level. >>>> >>>> I not sure either. Do you think a user could come up with something >>>> like this (found here [1]): >>>> >>>> sysctl -w net.core.rmem_max=8388608 >>>> sysctl -w net.core.wmem_max=8388608 >>>> sysctl -w net.core.rmem_default=65536 >>>> sysctl -w net.core.wmem_default=65536 >>>> sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608' >>>> sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608' >>>> sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608' >>>> sysctl -w net.ipv4.route.flush=1 >>>> >>>> Now the page listing this config claims this is for use "on Linux 2.4+ >>>> for high-bandwidth applications". Beats me if it still is correct in >>>> 4.17. >>>> >>>> Anyway, sysctl is nice for parameterizing code that is built-in the >>>> kernel so you don't need to rebuild it. mac80211 tends to be a module >>>> in most distros so >>>> maybe sysctl is not a good fit. So lets agree on that. >>>> >>>> Picking a value at driver level may be possible, but a driver tends to >>>> support a number of different devices. So how do you see the picking >>>> work. Some static >>>> table with entries for the different devices? >>> >>> Some users are not going to care about latency, and for others, latency may >>> be absolutely important and they don't care about bandwidth. >>> >>> So, it should be tunable. sysctl can support per network-device settings, >>> right? Or, probably could use ethtool API to set a per-netdev value as >>> well. >>> That might be nice for other network devices as well, not just wifi. >> >> I was under the impression that the parameters are all global, but your >> statement made me look. I came across some references here [2] so I >> checked the kernel sources under net/ and found net/ipv4/devinet.c [3]. >> So that confirms it supports per-netdev settings. > > Yeah, I think that *if* this is to be made configurable, a per-netdev > sysctl would be the way to go, with the driver being able to set the > default. > > However, the reason I think it may not be worth it to expose this as a > setting is that it is very much a case of diminishing returns. Once the > buffer size is large enough that full aggregates can be built, > increasing it further just adds latency with very little effect on > throughput. Which means that fiddling with the parameter is not going to > have a lot of effect, so it is not very useful to expose, which makes it > not worth the added configuration complexity... If it were easy, it would already be correct. I think adding tuning knob and some documentation will allow users to more easily try different things and use what is best for them (and let the community at large know what works so maybe the defaults can be tweaked over time). Thanks, Ben