Message ID | CA+BoTQ=u_xPuqTVOVaFTQNRrJ+UTXe89SY+=+7Y1LpxxrkRDfg@mail.gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: > + if (msdu->sk) { > + ewma_add(&ar->tx_delay_us, > + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / > + NSEC_PER_USEC); > + > + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = > + (ewma_read(&ar->tx_delay_us) * > + msdu->sk->sk_pacing_rate) >> 20; > + } > + Hi Michal This is almost it ;) As I said you must do this using u64 arithmetics, we still support 32bit kernels. Also, >> 20 instead of / 1000000 introduces a 5% error, I would use a plain divide, as the compiler will use a reciprocal divide (ie : a multiply) We use >> 10 instead of /1000 because a 2.4 % error is probably okay. ewma_add(&ar->tx_delay_us, ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / NSEC_PER_USEC); u64 val = (u64)ewma_read(&ar->tx_delay_us) * msdu->sk->sk_pacing_rate; do_div(val, USEC_PER_SEC); ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = (u32)val; (WRITE_ONCE() would be better for new kernels, but ACCESS_ONCE() is ok since we probably want to backport to stable kernels) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-10 at 04:54 -0800, Eric Dumazet wrote: > Hi Michal > > This is almost it ;) > > As I said you must do this using u64 arithmetics, we still support 32bit > kernels. > > Also, >> 20 instead of / 1000000 introduces a 5% error, I would use a > plain divide, as the compiler will use a reciprocal divide (ie : a > multiply) > > We use >> 10 instead of /1000 because a 2.4 % error is probably okay. > > ewma_add(&ar->tx_delay_us, > ktime_to_ns(ktime_sub(ktime_get(), > skb_cb->stamp)) / > NSEC_PER_USEC); btw I suspect this wont compile on 32 bit kernel You need to use do_div() as well : u64 val = ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)); do_div(val, NSEC_PER_USEC); ewma_add(&ar->tx_delay_us, val); -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: > ath10k_core_napi_dummy_poll, 64); > + ewma_init(&ar->tx_delay_us, 16384, 8); 1) 16384 factor might be too big. 2) a weight of 8 seems too low given aggregation values used in wifi. On 32bit arches, the max range for ewma value would be 262144 usec, a quarter of a second... You could use a factor of 64 instead, and a weight of 16. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: > + if (msdu->sk) { > + ewma_add(&ar->tx_delay_us, > + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / > + NSEC_PER_USEC); > + > + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = > + (ewma_read(&ar->tx_delay_us) * > + msdu->sk->sk_pacing_rate) >> 20; > + } To some extent, every wifi driver is going to have this problem. Perhaps we should do this in mac80211? johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-10 at 15:19 +0100, Johannes Berg wrote: > On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: > > > + if (msdu->sk) { > > + ewma_add(&ar->tx_delay_us, > > + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / > > + NSEC_PER_USEC); > > + > > + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = > > + (ewma_read(&ar->tx_delay_us) * > > + msdu->sk->sk_pacing_rate) >> 20; > > + } > > To some extent, every wifi driver is going to have this problem. Perhaps > we should do this in mac80211? I'll provide the TCP patch. sk->sk_tx_completion_delay_cushion is probably a wrong name, as the units here are in bytes, since it is really number of bytes in the network driver that accommodate for tx completions delays. tx_completion_delay * pacing_rate sk_tx_completion_cushion maybe. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10 February 2015 at 14:14, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: >> ath10k_core_napi_dummy_poll, 64); >> + ewma_init(&ar->tx_delay_us, 16384, 8); > > > 1) 16384 factor might be too big. > > 2) a weight of 8 seems too low given aggregation values used in wifi. > > On 32bit arches, the max range for ewma value would be 262144 usec, > a quarter of a second... > > You could use a factor of 64 instead, and a weight of 16. 64/16 seems to work fine as well. On a related note: I still wonder how to get single TCP flow to reach line rate with ath10k (it still doesn't; I reach line rate with multiple flows only). Isn't the tcp_limit_output_bytes just too small for devices like Wi-Fi where you can send aggregates of even 64*3*1500 bytes long in a single shot and you can't expect even a single tx-completion of it to come in before its transmitted entirely? You effectively operate with bursts of traffic. Some numbers: ath10k w/o cushion w/o aggregation 1 flow: UDP 65mbps, TCP 30mbps ath10k w/ cushion w/o aggregation 1 flow: UDP 65mbps, TCP 59mbps ath10k w/o cushion w/ aggregation 1 flow: UDP 650mbps, TCP 250mbps ath10k w/ cushion w/ aggregation 1 flow: UDP 650mbps, TCP 250mbps ath10k w/o cushion w/ aggregation 5 flows: UDP 650mbps, TCP 250mbps ath10k w/ cushion w/ aggregation 5 flows: UDP 650mbps, TCP 600mbps "w/o aggregation" means forcing ath10k to use 1 A-MSDU and 1 A-MPDU per aggregate so latencies due to aggregation itself should be pretty much nil. If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion w/ aggregation to reach 600mbps on a single flow. Micha? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10 February 2015 at 15:19, Johannes Berg <johannes@sipsolutions.net> wrote: > On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: > >> + if (msdu->sk) { >> + ewma_add(&ar->tx_delay_us, >> + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / >> + NSEC_PER_USEC); >> + >> + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = >> + (ewma_read(&ar->tx_delay_us) * >> + msdu->sk->sk_pacing_rate) >> 20; >> + } > > To some extent, every wifi driver is going to have this problem. Perhaps > we should do this in mac80211? Good point. I was actually thinking about it. I can try cooking a patch unless you want to do it yourself :-) Micha? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-02-11 at 09:33 +0100, Michal Kazior wrote: > If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion > w/ aggregation to reach 600mbps on a single flow. You know, there is a reason this sysctl exists in the first place ;) The first suggestion I made to you was to raise it. The default setting must stay as is as long default Qdisc is pfifo_fast. I believe I already mentioned skb->truesize tricks for drivers willing to adjust the TSQ given their constraints. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 February 2015 at 14:17, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Wed, 2015-02-11 at 09:33 +0100, Michal Kazior wrote: > >> If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion >> w/ aggregation to reach 600mbps on a single flow. > > You know, there is a reason this sysctl exists in the first place ;) > > The first suggestion I made to you was to raise it. > > The default setting must stay as is as long default Qdisc is pfifo_fast. > > I believe I already mentioned skb->truesize tricks for drivers willing > to adjust the TSQ given their constraints. Right. truesize didn't help in my early tests and once the cushion thing came about I had assumed that it's not relevant anymore. I just checked: @@ -2620,6 +2621,12 @@ static void ath10k_tx(struct ieee80211_hw *hw, if (info->flags & IEEE80211_TX_CTL_NO_CCK_RATE) ath10k_dbg(ar, ATH10K_DBG_MAC, "IEEE80211_TX_CTL_NO_CCK_RATE\n"); + if (skb->sk) { + u32 trim = skb->truesize - (skb->truesize / 8); + skb->truesize -= trim; + atomic_sub(trim, &skb->sk->sk_wmem_alloc); + } With this I get 600mbps on a single flow. The /2 wasn't enough (it barely made a difference, 250->300mbps). The question is how do I know how much of trimming is too much? Could the tx completion delay be used to compute the trim factor, hmm.. Maybe this should be done in mac80211 as well? Micha? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 February 2015 at 09:57, Michal Kazior <michal.kazior@tieto.com> wrote: > On 10 February 2015 at 15:19, Johannes Berg <johannes@sipsolutions.net> wrote: >> On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: >> >>> + if (msdu->sk) { >>> + ewma_add(&ar->tx_delay_us, >>> + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / >>> + NSEC_PER_USEC); >>> + >>> + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = >>> + (ewma_read(&ar->tx_delay_us) * >>> + msdu->sk->sk_pacing_rate) >> 20; >>> + } >> >> To some extent, every wifi driver is going to have this problem. Perhaps >> we should do this in mac80211? > > Good point. I was actually thinking about it. I can try cooking a > patch unless you want to do it yourself :-) I've taken a look into this. The most obvious place to add the timestamp for each packet would be ieee80211_tx_info (i.e. the skb->cb[48]). The problem is it's very tight there. Even squeezing 2 bytes (allowing up to 64ms of tx completion delay which I'm worried won't be enough) will be troublesome. Some drivers already use every last byte of their allowance on 64bit archs (e.g. ar5523 uses entire 40 bytes of driver_data). I wonder if it's okay to bump skb->cb to 56 bytes to avoid the cascade of changes required to implement the tx completion delay accounting? Micha? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 11, 2015 at 11:48 PM, Michal Kazior <michal.kazior@tieto.com> wrote: > On 11 February 2015 at 09:57, Michal Kazior <michal.kazior@tieto.com> wrote: >> On 10 February 2015 at 15:19, Johannes Berg <johannes@sipsolutions.net> wrote: >>> On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote: >>> >>>> + if (msdu->sk) { >>>> + ewma_add(&ar->tx_delay_us, >>>> + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / >>>> + NSEC_PER_USEC); >>>> + >>>> + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = >>>> + (ewma_read(&ar->tx_delay_us) * >>>> + msdu->sk->sk_pacing_rate) >> 20; >>>> + } >>> >>> To some extent, every wifi driver is going to have this problem. Perhaps >>> we should do this in mac80211? >> >> Good point. I was actually thinking about it. I can try cooking a >> patch unless you want to do it yourself :-) > > I've taken a look into this. The most obvious place to add the > timestamp for each packet would be ieee80211_tx_info (i.e. the > skb->cb[48]). The problem is it's very tight there. Even squeezing 2 > bytes (allowing up to 64ms of tx completion delay which I'm worried I will argue strongly in favor of never allowing more than 4ms packets to accumulate in the firmware. > won't be enough) will be troublesome. Some drivers already use every > last byte of their allowance on 64bit archs (e.g. ar5523 uses entire > 40 bytes of driver_data). > > I wonder if it's okay to bump skb->cb to 56 bytes to avoid the cascade > of changes required to implement the tx completion delay accounting? > > > Micha? > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-02-12 at 08:48 +0100, Michal Kazior wrote: > > Good point. I was actually thinking about it. I can try cooking a > > patch unless you want to do it yourself :-) > > I've taken a look into this. The most obvious place to add the > timestamp for each packet would be ieee80211_tx_info (i.e. the > skb->cb[48]). The problem is it's very tight there. Even squeezing 2 > bytes (allowing up to 64ms of tx completion delay which I'm worried > won't be enough) will be troublesome. Some drivers already use every > last byte of their allowance on 64bit archs (e.g. ar5523 uses entire > 40 bytes of driver_data). Couldn't we just repurpose the existing skb->tstamp field for this, as long as the skb is fully contained within the wireless layer? Actually, it looks like we can't, since I guess timestamping options can be turned on on any socket. > I wonder if it's okay to bump skb->cb to 56 bytes to avoid the cascade > of changes required to implement the tx completion delay accounting? I have no doubt that would be rejected :) johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-24 at 11:24 +0100, Johannes Berg wrote: > On Thu, 2015-02-12 at 08:48 +0100, Michal Kazior wrote: > > > > Good point. I was actually thinking about it. I can try cooking a > > > patch unless you want to do it yourself :-) > > > > I've taken a look into this. The most obvious place to add the > > timestamp for each packet would be ieee80211_tx_info (i.e. the > > skb->cb[48]). The problem is it's very tight there. Even squeezing 2 > > bytes (allowing up to 64ms of tx completion delay which I'm worried > > won't be enough) will be troublesome. Some drivers already use every > > last byte of their allowance on 64bit archs (e.g. ar5523 uses entire > > 40 bytes of driver_data). > > Couldn't we just repurpose the existing skb->tstamp field for this, as > long as the skb is fully contained within the wireless layer? > > Actually, it looks like we can't, since I guess timestamping options can > be turned on on any socket. Actually, that creates a clone or a new skb? Hmm. Anyway, I think we should just move the vif and hw_key pointers out of the tx_info and into ieee80211_tx_control. That will give us plenty of space, and they can't really be used long-term anyway? Although ... both might be somewhat problematic for mac80211 itself? Hmm. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-24 at 11:30 +0100, Johannes Berg wrote: > On Tue, 2015-02-24 at 11:24 +0100, Johannes Berg wrote: > > On Thu, 2015-02-12 at 08:48 +0100, Michal Kazior wrote: > > > > > > Good point. I was actually thinking about it. I can try cooking a > > > > patch unless you want to do it yourself :-) > > > > > > I've taken a look into this. The most obvious place to add the > > > timestamp for each packet would be ieee80211_tx_info (i.e. the > > > skb->cb[48]). The problem is it's very tight there. Even squeezing 2 > > > bytes (allowing up to 64ms of tx completion delay which I'm worried > > > won't be enough) will be troublesome. Some drivers already use every > > > last byte of their allowance on 64bit archs (e.g. ar5523 uses entire > > > 40 bytes of driver_data). > > > > Couldn't we just repurpose the existing skb->tstamp field for this, as > > long as the skb is fully contained within the wireless layer? > > > > Actually, it looks like we can't, since I guess timestamping options can > > be turned on on any socket. > > Actually, that creates a clone or a new skb? Hmm. Ah and then it puts it on the error queue right away, so I think we can reuse it. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c index 367e896..a29111c 100644 --- a/drivers/net/wireless/ath/ath10k/core.c +++ b/drivers/net/wireless/ath/ath10k/core.c @@ -18,6 +18,7 @@ #include <linux/module.h> #include <linux/firmware.h> #include <linux/of.h> +#include <linux/average.h> #include "core.h" #include "mac.h" @@ -1423,6 +1424,7 @@ struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev, init_dummy_netdev(&ar->napi_dev); ieee80211_napi_add(ar->hw, &ar->napi, &ar->napi_dev, ath10k_core_napi_dummy_poll, 64); + ewma_init(&ar->tx_delay_us, 16384, 8); ret = ath10k_debug_create(ar); if (ret) diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h index 3be3a59..34f6d78 100644 --- a/drivers/net/wireless/ath/ath10k/core.h +++ b/drivers/net/wireless/ath/ath10k/core.h @@ -24,6 +24,7 @@ #include <linux/pci.h> #include <linux/uuid.h> #include <linux/time.h> +#include <linux/average.h> #include "htt.h" #include "htc.h" @@ -82,6 +83,7 @@ struct ath10k_skb_cb { dma_addr_t paddr; u8 eid; u8 vdev_id; + ktime_t stamp; struct { u8 tid; @@ -625,6 +627,7 @@ struct ath10k { struct net_device napi_dev; struct napi_struct napi; + struct ewma tx_delay_us; #ifdef CONFIG_ATH10K_DEBUGFS struct ath10k_debug debug; diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 15e47f4..5efb2a7 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -2620,6 +2620,7 @@ static void ath10k_tx(struct ieee80211_hw *hw, if (info->flags & IEEE80211_TX_CTL_NO_CCK_RATE) ath10k_dbg(ar, ATH10K_DBG_MAC, "IEEE80211_TX_CTL_NO_CCK_RATE\n"); + ATH10K_SKB_CB(skb)->stamp = ktime_get(); ATH10K_SKB_CB(skb)->htt.is_offchan = false; ATH10K_SKB_CB(skb)->htt.tid = ath10k_tx_h_get_tid(hdr); ATH10K_SKB_CB(skb)->vdev_id = ath10k_tx_h_get_vdev_id(ar, vif); diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c index 3f00cec..0f5f0f2 100644 --- a/drivers/net/wireless/ath/ath10k/txrx.c +++ b/drivers/net/wireless/ath/ath10k/txrx.c @@ -15,6 +15,8 @@ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ +#include <net/sock.h> +#include <linux/average.h> #include "core.h" #include "txrx.h" #include "htt.h" @@ -82,6 +84,16 @@ void ath10k_txrx_tx_unref(struct ath10k_htt *htt, ath10k_report_offchan_tx(htt->ar, msdu); + if (msdu->sk) { + ewma_add(&ar->tx_delay_us, + ktime_to_ns(ktime_sub(ktime_get(), skb_cb->stamp)) / + NSEC_PER_USEC); + + ACCESS_ONCE(msdu->sk->sk_tx_completion_delay_cushion) = + (ewma_read(&ar->tx_delay_us) * + msdu->sk->sk_pacing_rate) >> 20; + } + info = IEEE80211_SKB_CB(msdu); memset(&info->status, 0, sizeof(info->status)); trace_ath10k_txrx_tx_unref(ar, tx_done->msdu_id); diff --git a/include/net/sock.h b/include/net/sock.h index 2210fec..6772543 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -391,6 +391,7 @@ struct sock { gfp_t sk_allocation; u32 sk_pacing_rate; /* bytes per second */ u32 sk_max_pacing_rate; + u32 sk_tx_completion_delay_cushion; netdev_features_t sk_route_caps; netdev_features_t sk_route_nocaps; int sk_gso_type; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 65caf8b..526a568 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1996,6 +1996,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, max_segs = tcp_tso_autosize(sk, mss_now); while ((skb = tcp_send_head(sk))) { unsigned int limit; + unsigned int amount; tso_segs = tcp_init_tso_segs(sk, skb, mss_now);