Message ID | 20220701094256.1970076-1-johan.almbladh@anyfinetworks.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf] xdp: Fix spurious packet loss in generic XDP TX path | expand |
On Fri, Jul 1, 2022 at 11:43 AM Johan Almbladh <johan.almbladh@anyfinetworks.com> wrote: > > The byte queue limits (BQL) mechanism is intended to move queuing from > the driver to the network stack in order to reduce latency caused by > excessive queuing in hardware. However, when transmitting or redirecting > a packet with XDP, the qdisc layer is bypassed and there are no > additional queues. Since netif_xmit_stopped() also takes BQL limits into > account, but without having any alternative queuing, packets are > silently dropped. > > This patch modifies the drop condition to only consider cases when the > driver itself cannot accept any more packets. This is analogous to the > condition in __dev_direct_xmit(). Dropped packets are also counted on > the device. This means XDP packets are able to starve other packets going through a qdisc, DDOS attacks will be more effective. in-driver-XDP use dedicated TX queues, so they do not have this starvation issue. This should be mentioned somewhere I guess. > > Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> > --- > net/core/dev.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8e6f22961206..41b5d7ac5ec5 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4875,10 +4875,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > txq = netdev_core_pick_tx(dev, skb, NULL); > cpu = smp_processor_id(); > HARD_TX_LOCK(dev, txq, cpu); > - if (!netif_xmit_stopped(txq)) { > + if (!netif_xmit_frozen_or_drv_stopped(txq)) { > rc = netdev_start_xmit(skb, dev, txq, 0); > if (dev_xmit_complete(rc)) > free_skb = false; > + } else { > + dev_core_stats_tx_dropped_inc(dev); > } > HARD_TX_UNLOCK(dev, txq); > if (free_skb) { > -- > 2.30.2 >
On 01/07/2022 11.57, Eric Dumazet wrote: > On Fri, Jul 1, 2022 at 11:43 AM Johan Almbladh > <johan.almbladh@anyfinetworks.com> wrote: >> >> The byte queue limits (BQL) mechanism is intended to move queuing from >> the driver to the network stack in order to reduce latency caused by >> excessive queuing in hardware. However, when transmitting or redirecting >> a packet with XDP, the qdisc layer is bypassed and there are no >> additional queues. Since netif_xmit_stopped() also takes BQL limits into >> account, but without having any alternative queuing, packets are >> silently dropped. >> >> This patch modifies the drop condition to only consider cases when the >> driver itself cannot accept any more packets. This is analogous to the >> condition in __dev_direct_xmit(). Dropped packets are also counted on >> the device. > > This means XDP packets are able to starve other packets going through a qdisc, > DDOS attacks will be more effective. > > in-driver-XDP use dedicated TX queues, so they do not have this > starvation issue. Good point. This happen in XDP-generic path, because XDP share the TX queue with normal network stack. > > This should be mentioned somewhere I guess. I want to mention that (even for in-driver-XDP) not having a queuing mechanism for XDP redirect is a general problem (and huge foot gun). E.g. doing XDP-redirect between interfaces with different link rates quickly result in issues. We have Toke + PhD student (Frey Cc) working[1] on "XDQ" to address this generically. I urge them to look at the code for the push-back mechanism that netif_xmit_frozen_or_drv_stopped() and BQL provides and somehow integrated XDQ with this... --Jesper [1] https://youtu.be/tthG9LP5GFk >> >> Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> >> --- >> net/core/dev.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 8e6f22961206..41b5d7ac5ec5 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -4875,10 +4875,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) >> txq = netdev_core_pick_tx(dev, skb, NULL); >> cpu = smp_processor_id(); >> HARD_TX_LOCK(dev, txq, cpu); >> - if (!netif_xmit_stopped(txq)) { >> + if (!netif_xmit_frozen_or_drv_stopped(txq)) { >> rc = netdev_start_xmit(skb, dev, txq, 0); >> if (dev_xmit_complete(rc)) >> free_skb = false; >> + } else { >> + dev_core_stats_tx_dropped_inc(dev); >> } >> HARD_TX_UNLOCK(dev, txq); >> if (free_skb) { >> -- >> 2.30.2 >> >
On 7/1/22 11:57 AM, Eric Dumazet wrote: > On Fri, Jul 1, 2022 at 11:43 AM Johan Almbladh > <johan.almbladh@anyfinetworks.com> wrote: >> >> The byte queue limits (BQL) mechanism is intended to move queuing from >> the driver to the network stack in order to reduce latency caused by >> excessive queuing in hardware. However, when transmitting or redirecting >> a packet with XDP, the qdisc layer is bypassed and there are no >> additional queues. Since netif_xmit_stopped() also takes BQL limits into >> account, but without having any alternative queuing, packets are >> silently dropped. >> >> This patch modifies the drop condition to only consider cases when the >> driver itself cannot accept any more packets. This is analogous to the >> condition in __dev_direct_xmit(). Dropped packets are also counted on >> the device. > > This means XDP packets are able to starve other packets going through a qdisc, > DDOS attacks will be more effective. > > in-driver-XDP use dedicated TX queues, so they do not have this > starvation issue. > > This should be mentioned somewhere I guess. +1, Johan, could you add this as comment and into commit description in a v2 of your fix? Definitely should be clarified that it's limited to generic XDP. Thanks, Daniel
On Fri, Jul 1, 2022 at 3:29 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 7/1/22 11:57 AM, Eric Dumazet wrote: > > On Fri, Jul 1, 2022 at 11:43 AM Johan Almbladh > > <johan.almbladh@anyfinetworks.com> wrote: > >> > >> The byte queue limits (BQL) mechanism is intended to move queuing from > >> the driver to the network stack in order to reduce latency caused by > >> excessive queuing in hardware. However, when transmitting or redirecting > >> a packet with XDP, the qdisc layer is bypassed and there are no > >> additional queues. Since netif_xmit_stopped() also takes BQL limits into > >> account, but without having any alternative queuing, packets are > >> silently dropped. > >> > >> This patch modifies the drop condition to only consider cases when the > >> driver itself cannot accept any more packets. This is analogous to the > >> condition in __dev_direct_xmit(). Dropped packets are also counted on > >> the device. > > > > This means XDP packets are able to starve other packets going through a qdisc, > > DDOS attacks will be more effective. > > > > in-driver-XDP use dedicated TX queues, so they do not have this > > starvation issue. > > > > This should be mentioned somewhere I guess. > > +1, Johan, could you add this as comment and into commit description in a v2 > of your fix? Definitely should be clarified that it's limited to generic XDP. Thanks for the review. Daniel, I will prepare a v2 shortly. Thanks, Johan
diff --git a/net/core/dev.c b/net/core/dev.c index 8e6f22961206..41b5d7ac5ec5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4875,10 +4875,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) txq = netdev_core_pick_tx(dev, skb, NULL); cpu = smp_processor_id(); HARD_TX_LOCK(dev, txq, cpu); - if (!netif_xmit_stopped(txq)) { + if (!netif_xmit_frozen_or_drv_stopped(txq)) { rc = netdev_start_xmit(skb, dev, txq, 0); if (dev_xmit_complete(rc)) free_skb = false; + } else { + dev_core_stats_tx_dropped_inc(dev); } HARD_TX_UNLOCK(dev, txq); if (free_skb) {
The byte queue limits (BQL) mechanism is intended to move queuing from the driver to the network stack in order to reduce latency caused by excessive queuing in hardware. However, when transmitting or redirecting a packet with XDP, the qdisc layer is bypassed and there are no additional queues. Since netif_xmit_stopped() also takes BQL limits into account, but without having any alternative queuing, packets are silently dropped. This patch modifies the drop condition to only consider cases when the driver itself cannot accept any more packets. This is analogous to the condition in __dev_direct_xmit(). Dropped packets are also counted on the device. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> --- net/core/dev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)