Message ID | 20220701151200.2033129-1-johan.almbladh@anyfinetworks.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf,v2] xdp: Fix spurious packet loss in generic XDP TX path | expand |
On 7/1/22 5:12 PM, Johan Almbladh wrote: > The byte queue limits (BQL) mechanism is intended to move queuing from > the driver to the network stack in order to reduce latency caused by > excessive queuing in hardware. However, when transmitting or redirecting > a packet using generic XDP, the qdisc layer is bypassed and there are no > additional queues. Since netif_xmit_stopped() also takes BQL limits into > account, but without having any alternative queuing, packets are > silently dropped. > > This patch modifies the drop condition to only consider cases when the > driver itself cannot accept any more packets. This is analogous to the > condition in __dev_direct_xmit(). Dropped packets are also counted on > the device. > > Bypassing the qdisc layer in the generic XDP TX path means that XDP > packets are able to starve other packets going through a qdisc, and > DDOS attacks will be more effective. In-driver-XDP use dedicated TX > queues, so they do not have this starvation issue. > > Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> > --- > net/core/dev.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8e6f22961206..00fb9249357f 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4863,7 +4863,10 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > } > > /* When doing generic XDP we have to bypass the qdisc layer and the > - * network taps in order to match in-driver-XDP behavior. > + * network taps in order to match in-driver-XDP behavior. This also means > + * that XDP packets are able to starve other packets going through a qdisc, > + * and DDOS attacks will be more effective. In-driver-XDP use dedicated TX > + * queues, so they do not have this starvation issue. > */ > void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > { > @@ -4875,10 +4878,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > txq = netdev_core_pick_tx(dev, skb, NULL); > cpu = smp_processor_id(); > HARD_TX_LOCK(dev, txq, cpu); > - if (!netif_xmit_stopped(txq)) { > + if (!netif_xmit_frozen_or_drv_stopped(txq)) { > rc = netdev_start_xmit(skb, dev, txq, 0); > if (dev_xmit_complete(rc)) > free_skb = false; > + } else { > + dev_core_stats_tx_dropped_inc(dev); > } > HARD_TX_UNLOCK(dev, txq); > if (free_skb) { Small q: Shouldn't the drop counter go into the free_skb branch? diff --git a/net/core/dev.c b/net/core/dev.c index 00fb9249357f..17e2c39477c5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4882,11 +4882,10 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) rc = netdev_start_xmit(skb, dev, txq, 0); if (dev_xmit_complete(rc)) free_skb = false; - } else { - dev_core_stats_tx_dropped_inc(dev); } HARD_TX_UNLOCK(dev, txq); if (free_skb) { + dev_core_stats_tx_dropped_inc(dev); trace_xdp_exception(dev, xdp_prog, XDP_TX); kfree_skb(skb); }
On Sat, Jul 2, 2022 at 12:47 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 7/1/22 5:12 PM, Johan Almbladh wrote: > > The byte queue limits (BQL) mechanism is intended to move queuing from > > the driver to the network stack in order to reduce latency caused by > > excessive queuing in hardware. However, when transmitting or redirecting > > a packet using generic XDP, the qdisc layer is bypassed and there are no > > additional queues. Since netif_xmit_stopped() also takes BQL limits into > > account, but without having any alternative queuing, packets are > > silently dropped. > > > > This patch modifies the drop condition to only consider cases when the > > driver itself cannot accept any more packets. This is analogous to the > > condition in __dev_direct_xmit(). Dropped packets are also counted on > > the device. > > > > Bypassing the qdisc layer in the generic XDP TX path means that XDP > > packets are able to starve other packets going through a qdisc, and > > DDOS attacks will be more effective. In-driver-XDP use dedicated TX > > queues, so they do not have this starvation issue. > > > > Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> > > --- > > net/core/dev.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 8e6f22961206..00fb9249357f 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -4863,7 +4863,10 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > > } > > > > /* When doing generic XDP we have to bypass the qdisc layer and the > > - * network taps in order to match in-driver-XDP behavior. > > + * network taps in order to match in-driver-XDP behavior. This also means > > + * that XDP packets are able to starve other packets going through a qdisc, > > + * and DDOS attacks will be more effective. In-driver-XDP use dedicated TX > > + * queues, so they do not have this starvation issue. > > */ > > void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > > { > > @@ -4875,10 +4878,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > > txq = netdev_core_pick_tx(dev, skb, NULL); > > cpu = smp_processor_id(); > > HARD_TX_LOCK(dev, txq, cpu); > > - if (!netif_xmit_stopped(txq)) { > > + if (!netif_xmit_frozen_or_drv_stopped(txq)) { > > rc = netdev_start_xmit(skb, dev, txq, 0); > > if (dev_xmit_complete(rc)) > > free_skb = false; > > + } else { > > + dev_core_stats_tx_dropped_inc(dev); > > } > > HARD_TX_UNLOCK(dev, txq); > > if (free_skb) { > > Small q: Shouldn't the drop counter go into the free_skb branch? This was on purpose to not increment the counter twice, but I think you are right. The driver update the tx_dropped counter if the packet is dropped, but I see that it also consumes the skb in those cases. Looking again at the driver tree I cannot found any examples where the driver updates the counter *without* consuming the skb. This logic makes sense - whoever consumes the skb it is also responsible for updating the counters on the netdev. > > diff --git a/net/core/dev.c b/net/core/dev.c > index 00fb9249357f..17e2c39477c5 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4882,11 +4882,10 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) > rc = netdev_start_xmit(skb, dev, txq, 0); > if (dev_xmit_complete(rc)) > free_skb = false; > - } else { > - dev_core_stats_tx_dropped_inc(dev); > } > HARD_TX_UNLOCK(dev, txq); > if (free_skb) { > + dev_core_stats_tx_dropped_inc(dev); > trace_xdp_exception(dev, xdp_prog, XDP_TX); > kfree_skb(skb); > }
diff --git a/net/core/dev.c b/net/core/dev.c index 8e6f22961206..00fb9249357f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4863,7 +4863,10 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, } /* When doing generic XDP we have to bypass the qdisc layer and the - * network taps in order to match in-driver-XDP behavior. + * network taps in order to match in-driver-XDP behavior. This also means + * that XDP packets are able to starve other packets going through a qdisc, + * and DDOS attacks will be more effective. In-driver-XDP use dedicated TX + * queues, so they do not have this starvation issue. */ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) { @@ -4875,10 +4878,12 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) txq = netdev_core_pick_tx(dev, skb, NULL); cpu = smp_processor_id(); HARD_TX_LOCK(dev, txq, cpu); - if (!netif_xmit_stopped(txq)) { + if (!netif_xmit_frozen_or_drv_stopped(txq)) { rc = netdev_start_xmit(skb, dev, txq, 0); if (dev_xmit_complete(rc)) free_skb = false; + } else { + dev_core_stats_tx_dropped_inc(dev); } HARD_TX_UNLOCK(dev, txq); if (free_skb) {
The byte queue limits (BQL) mechanism is intended to move queuing from the driver to the network stack in order to reduce latency caused by excessive queuing in hardware. However, when transmitting or redirecting a packet using generic XDP, the qdisc layer is bypassed and there are no additional queues. Since netif_xmit_stopped() also takes BQL limits into account, but without having any alternative queuing, packets are silently dropped. This patch modifies the drop condition to only consider cases when the driver itself cannot accept any more packets. This is analogous to the condition in __dev_direct_xmit(). Dropped packets are also counted on the device. Bypassing the qdisc layer in the generic XDP TX path means that XDP packets are able to starve other packets going through a qdisc, and DDOS attacks will be more effective. In-driver-XDP use dedicated TX queues, so they do not have this starvation issue. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> --- net/core/dev.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)