diff mbox series

[net-next,3/4] bpf: devmap: implement devmap prog execution for generic XDP

Message ID 20210620233200.855534-4-memxor@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Generic XDP improvements | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 4 maintainers not CCed: kpsingh@kernel.org yhs@fb.com songliubraving@fb.com hawk@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

Kumar Kartikeya Dwivedi June 20, 2021, 11:31 p.m. UTC
This lifts the restriction on running devmap BPF progs in generic
redirect mode. To match native XDP behavior, it is invoked right before
generic_xdp_tx is called, and only supports XDP_PASS/XDP_ABORTED/
XDP_DROP actions.

We also return 0 even if devmap program drops the packet, as
semantically redirect has already succeeded and the devmap prog is the
last point before TX of the packet to device where it can deliver a
verdict on the packet.

This also means it must take care of freeing the skb, as
xdp_do_generic_redirect callers only do that in case an error is
returned.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/devmap.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

Comments

Toke Høiland-Jørgensen June 21, 2021, 3:50 p.m. UTC | #1
Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> This lifts the restriction on running devmap BPF progs in generic
> redirect mode. To match native XDP behavior, it is invoked right before
> generic_xdp_tx is called, and only supports XDP_PASS/XDP_ABORTED/
> XDP_DROP actions.
>
> We also return 0 even if devmap program drops the packet, as
> semantically redirect has already succeeded and the devmap prog is the
> last point before TX of the packet to device where it can deliver a
> verdict on the packet.
>
> This also means it must take care of freeing the skb, as
> xdp_do_generic_redirect callers only do that in case an error is
> returned.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/devmap.c | 42 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index 2a75e6c2d27d..db3ed8b20c8c 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -322,7 +322,8 @@ bool dev_map_can_have_prog(struct bpf_map *map)
>  {
>  	if ((map->map_type == BPF_MAP_TYPE_DEVMAP ||
>  	     map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) &&
> -	    map->value_size != offsetofend(struct bpf_devmap_val, ifindex))
> +	    map->value_size != offsetofend(struct bpf_devmap_val, ifindex) &&
> +	    map->value_size != offsetofend(struct bpf_devmap_val, bpf_prog.fd))
>  		return true;

With this you've basically removed the need for the check that calls
this, so why not just get rid of it entirely? Same thing for cpumap,
instead of updating cpu_map_prog_allowed(), just get rid of it...

-Toke
diff mbox series

Patch

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 2a75e6c2d27d..db3ed8b20c8c 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -322,7 +322,8 @@  bool dev_map_can_have_prog(struct bpf_map *map)
 {
 	if ((map->map_type == BPF_MAP_TYPE_DEVMAP ||
 	     map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) &&
-	    map->value_size != offsetofend(struct bpf_devmap_val, ifindex))
+	    map->value_size != offsetofend(struct bpf_devmap_val, ifindex) &&
+	    map->value_size != offsetofend(struct bpf_devmap_val, bpf_prog.fd))
 		return true;
 
 	return false;
@@ -499,6 +500,37 @@  static inline int __xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp,
 	return 0;
 }
 
+static u32 dev_map_bpf_prog_run_skb(struct sk_buff *skb, struct bpf_prog *xdp_prog)
+{
+	struct xdp_txq_info txq = { .dev = skb->dev };
+	struct xdp_buff xdp;
+	u32 act;
+
+	if (!xdp_prog)
+		return XDP_PASS;
+
+	__skb_pull(skb, skb->mac_len);
+	xdp.txq = &txq;
+
+	act = bpf_prog_run_generic_xdp(skb, &xdp, xdp_prog);
+	switch (act) {
+	case XDP_PASS:
+		__skb_push(skb, skb->mac_len);
+		break;
+	default:
+		bpf_warn_invalid_xdp_action(act);
+		fallthrough;
+	case XDP_ABORTED:
+		trace_xdp_exception(skb->dev, xdp_prog, act);
+		fallthrough;
+	case XDP_DROP:
+		kfree_skb(skb);
+		break;
+	}
+
+	return act;
+}
+
 int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp,
 		    struct net_device *dev_rx)
 {
@@ -615,6 +647,14 @@  int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
 	if (unlikely(err))
 		return err;
 	skb->dev = dst->dev;
+
+	/* Redirect has already succeeded semantically at this point, so we just
+	 * return 0 even if packet is dropped. Helper below takes care of
+	 * freeing skb.
+	 */
+	if (dev_map_bpf_prog_run_skb(skb, dst->xdp_prog) != XDP_PASS)
+		return 0;
+
 	generic_xdp_tx(skb, xdp_prog);
 
 	return 0;