diff mbox series

[v3,bpf-next,06/17] libbpf: xsk: use bpf_link

Message ID 20210322205816.65159-7-maciej.fijalkowski@intel.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series AF_XDP selftests improvements & bpf_link | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count fail Series longer than 15 patches
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 10 maintainers not CCed: jonathan.lemon@gmail.com yhs@fb.com hawk@kernel.org kpsingh@kernel.org andrii@kernel.org kafai@fb.com bjorn@kernel.org songliubraving@fb.com davem@davemloft.net kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Maciej Fijalkowski March 22, 2021, 8:58 p.m. UTC
Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

Consider the scenario below:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

Provide backward compatibility by checking if underlying system is
bpf_link capable. Do this by looking up/creating bpf_link on loopback
device. If it failed in any way, stick with netlink-based XDP prog.
Otherwise, use bpf_link-based logic.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.

For case where resources exist but they are not AF_XDP related, bail out
and ask user to remove existing prog and then retry.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 tools/lib/bpf/xsk.c | 175 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 156 insertions(+), 19 deletions(-)

Comments

Toke Høiland-Jørgensen March 22, 2021, 9:47 p.m. UTC | #1
Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:

> Currently, if there are multiple xdpsock instances running on a single
> interface and in case one of the instances is terminated, the rest of
> them are left in an inoperable state due to the fact of unloaded XDP
> prog from interface.
>
> Consider the scenario below:
>
> // load xdp prog and xskmap and add entry to xskmap at idx 10
> $ sudo ./xdpsock -i ens801f0 -t -q 10
>
> // add entry to xskmap at idx 11
> $ sudo ./xdpsock -i ens801f0 -t -q 11
>
> terminate one of the processes and another one is unable to work due to
> the fact that the XDP prog was unloaded from interface.
>
> To address that, step away from setting bpf prog in favour of bpf_link.
> This means that refcounting of BPF resources will be done automatically
> by bpf_link itself.
>
> Provide backward compatibility by checking if underlying system is
> bpf_link capable. Do this by looking up/creating bpf_link on loopback
> device. If it failed in any way, stick with netlink-based XDP prog.
> Otherwise, use bpf_link-based logic.

So how is the caller supposed to know which of the cases happened?
Presumably they need to do their own cleanup in that case? AFAICT you're
changing the code to always clobber the existing XDP program on detach
in the fallback case, which seems like a bit of an aggressive change? :)

-Toke
Maciej Fijalkowski March 24, 2021, 1:09 p.m. UTC | #2
On Mon, Mar 22, 2021 at 10:47:09PM +0100, Toke Høiland-Jørgensen wrote:
> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> 
> > Currently, if there are multiple xdpsock instances running on a single
> > interface and in case one of the instances is terminated, the rest of
> > them are left in an inoperable state due to the fact of unloaded XDP
> > prog from interface.
> >
> > Consider the scenario below:
> >
> > // load xdp prog and xskmap and add entry to xskmap at idx 10
> > $ sudo ./xdpsock -i ens801f0 -t -q 10
> >
> > // add entry to xskmap at idx 11
> > $ sudo ./xdpsock -i ens801f0 -t -q 11
> >
> > terminate one of the processes and another one is unable to work due to
> > the fact that the XDP prog was unloaded from interface.
> >
> > To address that, step away from setting bpf prog in favour of bpf_link.
> > This means that refcounting of BPF resources will be done automatically
> > by bpf_link itself.
> >
> > Provide backward compatibility by checking if underlying system is
> > bpf_link capable. Do this by looking up/creating bpf_link on loopback
> > device. If it failed in any way, stick with netlink-based XDP prog.
> > Otherwise, use bpf_link-based logic.
> 
> So how is the caller supposed to know which of the cases happened?
> Presumably they need to do their own cleanup in that case? AFAICT you're
> changing the code to always clobber the existing XDP program on detach
> in the fallback case, which seems like a bit of an aggressive change? :)

Sorry Toke, I was offline yesterday.
Yeah once again I went too far and we shouldn't do:

bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);

if xsk_lookup_bpf_maps(xsk) returned non-zero value which implies that the
underlying prog is not AF_XDP related.

closing prog_fd (and link_fd under the condition that system is bpf_link
capable) is enough for that case.

If we agree on that and there's nothing else that I missed, I'll send a
v4.

Thanks for review!

> 
> -Toke
>
Toke Høiland-Jørgensen March 24, 2021, 11:38 p.m. UTC | #3
Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:

> On Mon, Mar 22, 2021 at 10:47:09PM +0100, Toke Høiland-Jørgensen wrote:
>> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
>> 
>> > Currently, if there are multiple xdpsock instances running on a single
>> > interface and in case one of the instances is terminated, the rest of
>> > them are left in an inoperable state due to the fact of unloaded XDP
>> > prog from interface.
>> >
>> > Consider the scenario below:
>> >
>> > // load xdp prog and xskmap and add entry to xskmap at idx 10
>> > $ sudo ./xdpsock -i ens801f0 -t -q 10
>> >
>> > // add entry to xskmap at idx 11
>> > $ sudo ./xdpsock -i ens801f0 -t -q 11
>> >
>> > terminate one of the processes and another one is unable to work due to
>> > the fact that the XDP prog was unloaded from interface.
>> >
>> > To address that, step away from setting bpf prog in favour of bpf_link.
>> > This means that refcounting of BPF resources will be done automatically
>> > by bpf_link itself.
>> >
>> > Provide backward compatibility by checking if underlying system is
>> > bpf_link capable. Do this by looking up/creating bpf_link on loopback
>> > device. If it failed in any way, stick with netlink-based XDP prog.
>> > Otherwise, use bpf_link-based logic.
>> 
>> So how is the caller supposed to know which of the cases happened?
>> Presumably they need to do their own cleanup in that case? AFAICT you're
>> changing the code to always clobber the existing XDP program on detach
>> in the fallback case, which seems like a bit of an aggressive change? :)
>
> Sorry Toke, I was offline yesterday.
> Yeah once again I went too far and we shouldn't do:
>
> bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
>
> if xsk_lookup_bpf_maps(xsk) returned non-zero value which implies that the
> underlying prog is not AF_XDP related.
>
> closing prog_fd (and link_fd under the condition that system is bpf_link
> capable) is enough for that case.

I think the same thing goes for further down? With your patch, if the
code takes the else branch (after checking prog_id), and then ends up
going to err_set_bpf_maps, it'll now also do an unconditional
bpf_set_link_xdp_fd(), where before it was checking prog_id again and
only unloading if it previously loaded the program...

> If we agree on that and there's nothing else that I missed, I'll send
> a v4.

Apart from the above, sure!

-Toke
Maciej Fijalkowski March 26, 2021, 3:23 p.m. UTC | #4
On Thu, Mar 25, 2021 at 12:38:07AM +0100, Toke Høiland-Jørgensen wrote:
> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> 
> > On Mon, Mar 22, 2021 at 10:47:09PM +0100, Toke Høiland-Jørgensen wrote:
> >> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
> >> 
> >> > Currently, if there are multiple xdpsock instances running on a single
> >> > interface and in case one of the instances is terminated, the rest of
> >> > them are left in an inoperable state due to the fact of unloaded XDP
> >> > prog from interface.
> >> >
> >> > Consider the scenario below:
> >> >
> >> > // load xdp prog and xskmap and add entry to xskmap at idx 10
> >> > $ sudo ./xdpsock -i ens801f0 -t -q 10
> >> >
> >> > // add entry to xskmap at idx 11
> >> > $ sudo ./xdpsock -i ens801f0 -t -q 11
> >> >
> >> > terminate one of the processes and another one is unable to work due to
> >> > the fact that the XDP prog was unloaded from interface.
> >> >
> >> > To address that, step away from setting bpf prog in favour of bpf_link.
> >> > This means that refcounting of BPF resources will be done automatically
> >> > by bpf_link itself.
> >> >
> >> > Provide backward compatibility by checking if underlying system is
> >> > bpf_link capable. Do this by looking up/creating bpf_link on loopback
> >> > device. If it failed in any way, stick with netlink-based XDP prog.
> >> > Otherwise, use bpf_link-based logic.
> >> 
> >> So how is the caller supposed to know which of the cases happened?
> >> Presumably they need to do their own cleanup in that case? AFAICT you're
> >> changing the code to always clobber the existing XDP program on detach
> >> in the fallback case, which seems like a bit of an aggressive change? :)
> >
> > Sorry Toke, I was offline yesterday.
> > Yeah once again I went too far and we shouldn't do:
> >
> > bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
> >
> > if xsk_lookup_bpf_maps(xsk) returned non-zero value which implies that the
> > underlying prog is not AF_XDP related.
> >
> > closing prog_fd (and link_fd under the condition that system is bpf_link
> > capable) is enough for that case.
> 
> I think the same thing goes for further down? With your patch, if the
> code takes the else branch (after checking prog_id), and then ends up
> going to err_set_bpf_maps, it'll now also do an unconditional
> bpf_set_link_xdp_fd(), where before it was checking prog_id again and
> only unloading if it previously loaded the program...

Hmm it's messy, I think we need a bit of refactoring here. Note that old
code was missing a close on ctx->xsks_map_fd if there was an error on
xsk_set_bpf_maps(xsk) and prog_id != 0 - given that
xsk_lookup_bpf_maps(xsk) succeeded, we therefore have a valid map fd that
we need to take care of on error path, for !prog_id case it was taken care
of within xsk_delete_bpf_maps(xsk).

So how about a diff below (on top of this patch), where we separate paths
based on prog_id value retrieved earlier? xsk_set_bpf_maps(xsk) is
repeated but this way I feel like it's more clear with cleanup/error
paths.

Wdyt?


diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 15812e4b93ca..c75067f0035f 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -790,71 +790,103 @@ static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
 	return 0;
 }
 
-static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
-				int *xsks_map_fd)
+static int xsk_init_xdp_res(struct xsk_socket *xsk,
+			    int *xsks_map_fd)
 {
-	struct xsk_socket *xsk = _xdp;
 	struct xsk_ctx *ctx = xsk->ctx;
-	__u32 prog_id = 0;
 	int err;
 
-	if (ctx->has_bpf_link)
-		err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd);
-	else
-		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
+	err = xsk_create_bpf_maps(xsk);
 	if (err)
 		return err;
 
-	if (!prog_id) {
-		err = xsk_create_bpf_maps(xsk);
-		if (err)
-			return err;
+	err = xsk_load_xdp_prog(xsk);
+	if (err)
+		goto err_load_xdp_prog;
 
-		err = xsk_load_xdp_prog(xsk);
-		if (err)
-			goto err_load_xdp_prog;
+	if (ctx->has_bpf_link)
+		err = xsk_create_bpf_link(xsk);
+	else
+		err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, ctx->prog_fd,
+					  xsk->config.xdp_flags);
 
-		if (ctx->has_bpf_link)
-			err = xsk_create_bpf_link(xsk);
-		else
-			err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, ctx->prog_fd,
-						  xsk->config.xdp_flags);
-		if (err)
-			goto err_attach_prog;
-	} else {
-		ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
-		if (ctx->prog_fd < 0)
-			return -errno;
-		err = xsk_lookup_bpf_maps(xsk);
-		if (err) {
-			close(ctx->prog_fd);
-			if (ctx->has_bpf_link)
-				close(ctx->link_fd);
-			else
-				bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
-			return err;
-		}
-	}
+	if (err)
+		goto err_atach_xdp_prog;
 
-	if (xsk->rx) {
-		err = xsk_set_bpf_maps(xsk);
-		if (err)
-			goto err_set_bpf_maps;
-	}
-	if (xsks_map_fd)
-		*xsks_map_fd = ctx->xsks_map_fd;
+	if (!xsk->rx)
+		return err;
 
-	return 0;
+	err = xsk_set_bpf_maps(xsk);
+	if (err)
+		goto err_set_bpf_maps;
+
+	return err;
 
 err_set_bpf_maps:
 	if (ctx->has_bpf_link)
 		close(ctx->link_fd);
 	else
-		bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
-err_attach_prog:
+		bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
+err_atach_xdp_prog:
 	close(ctx->prog_fd);
 err_load_xdp_prog:
 	xsk_delete_bpf_maps(xsk);
+	return err;
+}
+
+static int xsk_lookup_xdp_res(struct xsk_socket *xsk, int *xsks_map_fd, int prog_id)
+{
+	struct xsk_ctx *ctx = xsk->ctx;
+	int err;
+
+	ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
+	if (ctx->prog_fd < 0) {
+		err = -errno;
+		goto err_prog_fd;
+	}
+	err = xsk_lookup_bpf_maps(xsk);
+	if (err)
+		goto err_lookup_maps;
+
+	if (!xsk->rx)
+		return err;
+
+	err = xsk_set_bpf_maps(xsk);
+	if (err)
+		goto err_set_maps;
+
+	return err;
+
+err_set_maps:
+	close(ctx->xsks_map_fd);
+err_lookup_maps:
+	close(ctx->prog_fd);
+err_prog_fd:
+	if (ctx->has_bpf_link)
+		close(ctx->link_fd);
+	return err;
+}
+
+static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, int *xsks_map_fd)
+{
+	struct xsk_socket *xsk = _xdp;
+	struct xsk_ctx *ctx = xsk->ctx;
+	__u32 prog_id = 0;
+	int err;
+
+	if (ctx->has_bpf_link)
+		err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd);
+	else
+		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
+
+	if (err)
+		return err;
+
+	err = !prog_id ? xsk_init_xdp_res(xsk, xsks_map_fd) :
+			 xsk_lookup_xdp_res(xsk, xsks_map_fd, prog_id);
+
+	if (!err && xsks_map_fd)
+		*xsks_map_fd = ctx->xsks_map_fd;
 
 	return err;
 }
> 
> > If we agree on that and there's nothing else that I missed, I'll send
> > a v4.
> 
> Apart from the above, sure!
> 
> -Toke
>
Toke Høiland-Jørgensen March 26, 2021, 6:52 p.m. UTC | #5
Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:

> On Thu, Mar 25, 2021 at 12:38:07AM +0100, Toke Høiland-Jørgensen wrote:
>> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
>> 
>> > On Mon, Mar 22, 2021 at 10:47:09PM +0100, Toke Høiland-Jørgensen wrote:
>> >> Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes:
>> >> 
>> >> > Currently, if there are multiple xdpsock instances running on a single
>> >> > interface and in case one of the instances is terminated, the rest of
>> >> > them are left in an inoperable state due to the fact of unloaded XDP
>> >> > prog from interface.
>> >> >
>> >> > Consider the scenario below:
>> >> >
>> >> > // load xdp prog and xskmap and add entry to xskmap at idx 10
>> >> > $ sudo ./xdpsock -i ens801f0 -t -q 10
>> >> >
>> >> > // add entry to xskmap at idx 11
>> >> > $ sudo ./xdpsock -i ens801f0 -t -q 11
>> >> >
>> >> > terminate one of the processes and another one is unable to work due to
>> >> > the fact that the XDP prog was unloaded from interface.
>> >> >
>> >> > To address that, step away from setting bpf prog in favour of bpf_link.
>> >> > This means that refcounting of BPF resources will be done automatically
>> >> > by bpf_link itself.
>> >> >
>> >> > Provide backward compatibility by checking if underlying system is
>> >> > bpf_link capable. Do this by looking up/creating bpf_link on loopback
>> >> > device. If it failed in any way, stick with netlink-based XDP prog.
>> >> > Otherwise, use bpf_link-based logic.
>> >> 
>> >> So how is the caller supposed to know which of the cases happened?
>> >> Presumably they need to do their own cleanup in that case? AFAICT you're
>> >> changing the code to always clobber the existing XDP program on detach
>> >> in the fallback case, which seems like a bit of an aggressive change? :)
>> >
>> > Sorry Toke, I was offline yesterday.
>> > Yeah once again I went too far and we shouldn't do:
>> >
>> > bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
>> >
>> > if xsk_lookup_bpf_maps(xsk) returned non-zero value which implies that the
>> > underlying prog is not AF_XDP related.
>> >
>> > closing prog_fd (and link_fd under the condition that system is bpf_link
>> > capable) is enough for that case.
>> 
>> I think the same thing goes for further down? With your patch, if the
>> code takes the else branch (after checking prog_id), and then ends up
>> going to err_set_bpf_maps, it'll now also do an unconditional
>> bpf_set_link_xdp_fd(), where before it was checking prog_id again and
>> only unloading if it previously loaded the program...
>
> Hmm it's messy, I think we need a bit of refactoring here. Note that old
> code was missing a close on ctx->xsks_map_fd if there was an error on
> xsk_set_bpf_maps(xsk) and prog_id != 0 - given that
> xsk_lookup_bpf_maps(xsk) succeeded, we therefore have a valid map fd that
> we need to take care of on error path, for !prog_id case it was taken care
> of within xsk_delete_bpf_maps(xsk).
>
> So how about a diff below (on top of this patch), where we separate paths
> based on prog_id value retrieved earlier? xsk_set_bpf_maps(xsk) is
> repeated but this way I feel like it's more clear with cleanup/error
> paths.
>
> Wdyt?

Yeah, that's much easier to follow! Nice :)

-Toke
diff mbox series

Patch

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 526fc35c0b23..15812e4b93ca 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -28,6 +28,7 @@ 
 #include <sys/mman.h>
 #include <sys/socket.h>
 #include <sys/types.h>
+#include <linux/if_link.h>
 
 #include "bpf.h"
 #include "libbpf.h"
@@ -70,8 +71,10 @@  struct xsk_ctx {
 	int ifindex;
 	struct list_head list;
 	int prog_fd;
+	int link_fd;
 	int xsks_map_fd;
 	char ifname[IFNAMSIZ];
+	bool has_bpf_link;
 };
 
 struct xsk_socket {
@@ -409,7 +412,7 @@  static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 	static const int log_buf_size = 16 * 1024;
 	struct xsk_ctx *ctx = xsk->ctx;
 	char log_buf[log_buf_size];
-	int err, prog_fd;
+	int prog_fd;
 
 	/* This is the fallback C-program:
 	 * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
@@ -499,14 +502,43 @@  static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 		return prog_fd;
 	}
 
-	err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, prog_fd,
-				  xsk->config.xdp_flags);
+	ctx->prog_fd = prog_fd;
+	return 0;
+}
+
+static int xsk_create_bpf_link(struct xsk_socket *xsk)
+{
+	/* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
+	 * might have set XDP_FLAGS_UPDATE_IF_NOEXIST
+	 */
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
+			    .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
+	struct xsk_ctx *ctx = xsk->ctx;
+	__u32 prog_id = 0;
+	int link_fd;
+	int err;
+
+	err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
 	if (err) {
-		close(prog_fd);
+		pr_warn("getting XDP prog id failed\n");
 		return err;
 	}
 
-	ctx->prog_fd = prog_fd;
+	/* if there's a netlink-based XDP prog loaded on interface, bail out
+	 * and ask user to do the removal by himself
+	 */
+	if (prog_id) {
+		pr_warn("Netlink-based XDP prog detected, please unload it in order to launch AF_XDP prog\n");
+		return -EINVAL;
+	}
+
+	link_fd = bpf_link_create(ctx->prog_fd, ctx->ifindex, BPF_XDP, &opts);
+	if (link_fd < 0) {
+		pr_warn("bpf_link_create failed: %s\n", strerror(errno));
+		return link_fd;
+	}
+
+	ctx->link_fd = link_fd;
 	return 0;
 }
 
@@ -625,7 +657,6 @@  static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
 		close(fd);
 	}
 
-	err = 0;
 	if (ctx->xsks_map_fd == -1)
 		err = -ENOENT;
 
@@ -642,6 +673,97 @@  static int xsk_set_bpf_maps(struct xsk_socket *xsk)
 				   &xsk->fd, 0);
 }
 
+static int xsk_link_lookup(int ifindex, __u32 *prog_id, int *link_fd)
+{
+	struct bpf_link_info link_info;
+	__u32 link_len;
+	__u32 id = 0;
+	int err;
+	int fd;
+
+	while (true) {
+		err = bpf_link_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT) {
+				err = 0;
+				break;
+			}
+			pr_warn("can't get next link: %s\n", strerror(errno));
+			break;
+		}
+
+		fd = bpf_link_get_fd_by_id(id);
+		if (fd < 0) {
+			if (errno == ENOENT)
+				continue;
+			pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
+			err = -errno;
+			break;
+		}
+
+		link_len = sizeof(struct bpf_link_info);
+		memset(&link_info, 0, link_len);
+		err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
+		if (err) {
+			pr_warn("can't get link info: %s\n", strerror(errno));
+			close(fd);
+			break;
+		}
+		if (link_info.type == BPF_LINK_TYPE_XDP) {
+			if (link_info.xdp.ifindex == ifindex) {
+				*link_fd = fd;
+				if (prog_id)
+					*prog_id = link_info.prog_id;
+				break;
+			}
+		}
+		close(fd);
+	}
+
+	return err;
+}
+
+static bool xsk_probe_bpf_link(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
+			    .flags = XDP_FLAGS_SKB_MODE);
+	struct bpf_load_program_attr prog_attr;
+	struct bpf_insn insns[2] = {
+		BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
+		BPF_EXIT_INSN()
+	};
+	int prog_fd, link_fd = -1;
+	int ifindex_lo = 1;
+	bool ret = false;
+	int err;
+
+	err = xsk_link_lookup(ifindex_lo, NULL, &link_fd);
+	if (err)
+		return ret;
+
+	if (link_fd >= 0)
+		return true;
+
+	memset(&prog_attr, 0, sizeof(prog_attr));
+	prog_attr.prog_type = BPF_PROG_TYPE_XDP;
+	prog_attr.insns = insns;
+	prog_attr.insns_cnt = ARRAY_SIZE(insns);
+	prog_attr.license = "GPL";
+
+	prog_fd = bpf_load_program_xattr(&prog_attr, NULL, 0);
+	if (prog_fd < 0)
+		return ret;
+
+	link_fd = bpf_link_create(prog_fd, ifindex_lo, BPF_XDP, &opts);
+	if (link_fd >= 0)
+		ret = true;
+
+	close(prog_fd);
+	close(link_fd);
+
+	return ret;
+}
+
 static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
 {
 	char ifname[IFNAMSIZ];
@@ -663,6 +785,7 @@  static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
 	ctx->ifname[IFNAMSIZ - 1] = 0;
 
 	xsk->ctx = ctx;
+	xsk->ctx->has_bpf_link = xsk_probe_bpf_link();
 
 	return 0;
 }
@@ -675,8 +798,10 @@  static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 	__u32 prog_id = 0;
 	int err;
 
-	err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id,
-				  xsk->config.xdp_flags);
+	if (ctx->has_bpf_link)
+		err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd);
+	else
+		err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
 	if (err)
 		return err;
 
@@ -686,9 +811,16 @@  static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 			return err;
 
 		err = xsk_load_xdp_prog(xsk);
-		if (err) {
+		if (err)
 			goto err_load_xdp_prog;
-		}
+
+		if (ctx->has_bpf_link)
+			err = xsk_create_bpf_link(xsk);
+		else
+			err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, ctx->prog_fd,
+						  xsk->config.xdp_flags);
+		if (err)
+			goto err_attach_prog;
 	} else {
 		ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
 		if (ctx->prog_fd < 0)
@@ -696,20 +828,18 @@  static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 		err = xsk_lookup_bpf_maps(xsk);
 		if (err) {
 			close(ctx->prog_fd);
+			if (ctx->has_bpf_link)
+				close(ctx->link_fd);
+			else
+				bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
 			return err;
 		}
 	}
 
 	if (xsk->rx) {
 		err = xsk_set_bpf_maps(xsk);
-		if (err) {
-			if (!prog_id) {
-				goto err_set_bpf_maps;
-			} else {
-				close(ctx->prog_fd);
-				return err;
-			}
-		}
+		if (err)
+			goto err_set_bpf_maps;
 	}
 	if (xsks_map_fd)
 		*xsks_map_fd = ctx->xsks_map_fd;
@@ -717,8 +847,12 @@  static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
 	return 0;
 
 err_set_bpf_maps:
+	if (ctx->has_bpf_link)
+		close(ctx->link_fd);
+	else
+		bpf_set_link_xdp_fd(xsk->ctx->ifindex, -1, 0);
+err_attach_prog:
 	close(ctx->prog_fd);
-	bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
 err_load_xdp_prog:
 	xsk_delete_bpf_maps(xsk);
 
@@ -898,6 +1032,7 @@  int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
 		}
 	}
 	xsk->ctx = ctx;
+	xsk->ctx->has_bpf_link = xsk_probe_bpf_link();
 
 	if (rx) {
 		err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
@@ -1054,6 +1189,8 @@  void xsk_socket__delete(struct xsk_socket *xsk)
 	if (ctx->prog_fd != -1) {
 		xsk_delete_bpf_maps(xsk);
 		close(ctx->prog_fd);
+		if (ctx->has_bpf_link)
+			close(ctx->link_fd);
 	}
 
 	err = xsk_get_mmap_offsets(xsk->fd, &off);