diff mbox series

[bpf-next,6/8] libbpf: wire up BPF token support at BPF object level

Message ID 20231207185443.2297160-7-andrii@kernel.org (mailing list archive)
State New, archived
Headers show
Series BPF token support in libbpf's BPF object | expand

Commit Message

Andrii Nakryiko Dec. 7, 2023, 6:54 p.m. UTC
Add BPF token support to BPF object-level functionality.

BPF token is supported by BPF object logic either as an explicitly
provided BPF token from outside (through BPF FS path or explicit BPF
token FD), or implicitly (unless prevented through
bpf_object_open_opts).

Implicit mode is assumed to be the most common one for user namespaced
unprivileged workloads. The assumption is that privileged container
manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
delegation options (delegate_{cmds,maps,progs,attachs} mount options).
BPF object during loading will attempt to create BPF token from
/sys/fs/bpf location, and pass it for all relevant operations
(currently, map creation, BTF load, and program load).

In this implicit mode, if BPF token creation fails due to whatever
reason (BPF FS is not mounted, or kernel doesn't support BPF token,
etc), this is not considered an error. BPF object loading sequence will
proceed with no BPF token.

In explicit BPF token mode, user provides explicitly either custom BPF
FS mount point path or creates BPF token on their own and just passes
token FD directly. In such case, BPF object will either dup() token FD
(to not require caller to hold onto it for entire duration of BPF object
lifetime) or will attempt to create BPF token from provided BPF FS
location. If BPF token creation fails, that is considered a critical
error and BPF object load fails with an error.

Libbpf provides a way to disable implicit BPF token creation, if it
causes any troubles (BPF token is designed to be completely optional and
shouldn't cause any problems even if provided, but in the world of BPF
LSM, custom security logic can be installed that might change outcome
dependin on the presence of BPF token). To disable libbpf's default BPF
token creation behavior user should provide either invalid BPF token FD
(negative), or empty bpf_token_path option.

BPF token presence can influence libbpf's feature probing, so if BPF
object has associated BPF token, feature probing is instructed to use
BPF object-specific feature detection cache and token FD.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c             |   7 +-
 tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
 tools/lib/bpf/libbpf.h          |  28 +++++++-
 tools/lib/bpf/libbpf_internal.h |  17 ++++-
 4 files changed, 160 insertions(+), 12 deletions(-)

Comments

John Fastabend Dec. 11, 2023, 10:56 p.m. UTC | #1
Andrii Nakryiko wrote:
> Add BPF token support to BPF object-level functionality.
> 
> BPF token is supported by BPF object logic either as an explicitly
> provided BPF token from outside (through BPF FS path or explicit BPF
> token FD), or implicitly (unless prevented through
> bpf_object_open_opts).
> 
> Implicit mode is assumed to be the most common one for user namespaced
> unprivileged workloads. The assumption is that privileged container
> manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
> delegation options (delegate_{cmds,maps,progs,attachs} mount options).
> BPF object during loading will attempt to create BPF token from
> /sys/fs/bpf location, and pass it for all relevant operations
> (currently, map creation, BTF load, and program load).
> 
> In this implicit mode, if BPF token creation fails due to whatever
> reason (BPF FS is not mounted, or kernel doesn't support BPF token,
> etc), this is not considered an error. BPF object loading sequence will
> proceed with no BPF token.
> 
> In explicit BPF token mode, user provides explicitly either custom BPF
> FS mount point path or creates BPF token on their own and just passes
> token FD directly. In such case, BPF object will either dup() token FD
> (to not require caller to hold onto it for entire duration of BPF object
> lifetime) or will attempt to create BPF token from provided BPF FS
> location. If BPF token creation fails, that is considered a critical
> error and BPF object load fails with an error.
> 
> Libbpf provides a way to disable implicit BPF token creation, if it
> causes any troubles (BPF token is designed to be completely optional and
> shouldn't cause any problems even if provided, but in the world of BPF
> LSM, custom security logic can be installed that might change outcome
> dependin on the presence of BPF token). To disable libbpf's default BPF
> token creation behavior user should provide either invalid BPF token FD
> (negative), or empty bpf_token_path option.
> 
> BPF token presence can influence libbpf's feature probing, so if BPF
> object has associated BPF token, feature probing is instructed to use
> BPF object-specific feature detection cache and token FD.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  tools/lib/bpf/btf.c             |   7 +-
>  tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
>  tools/lib/bpf/libbpf.h          |  28 +++++++-
>  tools/lib/bpf/libbpf_internal.h |  17 ++++-
>  4 files changed, 160 insertions(+), 12 deletions(-)
> 

...

>  
> +static int bpf_object_prepare_token(struct bpf_object *obj)
> +{
> +	const char *bpffs_path;
> +	int bpffs_fd = -1, token_fd, err;
> +	bool mandatory;
> +	enum libbpf_print_level level = LIBBPF_DEBUG;

redundant set on level?

> +
> +	/* token is already set up */
> +	if (obj->token_fd > 0)
> +		return 0;
> +	/* token is explicitly prevented */
> +	if (obj->token_fd < 0) {
> +		pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
> +		/* reset to zero to avoid extra checks during map_create and prog_load steps */
> +		obj->token_fd = 0;
> +		return 0;
> +	}
> +
> +	mandatory = obj->token_path != NULL;
> +	level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
> +
> +	bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
> +	bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
> +	if (bpffs_fd < 0) {
> +		err = -errno;
> +		__pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
> +		     obj->name, err, bpffs_path,
> +		     mandatory ? "" : ", skipping optional step...");
> +		return mandatory ? err : 0;
> +	}
> +
> +	token_fd = bpf_token_create(bpffs_fd, 0);

Did this get tested on older kernels? In that case TOKEN_CREATE will
fail with -EINVAL.

> +	close(bpffs_fd);
> +	if (token_fd < 0) {
> +		if (!mandatory && token_fd == -ENOENT) {
> +			pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
> +				 obj->name, bpffs_path);
> +			return 0;
> +		}

Isn't there a case here we should give a warning about?  If BPF_TOKEN_CREATE
exists and !mandatory, but default BPFFS failed for enomem, or eperm reasons?
If the user reall/y doesn't want tokens here they should maybe override with
-1 token? My thought is if you have delegations set up then something on the
system is trying to configure this and an error might be ok? I'm asking just
because I paused on it for a bit not sure either way at the moment. I might
imagine a lazy program not specifying the default bpffs, but also really
thinking its going to get a valid token.


> +		__pr(level, "object '%s': failed (%d) to create BPF token from '%s'%s\n",
> +		     obj->name, token_fd, bpffs_path,
> +		     mandatory ? "" : ", skipping optional step...");
> +		return mandatory ? token_fd : 0;
> +	}
> +
> +	obj->feat_cache = calloc(1, sizeof(*obj->feat_cache));
> +	if (!obj->feat_cache) {
> +		close(token_fd);
> +		return -ENOMEM;
> +	}
> +
> +	obj->token_fd = token_fd;
> +	obj->feat_cache->token_fd = token_fd;
> +
> +	return 0;
> +}
> +
>  static int
>  bpf_object__probe_loading(struct bpf_object *obj)
>  {
> @@ -4601,6 +4664,7 @@ bpf_object__probe_loading(struct bpf_object *obj)
>  		BPF_EXIT_INSN(),
>  	};
>  	int ret, insn_cnt = ARRAY_SIZE(insns);
> +	LIBBPF_OPTS(bpf_prog_load_opts, opts, .token_fd = obj->token_fd);
>  
>  	if (obj->gen_loader)
>  		return 0;
> @@ -4610,9 +4674,9 @@ bpf_object__probe_loading(struct bpf_object *obj)
>  		pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);
>  
>  	/* make sure basic loading works */
> -	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
> +	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &opts);
>  	if (ret < 0)
> -		ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
> +		ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
>  	if (ret < 0) {
>  		ret = errno;
>  		cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
> @@ -4635,6 +4699,9 @@ bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
>  		 */
>  		return true;
>  
> +	if (obj->token_fd)
> +		return feat_supported(obj->feat_cache, feat_id);

OK that answers feat_supported() non null from earlier patch. Just
was reading in order.

> +
>  	return feat_supported(NULL, feat_id);
>  }

...

>  	btf_fd = bpf_object__btf_fd(obj);
> @@ -7050,10 +7119,10 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object
>  static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz,
>  					  const struct bpf_object_open_opts *opts)
>  {
> -	const char *obj_name, *kconfig, *btf_tmp_path;
> +	const char *obj_name, *kconfig, *btf_tmp_path, *token_path;
>  	struct bpf_object *obj;
>  	char tmp_name[64];
> -	int err;
> +	int err, token_fd;
>  	char *log_buf;
>  	size_t log_size;
>  	__u32 log_level;
> @@ -7087,6 +7156,20 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
>  	if (log_size && !log_buf)
>  		return ERR_PTR(-EINVAL);
>  
> +	token_path = OPTS_GET(opts, bpf_token_path, NULL);
> +	token_fd = OPTS_GET(opts, bpf_token_fd, -1);
> +	/* non-empty token path can't be combined with invalid token FD */
> +	if (token_path && token_path[0] != '\0' && token_fd < 0)
> +		return ERR_PTR(-EINVAL);
> +	if (token_path && token_path[0] == '\0') {
> +		/* empty token path can't be combined with valid token FD */
> +		if (token_fd > 0)
> +			return ERR_PTR(-EINVAL);
> +		/* empty token_path is equivalent to invalid token_fd */
> +		token_path = NULL;
> +		token_fd = -1;
> +	}
> +
>  	obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name);
>  	if (IS_ERR(obj))
>  		return obj;
> @@ -7095,6 +7178,23 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
>  	obj->log_size = log_size;
>  	obj->log_level = log_level;
>  
> +	obj->token_fd = token_fd <= 0 ? token_fd : dup_good_fd(token_fd);
> +	if (token_fd > 0 && obj->token_fd < 0) {
> +		err = -errno;
> +		goto out;
> +	}
> +	if (token_path) {
> +		if (strlen(token_path) >= PATH_MAX) {

small nit, might be cleaner to just have this up where the other sanity
checks are done? e.g.

   `token_path[0] !=` `\0` && token_path(token_path) < PATH_MAX`

just to abort earlier. But not sure I care much.

> +			err = -ENAMETOOLONG;
> +			goto out;
> +		}
> +		obj->token_path = strdup(token_path);
> +		if (!obj->token_path) {
> +			err = -ENOMEM;
> +			goto out;
> +		}
> +	}
> +
>  	btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL);
>  	if (btf_tmp_path) {
>  		if (strlen(btf_tmp_path) >= PATH_MAX) {
> @@ -7605,7 +7705,8 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
>  	if (obj->gen_loader)
>  		bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
>  
> -	err = bpf_object__probe_loading(obj);
> +	err = bpf_object_prepare_token(obj);
> +	err = err ? : bpf_object__probe_loading(obj);
>  	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
>  	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
>  	err = err ? : bpf_object__sanitize_and_load_btf(obj);
> @@ -8142,6 +8243,11 @@ void bpf_object__close(struct bpf_object *obj)
>  	}
>  	zfree(&obj->programs);
>  
> +	zfree(&obj->feat_cache);
> +	zfree(&obj->token_path);
> +	if (obj->token_fd > 0)
> +		close(obj->token_fd);
> +
>  	free(obj);
>  }
>  
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 6cd9c501624f..d3de39b537f3 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -177,10 +177,36 @@ struct bpf_object_open_opts {
>  	 * logs through its print callback.
>  	 */
>  	__u32 kernel_log_level;
> +	/* FD of a BPF token instantiated by user through bpf_token_create()
> +	 * API. BPF object will keep dup()'ed FD internally, so passed token
> +	 * FD can be closed after BPF object/skeleton open step.
> +	 *
> +	 * Setting bpf_token_fd to negative value disables libbpf's automatic
> +	 * attempt to create BPF token from default BPF FS mount point
> +	 * (/sys/fs/bpf), in case this default behavior is undesirable.
> +	 *
> +	 * bpf_token_path and bpf_token_fd are mutually exclusive and only one
> +	 * of those options should be set.
> +	 */
> +	int bpf_token_fd;
> +	/* Path to BPF FS mount point to derive BPF token from.
> +	 *
> +	 * Created BPF token will be used for all bpf() syscall operations
> +	 * that accept BPF token (e.g., map creation, BTF and program loads,
> +	 * etc) automatically within instantiated BPF object.
> +	 *
> +	 * Setting bpf_token_path option to empty string disables libbpf's
> +	 * automatic attempt to create BPF token from default BPF FS mount
> +	 * point (/sys/fs/bpf), in case this default behavior is undesirable.
> +	 *
> +	 * bpf_token_path and bpf_token_fd are mutually exclusive and only one
> +	 * of those options should be set.
> +	 */
> +	const char *bpf_token_path;
>
Andrii Nakryiko Dec. 12, 2023, 12:05 a.m. UTC | #2
On Mon, Dec 11, 2023 at 2:56 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Andrii Nakryiko wrote:
> > Add BPF token support to BPF object-level functionality.
> >
> > BPF token is supported by BPF object logic either as an explicitly
> > provided BPF token from outside (through BPF FS path or explicit BPF
> > token FD), or implicitly (unless prevented through
> > bpf_object_open_opts).
> >
> > Implicit mode is assumed to be the most common one for user namespaced
> > unprivileged workloads. The assumption is that privileged container
> > manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
> > delegation options (delegate_{cmds,maps,progs,attachs} mount options).
> > BPF object during loading will attempt to create BPF token from
> > /sys/fs/bpf location, and pass it for all relevant operations
> > (currently, map creation, BTF load, and program load).
> >
> > In this implicit mode, if BPF token creation fails due to whatever
> > reason (BPF FS is not mounted, or kernel doesn't support BPF token,
> > etc), this is not considered an error. BPF object loading sequence will
> > proceed with no BPF token.
> >
> > In explicit BPF token mode, user provides explicitly either custom BPF
> > FS mount point path or creates BPF token on their own and just passes
> > token FD directly. In such case, BPF object will either dup() token FD
> > (to not require caller to hold onto it for entire duration of BPF object
> > lifetime) or will attempt to create BPF token from provided BPF FS
> > location. If BPF token creation fails, that is considered a critical
> > error and BPF object load fails with an error.
> >
> > Libbpf provides a way to disable implicit BPF token creation, if it
> > causes any troubles (BPF token is designed to be completely optional and
> > shouldn't cause any problems even if provided, but in the world of BPF
> > LSM, custom security logic can be installed that might change outcome
> > dependin on the presence of BPF token). To disable libbpf's default BPF
> > token creation behavior user should provide either invalid BPF token FD
> > (negative), or empty bpf_token_path option.
> >
> > BPF token presence can influence libbpf's feature probing, so if BPF
> > object has associated BPF token, feature probing is instructed to use
> > BPF object-specific feature detection cache and token FD.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  tools/lib/bpf/btf.c             |   7 +-
> >  tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
> >  tools/lib/bpf/libbpf.h          |  28 +++++++-
> >  tools/lib/bpf/libbpf_internal.h |  17 ++++-
> >  4 files changed, 160 insertions(+), 12 deletions(-)
> >
>
> ...
>
> >
> > +static int bpf_object_prepare_token(struct bpf_object *obj)
> > +{
> > +     const char *bpffs_path;
> > +     int bpffs_fd = -1, token_fd, err;
> > +     bool mandatory;
> > +     enum libbpf_print_level level = LIBBPF_DEBUG;
>
> redundant set on level?
>

yep, removed initialization

> > +
> > +     /* token is already set up */
> > +     if (obj->token_fd > 0)
> > +             return 0;
> > +     /* token is explicitly prevented */
> > +     if (obj->token_fd < 0) {
> > +             pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
> > +             /* reset to zero to avoid extra checks during map_create and prog_load steps */
> > +             obj->token_fd = 0;
> > +             return 0;
> > +     }
> > +
> > +     mandatory = obj->token_path != NULL;
> > +     level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
> > +
> > +     bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
> > +     bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
> > +     if (bpffs_fd < 0) {
> > +             err = -errno;
> > +             __pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
> > +                  obj->name, err, bpffs_path,
> > +                  mandatory ? "" : ", skipping optional step...");
> > +             return mandatory ? err : 0;
> > +     }
> > +
> > +     token_fd = bpf_token_create(bpffs_fd, 0);
>
> Did this get tested on older kernels? In that case TOKEN_CREATE will
> fail with -EINVAL.

yep, I did actually test, it will generate expected *debug*-level
"failed to create BPF token" message

>
> > +     close(bpffs_fd);
> > +     if (token_fd < 0) {
> > +             if (!mandatory && token_fd == -ENOENT) {
> > +                     pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
> > +                              obj->name, bpffs_path);
> > +                     return 0;
> > +             }
>
> Isn't there a case here we should give a warning about?  If BPF_TOKEN_CREATE
> exists and !mandatory, but default BPFFS failed for enomem, or eperm reasons?
> If the user reall/y doesn't want tokens here they should maybe override with
> -1 token? My thought is if you have delegations set up then something on the
> system is trying to configure this and an error might be ok? I'm asking just
> because I paused on it for a bit not sure either way at the moment. I might
> imagine a lazy program not specifying the default bpffs, but also really
> thinking its going to get a valid token.

Interesting perspective! I actually came from the direction that BPF
token is not really all that common and expected thing, and so in
majority of cases (at least for some time) we won't be expecting to
have BPF FS with delegation options. So emitting a warning that
"something something BPF token failed" would be disconcerting to most
users.

What's the worst that would happen if BPF token was expected but we
failed to instantiate it? You'll get a BPF object load failure with
-EPERM, so it will be a pretty clear signal that whatever delegation
was supposed to happen didn't happen.

Also, if a user wants a BPF token for sure, they can explicitly set
bpf_token_path = "/sys/fs/bpf" and then it becomes mandatory.

So tl;dr, my perspective is that most users won't know or care about
BPF tokens. If sysadmin set up BPF FS correctly, it should just work
without the BPF application being aware. But for those rare cases
where a BPF token is expected and necessary, explicit bpf_token_path
or bpf_token_fd is the way to fail early, if something is not set up
the way it is expected.

>
>
> > +             __pr(level, "object '%s': failed (%d) to create BPF token from '%s'%s\n",
> > +                  obj->name, token_fd, bpffs_path,
> > +                  mandatory ? "" : ", skipping optional step...");
> > +             return mandatory ? token_fd : 0;
> > +     }
> > +
> > +     obj->feat_cache = calloc(1, sizeof(*obj->feat_cache));
> > +     if (!obj->feat_cache) {
> > +             close(token_fd);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     obj->token_fd = token_fd;
> > +     obj->feat_cache->token_fd = token_fd;
> > +
> > +     return 0;
> > +}
> > +
> >  static int
> >  bpf_object__probe_loading(struct bpf_object *obj)
> >  {
> > @@ -4601,6 +4664,7 @@ bpf_object__probe_loading(struct bpf_object *obj)
> >               BPF_EXIT_INSN(),
> >       };
> >       int ret, insn_cnt = ARRAY_SIZE(insns);
> > +     LIBBPF_OPTS(bpf_prog_load_opts, opts, .token_fd = obj->token_fd);
> >
> >       if (obj->gen_loader)
> >               return 0;
> > @@ -4610,9 +4674,9 @@ bpf_object__probe_loading(struct bpf_object *obj)
> >               pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);
> >
> >       /* make sure basic loading works */
> > -     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
> > +     ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &opts);
> >       if (ret < 0)
> > -             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
> > +             ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
> >       if (ret < 0) {
> >               ret = errno;
> >               cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
> > @@ -4635,6 +4699,9 @@ bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
> >                */
> >               return true;
> >
> > +     if (obj->token_fd)
> > +             return feat_supported(obj->feat_cache, feat_id);
>
> OK that answers feat_supported() non null from earlier patch. Just
> was reading in order.
>

yep, no worries, that's what I assumed :)

> > +
> >       return feat_supported(NULL, feat_id);
> >  }
>
> ...
>
> >       btf_fd = bpf_object__btf_fd(obj);
> > @@ -7050,10 +7119,10 @@ static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object
> >  static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz,
> >                                         const struct bpf_object_open_opts *opts)
> >  {
> > -     const char *obj_name, *kconfig, *btf_tmp_path;
> > +     const char *obj_name, *kconfig, *btf_tmp_path, *token_path;
> >       struct bpf_object *obj;
> >       char tmp_name[64];
> > -     int err;
> > +     int err, token_fd;
> >       char *log_buf;
> >       size_t log_size;
> >       __u32 log_level;
> > @@ -7087,6 +7156,20 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
> >       if (log_size && !log_buf)
> >               return ERR_PTR(-EINVAL);
> >
> > +     token_path = OPTS_GET(opts, bpf_token_path, NULL);
> > +     token_fd = OPTS_GET(opts, bpf_token_fd, -1);
> > +     /* non-empty token path can't be combined with invalid token FD */
> > +     if (token_path && token_path[0] != '\0' && token_fd < 0)
> > +             return ERR_PTR(-EINVAL);
> > +     if (token_path && token_path[0] == '\0') {
> > +             /* empty token path can't be combined with valid token FD */
> > +             if (token_fd > 0)
> > +                     return ERR_PTR(-EINVAL);
> > +             /* empty token_path is equivalent to invalid token_fd */
> > +             token_path = NULL;
> > +             token_fd = -1;
> > +     }
> > +
> >       obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name);
> >       if (IS_ERR(obj))
> >               return obj;
> > @@ -7095,6 +7178,23 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
> >       obj->log_size = log_size;
> >       obj->log_level = log_level;
> >
> > +     obj->token_fd = token_fd <= 0 ? token_fd : dup_good_fd(token_fd);
> > +     if (token_fd > 0 && obj->token_fd < 0) {
> > +             err = -errno;
> > +             goto out;
> > +     }
> > +     if (token_path) {
> > +             if (strlen(token_path) >= PATH_MAX) {
>
> small nit, might be cleaner to just have this up where the other sanity
> checks are done? e.g.
>
>    `token_path[0] !=` `\0` && token_path(token_path) < PATH_MAX`
>
> just to abort earlier. But not sure I care much.

yep, makes sense, I'll move ENAMETOOLONG up

>
> > +                     err = -ENAMETOOLONG;
> > +                     goto out;
> > +             }
> > +             obj->token_path = strdup(token_path);
> > +             if (!obj->token_path) {
> > +                     err = -ENOMEM;
> > +                     goto out;
> > +             }
> > +     }
> > +
> >       btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL);
> >       if (btf_tmp_path) {
> >               if (strlen(btf_tmp_path) >= PATH_MAX) {

[...]
John Fastabend Dec. 12, 2023, 12:26 a.m. UTC | #3
Andrii Nakryiko wrote:
> On Mon, Dec 11, 2023 at 2:56 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Andrii Nakryiko wrote:
> > > Add BPF token support to BPF object-level functionality.
> > >
> > > BPF token is supported by BPF object logic either as an explicitly
> > > provided BPF token from outside (through BPF FS path or explicit BPF
> > > token FD), or implicitly (unless prevented through
> > > bpf_object_open_opts).
> > >
> > > Implicit mode is assumed to be the most common one for user namespaced
> > > unprivileged workloads. The assumption is that privileged container
> > > manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
> > > delegation options (delegate_{cmds,maps,progs,attachs} mount options).
> > > BPF object during loading will attempt to create BPF token from
> > > /sys/fs/bpf location, and pass it for all relevant operations
> > > (currently, map creation, BTF load, and program load).
> > >
> > > In this implicit mode, if BPF token creation fails due to whatever
> > > reason (BPF FS is not mounted, or kernel doesn't support BPF token,
> > > etc), this is not considered an error. BPF object loading sequence will
> > > proceed with no BPF token.
> > >
> > > In explicit BPF token mode, user provides explicitly either custom BPF
> > > FS mount point path or creates BPF token on their own and just passes
> > > token FD directly. In such case, BPF object will either dup() token FD
> > > (to not require caller to hold onto it for entire duration of BPF object
> > > lifetime) or will attempt to create BPF token from provided BPF FS
> > > location. If BPF token creation fails, that is considered a critical
> > > error and BPF object load fails with an error.
> > >
> > > Libbpf provides a way to disable implicit BPF token creation, if it
> > > causes any troubles (BPF token is designed to be completely optional and
> > > shouldn't cause any problems even if provided, but in the world of BPF
> > > LSM, custom security logic can be installed that might change outcome
> > > dependin on the presence of BPF token). To disable libbpf's default BPF
> > > token creation behavior user should provide either invalid BPF token FD
> > > (negative), or empty bpf_token_path option.
> > >
> > > BPF token presence can influence libbpf's feature probing, so if BPF
> > > object has associated BPF token, feature probing is instructed to use
> > > BPF object-specific feature detection cache and token FD.
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > >  tools/lib/bpf/btf.c             |   7 +-
> > >  tools/lib/bpf/libbpf.c          | 120 ++++++++++++++++++++++++++++++--
> > >  tools/lib/bpf/libbpf.h          |  28 +++++++-
> > >  tools/lib/bpf/libbpf_internal.h |  17 ++++-
> > >  4 files changed, 160 insertions(+), 12 deletions(-)
> > >
> >
> > ...
> >
> > >
> > > +static int bpf_object_prepare_token(struct bpf_object *obj)
> > > +{
> > > +     const char *bpffs_path;
> > > +     int bpffs_fd = -1, token_fd, err;
> > > +     bool mandatory;
> > > +     enum libbpf_print_level level = LIBBPF_DEBUG;
> >
> > redundant set on level?
> >
> 
> yep, removed initialization
> 
> > > +
> > > +     /* token is already set up */
> > > +     if (obj->token_fd > 0)
> > > +             return 0;
> > > +     /* token is explicitly prevented */
> > > +     if (obj->token_fd < 0) {
> > > +             pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
> > > +             /* reset to zero to avoid extra checks during map_create and prog_load steps */
> > > +             obj->token_fd = 0;
> > > +             return 0;
> > > +     }
> > > +
> > > +     mandatory = obj->token_path != NULL;
> > > +     level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
> > > +
> > > +     bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
> > > +     bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
> > > +     if (bpffs_fd < 0) {
> > > +             err = -errno;
> > > +             __pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
> > > +                  obj->name, err, bpffs_path,
> > > +                  mandatory ? "" : ", skipping optional step...");
> > > +             return mandatory ? err : 0;
> > > +     }
> > > +
> > > +     token_fd = bpf_token_create(bpffs_fd, 0);
> >
> > Did this get tested on older kernels? In that case TOKEN_CREATE will
> > fail with -EINVAL.
> 
> yep, I did actually test, it will generate expected *debug*-level
> "failed to create BPF token" message

Great.

> 
> >
> > > +     close(bpffs_fd);
> > > +     if (token_fd < 0) {
> > > +             if (!mandatory && token_fd == -ENOENT) {
> > > +                     pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
> > > +                              obj->name, bpffs_path);
> > > +                     return 0;
> > > +             }
> >
> > Isn't there a case here we should give a warning about?  If BPF_TOKEN_CREATE
> > exists and !mandatory, but default BPFFS failed for enomem, or eperm reasons?
> > If the user reall/y doesn't want tokens here they should maybe override with
> > -1 token? My thought is if you have delegations set up then something on the
> > system is trying to configure this and an error might be ok? I'm asking just
> > because I paused on it for a bit not sure either way at the moment. I might
> > imagine a lazy program not specifying the default bpffs, but also really
> > thinking its going to get a valid token.
> 
> Interesting perspective! I actually came from the direction that BPF
> token is not really all that common and expected thing, and so in
> majority of cases (at least for some time) we won't be expecting to
> have BPF FS with delegation options. So emitting a warning that
> "something something BPF token failed" would be disconcerting to most
> users.
> 
> What's the worst that would happen if BPF token was expected but we
> failed to instantiate it? You'll get a BPF object load failure with
> -EPERM, so it will be a pretty clear signal that whatever delegation
> was supposed to happen didn't happen.
> 
> Also, if a user wants a BPF token for sure, they can explicitly set
> bpf_token_path = "/sys/fs/bpf" and then it becomes mandatory.
> 
> So tl;dr, my perspective is that most users won't know or care about
> BPF tokens. If sysadmin set up BPF FS correctly, it should just work
> without the BPF application being aware. But for those rare cases
> where a BPF token is expected and necessary, explicit bpf_token_path
> or bpf_token_fd is the way to fail early, if something is not set up
> the way it is expected.

Works for me. I don't have a strong opinion either way.
diff mbox series

Patch

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index ee95fd379d4d..63033c334320 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1317,7 +1317,9 @@  struct btf *btf__parse_split(const char *path, struct btf *base_btf)
 
 static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
 
-int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 log_level)
+int btf_load_into_kernel(struct btf *btf,
+			 char *log_buf, size_t log_sz, __u32 log_level,
+			 int token_fd)
 {
 	LIBBPF_OPTS(bpf_btf_load_opts, opts);
 	__u32 buf_sz = 0, raw_size;
@@ -1367,6 +1369,7 @@  int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 lo
 		opts.log_level = log_level;
 	}
 
+	opts.token_fd = token_fd;
 	btf->fd = bpf_btf_load(raw_data, raw_size, &opts);
 	if (btf->fd < 0) {
 		/* time to turn on verbose mode and try again */
@@ -1394,7 +1397,7 @@  int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 lo
 
 int btf__load_into_kernel(struct btf *btf)
 {
-	return btf_load_into_kernel(btf, NULL, 0, 0);
+	return btf_load_into_kernel(btf, NULL, 0, 0, 0);
 }
 
 int btf__fd(const struct btf *btf)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 1a9fe08179ee..53bf0993b09a 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -59,6 +59,8 @@ 
 #define BPF_FS_MAGIC		0xcafe4a11
 #endif
 
+#define BPF_FS_DEFAULT_PATH "/sys/fs/bpf"
+
 #define BPF_INSN_SZ (sizeof(struct bpf_insn))
 
 /* vsprintf() in __base_pr() uses nonliteral format string. It may break
@@ -694,6 +696,10 @@  struct bpf_object {
 
 	struct usdt_manager *usdt_man;
 
+	struct kern_feature_cache *feat_cache;
+	char *token_path;
+	int token_fd;
+
 	char path[];
 };
 
@@ -2193,7 +2199,7 @@  static int build_map_pin_path(struct bpf_map *map, const char *path)
 	int err;
 
 	if (!path)
-		path = "/sys/fs/bpf";
+		path = BPF_FS_DEFAULT_PATH;
 
 	err = pathname_concat(buf, sizeof(buf), path, bpf_map__name(map));
 	if (err)
@@ -3269,7 +3275,7 @@  static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 	} else {
 		/* currently BPF_BTF_LOAD only supports log_level 1 */
 		err = btf_load_into_kernel(kern_btf, obj->log_buf, obj->log_size,
-					   obj->log_level ? 1 : 0);
+					   obj->log_level ? 1 : 0, obj->token_fd);
 	}
 	if (sanitize) {
 		if (!err) {
@@ -4592,6 +4598,63 @@  int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries)
 	return 0;
 }
 
+static int bpf_object_prepare_token(struct bpf_object *obj)
+{
+	const char *bpffs_path;
+	int bpffs_fd = -1, token_fd, err;
+	bool mandatory;
+	enum libbpf_print_level level = LIBBPF_DEBUG;
+
+	/* token is already set up */
+	if (obj->token_fd > 0)
+		return 0;
+	/* token is explicitly prevented */
+	if (obj->token_fd < 0) {
+		pr_debug("object '%s': token is prevented, skipping...\n", obj->name);
+		/* reset to zero to avoid extra checks during map_create and prog_load steps */
+		obj->token_fd = 0;
+		return 0;
+	}
+
+	mandatory = obj->token_path != NULL;
+	level = mandatory ? LIBBPF_WARN : LIBBPF_DEBUG;
+
+	bpffs_path = obj->token_path ?: BPF_FS_DEFAULT_PATH;
+	bpffs_fd = open(bpffs_path, O_DIRECTORY, O_RDWR);
+	if (bpffs_fd < 0) {
+		err = -errno;
+		__pr(level, "object '%s': failed (%d) to open BPF FS mount at '%s'%s\n",
+		     obj->name, err, bpffs_path,
+		     mandatory ? "" : ", skipping optional step...");
+		return mandatory ? err : 0;
+	}
+
+	token_fd = bpf_token_create(bpffs_fd, 0);
+	close(bpffs_fd);
+	if (token_fd < 0) {
+		if (!mandatory && token_fd == -ENOENT) {
+			pr_debug("object '%s': BPF FS at '%s' doesn't have BPF token delegation set up, skipping...\n",
+				 obj->name, bpffs_path);
+			return 0;
+		}
+		__pr(level, "object '%s': failed (%d) to create BPF token from '%s'%s\n",
+		     obj->name, token_fd, bpffs_path,
+		     mandatory ? "" : ", skipping optional step...");
+		return mandatory ? token_fd : 0;
+	}
+
+	obj->feat_cache = calloc(1, sizeof(*obj->feat_cache));
+	if (!obj->feat_cache) {
+		close(token_fd);
+		return -ENOMEM;
+	}
+
+	obj->token_fd = token_fd;
+	obj->feat_cache->token_fd = token_fd;
+
+	return 0;
+}
+
 static int
 bpf_object__probe_loading(struct bpf_object *obj)
 {
@@ -4601,6 +4664,7 @@  bpf_object__probe_loading(struct bpf_object *obj)
 		BPF_EXIT_INSN(),
 	};
 	int ret, insn_cnt = ARRAY_SIZE(insns);
+	LIBBPF_OPTS(bpf_prog_load_opts, opts, .token_fd = obj->token_fd);
 
 	if (obj->gen_loader)
 		return 0;
@@ -4610,9 +4674,9 @@  bpf_object__probe_loading(struct bpf_object *obj)
 		pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);
 
 	/* make sure basic loading works */
-	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
+	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &opts);
 	if (ret < 0)
-		ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
+		ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
 	if (ret < 0) {
 		ret = errno;
 		cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
@@ -4635,6 +4699,9 @@  bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
 		 */
 		return true;
 
+	if (obj->token_fd)
+		return feat_supported(obj->feat_cache, feat_id);
+
 	return feat_supported(NULL, feat_id);
 }
 
@@ -4754,6 +4821,7 @@  static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 	create_attr.map_flags = def->map_flags;
 	create_attr.numa_node = map->numa_node;
 	create_attr.map_extra = map->map_extra;
+	create_attr.token_fd = obj->token_fd;
 
 	if (bpf_map__is_struct_ops(map))
 		create_attr.btf_vmlinux_value_type_id = map->btf_vmlinux_value_type_id;
@@ -6589,6 +6657,7 @@  static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog
 	load_attr.attach_btf_id = prog->attach_btf_id;
 	load_attr.kern_version = kern_version;
 	load_attr.prog_ifindex = prog->prog_ifindex;
+	load_attr.token_fd = obj->token_fd;
 
 	/* specify func_info/line_info only if kernel supports them */
 	btf_fd = bpf_object__btf_fd(obj);
@@ -7050,10 +7119,10 @@  static int bpf_object_init_progs(struct bpf_object *obj, const struct bpf_object
 static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 					  const struct bpf_object_open_opts *opts)
 {
-	const char *obj_name, *kconfig, *btf_tmp_path;
+	const char *obj_name, *kconfig, *btf_tmp_path, *token_path;
 	struct bpf_object *obj;
 	char tmp_name[64];
-	int err;
+	int err, token_fd;
 	char *log_buf;
 	size_t log_size;
 	__u32 log_level;
@@ -7087,6 +7156,20 @@  static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
 	if (log_size && !log_buf)
 		return ERR_PTR(-EINVAL);
 
+	token_path = OPTS_GET(opts, bpf_token_path, NULL);
+	token_fd = OPTS_GET(opts, bpf_token_fd, -1);
+	/* non-empty token path can't be combined with invalid token FD */
+	if (token_path && token_path[0] != '\0' && token_fd < 0)
+		return ERR_PTR(-EINVAL);
+	if (token_path && token_path[0] == '\0') {
+		/* empty token path can't be combined with valid token FD */
+		if (token_fd > 0)
+			return ERR_PTR(-EINVAL);
+		/* empty token_path is equivalent to invalid token_fd */
+		token_path = NULL;
+		token_fd = -1;
+	}
+
 	obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name);
 	if (IS_ERR(obj))
 		return obj;
@@ -7095,6 +7178,23 @@  static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf,
 	obj->log_size = log_size;
 	obj->log_level = log_level;
 
+	obj->token_fd = token_fd <= 0 ? token_fd : dup_good_fd(token_fd);
+	if (token_fd > 0 && obj->token_fd < 0) {
+		err = -errno;
+		goto out;
+	}
+	if (token_path) {
+		if (strlen(token_path) >= PATH_MAX) {
+			err = -ENAMETOOLONG;
+			goto out;
+		}
+		obj->token_path = strdup(token_path);
+		if (!obj->token_path) {
+			err = -ENOMEM;
+			goto out;
+		}
+	}
+
 	btf_tmp_path = OPTS_GET(opts, btf_custom_path, NULL);
 	if (btf_tmp_path) {
 		if (strlen(btf_tmp_path) >= PATH_MAX) {
@@ -7605,7 +7705,8 @@  static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
 	if (obj->gen_loader)
 		bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
 
-	err = bpf_object__probe_loading(obj);
+	err = bpf_object_prepare_token(obj);
+	err = err ? : bpf_object__probe_loading(obj);
 	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
 	err = err ? : bpf_object__sanitize_and_load_btf(obj);
@@ -8142,6 +8243,11 @@  void bpf_object__close(struct bpf_object *obj)
 	}
 	zfree(&obj->programs);
 
+	zfree(&obj->feat_cache);
+	zfree(&obj->token_path);
+	if (obj->token_fd > 0)
+		close(obj->token_fd);
+
 	free(obj);
 }
 
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 6cd9c501624f..d3de39b537f3 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -177,10 +177,36 @@  struct bpf_object_open_opts {
 	 * logs through its print callback.
 	 */
 	__u32 kernel_log_level;
+	/* FD of a BPF token instantiated by user through bpf_token_create()
+	 * API. BPF object will keep dup()'ed FD internally, so passed token
+	 * FD can be closed after BPF object/skeleton open step.
+	 *
+	 * Setting bpf_token_fd to negative value disables libbpf's automatic
+	 * attempt to create BPF token from default BPF FS mount point
+	 * (/sys/fs/bpf), in case this default behavior is undesirable.
+	 *
+	 * bpf_token_path and bpf_token_fd are mutually exclusive and only one
+	 * of those options should be set.
+	 */
+	int bpf_token_fd;
+	/* Path to BPF FS mount point to derive BPF token from.
+	 *
+	 * Created BPF token will be used for all bpf() syscall operations
+	 * that accept BPF token (e.g., map creation, BTF and program loads,
+	 * etc) automatically within instantiated BPF object.
+	 *
+	 * Setting bpf_token_path option to empty string disables libbpf's
+	 * automatic attempt to create BPF token from default BPF FS mount
+	 * point (/sys/fs/bpf), in case this default behavior is undesirable.
+	 *
+	 * bpf_token_path and bpf_token_fd are mutually exclusive and only one
+	 * of those options should be set.
+	 */
+	const char *bpf_token_path;
 
 	size_t :0;
 };
-#define bpf_object_open_opts__last_field kernel_log_level
+#define bpf_object_open_opts__last_field bpf_token_path
 
 /**
  * @brief **bpf_object__open()** creates a bpf_object by opening
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index b45566e428d7..4cda32298c49 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -383,7 +383,9 @@  int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
 int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
 			 const char *str_sec, size_t str_len,
 			 int token_fd);
-int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 log_level);
+int btf_load_into_kernel(struct btf *btf,
+			 char *log_buf, size_t log_sz, __u32 log_level,
+			 int token_fd);
 
 struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
 void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
@@ -547,6 +549,17 @@  static inline bool is_ldimm64_insn(struct bpf_insn *insn)
 	return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
 }
 
+/* Unconditionally dup FD, ensuring it doesn't use [0, 2] range.
+ * Original FD is not closed or altered in any other way.
+ * Preserves original FD value, if it's invalid (negative).
+ */
+static inline int dup_good_fd(int fd)
+{
+	if (fd < 0)
+		return fd;
+	return fcntl(fd, F_DUPFD_CLOEXEC, 3);
+}
+
 /* if fd is stdin, stdout, or stderr, dup to a fd greater than 2
  * Takes ownership of the fd passed in, and closes it if calling
  * fcntl(fd, F_DUPFD_CLOEXEC, 3).
@@ -558,7 +571,7 @@  static inline int ensure_good_fd(int fd)
 	if (fd < 0)
 		return fd;
 	if (fd < 3) {
-		fd = fcntl(fd, F_DUPFD_CLOEXEC, 3);
+		fd = dup_good_fd(fd);
 		saved_errno = errno;
 		close(old_fd);
 		errno = saved_errno;