[bpf-next] bpf: Support uid and gid when mounting bpffs

Message ID	20231201094729.1312133-1-jiejiang@chromium.org (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="a3ceOl66" From: Jie Jiang <jiejiang@chromium.org> To: bpf@vger.kernel.org Cc: jiejiang@chromium.org, vapier@chromium.org Subject: [PATCH bpf-next] bpf: Support uid and gid when mounting bpffs Date: Fri, 1 Dec 2023 09:47:29 +0000 Message-ID: <20231201094729.1312133-1-jiejiang@chromium.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[bpf-next] bpf: Support uid and gid when mounting bpffs \| expand [bpf-next] bpf: Support uid and gid when mounting bpffs

Context	Check	Description
bpf/vmtest-bpf-next-PR	success	PR summary
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for bpf-next
netdev/ynl	success	SINGLE THREAD; Generated files up to date; no warnings/errors;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1117 this patch: 1117
netdev/cc_maintainers	fail	11 maintainers not CCed: haoluo@google.com jolsa@kernel.org kpsingh@kernel.org john.fastabend@gmail.com martin.lau@linux.dev song@kernel.org daniel@iogearbox.net ast@kernel.org andrii@kernel.org sdf@google.com yonghong.song@linux.dev
netdev/build_clang	success	Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1144 this patch: 1144
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 72 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Jie Jiang Dec. 1, 2023, 9:47 a.m. UTC

Parse uid and gid in bpf_parse_param() so that they can be passed in as
the `data` parameter when mount() bpffs. This will be useful when we
want to control which user/group has the control to the mounted bpffs,
otherwise a separate chown() call will be needed.

Signed-off-by: Jie Jiang <jiejiang@chromium.org>
---
 kernel/bpf/inode.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

Mike Frysinger Dec. 1, 2023, 6:43 p.m. UTC | #1

Acked-by: Mike Frysinger <vapier@chromium.org>

Christian Brauner Dec. 5, 2023, 4:31 p.m. UTC | #2

On Fri, Dec 01, 2023 at 09:47:29AM +0000, Jie Jiang wrote:
> Parse uid and gid in bpf_parse_param() so that they can be passed in as
> the `data` parameter when mount() bpffs. This will be useful when we
> want to control which user/group has the control to the mounted bpffs,
> otherwise a separate chown() call will be needed.
> 
> Signed-off-by: Jie Jiang <jiejiang@chromium.org>
> ---

Sorry, I was asked to take a quick look at this. The patchset looks fine
overall but it will interact with Andrii's patchset which makes bpffs
mountable inside a user namespace (with caveats).

At that point you need additional validation in bpf_parse_param(). The
simplest thing would probably to just put this into this series or into
@Andrii's series. It's basically a copy-pasta from what I did for tmpfs
(see below).

I plan to move this validation into the VFS so that {g,u}id mount
options are validated consistenly for any such filesystem. There is just
some unpleasantness that I have to figure out first.

@Andrii, with the {g,u}id mount option it means that userns root can

fsconfig(..., FSCONFIG_SET_STRING, "uid", "1000", ...)
fsconfig(..., FSCONFIG_SET_STRING, "gid", "1000", ...)
fsconfig(..., FSCONFIG_CMD_CREATE, ...)

If you delegate CAP_BPF in that userns to uid 1000 then an unpriv user
in that userns can create bpf tokens. Currently this would require
userns root to give both CAP_DAC_READ_SEARCH and CAP_BPF to such an
unprivileged user.

Depending on whether or not that's intended you might want to add an
additional check into bpf_token_create() to verify that the caller's
{g,u}id resolves to 0:

if (from_kuid(current_user_ns(), current_fsuid()) != 0)
        return -EINVAL;

That's basically saying you're restricting this to userns root. Idk,
that's up to you. (Note that you currently enforce current_user_ns() ==
token->user_ns == s_user_ns which is why it doesn't matter what userns
you pass here. You'd just error out later.)

>  kernel/bpf/inode.c | 33 +++++++++++++++++++++++++++++++--
>  1 file changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
> index 1aafb2ff2e953..826fe48745ee2 100644
> --- a/kernel/bpf/inode.c
> +++ b/kernel/bpf/inode.c
> @@ -599,8 +599,15 @@ EXPORT_SYMBOL(bpf_prog_get_type_path);
>   */
>  static int bpf_show_options(struct seq_file *m, struct dentry *root)
>  {
> -	umode_t mode = d_inode(root)->i_mode & S_IALLUGO & ~S_ISVTX;
> -
> +	struct inode *inode = d_inode(root);
> +	umode_t mode = inode->i_mode & S_IALLUGO & ~S_ISVTX;
> +
> +	if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID))
> +		seq_printf(m, ",uid=%u",
> +			   from_kuid_munged(&init_user_ns, inode->i_uid));
> +	if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID))
> +		seq_printf(m, ",gid=%u",
> +			   from_kgid_munged(&init_user_ns, inode->i_gid));
>  	if (mode != S_IRWXUGO)
>  		seq_printf(m, ",mode=%o", mode);
>  	return 0;
> @@ -625,15 +632,21 @@ static const struct super_operations bpf_super_ops = {
>  };
>  
>  enum {
> +	OPT_UID,
> +	OPT_GID,
>  	OPT_MODE,
>  };
>  
>  static const struct fs_parameter_spec bpf_fs_parameters[] = {
> +	fsparam_u32	("gid",				OPT_GID),
>  	fsparam_u32oct	("mode",			OPT_MODE),
> +	fsparam_u32	("uid",				OPT_UID),
>  	{}
>  };
>  
>  struct bpf_mount_opts {
> +	kuid_t uid;
> +	kgid_t gid;
>  	umode_t mode;
>  };
>  
> @@ -641,6 +654,8 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param)
>  {
>  	struct bpf_mount_opts *opts = fc->fs_private;
>  	struct fs_parse_result result;
> +	kuid_t uid;
> +	kgid_t gid;
>  	int opt;
>  
>  	opt = fs_parse(fc, bpf_fs_parameters, param, &result);
> @@ -662,6 +677,18 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param)
>  	}
>  
>  	switch (opt) {
> +	case OPT_UID:
> +		uid = make_kuid(current_user_ns(), result.uint_32);
> +		if (!uid_valid(uid))
> +			return invalf(fc, "Unknown uid");

		/*
		 * The requested uid must be representable in the
		 * filesystem's idmapping.
		 */
		if (!kuid_has_mapping(fc->user_ns, kuid))
			goto bad_value;

> +		opts->uid = uid;
> +		break;
> +	case OPT_GID:
> +		gid = make_kgid(current_user_ns(), result.uint_32);
> +		if (!gid_valid(gid))
> +			return invalf(fc, "Unknown gid");

		/*
		 * The requested gid must be representable in the
		 * filesystem's idmapping.
		 */
		if (!kgid_has_mapping(fc->user_ns, kgid))
			goto bad_value;

> +		opts->gid = gid;
> +		break;
>  	case OPT_MODE:
>  		opts->mode = result.uint_32 & S_IALLUGO;
>  		break;
> @@ -750,6 +777,8 @@ static int bpf_fill_super(struct super_block *sb, struct fs_context *fc)
>  	sb->s_op = &bpf_super_ops;
>  
>  	inode = sb->s_root->d_inode;
> +	inode->i_uid = opts->uid;
> +	inode->i_gid = opts->gid;
>  	inode->i_op = &bpf_dir_iops;
>  	inode->i_mode &= ~S_IALLUGO;
>  	populate_bpffs(sb->s_root);
> -- 
> 2.43.0.rc2.451.g8631bc7472-goog
>

Andrii Nakryiko Dec. 5, 2023, 6:28 p.m. UTC | #3

On Tue, Dec 5, 2023 at 8:31 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Fri, Dec 01, 2023 at 09:47:29AM +0000, Jie Jiang wrote:
> > Parse uid and gid in bpf_parse_param() so that they can be passed in as
> > the `data` parameter when mount() bpffs. This will be useful when we
> > want to control which user/group has the control to the mounted bpffs,
> > otherwise a separate chown() call will be needed.
> >
> > Signed-off-by: Jie Jiang <jiejiang@chromium.org>
> > ---
>
> Sorry, I was asked to take a quick look at this. The patchset looks fine
> overall but it will interact with Andrii's patchset which makes bpffs
> mountable inside a user namespace (with caveats).
>
> At that point you need additional validation in bpf_parse_param(). The
> simplest thing would probably to just put this into this series or into
> @Andrii's series. It's basically a copy-pasta from what I did for tmpfs
> (see below).
>
> I plan to move this validation into the VFS so that {g,u}id mount
> options are validated consistenly for any such filesystem. There is just
> some unpleasantness that I have to figure out first.
>
> @Andrii, with the {g,u}id mount option it means that userns root can
>
> fsconfig(..., FSCONFIG_SET_STRING, "uid", "1000", ...)
> fsconfig(..., FSCONFIG_SET_STRING, "gid", "1000", ...)
> fsconfig(..., FSCONFIG_CMD_CREATE, ...)
>
> If you delegate CAP_BPF in that userns to uid 1000 then an unpriv user
> in that userns can create bpf tokens. Currently this would require
> userns root to give both CAP_DAC_READ_SEARCH and CAP_BPF to such an
> unprivileged user.

This is probably fine. Basically the only difference is that BPF FS
can be instantiated inside an unpriv namespace, instead of in a
privileged parent namespace, right?

But delegate_xxx options are still guarded by the explicit
capable(CAP_SYS_ADMIN) check, so that unprivileged user won't be able
to grant themselves BPF token-enabling capabilities without a
privileged parent doing it on their behalf.

Is my understanding correct or am I missing some nuance here?

>
> Depending on whether or not that's intended you might want to add an
> additional check into bpf_token_create() to verify that the caller's
> {g,u}id resolves to 0:
>
> if (from_kuid(current_user_ns(), current_fsuid()) != 0)
>         return -EINVAL;
>
> That's basically saying you're restricting this to userns root. Idk,
> that's up to you. (Note that you currently enforce current_user_ns() ==
> token->user_ns == s_user_ns which is why it doesn't matter what userns
> you pass here. You'd just error out later.)
>
> >  kernel/bpf/inode.c | 33 +++++++++++++++++++++++++++++++--
> >  1 file changed, 31 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
> > index 1aafb2ff2e953..826fe48745ee2 100644
> > --- a/kernel/bpf/inode.c
> > +++ b/kernel/bpf/inode.c

[...]

Jie Jiang Dec. 6, 2023, 7:43 a.m. UTC | #4

On Wed, Dec 6, 2023 at 3:28 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Dec 5, 2023 at 8:31 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Fri, Dec 01, 2023 at 09:47:29AM +0000, Jie Jiang wrote:
>> ...
> > Sorry, I was asked to take a quick look at this. The patchset looks fine
> > overall but it will interact with Andrii's patchset which makes bpffs
> > mountable inside a user namespace (with caveats).
> >
> > At that point you need additional validation in bpf_parse_param(). The
> > simplest thing would probably to just put this into this series or into
> > @Andrii's series. It's basically a copy-pasta from what I did for tmpfs
> > (see below).
> >
> > I plan to move this validation into the VFS so that {g,u}id mount
> > options are validated consistenly for any such filesystem. There is just
> > some unpleasantness that I have to figure out first.
> >

Thank you very much for the suggestions and discussions.
I uploaded the v2 version of this patch to include the checks as you suggested.

> > @Andrii, with the {g,u}id mount option it means that userns root can
> > ...
> [...]

Christian Brauner Dec. 6, 2023, 10:57 a.m. UTC | #5

On Tue, Dec 05, 2023 at 10:28:39AM -0800, Andrii Nakryiko wrote:
> On Tue, Dec 5, 2023 at 8:31 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Fri, Dec 01, 2023 at 09:47:29AM +0000, Jie Jiang wrote:
> > > Parse uid and gid in bpf_parse_param() so that they can be passed in as
> > > the `data` parameter when mount() bpffs. This will be useful when we
> > > want to control which user/group has the control to the mounted bpffs,
> > > otherwise a separate chown() call will be needed.
> > >
> > > Signed-off-by: Jie Jiang <jiejiang@chromium.org>
> > > ---
> >
> > Sorry, I was asked to take a quick look at this. The patchset looks fine
> > overall but it will interact with Andrii's patchset which makes bpffs
> > mountable inside a user namespace (with caveats).
> >
> > At that point you need additional validation in bpf_parse_param(). The
> > simplest thing would probably to just put this into this series or into
> > @Andrii's series. It's basically a copy-pasta from what I did for tmpfs
> > (see below).
> >
> > I plan to move this validation into the VFS so that {g,u}id mount
> > options are validated consistenly for any such filesystem. There is just
> > some unpleasantness that I have to figure out first.
> >
> > @Andrii, with the {g,u}id mount option it means that userns root can
> >
> > fsconfig(..., FSCONFIG_SET_STRING, "uid", "1000", ...)
> > fsconfig(..., FSCONFIG_SET_STRING, "gid", "1000", ...)
> > fsconfig(..., FSCONFIG_CMD_CREATE, ...)
> >
> > If you delegate CAP_BPF in that userns to uid 1000 then an unpriv user
> > in that userns can create bpf tokens. Currently this would require
> > userns root to give both CAP_DAC_READ_SEARCH and CAP_BPF to such an
> > unprivileged user.
> 
> This is probably fine. Basically the only difference is that BPF FS
> can be instantiated inside an unpriv namespace, instead of in a
> privileged parent namespace, right?

Hm, I think this is slightly misphrased but I guess I get what you mean.

Basically, userns root can change what {g,u}id bpffs will use to
instantiate inodes once init_user_ns root creates the superblock. IOW,
the {g,u}id mount option isn't guarded and can thus be changed by userns
root.

> 
> But delegate_xxx options are still guarded by the explicit
> capable(CAP_SYS_ADMIN) check, so that unprivileged user won't be able
> to grant themselves BPF token-enabling capabilities without a
> privileged parent doing it on their behalf.
> 
> Is my understanding correct or am I missing some nuance here?

No, that's correct.

Andrii Nakryiko Dec. 6, 2023, 5:18 p.m. UTC | #6

On Wed, Dec 6, 2023 at 2:58 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Dec 05, 2023 at 10:28:39AM -0800, Andrii Nakryiko wrote:
> > On Tue, Dec 5, 2023 at 8:31 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Fri, Dec 01, 2023 at 09:47:29AM +0000, Jie Jiang wrote:
> > > > Parse uid and gid in bpf_parse_param() so that they can be passed in as
> > > > the `data` parameter when mount() bpffs. This will be useful when we
> > > > want to control which user/group has the control to the mounted bpffs,
> > > > otherwise a separate chown() call will be needed.
> > > >
> > > > Signed-off-by: Jie Jiang <jiejiang@chromium.org>
> > > > ---
> > >
> > > Sorry, I was asked to take a quick look at this. The patchset looks fine
> > > overall but it will interact with Andrii's patchset which makes bpffs
> > > mountable inside a user namespace (with caveats).
> > >
> > > At that point you need additional validation in bpf_parse_param(). The
> > > simplest thing would probably to just put this into this series or into
> > > @Andrii's series. It's basically a copy-pasta from what I did for tmpfs
> > > (see below).
> > >
> > > I plan to move this validation into the VFS so that {g,u}id mount
> > > options are validated consistenly for any such filesystem. There is just
> > > some unpleasantness that I have to figure out first.
> > >
> > > @Andrii, with the {g,u}id mount option it means that userns root can
> > >
> > > fsconfig(..., FSCONFIG_SET_STRING, "uid", "1000", ...)
> > > fsconfig(..., FSCONFIG_SET_STRING, "gid", "1000", ...)
> > > fsconfig(..., FSCONFIG_CMD_CREATE, ...)
> > >
> > > If you delegate CAP_BPF in that userns to uid 1000 then an unpriv user
> > > in that userns can create bpf tokens. Currently this would require
> > > userns root to give both CAP_DAC_READ_SEARCH and CAP_BPF to such an
> > > unprivileged user.
> >
> > This is probably fine. Basically the only difference is that BPF FS
> > can be instantiated inside an unpriv namespace, instead of in a
> > privileged parent namespace, right?
>
> Hm, I think this is slightly misphrased but I guess I get what you mean.
>
> Basically, userns root can change what {g,u}id bpffs will use to
> instantiate inodes once init_user_ns root creates the superblock. IOW,
> the {g,u}id mount option isn't guarded and can thus be changed by userns
> root.
>
> >
> > But delegate_xxx options are still guarded by the explicit
> > capable(CAP_SYS_ADMIN) check, so that unprivileged user won't be able
> > to grant themselves BPF token-enabling capabilities without a
> > privileged parent doing it on their behalf.
> >
> > Is my understanding correct or am I missing some nuance here?
>
> No, that's correct.

Ok, thanks, then it seems like it's all good w.r.t. interaction with
delegate_xxx options and BPF token creation.

[bpf-next] bpf: Support uid and gid when mounting bpffs

Checks

Commit Message

Comments

Patch