diff mbox series

[1/7] bpf: Add missing fd modes check for map iterators

Message ID 20220906170301.256206-2-roberto.sassu@huaweicloud.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: Add fd modes check for map iter and extend libbpf | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on s390x with gcc
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1355 this patch: 1355
netdev/cc_maintainers warning 1 maintainers not CCed: joannelkoong@gmail.com
netdev/build_clang success Errors and warnings before: 159 this patch: 159
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1347 this patch: 1347
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 64 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix

Commit Message

Roberto Sassu Sept. 6, 2022, 5:02 p.m. UTC
From: Roberto Sassu <roberto.sassu@huawei.com>

Commit 6e71b04a82248 ("bpf: Add file mode configuration into bpf maps")
added the BPF_F_RDONLY and BPF_F_WRONLY flags, to let user space specify
whether it will just read or modify a map.

Map access control is done in two steps. First, when user space wants to
obtain a map fd, it provides to the kernel the eBPF-defined flags, which
are converted into open flags and passed to the security_bpf_map() security
hook for evaluation by LSMs.

Second, if user space successfully obtained an fd, it passes that fd to the
kernel when it requests a map operation (e.g. lookup or update). The kernel
first checks if the fd has the modes required to perform the requested
operation and, if yes, continues the execution and returns the result to
user space.

While the fd modes check was added for map_*_elem() functions, it is
currently missing for map iterators, added more recently with commit
a5cbe05a6673 ("bpf: Implement bpf iterator for map elements"). A map
iterator executes a chosen eBPF program for each key/value pair of a map
and allows that program to read and/or modify them.

Whether a map iterator allows only read or also write depends on whether
the MEM_RDONLY flag in the ctx_arg_info member of the bpf_iter_reg
structure is set. Also, write needs to be supported at verifier level (for
example, it is currently not supported for sock maps).

Since map iterators obtain a map from a user space fd with
bpf_map_get_with_uref(), add the new req_modes parameter to that function,
so that map iterators can provide the required fd modes to access a map. If
the user space fd doesn't include the required modes,
bpf_map_get_with_uref() returns with an error, and the map iterator will
not be created.

If a map iterator marks both the key and value as read-only, it calls
bpf_map_get_with_uref() with FMODE_CAN_READ as value for req_modes. If it
also allows write access to either the key or the value, it calls that
function with FMODE_CAN_READ | FMODE_CAN_WRITE as value for req_modes,
regardless of whether or not the write is supported by the verifier (the
write is intentionally allowed).

bpf_fd_probe_obj() does not require any fd mode, as the fd is only used for
the purpose of finding the eBPF object type, for pinning the object to the
bpffs filesystem.

Finally, it is worth to mention that the fd modes check was not added for
the cgroup iterator, although it registers an attach_target method like the
other iterators. The reason is that the fd is not the only way for user
space to reference a cgroup object (also by ID and by path). For the
protection to be effective, all reference methods need to be evaluated
consistently. This work is deferred to a separate patch.

Cc: stable@vger.kernel.org # 5.10.x
Fixes: a5cbe05a6673 ("bpf: Implement bpf iterator for map elements")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 include/linux/bpf.h       | 2 +-
 kernel/bpf/inode.c        | 2 +-
 kernel/bpf/map_iter.c     | 3 ++-
 kernel/bpf/syscall.c      | 8 +++++++-
 net/core/bpf_sk_storage.c | 3 ++-
 net/core/sock_map.c       | 3 ++-
 6 files changed, 15 insertions(+), 6 deletions(-)

Comments

Alexei Starovoitov Sept. 6, 2022, 6:21 p.m. UTC | #1
On Tue, Sep 6, 2022 at 10:04 AM Roberto Sassu
<roberto.sassu@huaweicloud.com> wrote:
>
> From: Roberto Sassu <roberto.sassu@huawei.com>
>
> Commit 6e71b04a82248 ("bpf: Add file mode configuration into bpf maps")
> added the BPF_F_RDONLY and BPF_F_WRONLY flags, to let user space specify
> whether it will just read or modify a map.
>
> Map access control is done in two steps. First, when user space wants to
> obtain a map fd, it provides to the kernel the eBPF-defined flags, which
> are converted into open flags and passed to the security_bpf_map() security
> hook for evaluation by LSMs.
>
> Second, if user space successfully obtained an fd, it passes that fd to the
> kernel when it requests a map operation (e.g. lookup or update). The kernel
> first checks if the fd has the modes required to perform the requested
> operation and, if yes, continues the execution and returns the result to
> user space.
>
> While the fd modes check was added for map_*_elem() functions, it is
> currently missing for map iterators, added more recently with commit
> a5cbe05a6673 ("bpf: Implement bpf iterator for map elements"). A map
> iterator executes a chosen eBPF program for each key/value pair of a map
> and allows that program to read and/or modify them.
>
> Whether a map iterator allows only read or also write depends on whether
> the MEM_RDONLY flag in the ctx_arg_info member of the bpf_iter_reg
> structure is set. Also, write needs to be supported at verifier level (for
> example, it is currently not supported for sock maps).
>
> Since map iterators obtain a map from a user space fd with
> bpf_map_get_with_uref(), add the new req_modes parameter to that function,
> so that map iterators can provide the required fd modes to access a map. If
> the user space fd doesn't include the required modes,
> bpf_map_get_with_uref() returns with an error, and the map iterator will
> not be created.
>
> If a map iterator marks both the key and value as read-only, it calls
> bpf_map_get_with_uref() with FMODE_CAN_READ as value for req_modes. If it
> also allows write access to either the key or the value, it calls that
> function with FMODE_CAN_READ | FMODE_CAN_WRITE as value for req_modes,
> regardless of whether or not the write is supported by the verifier (the
> write is intentionally allowed).
>
> bpf_fd_probe_obj() does not require any fd mode, as the fd is only used for
> the purpose of finding the eBPF object type, for pinning the object to the
> bpffs filesystem.
>
> Finally, it is worth to mention that the fd modes check was not added for
> the cgroup iterator, although it registers an attach_target method like the
> other iterators. The reason is that the fd is not the only way for user
> space to reference a cgroup object (also by ID and by path). For the
> protection to be effective, all reference methods need to be evaluated
> consistently. This work is deferred to a separate patch.

I think the current behavior is fine.
File permissions don't apply at iterator level or prog level.
fmode_can_read/write are for syscall commands only.
To be fair we've added them to lookup/delete commands
and it was more of a pain to maintain and no confirmed good use.
Roberto Sassu Sept. 7, 2022, 8:02 a.m. UTC | #2
On Tue, 2022-09-06 at 11:21 -0700, Alexei Starovoitov wrote:
> On Tue, Sep 6, 2022 at 10:04 AM Roberto Sassu
> <roberto.sassu@huaweicloud.com> wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > Commit 6e71b04a82248 ("bpf: Add file mode configuration into bpf
> > maps")
> > added the BPF_F_RDONLY and BPF_F_WRONLY flags, to let user space
> > specify
> > whether it will just read or modify a map.
> > 
> > Map access control is done in two steps. First, when user space
> > wants to
> > obtain a map fd, it provides to the kernel the eBPF-defined flags,
> > which
> > are converted into open flags and passed to the security_bpf_map()
> > security
> > hook for evaluation by LSMs.
> > 
> > Second, if user space successfully obtained an fd, it passes that
> > fd to the
> > kernel when it requests a map operation (e.g. lookup or update).
> > The kernel
> > first checks if the fd has the modes required to perform the
> > requested
> > operation and, if yes, continues the execution and returns the
> > result to
> > user space.
> > 
> > While the fd modes check was added for map_*_elem() functions, it
> > is
> > currently missing for map iterators, added more recently with
> > commit
> > a5cbe05a6673 ("bpf: Implement bpf iterator for map elements"). A
> > map
> > iterator executes a chosen eBPF program for each key/value pair of
> > a map
> > and allows that program to read and/or modify them.
> > 
> > Whether a map iterator allows only read or also write depends on
> > whether
> > the MEM_RDONLY flag in the ctx_arg_info member of the bpf_iter_reg
> > structure is set. Also, write needs to be supported at verifier
> > level (for
> > example, it is currently not supported for sock maps).
> > 
> > Since map iterators obtain a map from a user space fd with
> > bpf_map_get_with_uref(), add the new req_modes parameter to that
> > function,
> > so that map iterators can provide the required fd modes to access a
> > map. If
> > the user space fd doesn't include the required modes,
> > bpf_map_get_with_uref() returns with an error, and the map iterator
> > will
> > not be created.
> > 
> > If a map iterator marks both the key and value as read-only, it
> > calls
> > bpf_map_get_with_uref() with FMODE_CAN_READ as value for req_modes.
> > If it
> > also allows write access to either the key or the value, it calls
> > that
> > function with FMODE_CAN_READ | FMODE_CAN_WRITE as value for
> > req_modes,
> > regardless of whether or not the write is supported by the verifier
> > (the
> > write is intentionally allowed).
> > 
> > bpf_fd_probe_obj() does not require any fd mode, as the fd is only
> > used for
> > the purpose of finding the eBPF object type, for pinning the object
> > to the
> > bpffs filesystem.
> > 
> > Finally, it is worth to mention that the fd modes check was not
> > added for
> > the cgroup iterator, although it registers an attach_target method
> > like the
> > other iterators. The reason is that the fd is not the only way for
> > user
> > space to reference a cgroup object (also by ID and by path). For
> > the
> > protection to be effective, all reference methods need to be
> > evaluated
> > consistently. This work is deferred to a separate patch.
> 
> I think the current behavior is fine.
> File permissions don't apply at iterator level or prog level.

+ Chenbo, linux-security-module

Well, if you write a security module to prevent writes on a map, and
user space is able to do it anyway with an iterator, what is the
purpose of the security module then?

> fmode_can_read/write are for syscall commands only.
> To be fair we've added them to lookup/delete commands
> and it was more of a pain to maintain and no confirmed good use.

I think a good use would be requesting the right permission for the
type of operation that needs to be performed, e.g. read-only permission
when you have a read-like operation like a lookup or dump.

By always requesting read-write permission, for all operations,
security modules won't be able to distinguish which operation has to be
denied to satisfy the policy.

One example of that is that, when there is a security module preventing
writes on maps (will be that uncommon?), bpftool is not able to show
the full list of maps because it asks for read-write permission for
getting the map info.

Freezing the map is not a solution, if you want to allow certain
subjects to continuously update the protected map at run-time.

Roberto
Alexei Starovoitov Sept. 7, 2022, 4:02 p.m. UTC | #3
On Wed, Sep 7, 2022 at 1:03 AM Roberto Sassu
<roberto.sassu@huaweicloud.com> wrote:
>
> On Tue, 2022-09-06 at 11:21 -0700, Alexei Starovoitov wrote:
> > On Tue, Sep 6, 2022 at 10:04 AM Roberto Sassu
> > <roberto.sassu@huaweicloud.com> wrote:
> > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > >
> > > Commit 6e71b04a82248 ("bpf: Add file mode configuration into bpf
> > > maps")
> > > added the BPF_F_RDONLY and BPF_F_WRONLY flags, to let user space
> > > specify
> > > whether it will just read or modify a map.
> > >
> > > Map access control is done in two steps. First, when user space
> > > wants to
> > > obtain a map fd, it provides to the kernel the eBPF-defined flags,
> > > which
> > > are converted into open flags and passed to the security_bpf_map()
> > > security
> > > hook for evaluation by LSMs.
> > >
> > > Second, if user space successfully obtained an fd, it passes that
> > > fd to the
> > > kernel when it requests a map operation (e.g. lookup or update).
> > > The kernel
> > > first checks if the fd has the modes required to perform the
> > > requested
> > > operation and, if yes, continues the execution and returns the
> > > result to
> > > user space.
> > >
> > > While the fd modes check was added for map_*_elem() functions, it
> > > is
> > > currently missing for map iterators, added more recently with
> > > commit
> > > a5cbe05a6673 ("bpf: Implement bpf iterator for map elements"). A
> > > map
> > > iterator executes a chosen eBPF program for each key/value pair of
> > > a map
> > > and allows that program to read and/or modify them.
> > >
> > > Whether a map iterator allows only read or also write depends on
> > > whether
> > > the MEM_RDONLY flag in the ctx_arg_info member of the bpf_iter_reg
> > > structure is set. Also, write needs to be supported at verifier
> > > level (for
> > > example, it is currently not supported for sock maps).
> > >
> > > Since map iterators obtain a map from a user space fd with
> > > bpf_map_get_with_uref(), add the new req_modes parameter to that
> > > function,
> > > so that map iterators can provide the required fd modes to access a
> > > map. If
> > > the user space fd doesn't include the required modes,
> > > bpf_map_get_with_uref() returns with an error, and the map iterator
> > > will
> > > not be created.
> > >
> > > If a map iterator marks both the key and value as read-only, it
> > > calls
> > > bpf_map_get_with_uref() with FMODE_CAN_READ as value for req_modes.
> > > If it
> > > also allows write access to either the key or the value, it calls
> > > that
> > > function with FMODE_CAN_READ | FMODE_CAN_WRITE as value for
> > > req_modes,
> > > regardless of whether or not the write is supported by the verifier
> > > (the
> > > write is intentionally allowed).
> > >
> > > bpf_fd_probe_obj() does not require any fd mode, as the fd is only
> > > used for
> > > the purpose of finding the eBPF object type, for pinning the object
> > > to the
> > > bpffs filesystem.
> > >
> > > Finally, it is worth to mention that the fd modes check was not
> > > added for
> > > the cgroup iterator, although it registers an attach_target method
> > > like the
> > > other iterators. The reason is that the fd is not the only way for
> > > user
> > > space to reference a cgroup object (also by ID and by path). For
> > > the
> > > protection to be effective, all reference methods need to be
> > > evaluated
> > > consistently. This work is deferred to a separate patch.
> >
> > I think the current behavior is fine.
> > File permissions don't apply at iterator level or prog level.
>
> + Chenbo, linux-security-module
>
> Well, if you write a security module to prevent writes on a map, and
> user space is able to do it anyway with an iterator, what is the
> purpose of the security module then?

sounds like a broken "security module" and nothing else.

> > fmode_can_read/write are for syscall commands only.
> > To be fair we've added them to lookup/delete commands
> > and it was more of a pain to maintain and no confirmed good use.
>
> I think a good use would be requesting the right permission for the
> type of operation that needs to be performed, e.g. read-only permission
> when you have a read-like operation like a lookup or dump.
>
> By always requesting read-write permission, for all operations,
> security modules won't be able to distinguish which operation has to be
> denied to satisfy the policy.
>
> One example of that is that, when there is a security module preventing
> writes on maps (will be that uncommon?),

lsm that prevents writes into bpf maps? That's a convoluted design.
You can try to implement such an lsm, but expect lots of challenges.

> bpftool is not able to show
> the full list of maps because it asks for read-write permission for
> getting the map info.

completely orthogonal issue.

> Freezing the map is not a solution, if you want to allow certain
> subjects to continuously update the protected map at run-time.
>
> Roberto
>
Roberto Sassu Sept. 8, 2022, 1:58 p.m. UTC | #4
On Wed, 2022-09-07 at 09:02 -0700, Alexei Starovoitov wrote:
> 

[...]

> > Well, if you write a security module to prevent writes on a map,
> > and
> > user space is able to do it anyway with an iterator, what is the
> > purpose of the security module then?
> 
> sounds like a broken "security module" and nothing else.

Ok, if a custom security module does not convince you, let me make a
small example with SELinux.

I created a small map iterator that sets every value of a map to 5:

SEC("iter/bpf_map_elem")
int write_bpf_hash_map(struct bpf_iter__bpf_map_elem *ctx)
{
	u32 *key = ctx->key;
	u8 *val = ctx->value;

	if (key == NULL || val == NULL)
		return 0;

	*val = 5;
	return 0;
}

I create and pin a map:

# bpftool map create /sys/fs/bpf/map type array key 4 value 1 entries 1
name test

Initially, the content of the map looks like:

# bpftool map dump pinned /sys/fs/bpf/map 
key: 00 00 00 00  value: 00
Found 1 element

I then created a new SELinux type bpftool_test_t, which has only read
permission on maps:

# sesearch -A -s bpftool_test_t -t unconfined_t -c bpf
allow bpftool_test_t unconfined_t:bpf map_read;

So, what I expect is that this type is not able to write to the map.

Indeed, the current bpftool is not able to do it:

# strace -f -etrace=bpf runcon -t bpftool_test_t bpftool iter pin
writer.o /sys/fs/bpf/iter map pinned /sys/fs/bpf/map
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/map", bpf_fd=0, file_flags=0},
144) = -1 EACCES (Permission denied)
Error: bpf obj get (/sys/fs/bpf): Permission denied

This happens because the current bpftool requests to access the map
with read-write permission, and SELinux denies it:

# cat /var/log/audit/audit.log|audit2allow


#============= bpftool_test_t ==============
allow bpftool_test_t unconfined_t:bpf map_write;


The command failed, and the content of the map is still:

# bpftool map dump pinned /sys/fs/bpf/map 
key: 00 00 00 00  value: 00
Found 1 element


Now, what I will do is to use a slightly modified version of bpftool
which requests read-only access to the map instead:

# strace -f -etrace=bpf runcon -t bpftool_test_t ./bpftool iter pin
writer.o /sys/fs/bpf/iter map pinned /sys/fs/bpf/map
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/map", bpf_fd=0,
file_flags=BPF_F_RDONLY}, 16) = 3
libbpf: elf: skipping unrecognized data section(5) .eh_frame
libbpf: elf: skipping relo section(6) .rel.eh_frame for section(5)
.eh_frame

...

bpf(BPF_LINK_CREATE, {link_create={prog_fd=4, target_fd=0,
attach_type=BPF_TRACE_ITER, flags=0}, ...}, 48) = 5
bpf(BPF_OBJ_PIN, {pathname="/sys/fs/bpf/iter", bpf_fd=5, file_flags=0},
16) = 0

That worked, because SELinux grants read-only permission to
bpftool_test_t. However, the map iterator does not check how the fd was
obtained, and thus allows the iterator to be created.

At this point, we have write access, despite not having the right to do
it:

# cat /sys/fs/bpf/iter
# bpftool map dump pinned /sys/fs/bpf/map 
key: 00 00 00 00  value: 05
Found 1 element

The iterator updated the map value.


The patch I'm proposing checks how the map fd was obtained, and if its
modes are compatible with the operations an attached program is allowed
to do. If the fd does not have the required modes, eBPF denies the
creation of the map iterator.

After patching the kernel, I try to run the modified bpftool again:

# strace -f -etrace=bpf runcon -t bpftool_test_t ./bpftool iter pin
writer.o /sys/fs/bpf/iter map pinned /sys/fs/bpf/map
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/map", bpf_fd=0,
file_flags=BPF_F_RDONLY}, 16) = 3
libbpf: elf: skipping unrecognized data section(5) .eh_frame
libbpf: elf: skipping relo section(6) .rel.eh_frame for section(5)
.eh_frame

...

bpf(BPF_LINK_CREATE, {link_create={prog_fd=4, target_fd=0,
attach_type=BPF_TRACE_ITER, flags=0}, ...}, 48) = -1 EPERM (Operation
not permitted)
libbpf: prog 'write_bpf_hash_map': failed to attach to iterator:
Operation not permitted
Error: attach_iter failed for program write_bpf_hash_map

The map iterator cannot be created and the map is not updated:

# bpftool map dump pinned /sys/fs/bpf/map 
key: 00 00 00 00  value: 00
Found 1 element

Roberto
Alexei Starovoitov Sept. 8, 2022, 3:17 p.m. UTC | #5
On Thu, Sep 8, 2022 at 6:59 AM Roberto Sassu
<roberto.sassu@huaweicloud.com> wrote:
>
> On Wed, 2022-09-07 at 09:02 -0700, Alexei Starovoitov wrote:
> >
>
> [...]
>
> > > Well, if you write a security module to prevent writes on a map,
> > > and
> > > user space is able to do it anyway with an iterator, what is the
> > > purpose of the security module then?
> >
> > sounds like a broken "security module" and nothing else.
>
> Ok, if a custom security module does not convince you, let me make a
> small example with SELinux.
>
> I created a small map iterator that sets every value of a map to 5:
>
> SEC("iter/bpf_map_elem")
> int write_bpf_hash_map(struct bpf_iter__bpf_map_elem *ctx)
> {
>         u32 *key = ctx->key;
>         u8 *val = ctx->value;
>
>         if (key == NULL || val == NULL)
>                 return 0;
>
>         *val = 5;
>         return 0;
> }
>
> I create and pin a map:
>
> # bpftool map create /sys/fs/bpf/map type array key 4 value 1 entries 1
> name test
>
> Initially, the content of the map looks like:
>
> # bpftool map dump pinned /sys/fs/bpf/map
> key: 00 00 00 00  value: 00
> Found 1 element
>
> I then created a new SELinux type bpftool_test_t, which has only read
> permission on maps:
>
> # sesearch -A -s bpftool_test_t -t unconfined_t -c bpf
> allow bpftool_test_t unconfined_t:bpf map_read;
>
> So, what I expect is that this type is not able to write to the map.
>
> Indeed, the current bpftool is not able to do it:
>
> # strace -f -etrace=bpf runcon -t bpftool_test_t bpftool iter pin
> writer.o /sys/fs/bpf/iter map pinned /sys/fs/bpf/map
> bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/map", bpf_fd=0, file_flags=0},
> 144) = -1 EACCES (Permission denied)
> Error: bpf obj get (/sys/fs/bpf): Permission denied
>
> This happens because the current bpftool requests to access the map
> with read-write permission, and SELinux denies it:
>
> # cat /var/log/audit/audit.log|audit2allow
>
>
> #============= bpftool_test_t ==============
> allow bpftool_test_t unconfined_t:bpf map_write;
>
>
> The command failed, and the content of the map is still:
>
> # bpftool map dump pinned /sys/fs/bpf/map
> key: 00 00 00 00  value: 00
> Found 1 element
>
>
> Now, what I will do is to use a slightly modified version of bpftool
> which requests read-only access to the map instead:
>
> # strace -f -etrace=bpf runcon -t bpftool_test_t ./bpftool iter pin
> writer.o /sys/fs/bpf/iter map pinned /sys/fs/bpf/map
> bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/map", bpf_fd=0,
> file_flags=BPF_F_RDONLY}, 16) = 3
> libbpf: elf: skipping unrecognized data section(5) .eh_frame
> libbpf: elf: skipping relo section(6) .rel.eh_frame for section(5)
> .eh_frame
>
> ...
>
> bpf(BPF_LINK_CREATE, {link_create={prog_fd=4, target_fd=0,
> attach_type=BPF_TRACE_ITER, flags=0}, ...}, 48) = 5
> bpf(BPF_OBJ_PIN, {pathname="/sys/fs/bpf/iter", bpf_fd=5, file_flags=0},
> 16) = 0
>
> That worked, because SELinux grants read-only permission to
> bpftool_test_t. However, the map iterator does not check how the fd was
> obtained, and thus allows the iterator to be created.
>
> At this point, we have write access, despite not having the right to do
> it:

That is a wrong assumption to begin with.
Having an fd to a bpf object (map, link, prog) allows access.
read/write sort-of applicable to maps, but not so much
to progs, links.
That file based read/write flag is only for user processes.
bpf progs always had separate flags for that.
See BPF_F_RDONLY vs BPF_F_RDONLY_PROG.
One doesn't imply the other.
diff mbox series

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9c1674973e03..6cd2ca910553 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1628,7 +1628,7 @@  bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_ma
 void bpf_map_free_kptrs(struct bpf_map *map, void *map_value);
 
 struct bpf_map *bpf_map_get(u32 ufd);
-struct bpf_map *bpf_map_get_with_uref(u32 ufd);
+struct bpf_map *bpf_map_get_with_uref(u32 ufd, fmode_t req_modes);
 struct bpf_map *__bpf_map_get(struct fd f);
 void bpf_map_inc(struct bpf_map *map);
 void bpf_map_inc_with_uref(struct bpf_map *map);
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 4f841e16779e..862e1caa8b0f 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -71,7 +71,7 @@  static void *bpf_fd_probe_obj(u32 ufd, enum bpf_type *type)
 {
 	void *raw;
 
-	raw = bpf_map_get_with_uref(ufd);
+	raw = bpf_map_get_with_uref(ufd, 0);
 	if (!IS_ERR(raw)) {
 		*type = BPF_TYPE_MAP;
 		return raw;
diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c
index b0fa190b0979..1143f8960135 100644
--- a/kernel/bpf/map_iter.c
+++ b/kernel/bpf/map_iter.c
@@ -110,7 +110,8 @@  static int bpf_iter_attach_map(struct bpf_prog *prog,
 	if (!linfo->map.map_fd)
 		return -EBADF;
 
-	map = bpf_map_get_with_uref(linfo->map.map_fd);
+	map = bpf_map_get_with_uref(linfo->map.map_fd,
+				    FMODE_CAN_READ | FMODE_CAN_WRITE);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4e9d4622aef7..4a2063d8e99c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1232,7 +1232,7 @@  struct bpf_map *bpf_map_get(u32 ufd)
 }
 EXPORT_SYMBOL(bpf_map_get);
 
-struct bpf_map *bpf_map_get_with_uref(u32 ufd)
+struct bpf_map *bpf_map_get_with_uref(u32 ufd, fmode_t req_modes)
 {
 	struct fd f = fdget(ufd);
 	struct bpf_map *map;
@@ -1241,7 +1241,13 @@  struct bpf_map *bpf_map_get_with_uref(u32 ufd)
 	if (IS_ERR(map))
 		return map;
 
+	if ((map_get_sys_perms(map, f) & req_modes) != req_modes) {
+		map = ERR_PTR(-EPERM);
+		goto out;
+	}
+
 	bpf_map_inc_with_uref(map);
+out:
 	fdput(f);
 
 	return map;
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 1b7f385643b4..bf9c6afed8ac 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -897,7 +897,8 @@  static int bpf_iter_attach_map(struct bpf_prog *prog,
 	if (!linfo->map.map_fd)
 		return -EBADF;
 
-	map = bpf_map_get_with_uref(linfo->map.map_fd);
+	map = bpf_map_get_with_uref(linfo->map.map_fd,
+				    FMODE_CAN_READ | FMODE_CAN_WRITE);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index a660baedd9e7..7f7375dc39b2 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1636,7 +1636,8 @@  static int sock_map_iter_attach_target(struct bpf_prog *prog,
 	if (!linfo->map.map_fd)
 		return -EBADF;
 
-	map = bpf_map_get_with_uref(linfo->map.map_fd);
+	map = bpf_map_get_with_uref(linfo->map.map_fd,
+				    FMODE_CAN_READ | FMODE_CAN_WRITE);
 	if (IS_ERR(map))
 		return PTR_ERR(map);