Message ID | 20230412043300.360803-1-andrii@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | New BPF map and BTF security LSM hooks | expand |
On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > are meant to allow highly-granular LSM-based control over the usage of BPF > subsytem. Specifically, to control the creation of BPF maps and BTF data > objects, which are fundamental building blocks of any modern BPF application. > > These new hooks are able to override default kernel-side CAP_BPF-based (and > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > implement LSM policies that could granularly enforce more restrictions on > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > capabilities), but also, importantly, allow to *bypass kernel-side > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > cases. One of the hallmarks of the LSM has always been that it is non-authoritative: it cannot unilaterally grant access, it can only restrict what would have been otherwise permitted on a traditional Linux system. Put another way, a LSM should not undermine the Linux discretionary access controls, e.g. capabilities. If there is a problem with the eBPF capability-based access controls, that problem needs to be addressed in how the core eBPF code implements its capability checks, not by modifying the LSM mechanism to bypass these checks.
On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > are meant to allow highly-granular LSM-based control over the usage of BPF > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > objects, which are fundamental building blocks of any modern BPF application. > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > implement LSM policies that could granularly enforce more restrictions on > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > capabilities), but also, importantly, allow to *bypass kernel-side > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > cases. > > One of the hallmarks of the LSM has always been that it is > non-authoritative: it cannot unilaterally grant access, it can only > restrict what would have been otherwise permitted on a traditional > Linux system. Put another way, a LSM should not undermine the Linux > discretionary access controls, e.g. capabilities. > > If there is a problem with the eBPF capability-based access controls, > that problem needs to be addressed in how the core eBPF code > implements its capability checks, not by modifying the LSM mechanism > to bypass these checks. I think semantics matter here. I wouldn't view this as _bypassing_ capability enforcement: it's just more fine-grained access control. For example, in many places we have things like: if (!some_check(...) && !capable(...)) return -EPERM; I would expect this is a similar logic. An operation can succeed if the access control requirement is met. The mismatch we have through-out the kernel is that capability checks aren't strictly done by LSM hooks. And this series conceptually, I think, doesn't violate that -- it's changing the logic of the capability checks, not the LSM (i.e. there no LSM hooks yet here). The reason CAP_BPF was created was because there was nothing else that would be fine-grained enough at the time.
On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > implement LSM policies that could granularly enforce more restrictions on > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > cases. > > > > One of the hallmarks of the LSM has always been that it is > > non-authoritative: it cannot unilaterally grant access, it can only > > restrict what would have been otherwise permitted on a traditional > > Linux system. Put another way, a LSM should not undermine the Linux > > discretionary access controls, e.g. capabilities. > > > > If there is a problem with the eBPF capability-based access controls, > > that problem needs to be addressed in how the core eBPF code > > implements its capability checks, not by modifying the LSM mechanism > > to bypass these checks. > > I think semantics matter here. I wouldn't view this as _bypassing_ > capability enforcement: it's just more fine-grained access control. > > For example, in many places we have things like: > > if (!some_check(...) && !capable(...)) > return -EPERM; > > I would expect this is a similar logic. An operation can succeed if the > access control requirement is met. The mismatch we have through-out the > kernel is that capability checks aren't strictly done by LSM hooks. And > this series conceptually, I think, doesn't violate that -- it's changing > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > yet here). Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which when it returns a positive value "bypasses kernel checks". The patch isn't based on either Linus' tree or the LSM tree, I'm guessing it is based on a eBPF tree, so I can't say with 100% certainty that it is bypassing a capability check, but the description claims that to be the case. Regardless of how you want to spin this, I'm not supportive of a LSM hook which allows a LSM to bypass a capability check. A LSM hook can be used to provide additional access control restrictions beyond a capability check, but a LSM hook should never be allowed to overrule an access denial due to a capability check. > The reason CAP_BPF was created was because there was nothing else that > would be fine-grained enough at the time. The LSM layer predates CAP_BPF, and one could make a very solid argument that one of the reasons LSMs exist is to provide supplementary controls due to capability-based access controls being a poor fit for many modern use cases. -- paul-moore.com
On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > > implement LSM policies that could granularly enforce more restrictions on > > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > > cases. > > > > > > One of the hallmarks of the LSM has always been that it is > > > non-authoritative: it cannot unilaterally grant access, it can only > > > restrict what would have been otherwise permitted on a traditional > > > Linux system. Put another way, a LSM should not undermine the Linux > > > discretionary access controls, e.g. capabilities. > > > > > > If there is a problem with the eBPF capability-based access controls, > > > that problem needs to be addressed in how the core eBPF code > > > implements its capability checks, not by modifying the LSM mechanism > > > to bypass these checks. > > > > I think semantics matter here. I wouldn't view this as _bypassing_ > > capability enforcement: it's just more fine-grained access control. > > > > For example, in many places we have things like: > > > > if (!some_check(...) && !capable(...)) > > return -EPERM; > > > > I would expect this is a similar logic. An operation can succeed if the > > access control requirement is met. The mismatch we have through-out the > > kernel is that capability checks aren't strictly done by LSM hooks. And > > this series conceptually, I think, doesn't violate that -- it's changing > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > yet here). > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > when it returns a positive value "bypasses kernel checks". The patch > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > based on a eBPF tree, so I can't say with 100% certainty that it is > bypassing a capability check, but the description claims that to be > the case. > > Regardless of how you want to spin this, I'm not supportive of a LSM > hook which allows a LSM to bypass a capability check. A LSM hook can > be used to provide additional access control restrictions beyond a > capability check, but a LSM hook should never be allowed to overrule > an access denial due to a capability check. > > > The reason CAP_BPF was created was because there was nothing else that > > would be fine-grained enough at the time. > > The LSM layer predates CAP_BPF, and one could make a very solid > argument that one of the reasons LSMs exist is to provide > supplementary controls due to capability-based access controls being a > poor fit for many modern use cases. I generally agree with what you say, but we DO have this code pattern: if (!some_check(...) && !capable(...)) return -EPERM; It looks to me like this series can be refactored to do the same. I wouldn't consider that to be a "bypass", but I would agree the current series looks too much like "bypass", and makes reasoning about the effect of the LSM hooks too "special". :)
On 4/12/2023 11:06 AM, Paul Moore wrote: > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: >> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: >>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: >>>> Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which >>>> are meant to allow highly-granular LSM-based control over the usage of BPF >>>> subsytem. Specifically, to control the creation of BPF maps and BTF data >>>> objects, which are fundamental building blocks of any modern BPF application. >>>> >>>> These new hooks are able to override default kernel-side CAP_BPF-based (and >>>> sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to >>>> implement LSM policies that could granularly enforce more restrictions on >>>> a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN >>>> capabilities), but also, importantly, allow to *bypass kernel-side >>>> enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use >>>> cases. >>> One of the hallmarks of the LSM has always been that it is >>> non-authoritative: it cannot unilaterally grant access, it can only >>> restrict what would have been otherwise permitted on a traditional >>> Linux system. Put another way, a LSM should not undermine the Linux >>> discretionary access controls, e.g. capabilities. >>> >>> If there is a problem with the eBPF capability-based access controls, >>> that problem needs to be addressed in how the core eBPF code >>> implements its capability checks, not by modifying the LSM mechanism >>> to bypass these checks. Agreed. A lot of thought went into this. The LSM mechanism would be vastly different if the hooks were authoritative instead of restrictive. >> I think semantics matter here. I wouldn't view this as _bypassing_ >> capability enforcement: it's just more fine-grained access control. >> >> For example, in many places we have things like: >> >> if (!some_check(...) && !capable(...)) >> return -EPERM; >> >> I would expect this is a similar logic. An operation can succeed if the >> access control requirement is met. The mismatch we have through-out the >> kernel is that capability checks aren't strictly done by LSM hooks. And >> this series conceptually, I think, doesn't violate that -- it's changing >> the logic of the capability checks, not the LSM (i.e. there no LSM hooks >> yet here). > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > when it returns a positive value "bypasses kernel checks". The patch > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > based on a eBPF tree, so I can't say with 100% certainty that it is > bypassing a capability check, but the description claims that to be > the case. > > Regardless of how you want to spin this, I'm not supportive of a LSM > hook which allows a LSM to bypass a capability check. A LSM hook can > be used to provide additional access control restrictions beyond a > capability check, but a LSM hook should never be allowed to overrule > an access denial due to a capability check. > >> The reason CAP_BPF was created was because there was nothing else that >> would be fine-grained enough at the time. There's nothing stopping you from having a fine grained mechanism that further restricts a process with CAP_BPF. SELinux implements many checks that can, policy willing, restrict a process with a capability from doing what the capability permits. > The LSM layer predates CAP_BPF, and one could make a very solid > argument that one of the reasons LSMs exist is to provide > supplementary controls due to capability-based access controls being a > poor fit for many modern use cases. > > -- > paul-moore.com
On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > > > implement LSM policies that could granularly enforce more restrictions on > > > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > > > cases. > > > > > > > > One of the hallmarks of the LSM has always been that it is > > > > non-authoritative: it cannot unilaterally grant access, it can only > > > > restrict what would have been otherwise permitted on a traditional > > > > Linux system. Put another way, a LSM should not undermine the Linux > > > > discretionary access controls, e.g. capabilities. > > > > > > > > If there is a problem with the eBPF capability-based access controls, > > > > that problem needs to be addressed in how the core eBPF code > > > > implements its capability checks, not by modifying the LSM mechanism > > > > to bypass these checks. > > > > > > I think semantics matter here. I wouldn't view this as _bypassing_ > > > capability enforcement: it's just more fine-grained access control. > > > > > > For example, in many places we have things like: > > > > > > if (!some_check(...) && !capable(...)) > > > return -EPERM; > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > access control requirement is met. The mismatch we have through-out the > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > this series conceptually, I think, doesn't violate that -- it's changing > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > yet here). > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > when it returns a positive value "bypasses kernel checks". The patch > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > based on a eBPF tree, so I can't say with 100% certainty that it is > > bypassing a capability check, but the description claims that to be > > the case. > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > hook which allows a LSM to bypass a capability check. A LSM hook can > > be used to provide additional access control restrictions beyond a > > capability check, but a LSM hook should never be allowed to overrule > > an access denial due to a capability check. > > > > > The reason CAP_BPF was created was because there was nothing else that > > > would be fine-grained enough at the time. > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > argument that one of the reasons LSMs exist is to provide > > supplementary controls due to capability-based access controls being a > > poor fit for many modern use cases. > > I generally agree with what you say, but we DO have this code pattern: > > if (!some_check(...) && !capable(...)) > return -EPERM; I think we need to make this more concrete; we don't have a pattern in the upstream kernel where 'some_check(...)' is a LSM hook, right? Simply because there is another kernel access control mechanism which allows a capability check to be skipped doesn't mean I want to allow a LSM hook to be used to skip a capability check. > It looks to me like this series can be refactored to do the same. I > wouldn't consider that to be a "bypass", but I would agree the current > series looks too much like "bypass", and makes reasoning about the > effect of the LSM hooks too "special". :)
On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > > > > implement LSM policies that could granularly enforce more restrictions on > > > > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > > > > cases. > > > > > > > > > > One of the hallmarks of the LSM has always been that it is > > > > > non-authoritative: it cannot unilaterally grant access, it can only > > > > > restrict what would have been otherwise permitted on a traditional > > > > > Linux system. Put another way, a LSM should not undermine the Linux > > > > > discretionary access controls, e.g. capabilities. > > > > > > > > > > If there is a problem with the eBPF capability-based access controls, > > > > > that problem needs to be addressed in how the core eBPF code > > > > > implements its capability checks, not by modifying the LSM mechanism > > > > > to bypass these checks. > > > > > > > > I think semantics matter here. I wouldn't view this as _bypassing_ > > > > capability enforcement: it's just more fine-grained access control. Exactly. One of the motivations for this work was the need to move some production use cases that are only needing extra privileges so that they can use BPF into a more restrictive environment. Granting CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN to all such use cases that need them for BPF usage is too coarse grained. These caps would allow those applications way more than just BPF usage. So the idea here is more finer-grained control of BPF-specific operations, granting *effective* CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN caps dynamically based on custom production logic that would validate the use case. This *is* an attempt to achieve a more secure production approach. > > > > > > > > For example, in many places we have things like: > > > > > > > > if (!some_check(...) && !capable(...)) > > > > return -EPERM; > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > access control requirement is met. The mismatch we have through-out the > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > yet here). > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > when it returns a positive value "bypasses kernel checks". The patch > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > bypassing a capability check, but the description claims that to be > > > the case. > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > be used to provide additional access control restrictions beyond a > > > capability check, but a LSM hook should never be allowed to overrule > > > an access denial due to a capability check. > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > would be fine-grained enough at the time. > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > argument that one of the reasons LSMs exist is to provide > > > supplementary controls due to capability-based access controls being a > > > poor fit for many modern use cases. > > > > I generally agree with what you say, but we DO have this code pattern: > > > > if (!some_check(...) && !capable(...)) > > return -EPERM; > > I think we need to make this more concrete; we don't have a pattern in > the upstream kernel where 'some_check(...)' is a LSM hook, right? > Simply because there is another kernel access control mechanism which > allows a capability check to be skipped doesn't mean I want to allow a > LSM hook to be used to skip a capability check. This work is an attempt to tighten the security of production systems by allowing to drop too coarse-grained and permissive capabilities (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more than production use cases are meant to be able to do) and then grant specific BPF operations on specific BPF programs/maps based on custom LSM security policy, which validates application trustworthiness using custom production-specific logic. Isn't this goal in line with LSMs mission to enhance system security? > > > It looks to me like this series can be refactored to do the same. I > > wouldn't consider that to be a "bypass", but I would agree the current > > series looks too much like "bypass", and makes reasoning about the > > effect of the LSM hooks too "special". :) Sorry, I didn't realize that the current code layout is making things more confusing. I'll address feedback to make the intent a bit clearer. > > -- > paul-moore.com
On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: ... > > > > > For example, in many places we have things like: > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > return -EPERM; > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > access control requirement is met. The mismatch we have through-out the > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > yet here). > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > bypassing a capability check, but the description claims that to be > > > > the case. > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > be used to provide additional access control restrictions beyond a > > > > capability check, but a LSM hook should never be allowed to overrule > > > > an access denial due to a capability check. > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > would be fine-grained enough at the time. > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > argument that one of the reasons LSMs exist is to provide > > > > supplementary controls due to capability-based access controls being a > > > > poor fit for many modern use cases. > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > if (!some_check(...) && !capable(...)) > > > return -EPERM; > > > > I think we need to make this more concrete; we don't have a pattern in > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > Simply because there is another kernel access control mechanism which > > allows a capability check to be skipped doesn't mean I want to allow a > > LSM hook to be used to skip a capability check. > > This work is an attempt to tighten the security of production systems > by allowing to drop too coarse-grained and permissive capabilities > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > than production use cases are meant to be able to do) and then grant > specific BPF operations on specific BPF programs/maps based on custom > LSM security policy, which validates application trustworthiness using > custom production-specific logic. There are ways to leverage the LSMs to apply finer grained access control on top of the relatively coarse capabilities that do not require circumventing those capability controls. One grants the capabilities, just as one would do today, and then leverages the security functionality of a LSM to further restrict specific users, applications, etc. with a level of granularity beyond that offered by the capability controls.
On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > > On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > ... > > > > > > > For example, in many places we have things like: > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > return -EPERM; > > > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > > access control requirement is met. The mismatch we have through-out the > > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > > yet here). > > > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > > bypassing a capability check, but the description claims that to be > > > > > the case. > > > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > > be used to provide additional access control restrictions beyond a > > > > > capability check, but a LSM hook should never be allowed to overrule > > > > > an access denial due to a capability check. > > > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > > would be fine-grained enough at the time. > > > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > > argument that one of the reasons LSMs exist is to provide > > > > > supplementary controls due to capability-based access controls being a > > > > > poor fit for many modern use cases. > > > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > > > if (!some_check(...) && !capable(...)) > > > > return -EPERM; > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > Simply because there is another kernel access control mechanism which > > > allows a capability check to be skipped doesn't mean I want to allow a > > > LSM hook to be used to skip a capability check. > > > > This work is an attempt to tighten the security of production systems > > by allowing to drop too coarse-grained and permissive capabilities > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > than production use cases are meant to be able to do) and then grant > > specific BPF operations on specific BPF programs/maps based on custom > > LSM security policy, which validates application trustworthiness using > > custom production-specific logic. > > There are ways to leverage the LSMs to apply finer grained access > control on top of the relatively coarse capabilities that do not > require circumventing those capability controls. One grants the > capabilities, just as one would do today, and then leverages the > security functionality of a LSM to further restrict specific users, > applications, etc. with a level of granularity beyond that offered by > the capability controls. Please help me understand something. What you and Casey are proposing, when taken to the logical extreme, is to grant to all processes root permissions and then use LSM to restrict specific actions, do I understand correctly? This strikes me as a less secure and more error-prone way of doing things. If there is some problem with installing LSM policy, it could go unnoticed for a really long time, while the system would be way more vulnerable. Why do you prefer such an approach instead of going with no extra permissions by default, but allowing custom LSM policy to grant few exceptions for known and trusted use cases? By the way, even the above proposal of yours doesn't work for production use cases when user namespaces are involved, as far as I understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for containers running inside user namespaces, as CAP_BPF in non-init namespace is not enough for bpf() syscall to allow loading BPF maps or BPF program (bpf() doesn't do ns_capable(), it's only using capable()). What solution would you suggest for such production setups? Also, in previous email you said: > Simply because there is another kernel access control mechanism which > allows a capability check to be skipped doesn't mean I want to allow a > LSM hook to be used to skip a capability check. I understand your stated position, but can you please help me understand the reasoning behind it? What would be wrong with some LSM hooks granting effective capabilities? How would that change anything about LSM design? As far as I can see, I'm not doing anything crazy with my LSM hook implementation. It's reusing the standard call_int_hook() mechanism very straightforwardly with a default result of 0. And then just interprets 0, <0, and >0 results accordingly. Is that abusing the LSM mechanism itself somehow? Does the above also mean that you'd be fine if we just don't plug into the LSM subsystem at all and instead come up with some ad-hoc solution to allow effectively the same policies? This sounds detrimental both to LSM and BPF subsystems, so I hope we can talk this through before finalizing decisions. Lastly, you mentioned before: > > > I think we need to make this more concrete; we don't have a pattern in > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? Unfortunately I don't have enough familiarity with all LSM hooks, so I can't confirm or disprove the above statement. But earlier someone brought to my attention the case of security_vm_enough_memory_mm(), which seems to be granting effectively CAP_SYS_ADMIN for the purposes of memory accounting. Am I missing something subtle there or does it grant effective caps indeed? > > -- > paul-moore.com
On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > > On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > ... > > > > > > > > > For example, in many places we have things like: > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > return -EPERM; > > > > > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > > > access control requirement is met. The mismatch we have through-out the > > > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > > > yet here). > > > > > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > > > bypassing a capability check, but the description claims that to be > > > > > > the case. > > > > > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > > > be used to provide additional access control restrictions beyond a > > > > > > capability check, but a LSM hook should never be allowed to overrule > > > > > > an access denial due to a capability check. > > > > > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > > > would be fine-grained enough at the time. > > > > > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > > > argument that one of the reasons LSMs exist is to provide > > > > > > supplementary controls due to capability-based access controls being a > > > > > > poor fit for many modern use cases. > > > > > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > return -EPERM; > > > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > > Simply because there is another kernel access control mechanism which > > > > allows a capability check to be skipped doesn't mean I want to allow a > > > > LSM hook to be used to skip a capability check. > > > > > > This work is an attempt to tighten the security of production systems > > > by allowing to drop too coarse-grained and permissive capabilities > > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > > than production use cases are meant to be able to do) and then grant > > > specific BPF operations on specific BPF programs/maps based on custom > > > LSM security policy, which validates application trustworthiness using > > > custom production-specific logic. > > > > There are ways to leverage the LSMs to apply finer grained access > > control on top of the relatively coarse capabilities that do not > > require circumventing those capability controls. One grants the > > capabilities, just as one would do today, and then leverages the > > security functionality of a LSM to further restrict specific users, > > applications, etc. with a level of granularity beyond that offered by > > the capability controls. > > Please help me understand something. What you and Casey are proposing, > when taken to the logical extreme, is to grant to all processes root > permissions and then use LSM to restrict specific actions, do I > understand correctly? This strikes me as a less secure and more > error-prone way of doing things. When taken to the "logical extreme" most concepts end up sounding a bit absurd, but that was the point, wasn't it? Here is a fun story which seems relevant ... in the early days of SELinux, one of the community devs setup up a system with a SELinux policy which restricted all privileged operations from the root user, put the system on a publicly accessible network, posted the root password for all to see, and invited the public to login to the system and attempt to exercise root privilege (it's been well over 10 years at this point so the details are a bit fuzzy). Granted, there were some hiccups in the beginning, mostly due to the crude state of policy development/analysis at the time, but after a few policy revisions the system held up quite well. On the more practical side of things, there are several use cases which require, by way of legal or contractual requirements, that full root/admin privileges are decomposed into separate roles: security admin, audit admin, backup admin, etc. These users satisfy these requirements by using LSMs, such as SELinux, to restrict the administrative capabilities based on the SELinux user/role/domain. > By the way, even the above proposal of yours doesn't work for > production use cases when user namespaces are involved, as far as I > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > containers running inside user namespaces, as CAP_BPF in non-init > namespace is not enough for bpf() syscall to allow loading BPF maps or > BPF program ... Once again, the LSM has always intended to be a restrictive mechanism, not a privilege granting mechanism. If an operation is not possible without the LSM layer enabled, it should not be possible with the LSM layer enabled. The LSM is not a mechanism to circumvent other access control mechanisms in the kernel. > Also, in previous email you said: > > > Simply because there is another kernel access control mechanism which > > allows a capability check to be skipped doesn't mean I want to allow a > > LSM hook to be used to skip a capability check. > > I understand your stated position, but can you please help me > understand the reasoning behind it? Keeping the LSM as a restrictive access control mechanism helps ensure some level of sanity and consistency across different Linux installations. If a certain operation requires CAP_SYS_ADMIN on one Linux system, it should require CAP_SYS_ADMIN on another Linux system. Granted, a LSM running on one system might impose additional constraints on that operation, but the CAP_SYS_ADMIN requirement still applies. There is also an issue of safety in knowing that enabling a LSM will not degrade the access controls on a system by potentially granting operations that were previously denied. > Does the above also mean that you'd be fine if we just don't plug into > the LSM subsystem at all and instead come up with some ad-hoc solution > to allow effectively the same policies? This sounds detrimental both > to LSM and BPF subsystems, so I hope we can talk this through before > finalizing decisions. Based on your patches and our discussion, it seems to me that the problem you are trying to resolve is related more to the capability-based access controls in the eBPF, and possibly other kernel subsystems, and not any LSM-based restrictions. I'm happy to work with you on a solution involving the LSM, but please understand that I'm not going to support a solution which changes a core philosophy of the LSM layer. > Lastly, you mentioned before: > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > Unfortunately I don't have enough familiarity with all LSM hooks, so I > can't confirm or disprove the above statement. But earlier someone > brought to my attention the case of security_vm_enough_memory_mm(), > which seems to be granting effectively CAP_SYS_ADMIN for the purposes > of memory accounting. Am I missing something subtle there or does it > grant effective caps indeed? Some of the comments around that hook can be misleading, but if you look at the actual code it starts to make more sense. First, look at the LSM-disabled case and you'll see that the security_vm_enough_memory_mm() hook ends up looking like this: int security_vm_enough_memory_mm(...) { return __vm_enough_memory(mm, pages, cap_vm_enough_memory(mm, pages)); } ... which basically calls into the core capability code to check for CAP_SYS_ADMIN, passing the result onto __vm_enough_memory. If we then look at the LSM-enabled case, things are a little more complicated, but it looks something like this: int security_vm_enough_memory_mm(...) { int cap_admin = 1; for_each_lsm_hook(...) { rc = lsm_hook(...); if (rc <= 0) { cap_admin = 0; break; } } return __vm_enough_memory(mm, pages, cap_admin); } ... which as the comment says, "If all of the modules agree that it should be set it will. If any module thinks it should not be set it won't.". However, if we look at which LSMs define vm_enough_memory() hooks we see just two: the capability LSM, and SELinux. The capability LSM[1] just uses cap_vm_enough_memory() so that's straightforward, and the SELinux hook is selinux_vm_enough_memory(), which simply checks the loaded SELinux policy to see if the current task has permission to exercise the CAP_SYS_ADMIN capability. SELinux can't grant CAP_SYS_ADMIN beyond what the capability code permits, it only restricts its use. Put another way, if the capability code does not allow CAP_SYS_ADMIN in a call to security_vm_enough_memory() then CAP_SYS_ADMIN will not be granted regardless of what the other LSMs may decide. I do agree that the security_vm_enough_memory() hook is structured a bit differently than most of the other LSM hooks, but it still operates with the same philosophy: a LSM should only be allowed to restrict access, a LSM should never be allowed to grant access that would otherwise be denied by the traditional Linux access controls. Hopefully that explanation makes sense, but if things are still a bit fuzzy I would encourage you to go look at the code, I'm sure it will make sense once you spend a few minutes figuring out how it works. [1] There is a long and sorta bizarre history with the capability LSM, but just understand it is a bit "special" in many ways, and those "special" behaviors are intentional. -- paul-moore.com
On 4/12/2023 6:43 PM, Andrii Nakryiko wrote: > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: >> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: >>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: >>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: >>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: >>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: >>>>>>> Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which >>>>>>> are meant to allow highly-granular LSM-based control over the usage of BPF >>>>>>> subsytem. Specifically, to control the creation of BPF maps and BTF data >>>>>>> objects, which are fundamental building blocks of any modern BPF application. >>>>>>> >>>>>>> These new hooks are able to override default kernel-side CAP_BPF-based (and >>>>>>> sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to >>>>>>> implement LSM policies that could granularly enforce more restrictions on >>>>>>> a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN >>>>>>> capabilities), but also, importantly, allow to *bypass kernel-side >>>>>>> enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use >>>>>>> cases. >>>>>> One of the hallmarks of the LSM has always been that it is >>>>>> non-authoritative: it cannot unilaterally grant access, it can only >>>>>> restrict what would have been otherwise permitted on a traditional >>>>>> Linux system. Put another way, a LSM should not undermine the Linux >>>>>> discretionary access controls, e.g. capabilities. >>>>>> >>>>>> If there is a problem with the eBPF capability-based access controls, >>>>>> that problem needs to be addressed in how the core eBPF code >>>>>> implements its capability checks, not by modifying the LSM mechanism >>>>>> to bypass these checks. >>>>> I think semantics matter here. I wouldn't view this as _bypassing_ >>>>> capability enforcement: it's just more fine-grained access control. > Exactly. One of the motivations for this work was the need to move > some production use cases that are only needing extra privileges so > that they can use BPF into a more restrictive environment. Granting > CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN to all such use cases that need them > for BPF usage is too coarse grained. These caps would allow those > applications way more than just BPF usage. So the idea here is more > finer-grained control of BPF-specific operations, granting *effective* > CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN caps dynamically based on custom > production logic that would validate the use case. That's an authoritative model which is in direct conflict with the design and implementation of both capabilities and LSM. > > This *is* an attempt to achieve a more secure production approach. > >>>>> For example, in many places we have things like: >>>>> >>>>> if (!some_check(...) && !capable(...)) >>>>> return -EPERM; >>>>> >>>>> I would expect this is a similar logic. An operation can succeed if the >>>>> access control requirement is met. The mismatch we have through-out the >>>>> kernel is that capability checks aren't strictly done by LSM hooks. And >>>>> this series conceptually, I think, doesn't violate that -- it's changing >>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks >>>>> yet here). >>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which >>>> when it returns a positive value "bypasses kernel checks". The patch >>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is >>>> based on a eBPF tree, so I can't say with 100% certainty that it is >>>> bypassing a capability check, but the description claims that to be >>>> the case. >>>> >>>> Regardless of how you want to spin this, I'm not supportive of a LSM >>>> hook which allows a LSM to bypass a capability check. A LSM hook can >>>> be used to provide additional access control restrictions beyond a >>>> capability check, but a LSM hook should never be allowed to overrule >>>> an access denial due to a capability check. >>>> >>>>> The reason CAP_BPF was created was because there was nothing else that >>>>> would be fine-grained enough at the time. >>>> The LSM layer predates CAP_BPF, and one could make a very solid >>>> argument that one of the reasons LSMs exist is to provide >>>> supplementary controls due to capability-based access controls being a >>>> poor fit for many modern use cases. >>> I generally agree with what you say, but we DO have this code pattern: >>> >>> if (!some_check(...) && !capable(...)) >>> return -EPERM; >> I think we need to make this more concrete; we don't have a pattern in >> the upstream kernel where 'some_check(...)' is a LSM hook, right? >> Simply because there is another kernel access control mechanism which >> allows a capability check to be skipped doesn't mean I want to allow a >> LSM hook to be used to skip a capability check. > This work is an attempt to tighten the security of production systems > by allowing to drop too coarse-grained and permissive capabilities > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > than production use cases are meant to be able to do) The BPF developers are in complete control of what CAP_BPF controls. You can easily address the granularity issue by adding addition restrictions on processes that have CAP_BPF. That is the intended use of LSM. The whole point of having multiple capabilities is so that you can grant just those that are required by the system security policy, and do so safely. That leads to differences of opinion regarding the definition of the system security policy. BPF chose to set itself up as an element of security policy (you need CAP_BPF) rather than define elements such that existing capabilities (CAP_FOWNER, CAP_KILL, CAP_MAC_OVERRIDE, ...) would control. > and then grant > specific BPF operations on specific BPF programs/maps based on custom > LSM security policy, This is backwards. The correct implementation is to require CAP_BPF and further restrict BPF operations based on a custom LSM security policy. That's how LSM is designed. > which validates application trustworthiness using > custom production-specific logic. > > Isn't this goal in line with LSMs mission to enhance system security? We're not arguing the goal, we're discussing the implementation. >>> It looks to me like this series can be refactored to do the same. I >>> wouldn't consider that to be a "bypass", but I would agree the current >>> series looks too much like "bypass", and makes reasoning about the >>> effect of the LSM hooks too "special". :) > Sorry, I didn't realize that the current code layout is making things > more confusing. I'll address feedback to make the intent a bit > clearer. > >> -- >> paul-moore.com
On 4/12/2023 10:16 PM, Andrii Nakryiko wrote: > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: >> On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko >> <andrii.nakryiko@gmail.com> wrote: >>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: >>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: >>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: >>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: >>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: >>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: >> ... >> >>>>>>> For example, in many places we have things like: >>>>>>> >>>>>>> if (!some_check(...) && !capable(...)) >>>>>>> return -EPERM; >>>>>>> >>>>>>> I would expect this is a similar logic. An operation can succeed if the >>>>>>> access control requirement is met. The mismatch we have through-out the >>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And >>>>>>> this series conceptually, I think, doesn't violate that -- it's changing >>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks >>>>>>> yet here). >>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which >>>>>> when it returns a positive value "bypasses kernel checks". The patch >>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is >>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is >>>>>> bypassing a capability check, but the description claims that to be >>>>>> the case. >>>>>> >>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM >>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can >>>>>> be used to provide additional access control restrictions beyond a >>>>>> capability check, but a LSM hook should never be allowed to overrule >>>>>> an access denial due to a capability check. >>>>>> >>>>>>> The reason CAP_BPF was created was because there was nothing else that >>>>>>> would be fine-grained enough at the time. >>>>>> The LSM layer predates CAP_BPF, and one could make a very solid >>>>>> argument that one of the reasons LSMs exist is to provide >>>>>> supplementary controls due to capability-based access controls being a >>>>>> poor fit for many modern use cases. >>>>> I generally agree with what you say, but we DO have this code pattern: >>>>> >>>>> if (!some_check(...) && !capable(...)) >>>>> return -EPERM; >>>> I think we need to make this more concrete; we don't have a pattern in >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? >>>> Simply because there is another kernel access control mechanism which >>>> allows a capability check to be skipped doesn't mean I want to allow a >>>> LSM hook to be used to skip a capability check. >>> This work is an attempt to tighten the security of production systems >>> by allowing to drop too coarse-grained and permissive capabilities >>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more >>> than production use cases are meant to be able to do) and then grant >>> specific BPF operations on specific BPF programs/maps based on custom >>> LSM security policy, which validates application trustworthiness using >>> custom production-specific logic. >> There are ways to leverage the LSMs to apply finer grained access >> control on top of the relatively coarse capabilities that do not >> require circumventing those capability controls. One grants the >> capabilities, just as one would do today, and then leverages the >> security functionality of a LSM to further restrict specific users, >> applications, etc. with a level of granularity beyond that offered by >> the capability controls. > Please help me understand something. What you and Casey are proposing, > when taken to the logical extreme, is to grant to all processes root > permissions and then use LSM to restrict specific actions, do I > understand correctly? No. You grant a process the capabilities it needs (CAP_BPF, CAP_WHATEVER) and only those capabilities. If you want additional restrictions you include an LSM that implements those restrictions. If you want finer control over the operations controlled by CAP_BPF you include an LSM that implements those controls. > This strikes me as a less secure and more > error-prone way of doing things. If there is some problem with > installing LSM policy, LSMs are not required to have loadable or dynamic policies. That's up to the developer. > it could go unnoticed for a really long time, > while the system would be way more vulnerable. There is no way Paul or I are going to solve the mis-configured system problem. > Why do you prefer such > an approach instead of going with no extra permissions by default, but > allowing custom LSM policy to grant few exceptions for known and > trusted use cases? Because that's not how capabilities work. Capabilities are independent of other controls. If you want to propose a change to how capabilities work, you need to propose that to the capability maintainer. Because that's not how LSMs work. LSMs implement additional restrictions to the existing policy. The restrictive vs. authoritative debate was closed long ago. It's a fundamental property of how LSMs work. > By the way, even the above proposal of yours doesn't work for > production use cases when user namespaces are involved, as far as I > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > containers running inside user namespaces, as CAP_BPF in non-init > namespace is not enough for bpf() syscall to allow loading BPF maps or > BPF program (bpf() doesn't do ns_capable(), it's only using > capable()). What solution would you suggest for such production > setups? If user namespaces don't work the way you'd like, you should take that up with the namespace maintainers. Or, since this appears to be an issue with BPF not being namespace aware, fix BPF's use of capable() and ns_capable(). > Also, in previous email you said: > >> Simply because there is another kernel access control mechanism which >> allows a capability check to be skipped doesn't mean I want to allow a >> LSM hook to be used to skip a capability check. > I understand your stated position, but can you please help me > understand the reasoning behind it? What would be wrong with some LSM > hooks granting effective capabilities? You keep asking the question and ignoring the answer. See above. > How would that change anything > about LSM design? As far as I can see, I'm not doing anything crazy > with my LSM hook implementation. You keep asking the question and ignoring the answer. See above. > It's reusing the standard > call_int_hook() mechanism very straightforwardly with a default result > of 0. And then just interprets 0, <0, and >0 results accordingly. Is > that abusing the LSM mechanism itself somehow? > > Does the above also mean that you'd be fine if we just don't plug into > the LSM subsystem at all and instead come up with some ad-hoc solution > to allow effectively the same policies? No, because you would be breaking the capability system in that case. There is an example of a feature that does just what you're suggesting. POSIX ACLs aren't an LSM because they don't just add restrictions, they change the semantics of the file mode bits. Look at that implementation before you seriously consider going that route. > This sounds detrimental both > to LSM and BPF subsystems, so I hope we can talk this through before > finalizing decisions. > > Lastly, you mentioned before: > >>>> I think we need to make this more concrete; we don't have a pattern in >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > Unfortunately I don't have enough familiarity with all LSM hooks, so I > can't confirm or disprove the above statement. But earlier someone > brought to my attention the case of security_vm_enough_memory_mm(), > which seems to be granting effectively CAP_SYS_ADMIN for the purposes > of memory accounting. Am I missing something subtle there or does it > grant effective caps indeed? > > > > >> -- >> paul-moore.com
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > Why do you prefer such > an approach instead of going with no extra permissions by default, but > allowing custom LSM policy to grant few exceptions for known and > trusted use cases? Should you be curious, you can find some of the history of the "no authoritative hooks" policy at: https://lwn.net/2001/1108/kernel.php3 It was fairly heatedly discussed at the time. jon
On Wed, Apr 12, 2023 at 10:47:13AM -0700, Kees Cook wrote: Hi, I hope the week is ending well for everyone. > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > On Wed, Apr 12, 2023 at 12:33???AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > implement LSM policies that could granularly enforce more restrictions on > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > cases. > > > > One of the hallmarks of the LSM has always been that it is > > non-authoritative: it cannot unilaterally grant access, it can only > > restrict what would have been otherwise permitted on a traditional > > Linux system. Put another way, a LSM should not undermine the Linux > > discretionary access controls, e.g. capabilities. > > > > If there is a problem with the eBPF capability-based access controls, > > that problem needs to be addressed in how the core eBPF code > > implements its capability checks, not by modifying the LSM mechanism > > to bypass these checks. > I think semantics matter here. I wouldn't view this as _bypassing_ > capability enforcement: it's just more fine-grained access control. > > For example, in many places we have things like: > > if (!some_check(...) && !capable(...)) > return -EPERM; > > I would expect this is a similar logic. An operation can succeed if the > access control requirement is met. The mismatch we have through-out the > kernel is that capability checks aren't strictly done by LSM hooks. And > this series conceptually, I think, doesn't violate that -- it's changing > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > yet here). > > The reason CAP_BPF was created was because there was nothing else that > would be fine-grained enough at the time. This was one of the issues, among others, that the TSEM LSM we are working to upstream, was designed to address and may be an avenue forward. TSEM, being narratival rather than deontologically based, provides a framework for security permissions that are based on a characterization of the event itself. So the permissions are as variable as the contents of whatever BPF related information is passed to the bpf* LSM hooks [1]. Currently, the tsem_bpf_* hooks are generically modeled. We would certainly entertain any discussion or suggestions as to what elements of the structures passed to the hooks would be useful with respect to establishing security policies useful and appropriate to the BPF community. We don't want to get in the middle of the restrictive vs. authoritative debate, but it would seem that the jury is conclusively in on that issue and LSM hooks are not going to be allowed to dismiss, or modify, any other security controls. Hopefully the BPF ABI isn't tied to CAP_BPF as that would seem to make it problematic to make controls more granular. > Kees Cook Have a good weekend. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity [1]: Plus developers don't need to write security policies, you test your application in order to get the desired controls for a workload.
On Thu, Apr 13, 2023 at 12:03 PM Jonathan Corbet <corbet@lwn.net> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > Why do you prefer such > > an approach instead of going with no extra permissions by default, but > > allowing custom LSM policy to grant few exceptions for known and > > trusted use cases? > > Should you be curious, you can find some of the history of the "no > authoritative hooks" policy at: > > https://lwn.net/2001/1108/kernel.php3 > > It was fairly heatedly discussed at the time. > Thanks, Jonathan! Yes, it was very useful to get a bit of context. > jon
On Thu, Apr 13, 2023 at 8:11 AM Paul Moore <paul@paul-moore.com> wrote: > > On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > > > On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > > > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > ... > > > > > > > > > > > For example, in many places we have things like: > > > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > > return -EPERM; > > > > > > > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > > > > access control requirement is met. The mismatch we have through-out the > > > > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > > > > yet here). > > > > > > > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > > > > bypassing a capability check, but the description claims that to be > > > > > > > the case. > > > > > > > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > > > > be used to provide additional access control restrictions beyond a > > > > > > > capability check, but a LSM hook should never be allowed to overrule > > > > > > > an access denial due to a capability check. > > > > > > > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > > > > would be fine-grained enough at the time. > > > > > > > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > > > > argument that one of the reasons LSMs exist is to provide > > > > > > > supplementary controls due to capability-based access controls being a > > > > > > > poor fit for many modern use cases. > > > > > > > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > return -EPERM; > > > > > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > > > Simply because there is another kernel access control mechanism which > > > > > allows a capability check to be skipped doesn't mean I want to allow a > > > > > LSM hook to be used to skip a capability check. > > > > > > > > This work is an attempt to tighten the security of production systems > > > > by allowing to drop too coarse-grained and permissive capabilities > > > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > > > than production use cases are meant to be able to do) and then grant > > > > specific BPF operations on specific BPF programs/maps based on custom > > > > LSM security policy, which validates application trustworthiness using > > > > custom production-specific logic. > > > > > > There are ways to leverage the LSMs to apply finer grained access > > > control on top of the relatively coarse capabilities that do not > > > require circumventing those capability controls. One grants the > > > capabilities, just as one would do today, and then leverages the > > > security functionality of a LSM to further restrict specific users, > > > applications, etc. with a level of granularity beyond that offered by > > > the capability controls. > > > > Please help me understand something. What you and Casey are proposing, > > when taken to the logical extreme, is to grant to all processes root > > permissions and then use LSM to restrict specific actions, do I > > understand correctly? This strikes me as a less secure and more > > error-prone way of doing things. > > When taken to the "logical extreme" most concepts end up sounding a > bit absurd, but that was the point, wasn't it? Wasn't my intent to make it sound absurd, sorry. The way I see it, for the sake of example, let's say CAP_BPF allows 20 different operations (each with its own security_xxx hook). And let's say in production I want to only allow 3 of them. Sure, technically it should be possible to deny access at 17 hooks and let it through in just those 3. But if someone adds 21st and I forget to add 21st restriction, that would be bad (but very probably with such approach). So my point is that for situations like this, dropping CAP_BPF, but allowing only 3 hooks to proceed seems a safer approach, because if we add 21st hook, it will safely be denied without CAP_BPF *by default*. That's what I tried to point out. But even if we ignore this "safe by default when a new hook is added" behavior, when taking user namespaces into account, the restrictive LSM approach just doesn't seem to work at all for something like CAP_BPF. CAP_BPF cannot be "namespaced", just like, say, CAP_SYS_TIME, because we cannot ensure that a given BPF program won't access kernel state "belonging" to another process (as one example). Now, thanks to Jonathan, I get that there was a heated discussion 20 years ago about authoritative vs restrictive LSMs. But if I read a summary at that time ([0]), authoritative hooks were not out of the question *in principle*. Surely, "walk before we can run" makes sense, but it's been a while ago. [0] https://lwn.net/2001/1108/a/no-auth-hooks.php3 > > Here is a fun story which seems relevant ... in the early days of > SELinux, one of the community devs setup up a system with a SELinux > policy which restricted all privileged operations from the root user, > put the system on a publicly accessible network, posted the root > password for all to see, and invited the public to login to the system > and attempt to exercise root privilege (it's been well over 10 years > at this point so the details are a bit fuzzy). Granted, there were > some hiccups in the beginning, mostly due to the crude state of policy > development/analysis at the time, but after a few policy revisions the > system held up quite well. Honest question out of curiosity: was the intent to demonstrate that with LSM one can completely restrict root? Or that root was actually allowed to do something useful? Because I can see how rejecting everything would be rather simple, but actually pretty useless in practice. Restricting only part of the power of the root, while still allowing it to do something useful in production seems like a much harder (but way more valuable) endeavor. Not saying it's impossible, but see my example about missing 21st new CAP_BPF functionality. > > On the more practical side of things, there are several use cases > which require, by way of legal or contractual requirements, that full > root/admin privileges are decomposed into separate roles: security > admin, audit admin, backup admin, etc. These users satisfy these > requirements by using LSMs, such as SELinux, to restrict the > administrative capabilities based on the SELinux user/role/domain. > > > By the way, even the above proposal of yours doesn't work for > > production use cases when user namespaces are involved, as far as I > > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > > containers running inside user namespaces, as CAP_BPF in non-init > > namespace is not enough for bpf() syscall to allow loading BPF maps or > > BPF program ... > > Once again, the LSM has always intended to be a restrictive mechanism, > not a privilege granting mechanism. If an operation is not possible Not according to [0] above: > It is our belief that these changes do not belong in the initial version of > LSM (especially given our limited charter and original goals), and should > be proposed as incremental refinements after LSM has been initially > accepted. > ... > It is our belief that the current LSM > will provide a meaningful improvement in the security infrastructure of the > Linux kernel, and that there is plenty of room for future expansion of LSM > in subsequent phases. I don't see "always intended to be a restrictive mechanism" there. > without the LSM layer enabled, it should not be possible with the LSM > layer enabled. The LSM is not a mechanism to circumvent other access > control mechanisms in the kernel. I understand, but it's not like we are proposing to go and bypass all kinds of random kernel security mechanisms. These are targeted hooks, developed by the BPF community for the BPF subsystem to allow trusted unprivileged production use cases. Yes, we currently rely on checking CAP_BPF to grant more dangerous/advanced features, but it's because we can't just allow any unprivileged process to do this. But what we really want is to answer the question "can we trust this process to use this advanced functionality", and if there is no specific LSM policy that cares one way (allow) or the other (disallow), fallback to CAP_BPF enforcement. So it's not bypassing kernel checks, but rather augmenting them with more flexible and customizable mechanisms, while still falling back to CAP_BPF if the user didn't install any custom LSM policy. > > > Also, in previous email you said: > > > > > Simply because there is another kernel access control mechanism which > > > allows a capability check to be skipped doesn't mean I want to allow a > > > LSM hook to be used to skip a capability check. > > > > I understand your stated position, but can you please help me > > understand the reasoning behind it? > > Keeping the LSM as a restrictive access control mechanism helps ensure > some level of sanity and consistency across different Linux > installations. If a certain operation requires CAP_SYS_ADMIN on one > Linux system, it should require CAP_SYS_ADMIN on another Linux system. > Granted, a LSM running on one system might impose additional > constraints on that operation, but the CAP_SYS_ADMIN requirement still > applies. > > There is also an issue of safety in knowing that enabling a LSM will > not degrade the access controls on a system by potentially granting > operations that were previously denied. > > > Does the above also mean that you'd be fine if we just don't plug into > > the LSM subsystem at all and instead come up with some ad-hoc solution > > to allow effectively the same policies? This sounds detrimental both > > to LSM and BPF subsystems, so I hope we can talk this through before > > finalizing decisions. > > Based on your patches and our discussion, it seems to me that the > problem you are trying to resolve is related more to the > capability-based access controls in the eBPF, and possibly other > kernel subsystems, and not any LSM-based restrictions. I'm happy to > work with you on a solution involving the LSM, but please understand > that I'm not going to support a solution which changes a core > philosophy of the LSM layer. Great, I'd really appreciate help and suggestions on how to solve the following problem. We have a BPF subsystem that allows loading BPF programs. Those BPF programs cannot be contained within a particular namespace just by its system-wide tracing nature (it can safely read kernel and user memory and we can't restrict whether that memory belongs to a particular namespace), so it's like CAP_SYS_TIME, just with much broader API surface. The other piece of a puzzle is user namespaces. We do want to run applications inside user namespaces, but allow them to use BPF programs. As far as I can tell, there is no way to grant real CAP_BPF that will be recognized by capable(CAP_BPF) (not ns_capable, see above about system-wide nature of BPF). If there is, please help me understand how. All my local experiments failed, and looking at cap_capable() implementation it is not intended to even check the initial namespace's capability if the process is running in the user namespace. So, given that a) we can't make CAP_BPF namespace-aware and b) we can't grant real CAP_BPF to processes in user namespace, how could we allow user namespaced applications to do useful work with BPF? > > > Lastly, you mentioned before: > > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > > Unfortunately I don't have enough familiarity with all LSM hooks, so I > > can't confirm or disprove the above statement. But earlier someone > > brought to my attention the case of security_vm_enough_memory_mm(), > > which seems to be granting effectively CAP_SYS_ADMIN for the purposes > > of memory accounting. Am I missing something subtle there or does it > > grant effective caps indeed? > > Some of the comments around that hook can be misleading, but if you > look at the actual code it starts to make more sense. > [...] > > I do agree that the security_vm_enough_memory() hook is structured a > bit differently than most of the other LSM hooks, but it still > operates with the same philosophy: a LSM should only be allowed to > restrict access, a LSM should never be allowed to grant access that > would otherwise be denied by the traditional Linux access controls. > > Hopefully that explanation makes sense, but if things are still a bit > fuzzy I would encourage you to go look at the code, I'm sure it will > make sense once you spend a few minutes figuring out how it works. > Yep, thanks a lot, it's way more clear after grokking relevant pieces of LSM the code you pointed out and LSM infrastructure in general. "capabilities" LSM is non-negotiable, so it effectively always restricts a small subset of hooks, including vm_enough_memory and capable. Still, the problem still stands. How do we marry BPF and user namespaces? I'd really appreciate suggestions. Thank you! > [1] There is a long and sorta bizarre history with the capability LSM, > but just understand it is a bit "special" in many ways, and those > "special" behaviors are intentional. > > -- > paul-moore.com
On Thu, Apr 13, 2023 at 9:27 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 4/12/2023 6:43 PM, Andrii Nakryiko wrote: > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > >> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > >>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > >>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > >>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > >>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > >>>>>>> Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > >>>>>>> are meant to allow highly-granular LSM-based control over the usage of BPF > >>>>>>> subsytem. Specifically, to control the creation of BPF maps and BTF data > >>>>>>> objects, which are fundamental building blocks of any modern BPF application. > >>>>>>> > >>>>>>> These new hooks are able to override default kernel-side CAP_BPF-based (and > >>>>>>> sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > >>>>>>> implement LSM policies that could granularly enforce more restrictions on > >>>>>>> a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > >>>>>>> capabilities), but also, importantly, allow to *bypass kernel-side > >>>>>>> enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > >>>>>>> cases. > >>>>>> One of the hallmarks of the LSM has always been that it is > >>>>>> non-authoritative: it cannot unilaterally grant access, it can only > >>>>>> restrict what would have been otherwise permitted on a traditional > >>>>>> Linux system. Put another way, a LSM should not undermine the Linux > >>>>>> discretionary access controls, e.g. capabilities. > >>>>>> > >>>>>> If there is a problem with the eBPF capability-based access controls, > >>>>>> that problem needs to be addressed in how the core eBPF code > >>>>>> implements its capability checks, not by modifying the LSM mechanism > >>>>>> to bypass these checks. > >>>>> I think semantics matter here. I wouldn't view this as _bypassing_ > >>>>> capability enforcement: it's just more fine-grained access control. > > Exactly. One of the motivations for this work was the need to move > > some production use cases that are only needing extra privileges so > > that they can use BPF into a more restrictive environment. Granting > > CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN to all such use cases that need them > > for BPF usage is too coarse grained. These caps would allow those > > applications way more than just BPF usage. So the idea here is more > > finer-grained control of BPF-specific operations, granting *effective* > > CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN caps dynamically based on custom > > production logic that would validate the use case. > > That's an authoritative model which is in direct conflict with the > design and implementation of both capabilities and LSM. > > > > > This *is* an attempt to achieve a more secure production approach. > > > >>>>> For example, in many places we have things like: > >>>>> > >>>>> if (!some_check(...) && !capable(...)) > >>>>> return -EPERM; > >>>>> > >>>>> I would expect this is a similar logic. An operation can succeed if the > >>>>> access control requirement is met. The mismatch we have through-out the > >>>>> kernel is that capability checks aren't strictly done by LSM hooks. And > >>>>> this series conceptually, I think, doesn't violate that -- it's changing > >>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks > >>>>> yet here). > >>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > >>>> when it returns a positive value "bypasses kernel checks". The patch > >>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is > >>>> based on a eBPF tree, so I can't say with 100% certainty that it is > >>>> bypassing a capability check, but the description claims that to be > >>>> the case. > >>>> > >>>> Regardless of how you want to spin this, I'm not supportive of a LSM > >>>> hook which allows a LSM to bypass a capability check. A LSM hook can > >>>> be used to provide additional access control restrictions beyond a > >>>> capability check, but a LSM hook should never be allowed to overrule > >>>> an access denial due to a capability check. > >>>> > >>>>> The reason CAP_BPF was created was because there was nothing else that > >>>>> would be fine-grained enough at the time. > >>>> The LSM layer predates CAP_BPF, and one could make a very solid > >>>> argument that one of the reasons LSMs exist is to provide > >>>> supplementary controls due to capability-based access controls being a > >>>> poor fit for many modern use cases. > >>> I generally agree with what you say, but we DO have this code pattern: > >>> > >>> if (!some_check(...) && !capable(...)) > >>> return -EPERM; > >> I think we need to make this more concrete; we don't have a pattern in > >> the upstream kernel where 'some_check(...)' is a LSM hook, right? > >> Simply because there is another kernel access control mechanism which > >> allows a capability check to be skipped doesn't mean I want to allow a > >> LSM hook to be used to skip a capability check. > > This work is an attempt to tighten the security of production systems > > by allowing to drop too coarse-grained and permissive capabilities > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > than production use cases are meant to be able to do) > > The BPF developers are in complete control of what CAP_BPF controls. > You can easily address the granularity issue by adding addition restrictions > on processes that have CAP_BPF. That is the intended use of LSM. > The whole point of having multiple capabilities is so that you can > grant just those that are required by the system security policy, and > do so safely. That leads to differences of opinion regarding the definition > of the system security policy. BPF chose to set itself up as an element > of security policy (you need CAP_BPF) rather than define elements such that > existing capabilities (CAP_FOWNER, CAP_KILL, CAP_MAC_OVERRIDE, ...) would > control. Please see my reply to Paul, where I explain CAP_BPF's system-wide nature and problem with user namespaces. I don't think the problem is in the granularity of CAP_BPF, it's more of a "non-namespaceable" nature of the BPF subsystem in general. > > > and then grant > > specific BPF operations on specific BPF programs/maps based on custom > > LSM security policy, > > This is backwards. The correct implementation is to require CAP_BPF and > further restrict BPF operations based on a custom LSM security policy. > That's how LSM is designed. Please see my reply to Paul, we can't grant real CAP_BPF for applications in user namespace (unless there is some trick that I don't know, so please do point it out). Let's converge the discussion in that email thread branch to not discuss the same topic multiple times. > > > which validates application trustworthiness using > > custom production-specific logic. > > > > Isn't this goal in line with LSMs mission to enhance system security? > > We're not arguing the goal, we're discussing the implementation. > > >>> It looks to me like this series can be refactored to do the same. I > >>> wouldn't consider that to be a "bypass", but I would agree the current > >>> series looks too much like "bypass", and makes reasoning about the > >>> effect of the LSM hooks too "special". :) > > Sorry, I didn't realize that the current code layout is making things > > more confusing. I'll address feedback to make the intent a bit > > clearer. > > > >> -- > >> paul-moore.com
On Thu, Apr 13, 2023 at 9:54 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 4/12/2023 10:16 PM, Andrii Nakryiko wrote: > > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > >> On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > >> <andrii.nakryiko@gmail.com> wrote: > >>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > >>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > >>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > >>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > >>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > >>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > >> ... > >> > >>>>>>> For example, in many places we have things like: > >>>>>>> > >>>>>>> if (!some_check(...) && !capable(...)) > >>>>>>> return -EPERM; > >>>>>>> > >>>>>>> I would expect this is a similar logic. An operation can succeed if the > >>>>>>> access control requirement is met. The mismatch we have through-out the > >>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And > >>>>>>> this series conceptually, I think, doesn't violate that -- it's changing > >>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks > >>>>>>> yet here). > >>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > >>>>>> when it returns a positive value "bypasses kernel checks". The patch > >>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is > >>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is > >>>>>> bypassing a capability check, but the description claims that to be > >>>>>> the case. > >>>>>> > >>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM > >>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can > >>>>>> be used to provide additional access control restrictions beyond a > >>>>>> capability check, but a LSM hook should never be allowed to overrule > >>>>>> an access denial due to a capability check. > >>>>>> > >>>>>>> The reason CAP_BPF was created was because there was nothing else that > >>>>>>> would be fine-grained enough at the time. > >>>>>> The LSM layer predates CAP_BPF, and one could make a very solid > >>>>>> argument that one of the reasons LSMs exist is to provide > >>>>>> supplementary controls due to capability-based access controls being a > >>>>>> poor fit for many modern use cases. > >>>>> I generally agree with what you say, but we DO have this code pattern: > >>>>> > >>>>> if (!some_check(...) && !capable(...)) > >>>>> return -EPERM; > >>>> I think we need to make this more concrete; we don't have a pattern in > >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > >>>> Simply because there is another kernel access control mechanism which > >>>> allows a capability check to be skipped doesn't mean I want to allow a > >>>> LSM hook to be used to skip a capability check. > >>> This work is an attempt to tighten the security of production systems > >>> by allowing to drop too coarse-grained and permissive capabilities > >>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > >>> than production use cases are meant to be able to do) and then grant > >>> specific BPF operations on specific BPF programs/maps based on custom > >>> LSM security policy, which validates application trustworthiness using > >>> custom production-specific logic. > >> There are ways to leverage the LSMs to apply finer grained access > >> control on top of the relatively coarse capabilities that do not > >> require circumventing those capability controls. One grants the > >> capabilities, just as one would do today, and then leverages the > >> security functionality of a LSM to further restrict specific users, > >> applications, etc. with a level of granularity beyond that offered by > >> the capability controls. > > Please help me understand something. What you and Casey are proposing, > > when taken to the logical extreme, is to grant to all processes root > > permissions and then use LSM to restrict specific actions, do I > > understand correctly? > > No. You grant a process the capabilities it needs (CAP_BPF, CAP_WHATEVER) > and only those capabilities. If you want additional restrictions you include > an LSM that implements those restrictions. If you want finer control over > the operations controlled by CAP_BPF you include an LSM that implements > those controls. > See previous replies. We can't grant CAP_BPF, even if we wanted to, if the process is in a user namespace. > > This strikes me as a less secure and more > > error-prone way of doing things. If there is some problem with > > installing LSM policy, > > LSMs are not required to have loadable or dynamic policies. That's > up to the developer. > Sure, but having a more dynamic policy is a very attractive feature and one of the reasons for people to use BPF LSM. So it might not be required, but it's something that people are using in practice, so if we can make all this less error-prone, that would be better for everyone. > > it could go unnoticed for a really long time, > > while the system would be way more vulnerable. > > There is no way Paul or I are going to solve the mis-configured system > problem. > Please see my example about (hypothetical) 21st added hook that is very easy to miss, because the kernel is big and there are tons of people doing development, and so it's no wonder that users might miss a new hook they are supposed to restrict. But again, even with all that said, granting CAP_BPF is impossible for user namespaced applications. > > Why do you prefer such > > an approach instead of going with no extra permissions by default, but > > allowing custom LSM policy to grant few exceptions for known and > > trusted use cases? > > Because that's not how capabilities work. Capabilities are independent > of other controls. If you want to propose a change to how capabilities > work, you need to propose that to the capability maintainer. > > Because that's not how LSMs work. LSMs implement additional restrictions > to the existing policy. The restrictive vs. authoritative debate was closed > long ago. It's a fundamental property of how LSMs work. There doesn't seem to be anything fundamentally and technically preventing LSM hooks to say "yep, looks good, no need to fallback to CAP_BPF checks due to lack of other signal". [0] also outright said that authoritative hooks can be the next step, but didn't reject it outright. [0] https://lwn.net/2001/1108/a/no-auth-hooks.php3 > > > By the way, even the above proposal of yours doesn't work for > > production use cases when user namespaces are involved, as far as I > > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > > containers running inside user namespaces, as CAP_BPF in non-init > > namespace is not enough for bpf() syscall to allow loading BPF maps or > > BPF program (bpf() doesn't do ns_capable(), it's only using > > capable()). What solution would you suggest for such production > > setups? > > If user namespaces don't work the way you'd like, you should take that > up with the namespace maintainers. Or, since this appears to be an issue > with BPF not being namespace aware, fix BPF's use of capable() and ns_capable(). Can't be fixed on the BPF side, unfortunately. Don't know enough about namespaces to tell if it's a bug or feature that root CAP_BPF can't be checked from inside userns. So yep, I should perhaps ask. > > > Also, in previous email you said: > > > >> Simply because there is another kernel access control mechanism which > >> allows a capability check to be skipped doesn't mean I want to allow a > >> LSM hook to be used to skip a capability check. > > I understand your stated position, but can you please help me > > understand the reasoning behind it? What would be wrong with some LSM > > hooks granting effective capabilities? > > You keep asking the question and ignoring the answer. See above. > > > How would that change anything > > about LSM design? As far as I can see, I'm not doing anything crazy > > with my LSM hook implementation. > > You keep asking the question and ignoring the answer. See above. > > > > It's reusing the standard > > call_int_hook() mechanism very straightforwardly with a default result > > of 0. And then just interprets 0, <0, and >0 results accordingly. Is > > that abusing the LSM mechanism itself somehow? > > > > Does the above also mean that you'd be fine if we just don't plug into > > the LSM subsystem at all and instead come up with some ad-hoc solution > > to allow effectively the same policies? > > No, because you would be breaking the capability system in that case. > > There is an example of a feature that does just what you're suggesting. > POSIX ACLs aren't an LSM because they don't just add restrictions, they > change the semantics of the file mode bits. Look at that implementation > before you seriously consider going that route. Are you referring to posix_acl_permission() and fs/posix_acl.c? I'll take a look, not familiar. Thanks for the suggestion! I'd still prefer to avoid building a new access control system just for BPF, of course. But let me take a look at the code and see what you are referring to. > > > This sounds detrimental both > > to LSM and BPF subsystems, so I hope we can talk this through before > > finalizing decisions. > > > > Lastly, you mentioned before: > > > >>>> I think we need to make this more concrete; we don't have a pattern in > >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > > Unfortunately I don't have enough familiarity with all LSM hooks, so I > > can't confirm or disprove the above statement. But earlier someone > > brought to my attention the case of security_vm_enough_memory_mm(), > > which seems to be granting effectively CAP_SYS_ADMIN for the purposes > > of memory accounting. Am I missing something subtle there or does it > > grant effective caps indeed? > > > > > > > > > >> -- > >> paul-moore.com
On Fri, Apr 14, 2023 at 1:24 PM Dr. Greg <greg@enjellic.com> wrote: > > On Wed, Apr 12, 2023 at 10:47:13AM -0700, Kees Cook wrote: > > Hi, I hope the week is ending well for everyone. > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > On Wed, Apr 12, 2023 at 12:33???AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > > implement LSM policies that could granularly enforce more restrictions on > > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > > cases. > > > > > > One of the hallmarks of the LSM has always been that it is > > > non-authoritative: it cannot unilaterally grant access, it can only > > > restrict what would have been otherwise permitted on a traditional > > > Linux system. Put another way, a LSM should not undermine the Linux > > > discretionary access controls, e.g. capabilities. > > > > > > If there is a problem with the eBPF capability-based access controls, > > > that problem needs to be addressed in how the core eBPF code > > > implements its capability checks, not by modifying the LSM mechanism > > > to bypass these checks. > > > I think semantics matter here. I wouldn't view this as _bypassing_ > > capability enforcement: it's just more fine-grained access control. > > > > For example, in many places we have things like: > > > > if (!some_check(...) && !capable(...)) > > return -EPERM; > > > > I would expect this is a similar logic. An operation can succeed if the > > access control requirement is met. The mismatch we have through-out the > > kernel is that capability checks aren't strictly done by LSM hooks. And > > this series conceptually, I think, doesn't violate that -- it's changing > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > yet here). > > > > The reason CAP_BPF was created was because there was nothing else that > > would be fine-grained enough at the time. > > This was one of the issues, among others, that the TSEM LSM we are > working to upstream, was designed to address and may be an avenue > forward. > > TSEM, being narratival rather than deontologically based, provides a > framework for security permissions that are based on a > characterization of the event itself. So the permissions are as > variable as the contents of whatever BPF related information is passed > to the bpf* LSM hooks [1]. > > Currently, the tsem_bpf_* hooks are generically modeled. We would > certainly entertain any discussion or suggestions as to what elements > of the structures passed to the hooks would be useful with respect > to establishing security policies useful and appropriate to the BPF > community. Could you please provide some links to get a bit more context and information? I'd like to understand at least "narratival rather than deontologically based" part of this. > > We don't want to get in the middle of the restrictive > vs. authoritative debate, but it would seem that the jury is > conclusively in on that issue and LSM hooks are not going to be > allowed to dismiss, or modify, any other security controls. > > Hopefully the BPF ABI isn't tied to CAP_BPF as that would seem to make > it problematic to make controls more granular. > > > Kees Cook > > Have a good weekend. > > As always, > Dr. Greg > > The Quixote Project - Flailing at the Travails of Cybersecurity > > [1]: Plus developers don't need to write security policies, you test > your application in order to get the desired controls for a workload.
On 4/17/2023 4:31 PM, Andrii Nakryiko wrote: > On Thu, Apr 13, 2023 at 9:27 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >> On 4/12/2023 6:43 PM, Andrii Nakryiko wrote: >>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: >>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: >>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: >>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: >>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: >>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: >>>>>>>>> Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which >>>>>>>>> are meant to allow highly-granular LSM-based control over the usage of BPF >>>>>>>>> subsytem. Specifically, to control the creation of BPF maps and BTF data >>>>>>>>> objects, which are fundamental building blocks of any modern BPF application. >>>>>>>>> >>>>>>>>> These new hooks are able to override default kernel-side CAP_BPF-based (and >>>>>>>>> sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to >>>>>>>>> implement LSM policies that could granularly enforce more restrictions on >>>>>>>>> a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN >>>>>>>>> capabilities), but also, importantly, allow to *bypass kernel-side >>>>>>>>> enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use >>>>>>>>> cases. >>>>>>>> One of the hallmarks of the LSM has always been that it is >>>>>>>> non-authoritative: it cannot unilaterally grant access, it can only >>>>>>>> restrict what would have been otherwise permitted on a traditional >>>>>>>> Linux system. Put another way, a LSM should not undermine the Linux >>>>>>>> discretionary access controls, e.g. capabilities. >>>>>>>> >>>>>>>> If there is a problem with the eBPF capability-based access controls, >>>>>>>> that problem needs to be addressed in how the core eBPF code >>>>>>>> implements its capability checks, not by modifying the LSM mechanism >>>>>>>> to bypass these checks. >>>>>>> I think semantics matter here. I wouldn't view this as _bypassing_ >>>>>>> capability enforcement: it's just more fine-grained access control. >>> Exactly. One of the motivations for this work was the need to move >>> some production use cases that are only needing extra privileges so >>> that they can use BPF into a more restrictive environment. Granting >>> CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN to all such use cases that need them >>> for BPF usage is too coarse grained. These caps would allow those >>> applications way more than just BPF usage. So the idea here is more >>> finer-grained control of BPF-specific operations, granting *effective* >>> CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN caps dynamically based on custom >>> production logic that would validate the use case. >> That's an authoritative model which is in direct conflict with the >> design and implementation of both capabilities and LSM. >> >>> This *is* an attempt to achieve a more secure production approach. >>> >>>>>>> For example, in many places we have things like: >>>>>>> >>>>>>> if (!some_check(...) && !capable(...)) >>>>>>> return -EPERM; >>>>>>> >>>>>>> I would expect this is a similar logic. An operation can succeed if the >>>>>>> access control requirement is met. The mismatch we have through-out the >>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And >>>>>>> this series conceptually, I think, doesn't violate that -- it's changing >>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks >>>>>>> yet here). >>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which >>>>>> when it returns a positive value "bypasses kernel checks". The patch >>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is >>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is >>>>>> bypassing a capability check, but the description claims that to be >>>>>> the case. >>>>>> >>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM >>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can >>>>>> be used to provide additional access control restrictions beyond a >>>>>> capability check, but a LSM hook should never be allowed to overrule >>>>>> an access denial due to a capability check. >>>>>> >>>>>>> The reason CAP_BPF was created was because there was nothing else that >>>>>>> would be fine-grained enough at the time. >>>>>> The LSM layer predates CAP_BPF, and one could make a very solid >>>>>> argument that one of the reasons LSMs exist is to provide >>>>>> supplementary controls due to capability-based access controls being a >>>>>> poor fit for many modern use cases. >>>>> I generally agree with what you say, but we DO have this code pattern: >>>>> >>>>> if (!some_check(...) && !capable(...)) >>>>> return -EPERM; >>>> I think we need to make this more concrete; we don't have a pattern in >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? >>>> Simply because there is another kernel access control mechanism which >>>> allows a capability check to be skipped doesn't mean I want to allow a >>>> LSM hook to be used to skip a capability check. >>> This work is an attempt to tighten the security of production systems >>> by allowing to drop too coarse-grained and permissive capabilities >>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more >>> than production use cases are meant to be able to do) >> The BPF developers are in complete control of what CAP_BPF controls. >> You can easily address the granularity issue by adding addition restrictions >> on processes that have CAP_BPF. That is the intended use of LSM. >> The whole point of having multiple capabilities is so that you can >> grant just those that are required by the system security policy, and >> do so safely. That leads to differences of opinion regarding the definition >> of the system security policy. BPF chose to set itself up as an element >> of security policy (you need CAP_BPF) rather than define elements such that >> existing capabilities (CAP_FOWNER, CAP_KILL, CAP_MAC_OVERRIDE, ...) would >> control. > Please see my reply to Paul, where I explain CAP_BPF's system-wide > nature and problem with user namespaces. I don't think the problem is > in the granularity of CAP_BPF, it's more of a "non-namespaceable" > nature of the BPF subsystem in general. Paul is approaching this from a different angle. Your response to Paul does not address the issue I have raised. >>> and then grant >>> specific BPF operations on specific BPF programs/maps based on custom >>> LSM security policy, >> This is backwards. The correct implementation is to require CAP_BPF and >> further restrict BPF operations based on a custom LSM security policy. >> That's how LSM is designed. > Please see my reply to Paul, we can't grant real CAP_BPF for > applications in user namespace (unless there is some trick that I > don't know, so please do point it out). Let's converge the discussion > in that email thread branch to not discuss the same topic multiple > times. I saw your reply to Paul. Paul's points are not my points. If they where, I wouldn't have taken my or your time to present them. >>> which validates application trustworthiness using >>> custom production-specific logic. >>> >>> Isn't this goal in line with LSMs mission to enhance system security? >> We're not arguing the goal, we're discussing the implementation. >> >>>>> It looks to me like this series can be refactored to do the same. I >>>>> wouldn't consider that to be a "bypass", but I would agree the current >>>>> series looks too much like "bypass", and makes reasoning about the >>>>> effect of the LSM hooks too "special". :) >>> Sorry, I didn't realize that the current code layout is making things >>> more confusing. I'll address feedback to make the intent a bit >>> clearer. >>> >>>> -- >>>> paul-moore.com
On Mon, Apr 17, 2023 at 4:53 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 4/17/2023 4:31 PM, Andrii Nakryiko wrote: > > On Thu, Apr 13, 2023 at 9:27 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > >> On 4/12/2023 6:43 PM, Andrii Nakryiko wrote: > >>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > >>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > >>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > >>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > >>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > >>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > >>>>>>>>> Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > >>>>>>>>> are meant to allow highly-granular LSM-based control over the usage of BPF > >>>>>>>>> subsytem. Specifically, to control the creation of BPF maps and BTF data > >>>>>>>>> objects, which are fundamental building blocks of any modern BPF application. > >>>>>>>>> > >>>>>>>>> These new hooks are able to override default kernel-side CAP_BPF-based (and > >>>>>>>>> sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > >>>>>>>>> implement LSM policies that could granularly enforce more restrictions on > >>>>>>>>> a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > >>>>>>>>> capabilities), but also, importantly, allow to *bypass kernel-side > >>>>>>>>> enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > >>>>>>>>> cases. > >>>>>>>> One of the hallmarks of the LSM has always been that it is > >>>>>>>> non-authoritative: it cannot unilaterally grant access, it can only > >>>>>>>> restrict what would have been otherwise permitted on a traditional > >>>>>>>> Linux system. Put another way, a LSM should not undermine the Linux > >>>>>>>> discretionary access controls, e.g. capabilities. > >>>>>>>> > >>>>>>>> If there is a problem with the eBPF capability-based access controls, > >>>>>>>> that problem needs to be addressed in how the core eBPF code > >>>>>>>> implements its capability checks, not by modifying the LSM mechanism > >>>>>>>> to bypass these checks. > >>>>>>> I think semantics matter here. I wouldn't view this as _bypassing_ > >>>>>>> capability enforcement: it's just more fine-grained access control. > >>> Exactly. One of the motivations for this work was the need to move > >>> some production use cases that are only needing extra privileges so > >>> that they can use BPF into a more restrictive environment. Granting > >>> CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN to all such use cases that need them > >>> for BPF usage is too coarse grained. These caps would allow those > >>> applications way more than just BPF usage. So the idea here is more > >>> finer-grained control of BPF-specific operations, granting *effective* > >>> CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN caps dynamically based on custom > >>> production logic that would validate the use case. > >> That's an authoritative model which is in direct conflict with the > >> design and implementation of both capabilities and LSM. > >> > >>> This *is* an attempt to achieve a more secure production approach. > >>> > >>>>>>> For example, in many places we have things like: > >>>>>>> > >>>>>>> if (!some_check(...) && !capable(...)) > >>>>>>> return -EPERM; > >>>>>>> > >>>>>>> I would expect this is a similar logic. An operation can succeed if the > >>>>>>> access control requirement is met. The mismatch we have through-out the > >>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And > >>>>>>> this series conceptually, I think, doesn't violate that -- it's changing > >>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks > >>>>>>> yet here). > >>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > >>>>>> when it returns a positive value "bypasses kernel checks". The patch > >>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is > >>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is > >>>>>> bypassing a capability check, but the description claims that to be > >>>>>> the case. > >>>>>> > >>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM > >>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can > >>>>>> be used to provide additional access control restrictions beyond a > >>>>>> capability check, but a LSM hook should never be allowed to overrule > >>>>>> an access denial due to a capability check. > >>>>>> > >>>>>>> The reason CAP_BPF was created was because there was nothing else that > >>>>>>> would be fine-grained enough at the time. > >>>>>> The LSM layer predates CAP_BPF, and one could make a very solid > >>>>>> argument that one of the reasons LSMs exist is to provide > >>>>>> supplementary controls due to capability-based access controls being a > >>>>>> poor fit for many modern use cases. > >>>>> I generally agree with what you say, but we DO have this code pattern: > >>>>> > >>>>> if (!some_check(...) && !capable(...)) > >>>>> return -EPERM; > >>>> I think we need to make this more concrete; we don't have a pattern in > >>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > >>>> Simply because there is another kernel access control mechanism which > >>>> allows a capability check to be skipped doesn't mean I want to allow a > >>>> LSM hook to be used to skip a capability check. > >>> This work is an attempt to tighten the security of production systems > >>> by allowing to drop too coarse-grained and permissive capabilities > >>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > >>> than production use cases are meant to be able to do) > >> The BPF developers are in complete control of what CAP_BPF controls. > >> You can easily address the granularity issue by adding addition restrictions > >> on processes that have CAP_BPF. That is the intended use of LSM. > >> The whole point of having multiple capabilities is so that you can > >> grant just those that are required by the system security policy, and > >> do so safely. That leads to differences of opinion regarding the definition > >> of the system security policy. BPF chose to set itself up as an element > >> of security policy (you need CAP_BPF) rather than define elements such that > >> existing capabilities (CAP_FOWNER, CAP_KILL, CAP_MAC_OVERRIDE, ...) would > >> control. > > Please see my reply to Paul, where I explain CAP_BPF's system-wide > > nature and problem with user namespaces. I don't think the problem is > > in the granularity of CAP_BPF, it's more of a "non-namespaceable" > > nature of the BPF subsystem in general. > > Paul is approaching this from a different angle. Your response to Paul > does not address the issue I have raised. I see, I definitely missed this. Re-reading your reply, I still am not clear on what you are proposing, tbh. Can you please elaborate what you have in mind? > > >>> and then grant > >>> specific BPF operations on specific BPF programs/maps based on custom > >>> LSM security policy, > >> This is backwards. The correct implementation is to require CAP_BPF and > >> further restrict BPF operations based on a custom LSM security policy. > >> That's how LSM is designed. > > Please see my reply to Paul, we can't grant real CAP_BPF for > > applications in user namespace (unless there is some trick that I > > don't know, so please do point it out). Let's converge the discussion > > in that email thread branch to not discuss the same topic multiple > > times. > > I saw your reply to Paul. Paul's points are not my points. If they where, > I wouldn't have taken my or your time to present them. Sure, sorry about that. What do you have in mind then? > > >>> which validates application trustworthiness using > >>> custom production-specific logic. > >>> > >>> Isn't this goal in line with LSMs mission to enhance system security? > >> We're not arguing the goal, we're discussing the implementation. > >> > >>>>> It looks to me like this series can be refactored to do the same. I > >>>>> wouldn't consider that to be a "bypass", but I would agree the current > >>>>> series looks too much like "bypass", and makes reasoning about the > >>>>> effect of the LSM hooks too "special". :) > >>> Sorry, I didn't realize that the current code layout is making things > >>> more confusing. I'll address feedback to make the intent a bit > >>> clearer. > >>> > >>>> -- > >>>> paul-moore.com
On 4/17/2023 4:29 PM, Andrii Nakryiko wrote: > On Thu, Apr 13, 2023 at 8:11 AM Paul Moore <paul@paul-moore.com> wrote: >> On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko >> <andrii.nakryiko@gmail.com> wrote: >>> On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: >>>> On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko >>>> <andrii.nakryiko@gmail.com> wrote: >>>>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: >>>>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: >>>>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: >>>>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: >>>>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: >>>>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: >>>> ... >>>> >>>>>>>>> For example, in many places we have things like: >>>>>>>>> >>>>>>>>> if (!some_check(...) && !capable(...)) >>>>>>>>> return -EPERM; >>>>>>>>> >>>>>>>>> I would expect this is a similar logic. An operation can succeed if the >>>>>>>>> access control requirement is met. The mismatch we have through-out the >>>>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And >>>>>>>>> this series conceptually, I think, doesn't violate that -- it's changing >>>>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks >>>>>>>>> yet here). >>>>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which >>>>>>>> when it returns a positive value "bypasses kernel checks". The patch >>>>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is >>>>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is >>>>>>>> bypassing a capability check, but the description claims that to be >>>>>>>> the case. >>>>>>>> >>>>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM >>>>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can >>>>>>>> be used to provide additional access control restrictions beyond a >>>>>>>> capability check, but a LSM hook should never be allowed to overrule >>>>>>>> an access denial due to a capability check. >>>>>>>> >>>>>>>>> The reason CAP_BPF was created was because there was nothing else that >>>>>>>>> would be fine-grained enough at the time. >>>>>>>> The LSM layer predates CAP_BPF, and one could make a very solid >>>>>>>> argument that one of the reasons LSMs exist is to provide >>>>>>>> supplementary controls due to capability-based access controls being a >>>>>>>> poor fit for many modern use cases. >>>>>>> I generally agree with what you say, but we DO have this code pattern: >>>>>>> >>>>>>> if (!some_check(...) && !capable(...)) >>>>>>> return -EPERM; >>>>>> I think we need to make this more concrete; we don't have a pattern in >>>>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? >>>>>> Simply because there is another kernel access control mechanism which >>>>>> allows a capability check to be skipped doesn't mean I want to allow a >>>>>> LSM hook to be used to skip a capability check. >>>>> This work is an attempt to tighten the security of production systems >>>>> by allowing to drop too coarse-grained and permissive capabilities >>>>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more >>>>> than production use cases are meant to be able to do) and then grant >>>>> specific BPF operations on specific BPF programs/maps based on custom >>>>> LSM security policy, which validates application trustworthiness using >>>>> custom production-specific logic. >>>> There are ways to leverage the LSMs to apply finer grained access >>>> control on top of the relatively coarse capabilities that do not >>>> require circumventing those capability controls. One grants the >>>> capabilities, just as one would do today, and then leverages the >>>> security functionality of a LSM to further restrict specific users, >>>> applications, etc. with a level of granularity beyond that offered by >>>> the capability controls. >>> Please help me understand something. What you and Casey are proposing, >>> when taken to the logical extreme, is to grant to all processes root >>> permissions and then use LSM to restrict specific actions, do I >>> understand correctly? This strikes me as a less secure and more >>> error-prone way of doing things. >> When taken to the "logical extreme" most concepts end up sounding a >> bit absurd, but that was the point, wasn't it? > Wasn't my intent to make it sound absurd, sorry. The way I see it, for > the sake of example, let's say CAP_BPF allows 20 different operations > (each with its own security_xxx hook). And let's say in production I > want to only allow 3 of them. Sure, technically it should be possible > to deny access at 17 hooks and let it through in just those 3. But if > someone adds 21st and I forget to add 21st restriction, that would be > bad (but very probably with such approach). That would be a flaw in the implementation of the 21st, not a problem with the capabilities or LSM model. For the LSM model to be sufficiently flexible it cannot be required to prevent or detect coding errors. > So my point is that for situations like this, dropping CAP_BPF, but > allowing only 3 hooks to proceed seems a safer approach, because if we > add 21st hook, it will safely be denied without CAP_BPF *by default*. > That's what I tried to point out. When you're creating security relevant or enforcing mechanisms there has too be a level of expectation regarding the care with which they're developed. My expectation is that the 21st hook won't go in without adequate review. > But even if we ignore this "safe by default when a new hook is added" > behavior, when taking user namespaces into account, the restrictive > LSM approach just doesn't seem to work at all for something like > CAP_BPF. CAP_BPF cannot be "namespaced", just like, say, CAP_SYS_TIME, > because we cannot ensure that a given BPF program won't access kernel > state "belonging" to another process (as one example). Time namespaces have been proposed. I would be surprised if there aren't people working on BPF namespaces somewhere. There's a difference between "can't" and "haven't been". > Now, thanks to Jonathan, I get that there was a heated discussion 20 > years ago about authoritative vs restrictive LSMs. But if I read a > summary at that time ([0]), authoritative hooks were not out of the > question *in principle*. Surely, "walk before we can run" makes sense, > but it's been a while ago. Certainly. The SGI comment was mine, by the way. I wanted authoritative hooks for cases like POSIX ACLs and systems without root. While I would have liked the decision to go the other way, there's no way I would endorse a hybrid, where some hooks are restrictive and others authoritative. > [0] https://lwn.net/2001/1108/a/no-auth-hooks.php3 > > >> Here is a fun story which seems relevant ... in the early days of >> SELinux, one of the community devs setup up a system with a SELinux >> policy which restricted all privileged operations from the root user, >> put the system on a publicly accessible network, posted the root >> password for all to see, and invited the public to login to the system >> and attempt to exercise root privilege (it's been well over 10 years >> at this point so the details are a bit fuzzy). Granted, there were >> some hiccups in the beginning, mostly due to the crude state of policy >> development/analysis at the time, but after a few policy revisions the >> system held up quite well. > Honest question out of curiosity: was the intent to demonstrate that > with LSM one can completely restrict root? Or that root was actually > allowed to do something useful? Because I can see how rejecting > everything would be rather simple, but actually pretty useless in > practice. Restricting only part of the power of the root, while still > allowing it to do something useful in production seems like a much > harder (but way more valuable) endeavor. Not saying it's impossible, > but see my example about missing 21st new CAP_BPF functionality. Capabilities are sufficient to implement a rootless system. It's been done. Someone will point out that CAP_SYS_ADMIN is effectively root, and there's some truth to that. >> On the more practical side of things, there are several use cases >> which require, by way of legal or contractual requirements, that full >> root/admin privileges are decomposed into separate roles: security >> admin, audit admin, backup admin, etc. These users satisfy these >> requirements by using LSMs, such as SELinux, to restrict the >> administrative capabilities based on the SELinux user/role/domain. >> >>> By the way, even the above proposal of yours doesn't work for >>> production use cases when user namespaces are involved, as far as I >>> understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for >>> containers running inside user namespaces, as CAP_BPF in non-init >>> namespace is not enough for bpf() syscall to allow loading BPF maps or >>> BPF program ... >> Once again, the LSM has always intended to be a restrictive mechanism, >> not a privilege granting mechanism. If an operation is not possible > Not according to [0] above: > > > It is our belief that these changes do not belong in the initial version of > > LSM (especially given our limited charter and original goals), and should > > be proposed as incremental refinements after LSM has been initially > > accepted. > > ... > > It is our belief that the current LSM > > will provide a meaningful improvement in the security infrastructure of the > > Linux kernel, and that there is plenty of room for future expansion of LSM > > in subsequent phases. > > I don't see "always intended to be a restrictive mechanism" there. Having been on the other side of the argument, the system that was accepted was in fact "always intended to be a restrictive mechanism". The quote above is a "never say never" statement. >> without the LSM layer enabled, it should not be possible with the LSM >> layer enabled. The LSM is not a mechanism to circumvent other access >> control mechanisms in the kernel. > I understand, but it's not like we are proposing to go and bypass all > kinds of random kernel security mechanisms. These are targeted hooks, > developed by the BPF community for the BPF subsystem to allow trusted > unprivileged production use cases. Yes, we currently rely on checking > CAP_BPF to grant more dangerous/advanced features, but it's because we > can't just allow any unprivileged process to do this. But what we > really want is to answer the question "can we trust this process to > use this advanced functionality", and if there is no specific LSM > policy that cares one way (allow) or the other (disallow), fallback to > CAP_BPF enforcement. > > So it's not bypassing kernel checks, but rather augmenting them with > more flexible and customizable mechanisms, while still falling back to > CAP_BPF if the user didn't install any custom LSM policy. That would make CAP_BPF behave differently from all other capabilities. Capabilities are hard enough to use correctly as it is. If each capability defined its own semantics they would be completely unusable. >>> Also, in previous email you said: >>> >>>> Simply because there is another kernel access control mechanism which >>>> allows a capability check to be skipped doesn't mean I want to allow a >>>> LSM hook to be used to skip a capability check. >>> I understand your stated position, but can you please help me >>> understand the reasoning behind it? >> Keeping the LSM as a restrictive access control mechanism helps ensure >> some level of sanity and consistency across different Linux >> installations. If a certain operation requires CAP_SYS_ADMIN on one >> Linux system, it should require CAP_SYS_ADMIN on another Linux system. >> Granted, a LSM running on one system might impose additional >> constraints on that operation, but the CAP_SYS_ADMIN requirement still >> applies. >> >> There is also an issue of safety in knowing that enabling a LSM will >> not degrade the access controls on a system by potentially granting >> operations that were previously denied. >> >>> Does the above also mean that you'd be fine if we just don't plug into >>> the LSM subsystem at all and instead come up with some ad-hoc solution >>> to allow effectively the same policies? This sounds detrimental both >>> to LSM and BPF subsystems, so I hope we can talk this through before >>> finalizing decisions. >> Based on your patches and our discussion, it seems to me that the >> problem you are trying to resolve is related more to the >> capability-based access controls in the eBPF, and possibly other >> kernel subsystems, and not any LSM-based restrictions. I'm happy to >> work with you on a solution involving the LSM, but please understand >> that I'm not going to support a solution which changes a core >> philosophy of the LSM layer. > Great, I'd really appreciate help and suggestions on how to solve the > following problem. > > We have a BPF subsystem that allows loading BPF programs. Those BPF > programs cannot be contained within a particular namespace just by its > system-wide tracing nature (it can safely read kernel and user memory > and we can't restrict whether that memory belongs to a particular > namespace), so it's like CAP_SYS_TIME, just with much broader API > surface. This doesn't sound like a problem, it sounds like BPF is explicitly designed to prevent interference by namespaces. But in some cases you now want to limit it by namespaces. It appears that the desired uses of BPF are no longer compatible with its original security model. That's unfortunate, and likely to require a significant change to the implementation of BPF. > > The other piece of a puzzle is user namespaces. We do want to run > applications inside user namespaces, but allow them to use BPF > programs. As far as I can tell, there is no way to grant real CAP_BPF > that will be recognized by capable(CAP_BPF) (not ns_capable, see above > about system-wide nature of BPF). If there is, please help me > understand how. All my local experiments failed, and looking at > cap_capable() implementation it is not intended to even check the > initial namespace's capability if the process is running in the user > namespace. > > > So, given that a) we can't make CAP_BPF namespace-aware and b) we > can't grant real CAP_BPF to processes in user namespace, how could we > allow user namespaced applications to do useful work with BPF? > >>> Lastly, you mentioned before: >>> >>>>>> I think we need to make this more concrete; we don't have a pattern in >>>>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? >>> Unfortunately I don't have enough familiarity with all LSM hooks, so I >>> can't confirm or disprove the above statement. But earlier someone >>> brought to my attention the case of security_vm_enough_memory_mm(), >>> which seems to be granting effectively CAP_SYS_ADMIN for the purposes >>> of memory accounting. Am I missing something subtle there or does it >>> grant effective caps indeed? >> Some of the comments around that hook can be misleading, but if you >> look at the actual code it starts to make more sense. >> > [...] > >> I do agree that the security_vm_enough_memory() hook is structured a >> bit differently than most of the other LSM hooks, but it still >> operates with the same philosophy: a LSM should only be allowed to >> restrict access, a LSM should never be allowed to grant access that >> would otherwise be denied by the traditional Linux access controls. >> >> Hopefully that explanation makes sense, but if things are still a bit >> fuzzy I would encourage you to go look at the code, I'm sure it will >> make sense once you spend a few minutes figuring out how it works. >> > Yep, thanks a lot, it's way more clear after grokking relevant pieces > of LSM the code you pointed out and LSM infrastructure in general. > "capabilities" LSM is non-negotiable, so it effectively always > restricts a small subset of hooks, including vm_enough_memory and > capable. > > Still, the problem still stands. How do we marry BPF and user > namespaces? I'd really appreciate suggestions. Thank you! > > >> [1] There is a long and sorta bizarre history with the capability LSM, >> but just understand it is a bit "special" in many ways, and those >> "special" behaviors are intentional. >> >> -- >> paul-moore.com
On 4/17/2023 5:28 PM, Andrii Nakryiko wrote: > On Mon, Apr 17, 2023 at 4:53 PM Casey Schaufler <casey@schaufler-ca.com> wrote: >> ... >> >> The BPF developers are in complete control of what CAP_BPF controls. >> You can easily address the granularity issue by adding addition restrictions >> on processes that have CAP_BPF. That is the intended use of LSM. >> The whole point of having multiple capabilities is so that you can >> grant just those that are required by the system security policy, and >> do so safely. That leads to differences of opinion regarding the definition >> of the system security policy. BPF chose to set itself up as an element >> of security policy (you need CAP_BPF) rather than define elements such that >> existing capabilities (CAP_FOWNER, CAP_KILL, CAP_MAC_OVERRIDE, ...) would >> control. >>> Please see my reply to Paul, where I explain CAP_BPF's system-wide >>> nature and problem with user namespaces. I don't think the problem is >>> in the granularity of CAP_BPF, it's more of a "non-namespaceable" >>> nature of the BPF subsystem in general. >> Paul is approaching this from a different angle. Your response to Paul >> does not address the issue I have raised. > I see, I definitely missed this. Re-reading your reply, I still am not > clear on what you are proposing, tbh. Can you please elaborate what > you have in mind? As requested, I've moved over to the "other" thread.
On Mon, Apr 17, 2023 at 7:29 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Thu, Apr 13, 2023 at 8:11 AM Paul Moore <paul@paul-moore.com> wrote: > > On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > > > > <andrii.nakryiko@gmail.com> wrote: > > > > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > > > > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > ... > > > > > > > > > > > > > For example, in many places we have things like: > > > > > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > > > return -EPERM; > > > > > > > > > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > > > > > access control requirement is met. The mismatch we have through-out the > > > > > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > > > > > yet here). > > > > > > > > > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > > > > > bypassing a capability check, but the description claims that to be > > > > > > > > the case. > > > > > > > > > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > > > > > be used to provide additional access control restrictions beyond a > > > > > > > > capability check, but a LSM hook should never be allowed to overrule > > > > > > > > an access denial due to a capability check. > > > > > > > > > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > > > > > would be fine-grained enough at the time. > > > > > > > > > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > > > > > argument that one of the reasons LSMs exist is to provide > > > > > > > > supplementary controls due to capability-based access controls being a > > > > > > > > poor fit for many modern use cases. > > > > > > > > > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > return -EPERM; > > > > > > > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > > > > Simply because there is another kernel access control mechanism which > > > > > > allows a capability check to be skipped doesn't mean I want to allow a > > > > > > LSM hook to be used to skip a capability check. > > > > > > > > > > This work is an attempt to tighten the security of production systems > > > > > by allowing to drop too coarse-grained and permissive capabilities > > > > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > > > > than production use cases are meant to be able to do) and then grant > > > > > specific BPF operations on specific BPF programs/maps based on custom > > > > > LSM security policy, which validates application trustworthiness using > > > > > custom production-specific logic. > > > > > > > > There are ways to leverage the LSMs to apply finer grained access > > > > control on top of the relatively coarse capabilities that do not > > > > require circumventing those capability controls. One grants the > > > > capabilities, just as one would do today, and then leverages the > > > > security functionality of a LSM to further restrict specific users, > > > > applications, etc. with a level of granularity beyond that offered by > > > > the capability controls. > > > > > > Please help me understand something. What you and Casey are proposing, > > > when taken to the logical extreme, is to grant to all processes root > > > permissions and then use LSM to restrict specific actions, do I > > > understand correctly? This strikes me as a less secure and more > > > error-prone way of doing things. > > > > When taken to the "logical extreme" most concepts end up sounding a > > bit absurd, but that was the point, wasn't it? > > Wasn't my intent to make it sound absurd, sorry. The way I see it, for > the sake of example, let's say CAP_BPF allows 20 different operations > (each with its own security_xxx hook). And let's say in production I > want to only allow 3 of them. Sure, technically it should be possible > to deny access at 17 hooks and let it through in just those 3. But if > someone adds 21st and I forget to add 21st restriction, that would be > bad (but very probably with such approach). Welcome to the challenges of maintaining access controls within the Linux Kernel, LSM or otherwise. As we all know, the Linux Kernel moves forward at a staggering pace sometimes, and it is not uncommon for new features/subsystems to be added without consulting all of the different folks who worry about access controls. In many cases it can be a simple misunderstanding, but in some cases it's a willful rejection of a particular form of access control, the LSM being a prime example. Thankfully in almost all of those cases we have been moderately successful in retrofitting the necessary access controls, sometimes they are not as good/capable/granular/etc. as we would like because of design limitations, but such is life. I say this not because I believe this is a valid argument for authoritative LSM hooks, I say this simply to acknowledge that this *is* a problem. > So my point is that for situations like this, dropping CAP_BPF, but > allowing only 3 hooks to proceed seems a safer approach, because if we > add 21st hook, it will safely be denied without CAP_BPF *by default*. > That's what I tried to point out. I believe I understand your point, I just disagree with you on accepting authoritative LSM hooks in the upstream Linux Kernel; I believe it would be a *big* mistake to move away from the restrictive LSM hook philosophy at this point in time. > But even if we ignore this "safe by default when a new hook is added" > behavior, when taking user namespaces into account, the restrictive > LSM approach just doesn't seem to work at all for something like > CAP_BPF. CAP_BPF cannot be "namespaced", just like, say, CAP_SYS_TIME, > because we cannot ensure that a given BPF program won't access kernel > state "belonging" to another process (as one example). Once again, the root of this problem lies in the capabilities and/or namespace mechanisms, not the LSM; if you want to fix this properly you should be looking at how eBPF leverages capabilities for access control. Changing the very core behavior of the LSM layer in order to work around an issue with another access control mechanism is a non-starter. I can't say this enough. > Now, thanks to Jonathan, I get that there was a heated discussion 20 > years ago about authoritative vs restrictive LSMs. But if I read a > summary at that time ([0]), authoritative hooks were not out of the > question *in principle*. Surely, "walk before we can run" makes sense, > but it's been a while ago. ... and once again, the restrictive approach has proven to work reasonably well over the past ~20 years, why would we abandon that simply to work around a problem with a different access control mechanism. Don't break the LSM layer to fix something else. > > Here is a fun story which seems relevant ... in the early days of > > SELinux, one of the community devs setup up a system with a SELinux > > policy which restricted all privileged operations from the root user, > > put the system on a publicly accessible network, posted the root > > password for all to see, and invited the public to login to the system > > and attempt to exercise root privilege (it's been well over 10 years > > at this point so the details are a bit fuzzy). Granted, there were > > some hiccups in the beginning, mostly due to the crude state of policy > > development/analysis at the time, but after a few policy revisions the > > system held up quite well. > > Honest question out of curiosity: was the intent to demonstrate that > with LSM one can completely restrict root? Or that root was actually > allowed to do something useful? The intent was to show that it is possible to restrict capability-based access controls with the LSM layer; it was the best example of the "logical extreme" carried out in the real world that I could think of when writing my response. > > On the more practical side of things, there are several use cases > > which require, by way of legal or contractual requirements, that full > > root/admin privileges are decomposed into separate roles: security > > admin, audit admin, backup admin, etc. These users satisfy these > > requirements by using LSMs, such as SELinux, to restrict the > > administrative capabilities based on the SELinux user/role/domain. > > > > > By the way, even the above proposal of yours doesn't work for > > > production use cases when user namespaces are involved, as far as I > > > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > > > containers running inside user namespaces, as CAP_BPF in non-init > > > namespace is not enough for bpf() syscall to allow loading BPF maps or > > > BPF program ... > > > > Once again, the LSM has always intended to be a restrictive mechanism, > > not a privilege granting mechanism. If an operation is not possible > > Not according to [0] above: When one considers what has been present in Linus' tree, then yes. The idea of authoritative LSM hooks has been rejected for ~20 years and I've seen nothing in this thread to make me believe that we should change that now, and for this use case. > > Based on your patches and our discussion, it seems to me that the > > problem you are trying to resolve is related more to the > > capability-based access controls in the eBPF, and possibly other > > kernel subsystems, and not any LSM-based restrictions. I'm happy to > > work with you on a solution involving the LSM, but please understand > > that I'm not going to support a solution which changes a core > > philosophy of the LSM layer. > > Great, I'd really appreciate help and suggestions on how to solve the > following problem. > > We have a BPF subsystem that allows loading BPF programs. Those BPF > programs cannot be contained within a particular namespace just by its > system-wide tracing nature (it can safely read kernel and user memory > and we can't restrict whether that memory belongs to a particular > namespace), so it's like CAP_SYS_TIME, just with much broader API > surface. > > The other piece of a puzzle is user namespaces. We do want to run > applications inside user namespaces, but allow them to use BPF > programs. As far as I can tell, there is no way to grant real CAP_BPF > that will be recognized by capable(CAP_BPF) (not ns_capable, see above > about system-wide nature of BPF). If there is, please help me > understand how. All my local experiments failed, and looking at > cap_capable() implementation it is not intended to even check the > initial namespace's capability if the process is running in the user > namespace. > > So, given that a) we can't make CAP_BPF namespace-aware and b) we > can't grant real CAP_BPF to processes in user namespace, how could we > allow user namespaced applications to do useful work with BPF? I would start by talking with the user namespace folks. I may be misunderstanding the problem as you've described it, but it seems like the core issue is how capabilities, specifically CAP_BPF, are handled in user namespaces. To be honest, I'm not sure how much luck you'll have there, but you stand a better chance in changing how capabilities are handled across user namespaces than you do in getting an authoritative LSM hook merged. Regardless, my offer still stands, if you have a solution which sticks to a restrictive LSM model, I'm happy to work with you further to sort out the details and try to make that work. I don't have any great ideas there at the moment, but there are plenty of smart people on this mailing list and others who might have something clever in mind.
On Mon, Apr 17, 2023 at 04:31:31PM -0700, Andrii Nakryiko wrote: Hi, I hope the week is going well for everyone. > On Fri, Apr 14, 2023 at 1:24???PM Dr. Greg <greg@enjellic.com> wrote: > > > > On Wed, Apr 12, 2023 at 10:47:13AM -0700, Kees Cook wrote: > > > > Hi, I hope the week is ending well for everyone. > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > On Wed, Apr 12, 2023 at 12:33???AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > > > Add new LSM hooks, bpf_map_create_security and bpf_btf_load_security, which > > > > > are meant to allow highly-granular LSM-based control over the usage of BPF > > > > > subsytem. Specifically, to control the creation of BPF maps and BTF data > > > > > objects, which are fundamental building blocks of any modern BPF application. > > > > > > > > > > These new hooks are able to override default kernel-side CAP_BPF-based (and > > > > > sometimes CAP_NET_ADMIN-based) permission checks. It is now possible to > > > > > implement LSM policies that could granularly enforce more restrictions on > > > > > a per-BPF map basis (beyond checking coarse CAP_BPF/CAP_NET_ADMIN > > > > > capabilities), but also, importantly, allow to *bypass kernel-side > > > > > enforcement* of CAP_BPF/CAP_NET_ADMIN checks for trusted applications and use > > > > > cases. > > > > > > > > One of the hallmarks of the LSM has always been that it is > > > > non-authoritative: it cannot unilaterally grant access, it can only > > > > restrict what would have been otherwise permitted on a traditional > > > > Linux system. Put another way, a LSM should not undermine the Linux > > > > discretionary access controls, e.g. capabilities. > > > > > > > > If there is a problem with the eBPF capability-based access controls, > > > > that problem needs to be addressed in how the core eBPF code > > > > implements its capability checks, not by modifying the LSM mechanism > > > > to bypass these checks. > > > > > I think semantics matter here. I wouldn't view this as _bypassing_ > > > capability enforcement: it's just more fine-grained access control. > > > > > > For example, in many places we have things like: > > > > > > if (!some_check(...) && !capable(...)) > > > return -EPERM; > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > access control requirement is met. The mismatch we have through-out the > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > this series conceptually, I think, doesn't violate that -- it's changing > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > yet here). > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > would be fine-grained enough at the time. > > This was one of the issues, among others, that the TSEM LSM we are > > working to upstream, was designed to address and may be an avenue > > forward. > > > > TSEM, being narratival rather than deontologically based, provides a > > framework for security permissions that are based on a > > characterization of the event itself. So the permissions are as > > variable as the contents of whatever BPF related information is passed > > to the bpf* LSM hooks [1]. > > > > Currently, the tsem_bpf_* hooks are generically modeled. We would > > certainly entertain any discussion or suggestions as to what elements > > of the structures passed to the hooks would be useful with respect > > to establishing security policies useful and appropriate to the BPF > > community. > Could you please provide some links to get a bit more context and > information? I'd like to understand at least "narratival rather than > deontologically based" part of this. We don't have much in the way of links, hopefully some simple prose will be helpful. 'Narratival vs deontological' contrasts the logic philosophy that is being used in the design of a security architecture. Deontological implies that the security architecture is 'rules' based. A concept embraced by the classic mandatory access control architectures such as SeLinux. Narratival, the logic predicate embraced by TSEM, implies that the security architecture is events based and is constructed from a narration of a known good workload by unit testing. At the risk of indulging in further philosophical wonkiness, the two bodies of logic arise from the constrasting philosopies espoused by Immanual Kant and Georg Wilhelm Friedrich Hegel. It is somewhat less precise, but a security architecture that is rules based would be considered 'Kantian' motivated while an events based architecture would be considered 'Hegelian' inspired. So, departing from epistemology, what does all of this mean with respect to security. In a policy based architecture, the security decision is a product of the rules, in the case of SeLinux a rather complex corpus, that have been established to regulate the interaction of a role, subject and object label. In an events based architecture, the security decision is a product of the characteristics of the event. From a granularity perspective, which seems to be an issue in this BPF/BTF discussion, the granularity of the security decision can be as variable as any of characteristics that is used to describe the LSM event at the operating system level. In TSEM, the characteristics of the event are used to generic a unique numeric coefficient specific to the event. The TSEM documentation discusses the functional generation of these coefficients. In the case of the three bpf LSM hooks that are in 6.5, this would be any of the characteristics embodied in the following variables. bpf command bpf_attributes bpf_map fmode_t bpf_prog With respect to your problem at hand; Paul Moore suggested elsewhere in this thread that there were smart people hanging around on the list that might be able to comment on the challenge of CAP_BPF lacking granularity and being unavailable in a user namespace. I can't claim to being very smart, but I did hook up the big screen TV at our lake place in west-central Minnesota and it worked the first time, so here goes some thoughts. I can't claim a great deal of experience with BPF, but I'm assuming that any of the characteristics above, or that would be passed to the proposed BPF LSM hooks, would embody sufficient information about a BPF program to fully characterize it from a security perspective. I'm also assuming that the BPF implementation in the Linux kernel is now sufficiently featureful for a BPF program to assist in making a security decision by analyzing any of the attributes passed to an LSM hook for a subsequent and subordinate BPF program. We currently don't have support in TSEM for connecting a BPF program to an in kernel Trusted Modeling Agent (TMA), but it is on our radar screen, desperately seeking attention cycles. With such hypothetical support in place, I would propose gating the ability to attach a BPF program to a TMA with CAP_BPF. Said program would then assume the role of assisting the TMA in generating the security coefficients for subsequent BPF related security events in the modeling namespace. At that point, the security behavior of subsequent BPF programs will be under the control of the security model being run by the TMA assigned to that security namespace. It can be as granular and restrictive as any security characteristics that would be described as being relevant to BPF. From a security perspective, you don't write any security policy, you unit test the BPF application and the trust orchestrator generates the security model that would be subsequently enforced. With this model, you don't override any existing security controls and the LSM implementation remains purely restrictive. CAP_BPF regulates whether the BPF infrastructure can be accessed and BPF itself becomes responsible for defining the permissable security behavior of any subordinate BPF applications. There are undoubtedly considerations needed in the BPF implementation to support this model but I haven't had time to look at those particulars. There is further discussion of the concepts involved in the 18+ page documentation file that was included in the V0 release of TSEM. Here is the lore link for the original series: https://lore.kernel.org/linux-security-module/20230204050954.11583-1-greg@enjellic.com/#t The V1 release, currently being finalized, is a significantly enhanced implementation but the architectural and security concepts discussed are all still relevant, if there is a desire to dig into this further. With respect to the thinking and writings of Kant and Hegel, Wikipedia is your friend.... :-) To conclude in a big picture context, if it hasn't already jumped out at people. While TSEM operates practically from a narratival design perspective, it is designed to do so by applying either deterministic or machine learning models to the characterization and enforcement of the security behavior of a platform. The reason we have a somewhat intense interest in BPF is that HIDS based machine learning models need to do characteristic screening in order to be properly trained for anomaly detection. BPF is a pathway to achieving this with a single kernel based trusted modeling agent implementation. Now, back to figuring out how to hook up the stereo/hifi. Have a good remainder of the week. As always, Dr. Greg The Quixote Project - Flailing at the Travails of Cybersecurity
On Mon, Apr 17, 2023 at 5:48 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 4/17/2023 4:29 PM, Andrii Nakryiko wrote: > > On Thu, Apr 13, 2023 at 8:11 AM Paul Moore <paul@paul-moore.com> wrote: > >> On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko > >> <andrii.nakryiko@gmail.com> wrote: > >>> On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > >>>> On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > >>>> <andrii.nakryiko@gmail.com> wrote: > >>>>> On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > >>>>>> On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > >>>>>>> On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > >>>>>>>> On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > >>>>>>>>> On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > >>>>>>>>>> On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > >>>> ... > >>>> > >>>>>>>>> For example, in many places we have things like: > >>>>>>>>> > >>>>>>>>> if (!some_check(...) && !capable(...)) > >>>>>>>>> return -EPERM; > >>>>>>>>> > >>>>>>>>> I would expect this is a similar logic. An operation can succeed if the > >>>>>>>>> access control requirement is met. The mismatch we have through-out the > >>>>>>>>> kernel is that capability checks aren't strictly done by LSM hooks. And > >>>>>>>>> this series conceptually, I think, doesn't violate that -- it's changing > >>>>>>>>> the logic of the capability checks, not the LSM (i.e. there no LSM hooks > >>>>>>>>> yet here). > >>>>>>>> Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > >>>>>>>> when it returns a positive value "bypasses kernel checks". The patch > >>>>>>>> isn't based on either Linus' tree or the LSM tree, I'm guessing it is > >>>>>>>> based on a eBPF tree, so I can't say with 100% certainty that it is > >>>>>>>> bypassing a capability check, but the description claims that to be > >>>>>>>> the case. > >>>>>>>> > >>>>>>>> Regardless of how you want to spin this, I'm not supportive of a LSM > >>>>>>>> hook which allows a LSM to bypass a capability check. A LSM hook can > >>>>>>>> be used to provide additional access control restrictions beyond a > >>>>>>>> capability check, but a LSM hook should never be allowed to overrule > >>>>>>>> an access denial due to a capability check. > >>>>>>>> > >>>>>>>>> The reason CAP_BPF was created was because there was nothing else that > >>>>>>>>> would be fine-grained enough at the time. > >>>>>>>> The LSM layer predates CAP_BPF, and one could make a very solid > >>>>>>>> argument that one of the reasons LSMs exist is to provide > >>>>>>>> supplementary controls due to capability-based access controls being a > >>>>>>>> poor fit for many modern use cases. > >>>>>>> I generally agree with what you say, but we DO have this code pattern: > >>>>>>> > >>>>>>> if (!some_check(...) && !capable(...)) > >>>>>>> return -EPERM; > >>>>>> I think we need to make this more concrete; we don't have a pattern in > >>>>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > >>>>>> Simply because there is another kernel access control mechanism which > >>>>>> allows a capability check to be skipped doesn't mean I want to allow a > >>>>>> LSM hook to be used to skip a capability check. > >>>>> This work is an attempt to tighten the security of production systems > >>>>> by allowing to drop too coarse-grained and permissive capabilities > >>>>> (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > >>>>> than production use cases are meant to be able to do) and then grant > >>>>> specific BPF operations on specific BPF programs/maps based on custom > >>>>> LSM security policy, which validates application trustworthiness using > >>>>> custom production-specific logic. > >>>> There are ways to leverage the LSMs to apply finer grained access > >>>> control on top of the relatively coarse capabilities that do not > >>>> require circumventing those capability controls. One grants the > >>>> capabilities, just as one would do today, and then leverages the > >>>> security functionality of a LSM to further restrict specific users, > >>>> applications, etc. with a level of granularity beyond that offered by > >>>> the capability controls. > >>> Please help me understand something. What you and Casey are proposing, > >>> when taken to the logical extreme, is to grant to all processes root > >>> permissions and then use LSM to restrict specific actions, do I > >>> understand correctly? This strikes me as a less secure and more > >>> error-prone way of doing things. > >> When taken to the "logical extreme" most concepts end up sounding a > >> bit absurd, but that was the point, wasn't it? > > Wasn't my intent to make it sound absurd, sorry. The way I see it, for > > the sake of example, let's say CAP_BPF allows 20 different operations > > (each with its own security_xxx hook). And let's say in production I > > want to only allow 3 of them. Sure, technically it should be possible > > to deny access at 17 hooks and let it through in just those 3. But if > > someone adds 21st and I forget to add 21st restriction, that would be > > bad (but very probably with such approach). > > That would be a flaw in the implementation of the 21st, not a problem > with the capabilities or LSM model. For the LSM model to be sufficiently > flexible it cannot be required to prevent or detect coding errors. > > > So my point is that for situations like this, dropping CAP_BPF, but > > allowing only 3 hooks to proceed seems a safer approach, because if we > > add 21st hook, it will safely be denied without CAP_BPF *by default*. > > That's what I tried to point out. > > When you're creating security relevant or enforcing mechanisms there has > too be a level of expectation regarding the care with which they're > developed. My expectation is that the 21st hook won't go in without > adequate review. > That's not how it works with BPF LSM, but there is no point in arguing about this. I agree that LSM shouldn't be prevent from adding new hooks just because of some particular LSM implementation. > > But even if we ignore this "safe by default when a new hook is added" > > behavior, when taking user namespaces into account, the restrictive > > LSM approach just doesn't seem to work at all for something like > > CAP_BPF. CAP_BPF cannot be "namespaced", just like, say, CAP_SYS_TIME, > > because we cannot ensure that a given BPF program won't access kernel > > state "belonging" to another process (as one example). > > Time namespaces have been proposed. I would be surprised if there aren't > people working on BPF namespaces somewhere. There's a difference between > "can't" and "haven't been". > It really is "can't" for BPF, as it allows tracing of kernel internals. > > Now, thanks to Jonathan, I get that there was a heated discussion 20 > > years ago about authoritative vs restrictive LSMs. But if I read a > > summary at that time ([0]), authoritative hooks were not out of the > > question *in principle*. Surely, "walk before we can run" makes sense, > > but it's been a while ago. > > Certainly. The SGI comment was mine, by the way. I wanted authoritative > hooks for cases like POSIX ACLs and systems without root. While I would > have liked the decision to go the other way, there's no way I would endorse > a hybrid, where some hooks are restrictive and others authoritative. > Yep, saw your comments as well. Can't say I get what would be wrong with having authoritative hooks together with restrictive ones, but oh well. > > [0] https://lwn.net/2001/1108/a/no-auth-hooks.php3 > > > > > >> Here is a fun story which seems relevant ... in the early days of > >> SELinux, one of the community devs setup up a system with a SELinux > >> policy which restricted all privileged operations from the root user, > >> put the system on a publicly accessible network, posted the root > >> password for all to see, and invited the public to login to the system > >> and attempt to exercise root privilege (it's been well over 10 years > >> at this point so the details are a bit fuzzy). Granted, there were > >> some hiccups in the beginning, mostly due to the crude state of policy > >> development/analysis at the time, but after a few policy revisions the > >> system held up quite well. > > Honest question out of curiosity: was the intent to demonstrate that > > with LSM one can completely restrict root? Or that root was actually > > allowed to do something useful? Because I can see how rejecting > > everything would be rather simple, but actually pretty useless in > > practice. Restricting only part of the power of the root, while still > > allowing it to do something useful in production seems like a much > > harder (but way more valuable) endeavor. Not saying it's impossible, > > but see my example about missing 21st new CAP_BPF functionality. > > Capabilities are sufficient to implement a rootless system. It's been done. > Someone will point out that CAP_SYS_ADMIN is effectively root, and there's > some truth to that. > > >> On the more practical side of things, there are several use cases > >> which require, by way of legal or contractual requirements, that full > >> root/admin privileges are decomposed into separate roles: security > >> admin, audit admin, backup admin, etc. These users satisfy these > >> requirements by using LSMs, such as SELinux, to restrict the > >> administrative capabilities based on the SELinux user/role/domain. > >> > >>> By the way, even the above proposal of yours doesn't work for > >>> production use cases when user namespaces are involved, as far as I > >>> understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > >>> containers running inside user namespaces, as CAP_BPF in non-init > >>> namespace is not enough for bpf() syscall to allow loading BPF maps or > >>> BPF program ... > >> Once again, the LSM has always intended to be a restrictive mechanism, > >> not a privilege granting mechanism. If an operation is not possible > > Not according to [0] above: > > > > > It is our belief that these changes do not belong in the initial version of > > > LSM (especially given our limited charter and original goals), and should > > > be proposed as incremental refinements after LSM has been initially > > > accepted. > > > ... > > > It is our belief that the current LSM > > > will provide a meaningful improvement in the security infrastructure of the > > > Linux kernel, and that there is plenty of room for future expansion of LSM > > > in subsequent phases. > > > > I don't see "always intended to be a restrictive mechanism" there. > > Having been on the other side of the argument, the system that was accepted > was in fact "always intended to be a restrictive mechanism". The quote above > is a "never say never" statement. > > >> without the LSM layer enabled, it should not be possible with the LSM > >> layer enabled. The LSM is not a mechanism to circumvent other access > >> control mechanisms in the kernel. > > I understand, but it's not like we are proposing to go and bypass all > > kinds of random kernel security mechanisms. These are targeted hooks, > > developed by the BPF community for the BPF subsystem to allow trusted > > unprivileged production use cases. Yes, we currently rely on checking > > CAP_BPF to grant more dangerous/advanced features, but it's because we > > can't just allow any unprivileged process to do this. But what we > > really want is to answer the question "can we trust this process to > > use this advanced functionality", and if there is no specific LSM > > policy that cares one way (allow) or the other (disallow), fallback to > > CAP_BPF enforcement. > > > > So it's not bypassing kernel checks, but rather augmenting them with > > more flexible and customizable mechanisms, while still falling back to > > CAP_BPF if the user didn't install any custom LSM policy. > > That would make CAP_BPF behave differently from all other capabilities. > Capabilities are hard enough to use correctly as it is. If each capability > defined its own semantics they would be completely unusable. > > >>> Also, in previous email you said: > >>> > >>>> Simply because there is another kernel access control mechanism which > >>>> allows a capability check to be skipped doesn't mean I want to allow a > >>>> LSM hook to be used to skip a capability check. > >>> I understand your stated position, but can you please help me > >>> understand the reasoning behind it? > >> Keeping the LSM as a restrictive access control mechanism helps ensure > >> some level of sanity and consistency across different Linux > >> installations. If a certain operation requires CAP_SYS_ADMIN on one > >> Linux system, it should require CAP_SYS_ADMIN on another Linux system. > >> Granted, a LSM running on one system might impose additional > >> constraints on that operation, but the CAP_SYS_ADMIN requirement still > >> applies. > >> > >> There is also an issue of safety in knowing that enabling a LSM will > >> not degrade the access controls on a system by potentially granting > >> operations that were previously denied. > >> > >>> Does the above also mean that you'd be fine if we just don't plug into > >>> the LSM subsystem at all and instead come up with some ad-hoc solution > >>> to allow effectively the same policies? This sounds detrimental both > >>> to LSM and BPF subsystems, so I hope we can talk this through before > >>> finalizing decisions. > >> Based on your patches and our discussion, it seems to me that the > >> problem you are trying to resolve is related more to the > >> capability-based access controls in the eBPF, and possibly other > >> kernel subsystems, and not any LSM-based restrictions. I'm happy to > >> work with you on a solution involving the LSM, but please understand > >> that I'm not going to support a solution which changes a core > >> philosophy of the LSM layer. > > Great, I'd really appreciate help and suggestions on how to solve the > > following problem. > > > > We have a BPF subsystem that allows loading BPF programs. Those BPF > > programs cannot be contained within a particular namespace just by its > > system-wide tracing nature (it can safely read kernel and user memory > > and we can't restrict whether that memory belongs to a particular > > namespace), so it's like CAP_SYS_TIME, just with much broader API > > surface. > > This doesn't sound like a problem, it sounds like BPF is explicitly > designed to prevent interference by namespaces. But in some cases you > now want to limit it by namespaces. > > It appears that the desired uses of BPF are no longer compatible with > its original security model. That's unfortunate, and likely to require > a significant change to the implementation of BPF. > I have some new ideas, so hopefully not as significant. While I still think that authoritative LSM hooks would be great, I'll stop arguing. I'll get back with a different proposal that would allow BPF usage within user namespaces. We still will want LSM hooks for fine-grained control, but I think we'll be able to make them restrictive-only. > > > > The other piece of a puzzle is user namespaces. We do want to run > > applications inside user namespaces, but allow them to use BPF > > programs. As far as I can tell, there is no way to grant real CAP_BPF > > that will be recognized by capable(CAP_BPF) (not ns_capable, see above > > about system-wide nature of BPF). If there is, please help me > > understand how. All my local experiments failed, and looking at > > cap_capable() implementation it is not intended to even check the > > initial namespace's capability if the process is running in the user > > namespace. > > > > > > So, given that a) we can't make CAP_BPF namespace-aware and b) we > > can't grant real CAP_BPF to processes in user namespace, how could we > > allow user namespaced applications to do useful work with BPF? > > > >>> Lastly, you mentioned before: > >>> > >>>>>> I think we need to make this more concrete; we don't have a pattern in > >>>>>> the upstream kernel where 'some_check(...)' is a LSM hook, right? > >>> Unfortunately I don't have enough familiarity with all LSM hooks, so I > >>> can't confirm or disprove the above statement. But earlier someone > >>> brought to my attention the case of security_vm_enough_memory_mm(), > >>> which seems to be granting effectively CAP_SYS_ADMIN for the purposes > >>> of memory accounting. Am I missing something subtle there or does it > >>> grant effective caps indeed? > >> Some of the comments around that hook can be misleading, but if you > >> look at the actual code it starts to make more sense. > >> > > [...] > > > >> I do agree that the security_vm_enough_memory() hook is structured a > >> bit differently than most of the other LSM hooks, but it still > >> operates with the same philosophy: a LSM should only be allowed to > >> restrict access, a LSM should never be allowed to grant access that > >> would otherwise be denied by the traditional Linux access controls. > >> > >> Hopefully that explanation makes sense, but if things are still a bit > >> fuzzy I would encourage you to go look at the code, I'm sure it will > >> make sense once you spend a few minutes figuring out how it works. > >> > > Yep, thanks a lot, it's way more clear after grokking relevant pieces > > of LSM the code you pointed out and LSM infrastructure in general. > > "capabilities" LSM is non-negotiable, so it effectively always > > restricts a small subset of hooks, including vm_enough_memory and > > capable. > > > > Still, the problem still stands. How do we marry BPF and user > > namespaces? I'd really appreciate suggestions. Thank you! > > > > > >> [1] There is a long and sorta bizarre history with the capability LSM, > >> but just understand it is a bit "special" in many ways, and those > >> "special" behaviors are intentional. > >> > >> -- > >> paul-moore.com
On Tue, Apr 18, 2023 at 7:21 AM Paul Moore <paul@paul-moore.com> wrote: > > On Mon, Apr 17, 2023 at 7:29 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > On Thu, Apr 13, 2023 at 8:11 AM Paul Moore <paul@paul-moore.com> wrote: > > > On Thu, Apr 13, 2023 at 1:16 AM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Apr 12, 2023 at 7:56 PM Paul Moore <paul@paul-moore.com> wrote: > > > > > On Wed, Apr 12, 2023 at 9:43 PM Andrii Nakryiko > > > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Wed, Apr 12, 2023 at 12:07 PM Paul Moore <paul@paul-moore.com> wrote: > > > > > > > On Wed, Apr 12, 2023 at 2:28 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > On Wed, Apr 12, 2023 at 02:06:23PM -0400, Paul Moore wrote: > > > > > > > > > On Wed, Apr 12, 2023 at 1:47 PM Kees Cook <keescook@chromium.org> wrote: > > > > > > > > > > On Wed, Apr 12, 2023 at 12:49:06PM -0400, Paul Moore wrote: > > > > > > > > > > > On Wed, Apr 12, 2023 at 12:33 AM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > > > ... > > > > > > > > > > > > > > > For example, in many places we have things like: > > > > > > > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > > > > return -EPERM; > > > > > > > > > > > > > > > > > > > > I would expect this is a similar logic. An operation can succeed if the > > > > > > > > > > access control requirement is met. The mismatch we have through-out the > > > > > > > > > > kernel is that capability checks aren't strictly done by LSM hooks. And > > > > > > > > > > this series conceptually, I think, doesn't violate that -- it's changing > > > > > > > > > > the logic of the capability checks, not the LSM (i.e. there no LSM hooks > > > > > > > > > > yet here). > > > > > > > > > > > > > > > > > > Patch 04/08 creates a new LSM hook, security_bpf_map_create(), which > > > > > > > > > when it returns a positive value "bypasses kernel checks". The patch > > > > > > > > > isn't based on either Linus' tree or the LSM tree, I'm guessing it is > > > > > > > > > based on a eBPF tree, so I can't say with 100% certainty that it is > > > > > > > > > bypassing a capability check, but the description claims that to be > > > > > > > > > the case. > > > > > > > > > > > > > > > > > > Regardless of how you want to spin this, I'm not supportive of a LSM > > > > > > > > > hook which allows a LSM to bypass a capability check. A LSM hook can > > > > > > > > > be used to provide additional access control restrictions beyond a > > > > > > > > > capability check, but a LSM hook should never be allowed to overrule > > > > > > > > > an access denial due to a capability check. > > > > > > > > > > > > > > > > > > > The reason CAP_BPF was created was because there was nothing else that > > > > > > > > > > would be fine-grained enough at the time. > > > > > > > > > > > > > > > > > > The LSM layer predates CAP_BPF, and one could make a very solid > > > > > > > > > argument that one of the reasons LSMs exist is to provide > > > > > > > > > supplementary controls due to capability-based access controls being a > > > > > > > > > poor fit for many modern use cases. > > > > > > > > > > > > > > > > I generally agree with what you say, but we DO have this code pattern: > > > > > > > > > > > > > > > > if (!some_check(...) && !capable(...)) > > > > > > > > return -EPERM; > > > > > > > > > > > > > > I think we need to make this more concrete; we don't have a pattern in > > > > > > > the upstream kernel where 'some_check(...)' is a LSM hook, right? > > > > > > > Simply because there is another kernel access control mechanism which > > > > > > > allows a capability check to be skipped doesn't mean I want to allow a > > > > > > > LSM hook to be used to skip a capability check. > > > > > > > > > > > > This work is an attempt to tighten the security of production systems > > > > > > by allowing to drop too coarse-grained and permissive capabilities > > > > > > (like CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, which inevitable allow more > > > > > > than production use cases are meant to be able to do) and then grant > > > > > > specific BPF operations on specific BPF programs/maps based on custom > > > > > > LSM security policy, which validates application trustworthiness using > > > > > > custom production-specific logic. > > > > > > > > > > There are ways to leverage the LSMs to apply finer grained access > > > > > control on top of the relatively coarse capabilities that do not > > > > > require circumventing those capability controls. One grants the > > > > > capabilities, just as one would do today, and then leverages the > > > > > security functionality of a LSM to further restrict specific users, > > > > > applications, etc. with a level of granularity beyond that offered by > > > > > the capability controls. > > > > > > > > Please help me understand something. What you and Casey are proposing, > > > > when taken to the logical extreme, is to grant to all processes root > > > > permissions and then use LSM to restrict specific actions, do I > > > > understand correctly? This strikes me as a less secure and more > > > > error-prone way of doing things. > > > > > > When taken to the "logical extreme" most concepts end up sounding a > > > bit absurd, but that was the point, wasn't it? > > > > Wasn't my intent to make it sound absurd, sorry. The way I see it, for > > the sake of example, let's say CAP_BPF allows 20 different operations > > (each with its own security_xxx hook). And let's say in production I > > want to only allow 3 of them. Sure, technically it should be possible > > to deny access at 17 hooks and let it through in just those 3. But if > > someone adds 21st and I forget to add 21st restriction, that would be > > bad (but very probably with such approach). > > Welcome to the challenges of maintaining access controls within the > Linux Kernel, LSM or otherwise. As we all know, the Linux Kernel > moves forward at a staggering pace sometimes, and it is not uncommon > for new features/subsystems to be added without consulting all of the > different folks who worry about access controls. In many cases it can > be a simple misunderstanding, but in some cases it's a willful > rejection of a particular form of access control, the LSM being a > prime example. Thankfully in almost all of those cases we have been > moderately successful in retrofitting the necessary access controls, > sometimes they are not as good/capable/granular/etc. as we would like > because of design limitations, but such is life. > > I say this not because I believe this is a valid argument for > authoritative LSM hooks, I say this simply to acknowledge that this > *is* a problem. > Ack, thanks. > > So my point is that for situations like this, dropping CAP_BPF, but > > allowing only 3 hooks to proceed seems a safer approach, because if we > > add 21st hook, it will safely be denied without CAP_BPF *by default*. > > That's what I tried to point out. > > I believe I understand your point, I just disagree with you on > accepting authoritative LSM hooks in the upstream Linux Kernel; I > believe it would be a *big* mistake to move away from the restrictive > LSM hook philosophy at this point in time. Ok, understood. While unfortunate, I'll stop pushing for authoritative LSMs. > > > But even if we ignore this "safe by default when a new hook is added" > > behavior, when taking user namespaces into account, the restrictive > > LSM approach just doesn't seem to work at all for something like > > CAP_BPF. CAP_BPF cannot be "namespaced", just like, say, CAP_SYS_TIME, > > because we cannot ensure that a given BPF program won't access kernel > > state "belonging" to another process (as one example). > > Once again, the root of this problem lies in the capabilities and/or > namespace mechanisms, not the LSM; if you want to fix this properly > you should be looking at how eBPF leverages capabilities for access > control. Changing the very core behavior of the LSM layer in order to > work around an issue with another access control mechanism is a > non-starter. I can't say this enough. Alright. I now do have an alternative approach in mind that will only use restrictive LSMs and will still allow BPF usage within user namespaces. > > > Now, thanks to Jonathan, I get that there was a heated discussion 20 > > years ago about authoritative vs restrictive LSMs. But if I read a > > summary at that time ([0]), authoritative hooks were not out of the > > question *in principle*. Surely, "walk before we can run" makes sense, > > but it's been a while ago. > > ... and once again, the restrictive approach has proven to work > reasonably well over the past ~20 years, why would we abandon that > simply to work around a problem with a different access control > mechanism. Don't break the LSM layer to fix something else. There was no breakage introduced, let's call things by their proper names. Surely, new hooks were authoritative, but they don't really break anything, right? I understand that they go against your restrictive-only LSM philosophy, but it's not a breakage in any proper sense of that word. All existing hooks continue to work. New hooks would work properly as well. It's not a breakage. I'm not saying this to try to convince you, but let's not misrepresent what I tried to do in this patch set. > > > > Here is a fun story which seems relevant ... in the early days of > > > SELinux, one of the community devs setup up a system with a SELinux > > > policy which restricted all privileged operations from the root user, > > > put the system on a publicly accessible network, posted the root > > > password for all to see, and invited the public to login to the system > > > and attempt to exercise root privilege (it's been well over 10 years > > > at this point so the details are a bit fuzzy). Granted, there were > > > some hiccups in the beginning, mostly due to the crude state of policy > > > development/analysis at the time, but after a few policy revisions the > > > system held up quite well. > > > > Honest question out of curiosity: was the intent to demonstrate that > > with LSM one can completely restrict root? Or that root was actually > > allowed to do something useful? > > The intent was to show that it is possible to restrict > capability-based access controls with the LSM layer; it was the best > example of the "logical extreme" carried out in the real world that I > could think of when writing my response. > > > > On the more practical side of things, there are several use cases > > > which require, by way of legal or contractual requirements, that full > > > root/admin privileges are decomposed into separate roles: security > > > admin, audit admin, backup admin, etc. These users satisfy these > > > requirements by using LSMs, such as SELinux, to restrict the > > > administrative capabilities based on the SELinux user/role/domain. > > > > > > > By the way, even the above proposal of yours doesn't work for > > > > production use cases when user namespaces are involved, as far as I > > > > understand. We cannot grant CAP_BPF+CAP_PERFMON+CAP_NET_ADMIN for > > > > containers running inside user namespaces, as CAP_BPF in non-init > > > > namespace is not enough for bpf() syscall to allow loading BPF maps or > > > > BPF program ... > > > > > > Once again, the LSM has always intended to be a restrictive mechanism, > > > not a privilege granting mechanism. If an operation is not possible > > > > Not according to [0] above: > > When one considers what has been present in Linus' tree, then yes. > The idea of authoritative LSM hooks has been rejected for ~20 years > and I've seen nothing in this thread to make me believe that we should > change that now, and for this use case. Ack. > > > > Based on your patches and our discussion, it seems to me that the > > > problem you are trying to resolve is related more to the > > > capability-based access controls in the eBPF, and possibly other > > > kernel subsystems, and not any LSM-based restrictions. I'm happy to > > > work with you on a solution involving the LSM, but please understand > > > that I'm not going to support a solution which changes a core > > > philosophy of the LSM layer. > > > > Great, I'd really appreciate help and suggestions on how to solve the > > following problem. > > > > We have a BPF subsystem that allows loading BPF programs. Those BPF > > programs cannot be contained within a particular namespace just by its > > system-wide tracing nature (it can safely read kernel and user memory > > and we can't restrict whether that memory belongs to a particular > > namespace), so it's like CAP_SYS_TIME, just with much broader API > > surface. > > > > The other piece of a puzzle is user namespaces. We do want to run > > applications inside user namespaces, but allow them to use BPF > > programs. As far as I can tell, there is no way to grant real CAP_BPF > > that will be recognized by capable(CAP_BPF) (not ns_capable, see above > > about system-wide nature of BPF). If there is, please help me > > understand how. All my local experiments failed, and looking at > > cap_capable() implementation it is not intended to even check the > > initial namespace's capability if the process is running in the user > > namespace. > > > > So, given that a) we can't make CAP_BPF namespace-aware and b) we > > can't grant real CAP_BPF to processes in user namespace, how could we > > allow user namespaced applications to do useful work with BPF? > > I would start by talking with the user namespace folks. I may be > misunderstanding the problem as you've described it, but it seems like > the core issue is how capabilities, specifically CAP_BPF, are handled > in user namespaces. To be honest, I'm not sure how much luck you'll > have there, but you stand a better chance in changing how capabilities > are handled across user namespaces than you do in getting an > authoritative LSM hook merged. > You made it very clear, yes. > Regardless, my offer still stands, if you have a solution which sticks > to a restrictive LSM model, I'm happy to work with you further to sort > out the details and try to make that work. I don't have any great > ideas there at the moment, but there are plenty of smart people on > this mailing list and others who might have something clever in mind. I do have a solution in mind. Stay tuned. > > -- > paul-moore.com
On Thu, Apr 20, 2023 at 05:00:55PM -0700, Andrii Nakryiko wrote: > Alright. I now do have an alternative approach in mind that will only > use restrictive LSMs and will still allow BPF usage within user > namespaces. It seems the problem with in the existing kernel is that bpf_capable() is rather inflexible. In only one place is sysctl_unprivileged_bpf_disabled checked (outside the unprivileged_ebpf_enabled() checks in CPU errata fixes). Should CAP_BPF be per-namespace?