diff mbox series

selinux: allow to opt-out from skipping kernel sockets in sock_has_perm()

Message ID 20230215141709.305399-1-omosnace@redhat.com (mailing list archive)
State Rejected
Delegated to: Paul Moore
Headers show
Series selinux: allow to opt-out from skipping kernel sockets in sock_has_perm() | expand

Commit Message

Ondrej Mosnacek Feb. 15, 2023, 2:17 p.m. UTC
There is currently a very suspicious condition in sock_has_perm() that
basically skips any permission checking in sock_has_perm() if the target
socket is labeled with SECINITSID_KERNEL. This seems bad for several
reasons:
1. If a user-space process somehow gets its hands on such a socket, it
   will be able to don any socket operation on it without any SELinux
   restriction, even if it is not allowed to do such operations by the
   policy.
2. The exception is inconsistent, since one can e.g. write to a stream
   socket not only via send(2)/sendto(2)/..., but also via write(2),
   which doesn't go through sock_has_perm() and isn't subject to the
   SECINITSID_KERNEL exception.
3. The exception also allows operations on sockets that were created
   before the SELinux policy was loaded (even by user space), since
   these will always inherit the SECINITSID_KERNEL label.

Additionally, it's unclear what is the rationale behind this exception.
For sockets created by the kernel that are expected to be passed to
user space, it seems better to let them undergo normal access checks to
avoid misuse. A possible rationale is to skip checking on operations on
sockets created with kern=1 passed to __sock_create(), which can happen
under user-space credentials even thogh executed internally by the
kernel - notice that such sockets are always labeled with
SECINITSID_KERNEL. However, the operations on these sockets already
normally bypass LSM checks entirely, so arguably this not necessary. On
the contrary, it's better if actual user-space operations on these
sockets go through SELinux checks, since there may be a possibility that
such a socket accidentally leaks into user space and we definitely want
SELinux to detect that and prevent privilege escalation.

Since removing this condition could lead to regressions (notably due to
bullet point (3.) above), add a new policy capability that allows the
policy to opt-out from the condition. This allows policy writers or
distributors to test for impact, add any missing rules, and then enable
the capability.

I tested a kernel with the condition removed on my Fedora workstation
and noted only one new denial, related to a user-space socket created by
systemd-journald before the policy is loaded, which is then continued to
be used by systemd-journald while the system is running.

Also selinux-testsuite is passing without new denials when the check is
removed.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/hooks.c                   | 4 +++-
 security/selinux/include/policycap.h       | 1 +
 security/selinux/include/policycap_names.h | 3 ++-
 security/selinux/include/security.h        | 7 +++++++
 4 files changed, 13 insertions(+), 2 deletions(-)

Comments

Stephen Smalley Feb. 15, 2023, 2:30 p.m. UTC | #1
On Wed, Feb 15, 2023 at 9:17 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> There is currently a very suspicious condition in sock_has_perm() that
> basically skips any permission checking in sock_has_perm() if the target
> socket is labeled with SECINITSID_KERNEL. This seems bad for several
> reasons:
> 1. If a user-space process somehow gets its hands on such a socket, it
>    will be able to don any socket operation on it without any SELinux
>    restriction, even if it is not allowed to do such operations by the
>    policy.
> 2. The exception is inconsistent, since one can e.g. write to a stream
>    socket not only via send(2)/sendto(2)/..., but also via write(2),
>    which doesn't go through sock_has_perm() and isn't subject to the
>    SECINITSID_KERNEL exception.
> 3. The exception also allows operations on sockets that were created
>    before the SELinux policy was loaded (even by user space), since
>    these will always inherit the SECINITSID_KERNEL label.
>
> Additionally, it's unclear what is the rationale behind this exception.
> For sockets created by the kernel that are expected to be passed to
> user space, it seems better to let them undergo normal access checks to
> avoid misuse. A possible rationale is to skip checking on operations on
> sockets created with kern=1 passed to __sock_create(), which can happen
> under user-space credentials even thogh executed internally by the
> kernel - notice that such sockets are always labeled with
> SECINITSID_KERNEL. However, the operations on these sockets already
> normally bypass LSM checks entirely, so arguably this not necessary. On
> the contrary, it's better if actual user-space operations on these
> sockets go through SELinux checks, since there may be a possibility that
> such a socket accidentally leaks into user space and we definitely want
> SELinux to detect that and prevent privilege escalation.
>
> Since removing this condition could lead to regressions (notably due to
> bullet point (3.) above), add a new policy capability that allows the
> policy to opt-out from the condition. This allows policy writers or
> distributors to test for impact, add any missing rules, and then enable
> the capability.
>
> I tested a kernel with the condition removed on my Fedora workstation
> and noted only one new denial, related to a user-space socket created by
> systemd-journald before the policy is loaded, which is then continued to
> be used by systemd-journald while the system is running.
>
> Also selinux-testsuite is passing without new denials when the check is
> removed.

I'll have to dig a bit in history to fully recover the original
motivation/background but IIRC, this had to do with kernel-internal
sockets created and used by the kernel on behalf of userspace.  For
example, sockets associated with NFS mounting and subsequent RPC. In
these cases, we don't necessarily want to allow the userspace process
to directly create/use such sockets and from the userspace
perspective, it is just acting on NFS files and not performing any
socket I/O. Open to doing it in a better way, or finding out that it
is no longer necessary.

>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/hooks.c                   | 4 +++-
>  security/selinux/include/policycap.h       | 1 +
>  security/selinux/include/policycap_names.h | 3 ++-
>  security/selinux/include/security.h        | 7 +++++++
>  4 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 3c5be76a91991..cf7418df3d4c0 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -4561,7 +4561,9 @@ static int sock_has_perm(struct sock *sk, u32 perms)
>         struct common_audit_data ad;
>         struct lsm_network_audit net = {0,};
>
> -       if (sksec->sid == SECINITSID_KERNEL)
> +       /* legacy behavior was to not perform checks on kernel sockets */
> +       if (!selinux_policycap_check_kernel_sockets() &&
> +           sksec->sid == SECINITSID_KERNEL)
>                 return 0;
>
>         ad.type = LSM_AUDIT_DATA_NET;
> diff --git a/security/selinux/include/policycap.h b/security/selinux/include/policycap.h
> index f35d3458e71de..814db520b9d8b 100644
> --- a/security/selinux/include/policycap.h
> +++ b/security/selinux/include/policycap.h
> @@ -12,6 +12,7 @@ enum {
>         POLICYDB_CAP_NNP_NOSUID_TRANSITION,
>         POLICYDB_CAP_GENFS_SECLABEL_SYMLINKS,
>         POLICYDB_CAP_IOCTL_SKIP_CLOEXEC,
> +       POLICYDB_CAP_CHECK_KERNEL_SOCKETS,
>         __POLICYDB_CAP_MAX
>  };
>  #define POLICYDB_CAP_MAX (__POLICYDB_CAP_MAX - 1)
> diff --git a/security/selinux/include/policycap_names.h b/security/selinux/include/policycap_names.h
> index 2a87fc3702b81..62de8262f90fe 100644
> --- a/security/selinux/include/policycap_names.h
> +++ b/security/selinux/include/policycap_names.h
> @@ -13,7 +13,8 @@ const char *const selinux_policycap_names[__POLICYDB_CAP_MAX] = {
>         "cgroup_seclabel",
>         "nnp_nosuid_transition",
>         "genfs_seclabel_symlinks",
> -       "ioctl_skip_cloexec"
> +       "ioctl_skip_cloexec",
> +       "check_kernel_sockets",
>  };
>
>  #endif /* _SELINUX_POLICYCAP_NAMES_H_ */
> diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
> index 393aff41d3ef8..1e57c71d067fb 100644
> --- a/security/selinux/include/security.h
> +++ b/security/selinux/include/security.h
> @@ -230,6 +230,13 @@ static inline bool selinux_policycap_ioctl_skip_cloexec(void)
>         return READ_ONCE(state->policycap[POLICYDB_CAP_IOCTL_SKIP_CLOEXEC]);
>  }
>
> +static inline bool selinux_policycap_check_kernel_sockets(void)
> +{
> +       struct selinux_state *state = &selinux_state;
> +
> +       return READ_ONCE(state->policycap[POLICYDB_CAP_CHECK_KERNEL_SOCKETS]);
> +}
> +
>  struct selinux_policy_convert_data;
>
>  struct selinux_load_state {
> --
> 2.39.1
>
Paul Moore Feb. 15, 2023, 3:07 p.m. UTC | #2
On Wed, Feb 15, 2023 at 9:30 AM Stephen Smalley
<stephen.smalley.work@gmail.com> wrote:
> On Wed, Feb 15, 2023 at 9:17 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >
> > There is currently a very suspicious condition in sock_has_perm() that
> > basically skips any permission checking in sock_has_perm() if the target
> > socket is labeled with SECINITSID_KERNEL. This seems bad for several
> > reasons:
> > 1. If a user-space process somehow gets its hands on such a socket, it
> >    will be able to don any socket operation on it without any SELinux
> >    restriction, even if it is not allowed to do such operations by the
> >    policy.
> > 2. The exception is inconsistent, since one can e.g. write to a stream
> >    socket not only via send(2)/sendto(2)/..., but also via write(2),
> >    which doesn't go through sock_has_perm() and isn't subject to the
> >    SECINITSID_KERNEL exception.
> > 3. The exception also allows operations on sockets that were created
> >    before the SELinux policy was loaded (even by user space), since
> >    these will always inherit the SECINITSID_KERNEL label.
> >
> > Additionally, it's unclear what is the rationale behind this exception.
> > For sockets created by the kernel that are expected to be passed to
> > user space, it seems better to let them undergo normal access checks to
> > avoid misuse. A possible rationale is to skip checking on operations on
> > sockets created with kern=1 passed to __sock_create(), which can happen
> > under user-space credentials even thogh executed internally by the
> > kernel - notice that such sockets are always labeled with
> > SECINITSID_KERNEL. However, the operations on these sockets already
> > normally bypass LSM checks entirely, so arguably this not necessary. On
> > the contrary, it's better if actual user-space operations on these
> > sockets go through SELinux checks, since there may be a possibility that
> > such a socket accidentally leaks into user space and we definitely want
> > SELinux to detect that and prevent privilege escalation.
> >
> > Since removing this condition could lead to regressions (notably due to
> > bullet point (3.) above), add a new policy capability that allows the
> > policy to opt-out from the condition. This allows policy writers or
> > distributors to test for impact, add any missing rules, and then enable
> > the capability.
> >
> > I tested a kernel with the condition removed on my Fedora workstation
> > and noted only one new denial, related to a user-space socket created by
> > systemd-journald before the policy is loaded, which is then continued to
> > be used by systemd-journald while the system is running.
> >
> > Also selinux-testsuite is passing without new denials when the check is
> > removed.
>
> I'll have to dig a bit in history to fully recover the original
> motivation/background but IIRC, this had to do with kernel-internal
> sockets created and used by the kernel on behalf of userspace.  For
> example, sockets associated with NFS mounting and subsequent RPC. In
> these cases, we don't necessarily want to allow the userspace process
> to directly create/use such sockets and from the userspace
> perspective, it is just acting on NFS files and not performing any
> socket I/O. Open to doing it in a better way, or finding out that it
> is no longer necessary.

I know that some network filesystems and netlink use kernel sockets
for their in-kernel endpoints.  Looking quickly at the code I see
several other potential candidates include the Plan 9 filesystem and
RDS.

I'm not opposed to removing the concept of a kernel socket, or
improving the consistency of how we enforce policy with respect to
kernel sockets, but I'm going to need to see a lot more justification
and testing than booting a system and running the test suite.
diff mbox series

Patch

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3c5be76a91991..cf7418df3d4c0 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4561,7 +4561,9 @@  static int sock_has_perm(struct sock *sk, u32 perms)
 	struct common_audit_data ad;
 	struct lsm_network_audit net = {0,};
 
-	if (sksec->sid == SECINITSID_KERNEL)
+	/* legacy behavior was to not perform checks on kernel sockets */
+	if (!selinux_policycap_check_kernel_sockets() &&
+	    sksec->sid == SECINITSID_KERNEL)
 		return 0;
 
 	ad.type = LSM_AUDIT_DATA_NET;
diff --git a/security/selinux/include/policycap.h b/security/selinux/include/policycap.h
index f35d3458e71de..814db520b9d8b 100644
--- a/security/selinux/include/policycap.h
+++ b/security/selinux/include/policycap.h
@@ -12,6 +12,7 @@  enum {
 	POLICYDB_CAP_NNP_NOSUID_TRANSITION,
 	POLICYDB_CAP_GENFS_SECLABEL_SYMLINKS,
 	POLICYDB_CAP_IOCTL_SKIP_CLOEXEC,
+	POLICYDB_CAP_CHECK_KERNEL_SOCKETS,
 	__POLICYDB_CAP_MAX
 };
 #define POLICYDB_CAP_MAX (__POLICYDB_CAP_MAX - 1)
diff --git a/security/selinux/include/policycap_names.h b/security/selinux/include/policycap_names.h
index 2a87fc3702b81..62de8262f90fe 100644
--- a/security/selinux/include/policycap_names.h
+++ b/security/selinux/include/policycap_names.h
@@ -13,7 +13,8 @@  const char *const selinux_policycap_names[__POLICYDB_CAP_MAX] = {
 	"cgroup_seclabel",
 	"nnp_nosuid_transition",
 	"genfs_seclabel_symlinks",
-	"ioctl_skip_cloexec"
+	"ioctl_skip_cloexec",
+	"check_kernel_sockets",
 };
 
 #endif /* _SELINUX_POLICYCAP_NAMES_H_ */
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 393aff41d3ef8..1e57c71d067fb 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -230,6 +230,13 @@  static inline bool selinux_policycap_ioctl_skip_cloexec(void)
 	return READ_ONCE(state->policycap[POLICYDB_CAP_IOCTL_SKIP_CLOEXEC]);
 }
 
+static inline bool selinux_policycap_check_kernel_sockets(void)
+{
+	struct selinux_state *state = &selinux_state;
+
+	return READ_ONCE(state->policycap[POLICYDB_CAP_CHECK_KERNEL_SOCKETS]);
+}
+
 struct selinux_policy_convert_data;
 
 struct selinux_load_state {