diff mbox series

selinux: introduce an initial SID for early boot processes

Message ID 20230612090145.1059245-1-omosnace@redhat.com (mailing list archive)
State Changes Requested
Delegated to: Paul Moore
Headers show
Series selinux: introduce an initial SID for early boot processes | expand

Commit Message

Ondrej Mosnacek June 12, 2023, 9:01 a.m. UTC
Currently, SELinux doesn't allow distinguishing between kernel threads
and userspace processes that are started before the policy is first
loaded - both get the label corresponding to the kernel SID. The only
way a process that persists from early boot can get a meaningful label
is by doing a voluntary dyntransition or re-executing itself.

Reusing the kernel label for userspace processes is problematic for
several reasons:
1. The kernel is considered to be a privileged domain and generally
   needs to have a wide range of permissions allowed to work correctly,
   which prevents the policy writer from effectively hardening against
   early boot processes that might remain running unintentionally after
   the policy is loaded (they represent a potential extra attack surface
   that should be mitigated).
2. Despite the kernel being treated as a privileged domain, the policy
   writer may want to impose certain special limitations on kernel
   threads that may conflict with the requirements of intentional early
   boot processes. For example, it is a good hardening practice to limit
   what executables the kernel can execute as usermode helpers and to
   confine the resulting usermode helper processes. However, a
   (legitimate) process surviving from early boot may need to execute a
   different set of executables.
3. As currently implemented, overlayfs remembers the security context of
   the process that created an overlayfs mount and uses it to bound
   subsequent operations on files using this context. If an overlayfs
   mount is created before the SELinux policy is loaded, these "mounter"
   checks are made against the kernel context, which may clash with
   restrictions on the kernel domain (see 2.).

To resolve this, introduce a new initial SID (reusing the slot of the
former "init" initial SID) that will be assigned to any userspace
process started before the policy is first loaded. This is easy to do,
as we can simply label any process that goes through the
bprm_creds_for_exec LSM hook with the new init-SID instead of
propagating the kernel SID from the parent.

To provide backwards compatibility for existing policies that are
unaware of this new semantic of the "init" initial SID, introduce a new
policy capability "userspace_initial_context" and set the "init" SID to
the same context as the "kernel" SID unless this capability is set by
the policy.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/hooks.c                      | 27 +++++++++++++++++++
 .../selinux/include/initial_sid_to_string.h   |  2 +-
 security/selinux/include/policycap.h          |  1 +
 security/selinux/include/policycap_names.h    |  3 ++-
 security/selinux/include/security.h           |  7 +++++
 security/selinux/ss/policydb.c                | 27 +++++++++++++++++++
 6 files changed, 65 insertions(+), 2 deletions(-)

Comments

Paul Moore June 16, 2023, 8:43 p.m. UTC | #1
On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> Currently, SELinux doesn't allow distinguishing between kernel threads
> and userspace processes that are started before the policy is first
> loaded - both get the label corresponding to the kernel SID. The only
> way a process that persists from early boot can get a meaningful label
> is by doing a voluntary dyntransition or re-executing itself.
>
> Reusing the kernel label for userspace processes is problematic for
> several reasons:
> 1. The kernel is considered to be a privileged domain and generally
>    needs to have a wide range of permissions allowed to work correctly,
>    which prevents the policy writer from effectively hardening against
>    early boot processes that might remain running unintentionally after
>    the policy is loaded (they represent a potential extra attack surface
>    that should be mitigated).
> 2. Despite the kernel being treated as a privileged domain, the policy
>    writer may want to impose certain special limitations on kernel
>    threads that may conflict with the requirements of intentional early
>    boot processes. For example, it is a good hardening practice to limit
>    what executables the kernel can execute as usermode helpers and to
>    confine the resulting usermode helper processes. However, a
>    (legitimate) process surviving from early boot may need to execute a
>    different set of executables.
> 3. As currently implemented, overlayfs remembers the security context of
>    the process that created an overlayfs mount and uses it to bound
>    subsequent operations on files using this context. If an overlayfs
>    mount is created before the SELinux policy is loaded, these "mounter"
>    checks are made against the kernel context, which may clash with
>    restrictions on the kernel domain (see 2.).
>
> To resolve this, introduce a new initial SID (reusing the slot of the
> former "init" initial SID) that will be assigned to any userspace
> process started before the policy is first loaded. This is easy to do,
> as we can simply label any process that goes through the
> bprm_creds_for_exec LSM hook with the new init-SID instead of
> propagating the kernel SID from the parent.
>
> To provide backwards compatibility for existing policies that are
> unaware of this new semantic of the "init" initial SID, introduce a new
> policy capability "userspace_initial_context" and set the "init" SID to
> the same context as the "kernel" SID unless this capability is set by
> the policy.
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/hooks.c                      | 27 +++++++++++++++++++
>  .../selinux/include/initial_sid_to_string.h   |  2 +-
>  security/selinux/include/policycap.h          |  1 +
>  security/selinux/include/policycap_names.h    |  3 ++-
>  security/selinux/include/security.h           |  7 +++++
>  security/selinux/ss/policydb.c                | 27 +++++++++++++++++++
>  6 files changed, 65 insertions(+), 2 deletions(-)

Thanks Ondrej, this looks pretty good to me.  There is some minor
nitpicky stuff below, but those comments are mainly FYIs for future
reference.  Given where we are at in the -rcX cycle, I am going to
hold off on merging this until after the upcoming merge window.

> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 99ded60a6b911..dd410ceb178cb 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2264,6 +2264,18 @@ static int selinux_bprm_creds_for_exec(struct linux_binprm *bprm)
>         new_tsec->keycreate_sid = 0;
>         new_tsec->sockcreate_sid = 0;
>
> +       /*
> +        * Before policy is loaded, label any task outside kernel space
> +        * as SECINITSID_INIT, so that any userspace tasks surviving from
> +        * early boot end up with a label different from SECINITSID_KERNEL
> +        * (if the policy chooses to set SECINITSID_INIT != SECINITSID_KERNEL).
> +        */
> +       if (!selinux_initialized()) {
> +               new_tsec->sid = SECINITSID_INIT;
> +               new_tsec->exec_sid = 0; /* just in case */

Style nit, I don't like placing trailing comments on the same line as
code.  Don't respin this patch just for this, but remember this for
future submissions.

> +               return 0;
> +       }
> +
>         if (old_tsec->exec_sid) {
>                 new_tsec->sid = old_tsec->exec_sid;
>                 /* Reset exec SID on execve. */

...

> diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> index 97c0074f9312a..240e0fb1d57f9 100644
> --- a/security/selinux/ss/policydb.c
> +++ b/security/selinux/ss/policydb.c
> @@ -863,6 +863,8 @@ void policydb_destroy(struct policydb *p)
>  int policydb_load_isids(struct policydb *p, struct sidtab *s)
>  {
>         struct ocontext *head, *c;
> +       bool secsid_init_supported = ebitmap_get_bit(&p->policycaps,
> +                                                    POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT);

This is another "please don't respin for this", but if you have to
respin for any reason can you change the variable name to
"isid_init_supported" or something similar?  The "secsid" portion of
the name looks wrong to me.
Ondrej Mosnacek June 17, 2023, 9:29 a.m. UTC | #2
On Fri, Jun 16, 2023 at 10:43 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:

...

> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index 99ded60a6b911..dd410ceb178cb 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -2264,6 +2264,18 @@ static int selinux_bprm_creds_for_exec(struct linux_binprm *bprm)
> >         new_tsec->keycreate_sid = 0;
> >         new_tsec->sockcreate_sid = 0;
> >
> > +       /*
> > +        * Before policy is loaded, label any task outside kernel space
> > +        * as SECINITSID_INIT, so that any userspace tasks surviving from
> > +        * early boot end up with a label different from SECINITSID_KERNEL
> > +        * (if the policy chooses to set SECINITSID_INIT != SECINITSID_KERNEL).
> > +        */
> > +       if (!selinux_initialized()) {
> > +               new_tsec->sid = SECINITSID_INIT;
> > +               new_tsec->exec_sid = 0; /* just in case */
>
> Style nit, I don't like placing trailing comments on the same line as
> code.  Don't respin this patch just for this, but remember this for
> future submissions.

Ack.

> > +               return 0;
> > +       }
> > +
> >         if (old_tsec->exec_sid) {
> >                 new_tsec->sid = old_tsec->exec_sid;
> >                 /* Reset exec SID on execve. */
>
> ...
>
> > diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> > index 97c0074f9312a..240e0fb1d57f9 100644
> > --- a/security/selinux/ss/policydb.c
> > +++ b/security/selinux/ss/policydb.c
> > @@ -863,6 +863,8 @@ void policydb_destroy(struct policydb *p)
> >  int policydb_load_isids(struct policydb *p, struct sidtab *s)
> >  {
> >         struct ocontext *head, *c;
> > +       bool secsid_init_supported = ebitmap_get_bit(&p->policycaps,
> > +                                                    POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT);
>
> This is another "please don't respin for this", but if you have to
> respin for any reason can you change the variable name to
> "isid_init_supported" or something similar?  The "secsid" portion of
> the name looks wrong to me.

It was supposed to be "secinitsid_init_supported" but I botched it :)
Though that name is very long, so if I were to change it, I would go
with your suggestion.
Paul Moore June 19, 2023, 9:10 p.m. UTC | #3
On Sat, Jun 17, 2023 at 5:30 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Fri, Jun 16, 2023 at 10:43 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> ...
>
> > > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > > index 99ded60a6b911..dd410ceb178cb 100644
> > > --- a/security/selinux/hooks.c
> > > +++ b/security/selinux/hooks.c
> > > @@ -2264,6 +2264,18 @@ static int selinux_bprm_creds_for_exec(struct linux_binprm *bprm)
> > >         new_tsec->keycreate_sid = 0;
> > >         new_tsec->sockcreate_sid = 0;
> > >
> > > +       /*
> > > +        * Before policy is loaded, label any task outside kernel space
> > > +        * as SECINITSID_INIT, so that any userspace tasks surviving from
> > > +        * early boot end up with a label different from SECINITSID_KERNEL
> > > +        * (if the policy chooses to set SECINITSID_INIT != SECINITSID_KERNEL).
> > > +        */
> > > +       if (!selinux_initialized()) {
> > > +               new_tsec->sid = SECINITSID_INIT;
> > > +               new_tsec->exec_sid = 0; /* just in case */
> >
> > Style nit, I don't like placing trailing comments on the same line as
> > code.  Don't respin this patch just for this, but remember this for
> > future submissions.
>
> Ack.
>
> > > +               return 0;
> > > +       }
> > > +
> > >         if (old_tsec->exec_sid) {
> > >                 new_tsec->sid = old_tsec->exec_sid;
> > >                 /* Reset exec SID on execve. */
> >
> > ...
> >
> > > diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> > > index 97c0074f9312a..240e0fb1d57f9 100644
> > > --- a/security/selinux/ss/policydb.c
> > > +++ b/security/selinux/ss/policydb.c
> > > @@ -863,6 +863,8 @@ void policydb_destroy(struct policydb *p)
> > >  int policydb_load_isids(struct policydb *p, struct sidtab *s)
> > >  {
> > >         struct ocontext *head, *c;
> > > +       bool secsid_init_supported = ebitmap_get_bit(&p->policycaps,
> > > +                                                    POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT);
> >
> > This is another "please don't respin for this", but if you have to
> > respin for any reason can you change the variable name to
> > "isid_init_supported" or something similar?  The "secsid" portion of
> > the name looks wrong to me.
>
> It was supposed to be "secinitsid_init_supported" but I botched it :)
> Though that name is very long, so if I were to change it, I would go
> with your suggestion.

 :)

Since we've got a couple of weeks (-rc7 + merge window), why not go
ahead and do the respin to fix up those small things and simplify the
policycap accessor (see the other patch I posted) - does that sound
reasonable?
Ondrej Mosnacek June 20, 2023, 9:24 a.m. UTC | #4
On Mon, Jun 19, 2023 at 11:10 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Sat, Jun 17, 2023 at 5:30 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> > On Fri, Jun 16, 2023 at 10:43 PM Paul Moore <paul@paul-moore.com> wrote:
> > > On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >
> > ...
> >
> > > > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > > > index 99ded60a6b911..dd410ceb178cb 100644
> > > > --- a/security/selinux/hooks.c
> > > > +++ b/security/selinux/hooks.c
> > > > @@ -2264,6 +2264,18 @@ static int selinux_bprm_creds_for_exec(struct linux_binprm *bprm)
> > > >         new_tsec->keycreate_sid = 0;
> > > >         new_tsec->sockcreate_sid = 0;
> > > >
> > > > +       /*
> > > > +        * Before policy is loaded, label any task outside kernel space
> > > > +        * as SECINITSID_INIT, so that any userspace tasks surviving from
> > > > +        * early boot end up with a label different from SECINITSID_KERNEL
> > > > +        * (if the policy chooses to set SECINITSID_INIT != SECINITSID_KERNEL).
> > > > +        */
> > > > +       if (!selinux_initialized()) {
> > > > +               new_tsec->sid = SECINITSID_INIT;
> > > > +               new_tsec->exec_sid = 0; /* just in case */
> > >
> > > Style nit, I don't like placing trailing comments on the same line as
> > > code.  Don't respin this patch just for this, but remember this for
> > > future submissions.
> >
> > Ack.
> >
> > > > +               return 0;
> > > > +       }
> > > > +
> > > >         if (old_tsec->exec_sid) {
> > > >                 new_tsec->sid = old_tsec->exec_sid;
> > > >                 /* Reset exec SID on execve. */
> > >
> > > ...
> > >
> > > > diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
> > > > index 97c0074f9312a..240e0fb1d57f9 100644
> > > > --- a/security/selinux/ss/policydb.c
> > > > +++ b/security/selinux/ss/policydb.c
> > > > @@ -863,6 +863,8 @@ void policydb_destroy(struct policydb *p)
> > > >  int policydb_load_isids(struct policydb *p, struct sidtab *s)
> > > >  {
> > > >         struct ocontext *head, *c;
> > > > +       bool secsid_init_supported = ebitmap_get_bit(&p->policycaps,
> > > > +                                                    POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT);
> > >
> > > This is another "please don't respin for this", but if you have to
> > > respin for any reason can you change the variable name to
> > > "isid_init_supported" or something similar?  The "secsid" portion of
> > > the name looks wrong to me.
> >
> > It was supposed to be "secinitsid_init_supported" but I botched it :)
> > Though that name is very long, so if I were to change it, I would go
> > with your suggestion.
>
>  :)
>
> Since we've got a couple of weeks (-rc7 + merge window), why not go
> ahead and do the respin to fix up those small things and simplify the
> policycap accessor (see the other patch I posted) - does that sound
> reasonable?

Sure, will do.
Paul Moore Oct. 18, 2023, 8:13 p.m. UTC | #5
On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> Currently, SELinux doesn't allow distinguishing between kernel threads
> and userspace processes that are started before the policy is first
> loaded - both get the label corresponding to the kernel SID. The only
> way a process that persists from early boot can get a meaningful label
> is by doing a voluntary dyntransition or re-executing itself.
>
> Reusing the kernel label for userspace processes is problematic for
> several reasons:
> 1. The kernel is considered to be a privileged domain and generally
>    needs to have a wide range of permissions allowed to work correctly,
>    which prevents the policy writer from effectively hardening against
>    early boot processes that might remain running unintentionally after
>    the policy is loaded (they represent a potential extra attack surface
>    that should be mitigated).
> 2. Despite the kernel being treated as a privileged domain, the policy
>    writer may want to impose certain special limitations on kernel
>    threads that may conflict with the requirements of intentional early
>    boot processes. For example, it is a good hardening practice to limit
>    what executables the kernel can execute as usermode helpers and to
>    confine the resulting usermode helper processes. However, a
>    (legitimate) process surviving from early boot may need to execute a
>    different set of executables.
> 3. As currently implemented, overlayfs remembers the security context of
>    the process that created an overlayfs mount and uses it to bound
>    subsequent operations on files using this context. If an overlayfs
>    mount is created before the SELinux policy is loaded, these "mounter"
>    checks are made against the kernel context, which may clash with
>    restrictions on the kernel domain (see 2.).
>
> To resolve this, introduce a new initial SID (reusing the slot of the
> former "init" initial SID) that will be assigned to any userspace
> process started before the policy is first loaded. This is easy to do,
> as we can simply label any process that goes through the
> bprm_creds_for_exec LSM hook with the new init-SID instead of
> propagating the kernel SID from the parent.
>
> To provide backwards compatibility for existing policies that are
> unaware of this new semantic of the "init" initial SID, introduce a new
> policy capability "userspace_initial_context" and set the "init" SID to
> the same context as the "kernel" SID unless this capability is set by
> the policy.
>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>  security/selinux/hooks.c                      | 27 +++++++++++++++++++
>  .../selinux/include/initial_sid_to_string.h   |  2 +-
>  security/selinux/include/policycap.h          |  1 +
>  security/selinux/include/policycap_names.h    |  3 ++-
>  security/selinux/include/security.h           |  7 +++++
>  security/selinux/ss/policydb.c                | 27 +++++++++++++++++++
>  6 files changed, 65 insertions(+), 2 deletions(-)

Unfortunately we had to revert this due to compatibility issues, but I
was hoping there might be a new, fixed version by now; any updates
Ondrej?
Ondrej Mosnacek Oct. 20, 2023, 2:55 p.m. UTC | #6
On Wed, Oct 18, 2023 at 10:13 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Jun 12, 2023 at 5:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >
> > Currently, SELinux doesn't allow distinguishing between kernel threads
> > and userspace processes that are started before the policy is first
> > loaded - both get the label corresponding to the kernel SID. The only
> > way a process that persists from early boot can get a meaningful label
> > is by doing a voluntary dyntransition or re-executing itself.
> >
> > Reusing the kernel label for userspace processes is problematic for
> > several reasons:
> > 1. The kernel is considered to be a privileged domain and generally
> >    needs to have a wide range of permissions allowed to work correctly,
> >    which prevents the policy writer from effectively hardening against
> >    early boot processes that might remain running unintentionally after
> >    the policy is loaded (they represent a potential extra attack surface
> >    that should be mitigated).
> > 2. Despite the kernel being treated as a privileged domain, the policy
> >    writer may want to impose certain special limitations on kernel
> >    threads that may conflict with the requirements of intentional early
> >    boot processes. For example, it is a good hardening practice to limit
> >    what executables the kernel can execute as usermode helpers and to
> >    confine the resulting usermode helper processes. However, a
> >    (legitimate) process surviving from early boot may need to execute a
> >    different set of executables.
> > 3. As currently implemented, overlayfs remembers the security context of
> >    the process that created an overlayfs mount and uses it to bound
> >    subsequent operations on files using this context. If an overlayfs
> >    mount is created before the SELinux policy is loaded, these "mounter"
> >    checks are made against the kernel context, which may clash with
> >    restrictions on the kernel domain (see 2.).
> >
> > To resolve this, introduce a new initial SID (reusing the slot of the
> > former "init" initial SID) that will be assigned to any userspace
> > process started before the policy is first loaded. This is easy to do,
> > as we can simply label any process that goes through the
> > bprm_creds_for_exec LSM hook with the new init-SID instead of
> > propagating the kernel SID from the parent.
> >
> > To provide backwards compatibility for existing policies that are
> > unaware of this new semantic of the "init" initial SID, introduce a new
> > policy capability "userspace_initial_context" and set the "init" SID to
> > the same context as the "kernel" SID unless this capability is set by
> > the policy.
> >
> > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> > ---
> >  security/selinux/hooks.c                      | 27 +++++++++++++++++++
> >  .../selinux/include/initial_sid_to_string.h   |  2 +-
> >  security/selinux/include/policycap.h          |  1 +
> >  security/selinux/include/policycap_names.h    |  3 ++-
> >  security/selinux/include/security.h           |  7 +++++
> >  security/selinux/ss/policydb.c                | 27 +++++++++++++++++++
> >  6 files changed, 65 insertions(+), 2 deletions(-)
>
> Unfortunately we had to revert this due to compatibility issues, but I
> was hoping there might be a new, fixed version by now; any updates
> Ondrej?

Not yet, sorry... Haven't had time to sit down to it yet, but it's one
of my top priorities this quarter and I hope to have a patch posted
around early November (at worst).

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
Paul Moore Oct. 20, 2023, 3:11 p.m. UTC | #7
On Fri, Oct 20, 2023 at 10:55 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Wed, Oct 18, 2023 at 10:13 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > Unfortunately we had to revert this due to compatibility issues, but I
> > was hoping there might be a new, fixed version by now; any updates
> > Ondrej?
>
> Not yet, sorry... Haven't had time to sit down to it yet, but it's one
> of my top priorities this quarter and I hope to have a patch posted
> around early November (at worst).

No worries, we all understand how busy things can get, I was just
curious where things were at with the patchset.  I'll look forward to
seeing the next revision.
diff mbox series

Patch

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 99ded60a6b911..dd410ceb178cb 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2264,6 +2264,18 @@  static int selinux_bprm_creds_for_exec(struct linux_binprm *bprm)
 	new_tsec->keycreate_sid = 0;
 	new_tsec->sockcreate_sid = 0;
 
+	/*
+	 * Before policy is loaded, label any task outside kernel space
+	 * as SECINITSID_INIT, so that any userspace tasks surviving from
+	 * early boot end up with a label different from SECINITSID_KERNEL
+	 * (if the policy chooses to set SECINITSID_INIT != SECINITSID_KERNEL).
+	 */
+	if (!selinux_initialized()) {
+		new_tsec->sid = SECINITSID_INIT;
+		new_tsec->exec_sid = 0; /* just in case */
+		return 0;
+	}
+
 	if (old_tsec->exec_sid) {
 		new_tsec->sid = old_tsec->exec_sid;
 		/* Reset exec SID on execve. */
@@ -4480,6 +4492,21 @@  static int sock_has_perm(struct sock *sk, u32 perms)
 	if (sksec->sid == SECINITSID_KERNEL)
 		return 0;
 
+	/*
+	 * Before POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT, sockets that
+	 * inherited the kernel context from early boot used to be skipped
+	 * here, so preserve that behavior unless the capability is set.
+	 *
+	 * By setting the capability the policy signals that it is ready
+	 * for this quirk to be fixed. Note that sockets created by a kernel
+	 * thread or a usermode helper executed without a transition will
+	 * still be skipped in this check regardless of the policycap
+	 * setting.
+	 */
+	if (!selinux_policycap_userspace_initial_context() &&
+	    sksec->sid == SECINITSID_INIT)
+		return 0;
+
 	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->sk = sk;
diff --git a/security/selinux/include/initial_sid_to_string.h b/security/selinux/include/initial_sid_to_string.h
index 60820517aa438..6d450669e9c68 100644
--- a/security/selinux/include/initial_sid_to_string.h
+++ b/security/selinux/include/initial_sid_to_string.h
@@ -7,7 +7,7 @@  static const char *const initial_sid_to_string[] = {
 	NULL,
 	"file",
 	NULL,
-	NULL,
+	"init",
 	"any_socket",
 	"port",
 	"netif",
diff --git a/security/selinux/include/policycap.h b/security/selinux/include/policycap.h
index f35d3458e71de..c7373e6effe5d 100644
--- a/security/selinux/include/policycap.h
+++ b/security/selinux/include/policycap.h
@@ -12,6 +12,7 @@  enum {
 	POLICYDB_CAP_NNP_NOSUID_TRANSITION,
 	POLICYDB_CAP_GENFS_SECLABEL_SYMLINKS,
 	POLICYDB_CAP_IOCTL_SKIP_CLOEXEC,
+	POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT,
 	__POLICYDB_CAP_MAX
 };
 #define POLICYDB_CAP_MAX (__POLICYDB_CAP_MAX - 1)
diff --git a/security/selinux/include/policycap_names.h b/security/selinux/include/policycap_names.h
index 2a87fc3702b81..28e4c9ee23997 100644
--- a/security/selinux/include/policycap_names.h
+++ b/security/selinux/include/policycap_names.h
@@ -13,7 +13,8 @@  const char *const selinux_policycap_names[__POLICYDB_CAP_MAX] = {
 	"cgroup_seclabel",
 	"nnp_nosuid_transition",
 	"genfs_seclabel_symlinks",
-	"ioctl_skip_cloexec"
+	"ioctl_skip_cloexec",
+	"userspace_initial_context",
 };
 
 #endif /* _SELINUX_POLICYCAP_NAMES_H_ */
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 8746fafeb7789..2d4ca1b9c5913 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -201,6 +201,13 @@  static inline bool selinux_policycap_ioctl_skip_cloexec(void)
 	return READ_ONCE(state->policycap[POLICYDB_CAP_IOCTL_SKIP_CLOEXEC]);
 }
 
+static inline bool selinux_policycap_userspace_initial_context(void)
+{
+	struct selinux_state *state = &selinux_state;
+
+	return READ_ONCE(state->policycap[POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT]);
+}
+
 struct selinux_policy_convert_data;
 
 struct selinux_load_state {
diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index 97c0074f9312a..240e0fb1d57f9 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -863,6 +863,8 @@  void policydb_destroy(struct policydb *p)
 int policydb_load_isids(struct policydb *p, struct sidtab *s)
 {
 	struct ocontext *head, *c;
+	bool secsid_init_supported = ebitmap_get_bit(&p->policycaps,
+						     POLICYDB_CAP_USERSPACE_INITIAL_CONTEXT);
 	int rc;
 
 	rc = sidtab_init(s);
@@ -886,6 +888,13 @@  int policydb_load_isids(struct policydb *p, struct sidtab *s)
 		if (!name)
 			continue;
 
+		/*
+		 * Also ignore SECINITSID_INIT if the policy doesn't declare
+		 * support for it
+		 */
+		if (sid == SECINITSID_INIT && !secsid_init_supported)
+			continue;
+
 		rc = sidtab_set_initial(s, sid, &c->context[0]);
 		if (rc) {
 			pr_err("SELinux:  unable to load initial SID %s.\n",
@@ -893,6 +902,24 @@  int policydb_load_isids(struct policydb *p, struct sidtab *s)
 			sidtab_destroy(s);
 			return rc;
 		}
+
+		/*
+		 * If the policy doesn't support the "userspace_initial_context"
+		 * capability, set SECINITSID_INIT to the same context as
+		 * SECINITSID_KERNEL. This ensures the same behavior as before
+		 * the reintroduction of SECINITSID_INIT, where all tasks
+		 * started before policy load would initially get the context
+		 * corresponding to SECINITSID_KERNEL.
+		 */
+		if (sid == SECINITSID_KERNEL && !secsid_init_supported) {
+			rc = sidtab_set_initial(s, SECINITSID_INIT, &c->context[0]);
+			if (rc) {
+				pr_err("SELinux:  unable to load initial SID %s.\n",
+				       name);
+				sidtab_destroy(s);
+				return rc;
+			}
+		}
 	}
 	return 0;
 }