[1/2] xsm: add ability to elevate a domain to privileged

Message ID	20220330230549.26074-2-dpsmith@apertussolutions.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: "Daniel P. Smith" <dpsmith@apertussolutions.com> To: xen-devel@lists.xenproject.org Cc: "Daniel P. Smith" <dpsmith@apertussolutions.com>, scott.davis@starlab.io, jandryuk@gmail.com, Daniel De Graaf <dgdegra@tycho.nsa.gov> Subject: [PATCH 1/2] xsm: add ability to elevate a domain to privileged Date: Wed, 30 Mar 2022 19:05:48 -0400 Message-Id: <20220330230549.26074-2-dpsmith@apertussolutions.com> In-Reply-To: <20220330230549.26074-1-dpsmith@apertussolutions.com> References: <20220330230549.26074-1-dpsmith@apertussolutions.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Introduce XSM ability for domain privilege escalation \| expand [0/2] Introduce XSM ability for domain privilege escalation [1/2] xsm: add ability to elevate a domain to privileged [2/2] arch: ensure idle domain is not left privileged

Daniel P. Smith March 30, 2022, 11:05 p.m. UTC

There are now instances where internal hypervisor logic needs to make resource
allocation calls that are protected by XSM checks. The internal hypervisor logic
is represented a number of system domains which by designed are represented by
non-privileged struct domain instances. To enable these logic blocks to
function correctly but in a controlled manner, this commit introduces a pair
of privilege escalation and demotion functions that will make a system domain
privileged and then remove that privilege.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
---
 xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Roger Pau Monné March 31, 2022, 12:36 p.m. UTC | #1

On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
> There are now instances where internal hypervisor logic needs to make resource
> allocation calls that are protected by XSM checks. The internal hypervisor logic
> is represented a number of system domains which by designed are represented by
> non-privileged struct domain instances. To enable these logic blocks to
> function correctly but in a controlled manner, this commit introduces a pair
> of privilege escalation and demotion functions that will make a system domain
> privileged and then remove that privilege.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> ---
>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++

I'm not sure this needs to be in xsm code, AFAICT it could live in a
more generic file.

>  1 file changed, 22 insertions(+)
> 
> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> index e22d6160b5..157e57151e 100644
> --- a/xen/include/xsm/xsm.h
> +++ b/xen/include/xsm/xsm.h
> @@ -189,6 +189,28 @@ struct xsm_operations {
>  #endif
>  };
>  
> +static always_inline int xsm_elevate_priv(struct domain *d)

I don't think it needs to be always_inline, using just inline would be
fine IMO.

Also this needs to be __init.

> +{
> +    if ( is_system_domain(d) )
> +    {
> +        d->is_privileged = true;
> +        return 0;
> +    }
> +

I would add an ASSERT_UNREACHABLE(); here, I don't think we have any
use case for trying to elevate the privileges of a non-system domain.

Thanks, Roger.

Jason Andryuk March 31, 2022, 1:16 p.m. UTC | #2

On Wed, Mar 30, 2022 at 3:05 PM Daniel P. Smith
<dpsmith@apertussolutions.com> wrote:
>
> There are now instances where internal hypervisor logic needs to make resource
> allocation calls that are protected by XSM checks. The internal hypervisor logic
> is represented a number of system domains which by designed are represented by
> non-privileged struct domain instances. To enable these logic blocks to
> function correctly but in a controlled manner, this commit introduces a pair
> of privilege escalation and demotion functions that will make a system domain
> privileged and then remove that privilege.
>
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> ---
>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> index e22d6160b5..157e57151e 100644
> --- a/xen/include/xsm/xsm.h
> +++ b/xen/include/xsm/xsm.h
> @@ -189,6 +189,28 @@ struct xsm_operations {
>  #endif
>  };
>
> +static always_inline int xsm_elevate_priv(struct domain *d)
> +{
> +    if ( is_system_domain(d) )
> +    {
> +        d->is_privileged = true;
> +        return 0;
> +    }
> +
> +    return -EPERM;
> +}

These look sufficient for the default policy, but they don't seem
sufficient for Flask.  I think you need to create a new XSM hook.  For
Flask, you would want the demote hook to transition xen_boot_t ->
xen_t.  That would start xen_boot_t with privileges that are dropped
in a one-way transition.  Does that require all policies to then have
xen_boot_t and xen_t?  I guess it does unless the hook code has some
logic to skip the transition.

For the default policy, you could start by creating the system domains
as privileged and just have a single hook to drop privs.  Then you
don't have to worry about the "elevate" hook existing.  The patch 2
asserts could instead become the location of xsm_drop_privs calls to
have a clear demarcation point.  That expands the window with
privileges though.  It's a little simpler, but maybe you don't want
that.  However, it seems like you can only depriv once for the Flask
case since you want it to be one-way.

Regards,
Jason

Julien Grall April 1, 2022, 5:52 p.m. UTC | #3

Hi,

On 31/03/2022 13:36, Roger Pau Monné wrote:
> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>> There are now instances where internal hypervisor logic needs to make resource
>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>> is represented a number of system domains which by designed are represented by
>> non-privileged struct domain instances. To enable these logic blocks to
>> function correctly but in a controlled manner, this commit introduces a pair
>> of privilege escalation and demotion functions that will make a system domain
>> privileged and then remove that privilege.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> ---
>>   xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
> 
> I'm not sure this needs to be in xsm code, AFAICT it could live in a
> more generic file.
> 
>>   1 file changed, 22 insertions(+)
>>
>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>> index e22d6160b5..157e57151e 100644
>> --- a/xen/include/xsm/xsm.h
>> +++ b/xen/include/xsm/xsm.h
>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>   #endif
>>   };
>>   
>> +static always_inline int xsm_elevate_priv(struct domain *d)
> 
> I don't think it needs to be always_inline, using just inline would be
> fine IMO.
> 
> Also this needs to be __init.

Hmmm.... I thought adding __init on function defined in header was 
pointless. In particular, if the compiler decides to inline it.

In any case, I think it would be good to check that the system_state < 
SYS_state_active (could potentially be an ASSERT()) to prevent any misuse.

Cheers,

Julien Grall April 1, 2022, 5:53 p.m. UTC | #4

Hi Daniel,

On 31/03/2022 00:05, Daniel P. Smith wrote:
> There are now instances where internal hypervisor logic needs to make resource
> allocation calls that are protected by XSM checks. The internal hypervisor logic
> is represented a number of system domains which by designed are represented by
> non-privileged struct domain instances. To enable these logic blocks to
> function correctly but in a controlled manner, this commit introduces a pair
> of privilege escalation and demotion functions that will make a system domain
> privileged and then remove that privilege.
> 
> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> ---
>   xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>   1 file changed, 22 insertions(+)
> 
> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> index e22d6160b5..157e57151e 100644
> --- a/xen/include/xsm/xsm.h
> +++ b/xen/include/xsm/xsm.h
> @@ -189,6 +189,28 @@ struct xsm_operations {
>   #endif
>   };
>   
> +static always_inline int xsm_elevate_priv(struct domain *d)
> +{
> +    if ( is_system_domain(d) )
> +    {
> +        d->is_privileged = true;

The call for xsm_elevate_priv() cannot be nested. So I would suggest to 
check if d->is_privileged is already true and return -EBUSY in this case.

> +        return 0;
> +    }
> +
> +    return -EPERM;
> +}
> +
> +static always_inline int xsm_demote_priv(struct domain *d)
> +{
> +    if ( is_system_domain(d) )
> +    {
> +        d->is_privileged = false;
> +        return 0;
> +    }
> +
> +    return -EPERM;
> +}
> +
>   #ifdef CONFIG_XSM
>   
>   extern struct xsm_operations *xsm_ops;

Cheers,

Roger Pau Monné April 4, 2022, 8:08 a.m. UTC | #5

On Fri, Apr 01, 2022 at 06:52:46PM +0100, Julien Grall wrote:
> Hi,
> 
> On 31/03/2022 13:36, Roger Pau Monné wrote:
> > On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
> > > There are now instances where internal hypervisor logic needs to make resource
> > > allocation calls that are protected by XSM checks. The internal hypervisor logic
> > > is represented a number of system domains which by designed are represented by
> > > non-privileged struct domain instances. To enable these logic blocks to
> > > function correctly but in a controlled manner, this commit introduces a pair
> > > of privilege escalation and demotion functions that will make a system domain
> > > privileged and then remove that privilege.
> > > 
> > > Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> > > ---
> > >   xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
> > 
> > I'm not sure this needs to be in xsm code, AFAICT it could live in a
> > more generic file.
> > 
> > >   1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> > > index e22d6160b5..157e57151e 100644
> > > --- a/xen/include/xsm/xsm.h
> > > +++ b/xen/include/xsm/xsm.h
> > > @@ -189,6 +189,28 @@ struct xsm_operations {
> > >   #endif
> > >   };
> > > +static always_inline int xsm_elevate_priv(struct domain *d)
> > 
> > I don't think it needs to be always_inline, using just inline would be
> > fine IMO.
> > 
> > Also this needs to be __init.
> 
> Hmmm.... I thought adding __init on function defined in header was
> pointless. In particular, if the compiler decides to inline it.

Indeed, I didn't realize, thanks for pointing this out.

> In any case, I think it would be good to check that the system_state <
> SYS_state_active (could potentially be an ASSERT()) to prevent any misuse.

My preference would be to make those non-inline then I think, we could
place them in common/domain.c maybe? There's no performance reason to
have those helpers as inline.

Another option would be what Jason suggested about creating the idle
domain as privileged and then dropping the privileges before starting
any domains (ie: before setting the system as ACTIVE).

This expands the duration the idle domain context is marked as
privileged, but OTOH we don't need to add a hook to set
is_privileged.

Thanks, Roger.

Jan Beulich April 4, 2022, 12:24 p.m. UTC | #6

On 04.04.2022 10:08, Roger Pau Monné wrote:
> On Fri, Apr 01, 2022 at 06:52:46PM +0100, Julien Grall wrote:
>> Hi,
>>
>> On 31/03/2022 13:36, Roger Pau Monné wrote:
>>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>>>> There are now instances where internal hypervisor logic needs to make resource
>>>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>>>> is represented a number of system domains which by designed are represented by
>>>> non-privileged struct domain instances. To enable these logic blocks to
>>>> function correctly but in a controlled manner, this commit introduces a pair
>>>> of privilege escalation and demotion functions that will make a system domain
>>>> privileged and then remove that privilege.
>>>>
>>>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>>>> ---
>>>>   xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>>>
>>> I'm not sure this needs to be in xsm code, AFAICT it could live in a
>>> more generic file.
>>>
>>>>   1 file changed, 22 insertions(+)
>>>>
>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>>>> index e22d6160b5..157e57151e 100644
>>>> --- a/xen/include/xsm/xsm.h
>>>> +++ b/xen/include/xsm/xsm.h
>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>>>   #endif
>>>>   };
>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
>>>
>>> I don't think it needs to be always_inline, using just inline would be
>>> fine IMO.
>>>
>>> Also this needs to be __init.
>>
>> Hmmm.... I thought adding __init on function defined in header was
>> pointless. In particular, if the compiler decides to inline it.
> 
> Indeed, I didn't realize, thanks for pointing this out.

The question isn't header or not, but declaration or definition.
Attributes like this one are meaningless on declarations (at least
on all the arches we care about; there may be subtleties), but
meaningful for definitions. Iirc even with always_inline the
compiler may find reasons why a function cannot be inlined, and
hence the intended section should be specified. Plus such an
annotation serves a documentation purpose.

Jan

Daniel P. Smith April 4, 2022, 2:21 p.m. UTC | #7

On 3/31/22 08:36, Roger Pau Monné wrote:
> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>> There are now instances where internal hypervisor logic needs to make resource
>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>> is represented a number of system domains which by designed are represented by
>> non-privileged struct domain instances. To enable these logic blocks to
>> function correctly but in a controlled manner, this commit introduces a pair
>> of privilege escalation and demotion functions that will make a system domain
>> privileged and then remove that privilege.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> ---
>>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
> 
> I'm not sure this needs to be in xsm code, AFAICT it could live in a
> more generic file.

From my perspective this is access control logic, thus why I advocate
for it to be under XSM. A personal goal is to try to get all security,
i.e. access control, centralized to the extent that it doing so does not
make the code base unnecessarily complicated.

>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>> index e22d6160b5..157e57151e 100644
>> --- a/xen/include/xsm/xsm.h
>> +++ b/xen/include/xsm/xsm.h
>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>  #endif
>>  };
>>  
>> +static always_inline int xsm_elevate_priv(struct domain *d)
> 
> I don't think it needs to be always_inline, using just inline would be
> fine IMO.
> 
> Also this needs to be __init.

AIUI always_inline is likely the best way to preserve the speculation
safety brought in by the call to is_system_domain().

>> +{
>> +    if ( is_system_domain(d) )
>> +    {
>> +        d->is_privileged = true;
>> +        return 0;
>> +    }
>> +
> 
> I would add an ASSERT_UNREACHABLE(); here, I don't think we have any
> use case for trying to elevate the privileges of a non-system domain.

Ack.

v/r
dps

Roger Pau Monné April 4, 2022, 3:12 p.m. UTC | #8

On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
> On 3/31/22 08:36, Roger Pau Monné wrote:
> > On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
> >> There are now instances where internal hypervisor logic needs to make resource
> >> allocation calls that are protected by XSM checks. The internal hypervisor logic
> >> is represented a number of system domains which by designed are represented by
> >> non-privileged struct domain instances. To enable these logic blocks to
> >> function correctly but in a controlled manner, this commit introduces a pair
> >> of privilege escalation and demotion functions that will make a system domain
> >> privileged and then remove that privilege.
> >>
> >> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> >> ---
> >>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
> > 
> > I'm not sure this needs to be in xsm code, AFAICT it could live in a
> > more generic file.
> 
> From my perspective this is access control logic, thus why I advocate
> for it to be under XSM. A personal goal is to try to get all security,
> i.e. access control, centralized to the extent that it doing so does not
> make the code base unnecessarily complicated.

Maybe others have a different opinion, but IMO setting is_privileged
is not XSM specific. It happens to solve an XSM issue, but that's no
reason to place it in the xsm code base.

> >>  1 file changed, 22 insertions(+)
> >>
> >> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> >> index e22d6160b5..157e57151e 100644
> >> --- a/xen/include/xsm/xsm.h
> >> +++ b/xen/include/xsm/xsm.h
> >> @@ -189,6 +189,28 @@ struct xsm_operations {
> >>  #endif
> >>  };
> >>  
> >> +static always_inline int xsm_elevate_priv(struct domain *d)
> > 
> > I don't think it needs to be always_inline, using just inline would be
> > fine IMO.
> > 
> > Also this needs to be __init.
> 
> AIUI always_inline is likely the best way to preserve the speculation
> safety brought in by the call to is_system_domain().

There's nothing related to speculation safety in is_system_domain()
AFAICT. It's just a plain check against d->domain_id. It's my
understanding there's no need for any speculation barrier there
because d->domain_id is not an external input.

In any case this function should be __init only, at which point there
are no untrusted inputs to Xen.

Thanks, Roger.

Jan Beulich April 4, 2022, 3:17 p.m. UTC | #9

On 04.04.2022 17:12, Roger Pau Monné wrote:
> On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
>> On 3/31/22 08:36, Roger Pau Monné wrote:
>>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>>>> There are now instances where internal hypervisor logic needs to make resource
>>>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>>>> is represented a number of system domains which by designed are represented by
>>>> non-privileged struct domain instances. To enable these logic blocks to
>>>> function correctly but in a controlled manner, this commit introduces a pair
>>>> of privilege escalation and demotion functions that will make a system domain
>>>> privileged and then remove that privilege.
>>>>
>>>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>>>> ---
>>>>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>>>
>>> I'm not sure this needs to be in xsm code, AFAICT it could live in a
>>> more generic file.
>>
>> From my perspective this is access control logic, thus why I advocate
>> for it to be under XSM. A personal goal is to try to get all security,
>> i.e. access control, centralized to the extent that it doing so does not
>> make the code base unnecessarily complicated.
> 
> Maybe others have a different opinion, but IMO setting is_privileged
> is not XSM specific. It happens to solve an XSM issue, but that's no
> reason to place it in the xsm code base.

To be honest I can see justification for either placement.

>>>>  1 file changed, 22 insertions(+)
>>>>
>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>>>> index e22d6160b5..157e57151e 100644
>>>> --- a/xen/include/xsm/xsm.h
>>>> +++ b/xen/include/xsm/xsm.h
>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>>>  #endif
>>>>  };
>>>>  
>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
>>>
>>> I don't think it needs to be always_inline, using just inline would be
>>> fine IMO.
>>>
>>> Also this needs to be __init.
>>
>> AIUI always_inline is likely the best way to preserve the speculation
>> safety brought in by the call to is_system_domain().
> 
> There's nothing related to speculation safety in is_system_domain()
> AFAICT. It's just a plain check against d->domain_id. It's my
> understanding there's no need for any speculation barrier there
> because d->domain_id is not an external input.

Whether is_system_domain() wants hardening really depends on where
it's used. It doesn't matter as much what specific check an is_...()
function performs, but what code paths it sits on where taking the
wrong branch of a conditional could reveal data which shouldn't be
visible.

Jan

Daniel P. Smith April 4, 2022, 3:33 p.m. UTC | #10

On 3/31/22 09:16, Jason Andryuk wrote:
> On Wed, Mar 30, 2022 at 3:05 PM Daniel P. Smith
> <dpsmith@apertussolutions.com> wrote:
>>
>> There are now instances where internal hypervisor logic needs to make resource
>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>> is represented a number of system domains which by designed are represented by
>> non-privileged struct domain instances. To enable these logic blocks to
>> function correctly but in a controlled manner, this commit introduces a pair
>> of privilege escalation and demotion functions that will make a system domain
>> privileged and then remove that privilege.
>>
>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>> ---
>>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>> index e22d6160b5..157e57151e 100644
>> --- a/xen/include/xsm/xsm.h
>> +++ b/xen/include/xsm/xsm.h
>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>  #endif
>>  };
>>
>> +static always_inline int xsm_elevate_priv(struct domain *d)
>> +{
>> +    if ( is_system_domain(d) )
>> +    {
>> +        d->is_privileged = true;
>> +        return 0;
>> +    }
>> +
>> +    return -EPERM;
>> +}
> 
> These look sufficient for the default policy, but they don't seem
> sufficient for Flask.  I think you need to create a new XSM hook.  For
> Flask, you would want the demote hook to transition xen_boot_t ->
> xen_t.  That would start xen_boot_t with privileges that are dropped
> in a one-way transition.  Does that require all policies to then have
> xen_boot_t and xen_t?  I guess it does unless the hook code has some
> logic to skip the transition.

I am still thinking this through but my initial concern for Flask is
that I don't think we want dedicated domain transitions directly in
code. My current thinking would be to use a Kconfig to use xen_boot_t
type as the initial sid for the idle domain which would then require the
default policy to include an allowed transition from xen_boot_t to
xen_t. Then rely upon a boot domain to issue an xsm_op to do a relabel
transition for the idle domain with an assertion that the idle domain is
no longer labeled with its initial sid before Xen transitions its state
to SYS_STATE_active. The one wrinkle to this is whether I will be able
to schedule the boot domain before allowing Xen to transition into
SYS_STATE_active.

> For the default policy, you could start by creating the system domains
> as privileged and just have a single hook to drop privs.  Then you
> don't have to worry about the "elevate" hook existing.  The patch 2
> asserts could instead become the location of xsm_drop_privs calls to
> have a clear demarcation point.  That expands the window with
> privileges though.  It's a little simpler, but maybe you don't want
> that.  However, it seems like you can only depriv once for the Flask
> case since you want it to be one-way.

This does simplify the solution and since today we cannot differentiate
between hypervisor setup and hypervisor initiated domain construction
contexts, it does not run counter to what I have proposed. As for flask,
again I do not believe codifying a domain transition bound to a new XSM
op is the appropriate approach.

v/r,
dps

Daniel P. Smith April 4, 2022, 4:08 p.m. UTC | #11

On 4/4/22 11:12, Roger Pau Monné wrote:
> On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
>> On 3/31/22 08:36, Roger Pau Monné wrote:
>>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>>>> There are now instances where internal hypervisor logic needs to make resource
>>>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>>>> is represented a number of system domains which by designed are represented by
>>>> non-privileged struct domain instances. To enable these logic blocks to
>>>> function correctly but in a controlled manner, this commit introduces a pair
>>>> of privilege escalation and demotion functions that will make a system domain
>>>> privileged and then remove that privilege.
>>>>
>>>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>>>> ---
>>>>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>>>
>>> I'm not sure this needs to be in xsm code, AFAICT it could live in a
>>> more generic file.
>>
>> From my perspective this is access control logic, thus why I advocate
>> for it to be under XSM. A personal goal is to try to get all security,
>> i.e. access control, centralized to the extent that it doing so does not
>> make the code base unnecessarily complicated.
> 
> Maybe others have a different opinion, but IMO setting is_privileged
> is not XSM specific. It happens to solve an XSM issue, but that's no
> reason to place it in the xsm code base.

I think this deserves a separate more dedicated thread as I would take
the position that Xen privilege model is at best fractured because
is_control_domain() (and underlying is_privileged),
is_hardware_domain,() and is_xenstore_domain() are used independent of
XSM. Perhaps the discussion can be had when I get back to the XSM Roles
patches to enable disaggregating Xen with hyperlaunch (or at a Xen
Summit design session).

>>>>  1 file changed, 22 insertions(+)
>>>>
>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>>>> index e22d6160b5..157e57151e 100644
>>>> --- a/xen/include/xsm/xsm.h
>>>> +++ b/xen/include/xsm/xsm.h
>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>>>  #endif
>>>>  };
>>>>  
>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
>>>
>>> I don't think it needs to be always_inline, using just inline would be
>>> fine IMO.
>>>
>>> Also this needs to be __init.
>>
>> AIUI always_inline is likely the best way to preserve the speculation
>> safety brought in by the call to is_system_domain().
> 
> There's nothing related to speculation safety in is_system_domain()
> AFAICT. It's just a plain check against d->domain_id. It's my
> understanding there's no need for any speculation barrier there
> because d->domain_id is not an external input.

Hmmm, this actually raises a good question. Why is is_control_domain(),
is_hardware_domain, and others all have evaluate_nospec() wrapping the
check of a struct domain element while is_system_domain() does not?

> In any case this function should be __init only, at which point there
> are no untrusted inputs to Xen.

I thought it was agreed that __init on inline functions in headers had
no meaning?

Roger Pau Monné April 5, 2022, 7:42 a.m. UTC | #12

On Mon, Apr 04, 2022 at 12:08:25PM -0400, Daniel P. Smith wrote:
> On 4/4/22 11:12, Roger Pau Monné wrote:
> > On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
> >> On 3/31/22 08:36, Roger Pau Monné wrote:
> >>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
> >>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> >>>> index e22d6160b5..157e57151e 100644
> >>>> --- a/xen/include/xsm/xsm.h
> >>>> +++ b/xen/include/xsm/xsm.h
> >>>> @@ -189,6 +189,28 @@ struct xsm_operations {
> >>>>  #endif
> >>>>  };
> >>>>  
> >>>> +static always_inline int xsm_elevate_priv(struct domain *d)
> >>>
> >>> I don't think it needs to be always_inline, using just inline would be
> >>> fine IMO.
> >>>
> >>> Also this needs to be __init.
> >>
> >> AIUI always_inline is likely the best way to preserve the speculation
> >> safety brought in by the call to is_system_domain().
> > 
> > There's nothing related to speculation safety in is_system_domain()
> > AFAICT. It's just a plain check against d->domain_id. It's my
> > understanding there's no need for any speculation barrier there
> > because d->domain_id is not an external input.
> 
> Hmmm, this actually raises a good question. Why is is_control_domain(),
> is_hardware_domain, and others all have evaluate_nospec() wrapping the
> check of a struct domain element while is_system_domain() does not?

Jan replied to this regard, see:

https://lore.kernel.org/xen-devel/54272d08-7ce1-b162-c8e9-1955b780ca11@suse.com/

> > In any case this function should be __init only, at which point there
> > are no untrusted inputs to Xen.
> 
> I thought it was agreed that __init on inline functions in headers had
> no meaning?

In a different reply I already noted my preference would be for the
function to not reside in a header and not be inline, simply because
it would be gone after initialization and we won't have to worry about
any stray calls when the system is active.

Thanks, Roger.

Daniel P. Smith April 5, 2022, 12:06 p.m. UTC | #13

On 4/5/22 03:42, Roger Pau Monné wrote:
> On Mon, Apr 04, 2022 at 12:08:25PM -0400, Daniel P. Smith wrote:
>> On 4/4/22 11:12, Roger Pau Monné wrote:
>>> On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
>>>> On 3/31/22 08:36, Roger Pau Monné wrote:
>>>>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
>>>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>>>>>> index e22d6160b5..157e57151e 100644
>>>>>> --- a/xen/include/xsm/xsm.h
>>>>>> +++ b/xen/include/xsm/xsm.h
>>>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>>>>>  #endif
>>>>>>  };
>>>>>>  
>>>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
>>>>>
>>>>> I don't think it needs to be always_inline, using just inline would be
>>>>> fine IMO.
>>>>>
>>>>> Also this needs to be __init.
>>>>
>>>> AIUI always_inline is likely the best way to preserve the speculation
>>>> safety brought in by the call to is_system_domain().
>>>
>>> There's nothing related to speculation safety in is_system_domain()
>>> AFAICT. It's just a plain check against d->domain_id. It's my
>>> understanding there's no need for any speculation barrier there
>>> because d->domain_id is not an external input.
>>
>> Hmmm, this actually raises a good question. Why is is_control_domain(),
>> is_hardware_domain, and others all have evaluate_nospec() wrapping the
>> check of a struct domain element while is_system_domain() does not?
> 
> Jan replied to this regard, see:
> 
> https://lore.kernel.org/xen-devel/54272d08-7ce1-b162-c8e9-1955b780ca11@suse.com/

Jan can correct me if I misunderstood, but his point is with respect to
where the inline function will be expanded into and I would think you
would want to ensure that if anyone were to use is_system_domain(), then
the inline expansion of this new location could create a potential
speculation-able branch. Basically my concern is not putting the guards
in place today just because there is not currently any location where
is_system_domain() is expanded to create a speculation opportunity does
not mean there is not an opening for the opportunity down the road for a
future unprotected use.

>>> In any case this function should be __init only, at which point there
>>> are no untrusted inputs to Xen.
>>
>> I thought it was agreed that __init on inline functions in headers had
>> no meaning?
> 
> In a different reply I already noted my preference would be for the
> function to not reside in a header and not be inline, simply because
> it would be gone after initialization and we won't have to worry about
> any stray calls when the system is active.

If an inline function is only used by __init code, how would be
available for stray calls when the system is active? I would concede
that it is possible for someone to explicitly use in not __init code but
I would like to believe any usage in a submitted code change would be
questioned by the reviewers.

With that said, if we consider Jason's suggestion would this remove your
concern since that would only introduce a de-privilege function and
there would be no piv escalation that could be erroneously called at
anytime?

v/r
dps

Roger Pau Monné April 5, 2022, 3:01 p.m. UTC | #14

On Tue, Apr 05, 2022 at 08:06:31AM -0400, Daniel P. Smith wrote:
> On 4/5/22 03:42, Roger Pau Monné wrote:
> > On Mon, Apr 04, 2022 at 12:08:25PM -0400, Daniel P. Smith wrote:
> >> On 4/4/22 11:12, Roger Pau Monné wrote:
> >>> On Mon, Apr 04, 2022 at 10:21:18AM -0400, Daniel P. Smith wrote:
> >>>> On 3/31/22 08:36, Roger Pau Monné wrote:
> >>>>> On Wed, Mar 30, 2022 at 07:05:48PM -0400, Daniel P. Smith wrote:
> >>>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> >>>>>> index e22d6160b5..157e57151e 100644
> >>>>>> --- a/xen/include/xsm/xsm.h
> >>>>>> +++ b/xen/include/xsm/xsm.h
> >>>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
> >>>>>>  #endif
> >>>>>>  };
> >>>>>>  
> >>>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
> >>>>>
> >>>>> I don't think it needs to be always_inline, using just inline would be
> >>>>> fine IMO.
> >>>>>
> >>>>> Also this needs to be __init.
> >>>>
> >>>> AIUI always_inline is likely the best way to preserve the speculation
> >>>> safety brought in by the call to is_system_domain().
> >>>
> >>> There's nothing related to speculation safety in is_system_domain()
> >>> AFAICT. It's just a plain check against d->domain_id. It's my
> >>> understanding there's no need for any speculation barrier there
> >>> because d->domain_id is not an external input.
> >>
> >> Hmmm, this actually raises a good question. Why is is_control_domain(),
> >> is_hardware_domain, and others all have evaluate_nospec() wrapping the
> >> check of a struct domain element while is_system_domain() does not?
> > 
> > Jan replied to this regard, see:
> > 
> > https://lore.kernel.org/xen-devel/54272d08-7ce1-b162-c8e9-1955b780ca11@suse.com/
> 
> Jan can correct me if I misunderstood, but his point is with respect to
> where the inline function will be expanded into and I would think you
> would want to ensure that if anyone were to use is_system_domain(), then
> the inline expansion of this new location could create a potential
> speculation-able branch. Basically my concern is not putting the guards
> in place today just because there is not currently any location where
> is_system_domain() is expanded to create a speculation opportunity does
> not mean there is not an opening for the opportunity down the road for a
> future unprotected use.
> 
> >>> In any case this function should be __init only, at which point there
> >>> are no untrusted inputs to Xen.
> >>
> >> I thought it was agreed that __init on inline functions in headers had
> >> no meaning?
> > 
> > In a different reply I already noted my preference would be for the
> > function to not reside in a header and not be inline, simply because
> > it would be gone after initialization and we won't have to worry about
> > any stray calls when the system is active.
> 
> If an inline function is only used by __init code, how would be
> available for stray calls when the system is active? I would concede
> that it is possible for someone to explicitly use in not __init code but
> I would like to believe any usage in a submitted code change would be
> questioned by the reviewers.

Right, it's IMO easier when things just explode when not used
correctly, hence my suggestion to make it __init.

> With that said, if we consider Jason's suggestion would this remove your
> concern since that would only introduce a de-privilege function and
> there would be no piv escalation that could be erroneously called at
> anytime?

Indeed.  IMO everything that happens before the system switches to the
active state should be considered to be running in a privileged
context anyway.  Maybe others have different opinions.  Or maybe there
are use-cases I'm not aware of where this is not true.

Thanks, Roger.

Jason Andryuk April 5, 2022, 5:17 p.m. UTC | #15

On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith
<dpsmith@apertussolutions.com> wrote:
>
> On 3/31/22 09:16, Jason Andryuk wrote:
> > On Wed, Mar 30, 2022 at 3:05 PM Daniel P. Smith
> > <dpsmith@apertussolutions.com> wrote:
> >>
> >> There are now instances where internal hypervisor logic needs to make resource
> >> allocation calls that are protected by XSM checks. The internal hypervisor logic
> >> is represented a number of system domains which by designed are represented by
> >> non-privileged struct domain instances. To enable these logic blocks to
> >> function correctly but in a controlled manner, this commit introduces a pair
> >> of privilege escalation and demotion functions that will make a system domain
> >> privileged and then remove that privilege.
> >>
> >> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
> >> ---
> >>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
> >>  1 file changed, 22 insertions(+)
> >>
> >> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
> >> index e22d6160b5..157e57151e 100644
> >> --- a/xen/include/xsm/xsm.h
> >> +++ b/xen/include/xsm/xsm.h
> >> @@ -189,6 +189,28 @@ struct xsm_operations {
> >>  #endif
> >>  };
> >>
> >> +static always_inline int xsm_elevate_priv(struct domain *d)
> >> +{
> >> +    if ( is_system_domain(d) )
> >> +    {
> >> +        d->is_privileged = true;
> >> +        return 0;
> >> +    }
> >> +
> >> +    return -EPERM;
> >> +}
> >
> > These look sufficient for the default policy, but they don't seem
> > sufficient for Flask.  I think you need to create a new XSM hook.  For
> > Flask, you would want the demote hook to transition xen_boot_t ->
> > xen_t.  That would start xen_boot_t with privileges that are dropped
> > in a one-way transition.  Does that require all policies to then have
> > xen_boot_t and xen_t?  I guess it does unless the hook code has some
> > logic to skip the transition.
>
> I am still thinking this through but my initial concern for Flask is
> that I don't think we want dedicated domain transitions directly in
> code. My current thinking would be to use a Kconfig to use xen_boot_t
> type as the initial sid for the idle domain which would then require the
> default policy to include an allowed transition from xen_boot_t to
> xen_t. Then rely upon a boot domain to issue an xsm_op to do a relabel
> transition for the idle domain with an assertion that the idle domain is
> no longer labeled with its initial sid before Xen transitions its state
> to SYS_STATE_active. The one wrinkle to this is whether I will be able
> to schedule the boot domain before allowing Xen to transition into
> SYS_STATE_active.

That is an interesting approach.  While it would work, I find it
unusual that a domain would relabel Xen.  I think Xen should be
responsible for itself and not rely on a domain for this operation.

> > For the default policy, you could start by creating the system domains
> > as privileged and just have a single hook to drop privs.  Then you
> > don't have to worry about the "elevate" hook existing.  The patch 2
> > asserts could instead become the location of xsm_drop_privs calls to
> > have a clear demarcation point.  That expands the window with
> > privileges though.  It's a little simpler, but maybe you don't want
> > that.  However, it seems like you can only depriv once for the Flask
> > case since you want it to be one-way.
>
> This does simplify the solution and since today we cannot differentiate
> between hypervisor setup and hypervisor initiated domain construction
> contexts, it does not run counter to what I have proposed. As for flask,
> again I do not believe codifying a domain transition bound to a new XSM
> op is the appropriate approach.

This hard coded domain transition does feel a little weird.  But it
seems like a natural consequence of trying to use Flask to
deprivilege.  I guess the transition could be behind a
dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
in some fashion with Flask enforcing.

Another idea: Flask could start in permissive and only transition to
enforcing at the deprivilege point.  Kinda gross, but it works without
needing a transition.

To reiterate, XSM isn't really appropriate to enforce anything
internal to Xen.  We are working around the need to go through hook
points during correct operation.  Code exec in Xen means all bets are
off.  Memory writes to Xen data mean the XSM checks can be disabled
(flip Flask to permissive) or bypassed (set d->is_privileged or change
d->ssid).  We shouldn't lose sight of this when we talk about
deprivileging the idle domain.

Regards,
Jason

Daniel P. Smith April 5, 2022, 7:05 p.m. UTC | #16

On 4/5/22 13:17, Jason Andryuk wrote:
> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith
> <dpsmith@apertussolutions.com> wrote:
>>
>> On 3/31/22 09:16, Jason Andryuk wrote:
>>> On Wed, Mar 30, 2022 at 3:05 PM Daniel P. Smith
>>> <dpsmith@apertussolutions.com> wrote:
>>>>
>>>> There are now instances where internal hypervisor logic needs to make resource
>>>> allocation calls that are protected by XSM checks. The internal hypervisor logic
>>>> is represented a number of system domains which by designed are represented by
>>>> non-privileged struct domain instances. To enable these logic blocks to
>>>> function correctly but in a controlled manner, this commit introduces a pair
>>>> of privilege escalation and demotion functions that will make a system domain
>>>> privileged and then remove that privilege.
>>>>
>>>> Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
>>>> ---
>>>>  xen/include/xsm/xsm.h | 22 ++++++++++++++++++++++
>>>>  1 file changed, 22 insertions(+)
>>>>
>>>> diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
>>>> index e22d6160b5..157e57151e 100644
>>>> --- a/xen/include/xsm/xsm.h
>>>> +++ b/xen/include/xsm/xsm.h
>>>> @@ -189,6 +189,28 @@ struct xsm_operations {
>>>>  #endif
>>>>  };
>>>>
>>>> +static always_inline int xsm_elevate_priv(struct domain *d)
>>>> +{
>>>> +    if ( is_system_domain(d) )
>>>> +    {
>>>> +        d->is_privileged = true;
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    return -EPERM;
>>>> +}
>>>
>>> These look sufficient for the default policy, but they don't seem
>>> sufficient for Flask.  I think you need to create a new XSM hook.  For
>>> Flask, you would want the demote hook to transition xen_boot_t ->
>>> xen_t.  That would start xen_boot_t with privileges that are dropped
>>> in a one-way transition.  Does that require all policies to then have
>>> xen_boot_t and xen_t?  I guess it does unless the hook code has some
>>> logic to skip the transition.
>>
>> I am still thinking this through but my initial concern for Flask is
>> that I don't think we want dedicated domain transitions directly in
>> code. My current thinking would be to use a Kconfig to use xen_boot_t
>> type as the initial sid for the idle domain which would then require the
>> default policy to include an allowed transition from xen_boot_t to
>> xen_t. Then rely upon a boot domain to issue an xsm_op to do a relabel
>> transition for the idle domain with an assertion that the idle domain is
>> no longer labeled with its initial sid before Xen transitions its state
>> to SYS_STATE_active. The one wrinkle to this is whether I will be able
>> to schedule the boot domain before allowing Xen to transition into
>> SYS_STATE_active.
> 
> That is an interesting approach.  While it would work, I find it
> unusual that a domain would relabel Xen.  I think Xen should be
> responsible for itself and not rely on a domain for this operation.

The boot domain is not a general domain as no domain can/should be
created with its domid or flask label post transition to
SYS_STATE_active. Its purpose was specifically meant to be a natural way
to push out complicated pre-execution domain configuration from having
to be in they hypervisor code. Therefore in a way it can be considered a
user provided de-privileged part of the hypervisor.

With that said, I just realized a flaw in the basis of my position. What
is the difference between codifying a check that the idle domain is not
the boot label versus codifying a transition from the boot label to the
running label? None really, both will require some knowledge that there
is a boot label and some running label. Combine with the fact that the
idle domain really shouldn't have any other label than xen_t. I will
work out how to incorporate the domain transition.

>>> For the default policy, you could start by creating the system domains
>>> as privileged and just have a single hook to drop privs.  Then you
>>> don't have to worry about the "elevate" hook existing.  The patch 2
>>> asserts could instead become the location of xsm_drop_privs calls to
>>> have a clear demarcation point.  That expands the window with
>>> privileges though.  It's a little simpler, but maybe you don't want
>>> that.  However, it seems like you can only depriv once for the Flask
>>> case since you want it to be one-way.
>>
>> This does simplify the solution and since today we cannot differentiate
>> between hypervisor setup and hypervisor initiated domain construction
>> contexts, it does not run counter to what I have proposed. As for flask,
>> again I do not believe codifying a domain transition bound to a new XSM
>> op is the appropriate approach.
> 
> This hard coded domain transition does feel a little weird.  But it
> seems like a natural consequence of trying to use Flask to
> deprivilege.  I guess the transition could be behind a
> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> in some fashion with Flask enforcing.
> 
> Another idea: Flask could start in permissive and only transition to
> enforcing at the deprivilege point.  Kinda gross, but it works without
> needing a transition.
> 
> To reiterate, XSM isn't really appropriate to enforce anything
> internal to Xen.  We are working around the need to go through hook
> points during correct operation.  Code exec in Xen means all bets are
> off.  Memory writes to Xen data mean the XSM checks can be disabled
> (flip Flask to permissive) or bypassed (set d->is_privileged or change
> d->ssid).  We shouldn't lose sight of this when we talk about
> deprivileging the idle domain.

I would far prefer a transition than trying to reason about whether
flask should be in permissive or enforcing based on this state change in
combination of command line setting.

v/r
dps

Jan Beulich April 6, 2022, 7:06 a.m. UTC | #17

On 05.04.2022 19:17, Jason Andryuk wrote:
> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
>> On 3/31/22 09:16, Jason Andryuk wrote:
>>> For the default policy, you could start by creating the system domains
>>> as privileged and just have a single hook to drop privs.  Then you
>>> don't have to worry about the "elevate" hook existing.  The patch 2
>>> asserts could instead become the location of xsm_drop_privs calls to
>>> have a clear demarcation point.  That expands the window with
>>> privileges though.  It's a little simpler, but maybe you don't want
>>> that.  However, it seems like you can only depriv once for the Flask
>>> case since you want it to be one-way.
>>
>> This does simplify the solution and since today we cannot differentiate
>> between hypervisor setup and hypervisor initiated domain construction
>> contexts, it does not run counter to what I have proposed. As for flask,
>> again I do not believe codifying a domain transition bound to a new XSM
>> op is the appropriate approach.
> 
> This hard coded domain transition does feel a little weird.  But it
> seems like a natural consequence of trying to use Flask to
> deprivilege.  I guess the transition could be behind a
> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> in some fashion with Flask enforcing.
> 
> Another idea: Flask could start in permissive and only transition to
> enforcing at the deprivilege point.  Kinda gross, but it works without
> needing a transition.

I don't think that would be right. Logically such behavior ought to be
mirrored to SILO, and I'll take that for the example for being the
simpler model: Suppose an admin wants to disallow communication
between DomU-s created by Xen. Such would want enforcing when creating
those DomU-s, despite the creator (Xen) being all powerful. If the
device tree information said something different (e.g. directing for
an event channel to be established between two such DomU-s), this
should be flagged as an error, not be silently permitted.

Jan

Roger Pau Monné April 6, 2022, 8:46 a.m. UTC | #18

On Wed, Apr 06, 2022 at 09:06:59AM +0200, Jan Beulich wrote:
> On 05.04.2022 19:17, Jason Andryuk wrote:
> > On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
> >> On 3/31/22 09:16, Jason Andryuk wrote:
> >>> For the default policy, you could start by creating the system domains
> >>> as privileged and just have a single hook to drop privs.  Then you
> >>> don't have to worry about the "elevate" hook existing.  The patch 2
> >>> asserts could instead become the location of xsm_drop_privs calls to
> >>> have a clear demarcation point.  That expands the window with
> >>> privileges though.  It's a little simpler, but maybe you don't want
> >>> that.  However, it seems like you can only depriv once for the Flask
> >>> case since you want it to be one-way.
> >>
> >> This does simplify the solution and since today we cannot differentiate
> >> between hypervisor setup and hypervisor initiated domain construction
> >> contexts, it does not run counter to what I have proposed. As for flask,
> >> again I do not believe codifying a domain transition bound to a new XSM
> >> op is the appropriate approach.
> > 
> > This hard coded domain transition does feel a little weird.  But it
> > seems like a natural consequence of trying to use Flask to
> > deprivilege.  I guess the transition could be behind a
> > dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> > in some fashion with Flask enforcing.
> > 
> > Another idea: Flask could start in permissive and only transition to
> > enforcing at the deprivilege point.  Kinda gross, but it works without
> > needing a transition.
> 
> I don't think that would be right. Logically such behavior ought to be
> mirrored to SILO, and I'll take that for the example for being the
> simpler model: Suppose an admin wants to disallow communication
> between DomU-s created by Xen. Such would want enforcing when creating
> those DomU-s, despite the creator (Xen) being all powerful. If the
> device tree information said something different (e.g. directing for
> an event channel to be established between two such DomU-s), this
> should be flagged as an error, not be silently permitted.

I could also see this argument the other way around: what if an admin
wants to disallow domUs freely communicating between them, but would
still like some controlled domU communication to be possible by
setting up those channels at domain creation?

Thanks, Roger.

Jan Beulich April 6, 2022, 8:48 a.m. UTC | #19

On 06.04.2022 10:46, Roger Pau Monné wrote:
> On Wed, Apr 06, 2022 at 09:06:59AM +0200, Jan Beulich wrote:
>> On 05.04.2022 19:17, Jason Andryuk wrote:
>>> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
>>>> On 3/31/22 09:16, Jason Andryuk wrote:
>>>>> For the default policy, you could start by creating the system domains
>>>>> as privileged and just have a single hook to drop privs.  Then you
>>>>> don't have to worry about the "elevate" hook existing.  The patch 2
>>>>> asserts could instead become the location of xsm_drop_privs calls to
>>>>> have a clear demarcation point.  That expands the window with
>>>>> privileges though.  It's a little simpler, but maybe you don't want
>>>>> that.  However, it seems like you can only depriv once for the Flask
>>>>> case since you want it to be one-way.
>>>>
>>>> This does simplify the solution and since today we cannot differentiate
>>>> between hypervisor setup and hypervisor initiated domain construction
>>>> contexts, it does not run counter to what I have proposed. As for flask,
>>>> again I do not believe codifying a domain transition bound to a new XSM
>>>> op is the appropriate approach.
>>>
>>> This hard coded domain transition does feel a little weird.  But it
>>> seems like a natural consequence of trying to use Flask to
>>> deprivilege.  I guess the transition could be behind a
>>> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
>>> in some fashion with Flask enforcing.
>>>
>>> Another idea: Flask could start in permissive and only transition to
>>> enforcing at the deprivilege point.  Kinda gross, but it works without
>>> needing a transition.
>>
>> I don't think that would be right. Logically such behavior ought to be
>> mirrored to SILO, and I'll take that for the example for being the
>> simpler model: Suppose an admin wants to disallow communication
>> between DomU-s created by Xen. Such would want enforcing when creating
>> those DomU-s, despite the creator (Xen) being all powerful. If the
>> device tree information said something different (e.g. directing for
>> an event channel to be established between two such DomU-s), this
>> should be flagged as an error, not be silently permitted.
> 
> I could also see this argument the other way around: what if an admin
> wants to disallow domUs freely communicating between them, but would
> still like some controlled domU communication to be possible by
> setting up those channels at domain creation?

Well, imo that would require a proper (Flask) policy then, not SILO mode.

Jan

Roger Pau Monné April 6, 2022, 9:09 a.m. UTC | #20

On Wed, Apr 06, 2022 at 10:48:23AM +0200, Jan Beulich wrote:
> On 06.04.2022 10:46, Roger Pau Monné wrote:
> > On Wed, Apr 06, 2022 at 09:06:59AM +0200, Jan Beulich wrote:
> >> On 05.04.2022 19:17, Jason Andryuk wrote:
> >>> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
> >>>> On 3/31/22 09:16, Jason Andryuk wrote:
> >>>>> For the default policy, you could start by creating the system domains
> >>>>> as privileged and just have a single hook to drop privs.  Then you
> >>>>> don't have to worry about the "elevate" hook existing.  The patch 2
> >>>>> asserts could instead become the location of xsm_drop_privs calls to
> >>>>> have a clear demarcation point.  That expands the window with
> >>>>> privileges though.  It's a little simpler, but maybe you don't want
> >>>>> that.  However, it seems like you can only depriv once for the Flask
> >>>>> case since you want it to be one-way.
> >>>>
> >>>> This does simplify the solution and since today we cannot differentiate
> >>>> between hypervisor setup and hypervisor initiated domain construction
> >>>> contexts, it does not run counter to what I have proposed. As for flask,
> >>>> again I do not believe codifying a domain transition bound to a new XSM
> >>>> op is the appropriate approach.
> >>>
> >>> This hard coded domain transition does feel a little weird.  But it
> >>> seems like a natural consequence of trying to use Flask to
> >>> deprivilege.  I guess the transition could be behind a
> >>> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> >>> in some fashion with Flask enforcing.
> >>>
> >>> Another idea: Flask could start in permissive and only transition to
> >>> enforcing at the deprivilege point.  Kinda gross, but it works without
> >>> needing a transition.
> >>
> >> I don't think that would be right. Logically such behavior ought to be
> >> mirrored to SILO, and I'll take that for the example for being the
> >> simpler model: Suppose an admin wants to disallow communication
> >> between DomU-s created by Xen. Such would want enforcing when creating
> >> those DomU-s, despite the creator (Xen) being all powerful. If the
> >> device tree information said something different (e.g. directing for
> >> an event channel to be established between two such DomU-s), this
> >> should be flagged as an error, not be silently permitted.
> > 
> > I could also see this argument the other way around: what if an admin
> > wants to disallow domUs freely communicating between them, but would
> > still like some controlled domU communication to be possible by
> > setting up those channels at domain creation?
> 
> Well, imo that would require a proper (Flask) policy then, not SILO mode.

But when creating such domains in SILO mode from dom0, dom0 would be
allowed to create and bind event channels to the created domains, even
if the domain themselves are not allowed the operation.

That's because the check in evtchn_bind_interdomain() is done against
'current' and not the domain where the event channel will be bound.
Maybe such check should instead take 3 parameters: current context
domain, domain to bind the event channel to and remote domain on the
other end of the event channel.

Thanks, Roger.

Jan Beulich April 6, 2022, 9:16 a.m. UTC | #21

On 06.04.2022 11:09, Roger Pau Monné wrote:
> On Wed, Apr 06, 2022 at 10:48:23AM +0200, Jan Beulich wrote:
>> On 06.04.2022 10:46, Roger Pau Monné wrote:
>>> On Wed, Apr 06, 2022 at 09:06:59AM +0200, Jan Beulich wrote:
>>>> On 05.04.2022 19:17, Jason Andryuk wrote:
>>>>> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
>>>>>> On 3/31/22 09:16, Jason Andryuk wrote:
>>>>>>> For the default policy, you could start by creating the system domains
>>>>>>> as privileged and just have a single hook to drop privs.  Then you
>>>>>>> don't have to worry about the "elevate" hook existing.  The patch 2
>>>>>>> asserts could instead become the location of xsm_drop_privs calls to
>>>>>>> have a clear demarcation point.  That expands the window with
>>>>>>> privileges though.  It's a little simpler, but maybe you don't want
>>>>>>> that.  However, it seems like you can only depriv once for the Flask
>>>>>>> case since you want it to be one-way.
>>>>>>
>>>>>> This does simplify the solution and since today we cannot differentiate
>>>>>> between hypervisor setup and hypervisor initiated domain construction
>>>>>> contexts, it does not run counter to what I have proposed. As for flask,
>>>>>> again I do not believe codifying a domain transition bound to a new XSM
>>>>>> op is the appropriate approach.
>>>>>
>>>>> This hard coded domain transition does feel a little weird.  But it
>>>>> seems like a natural consequence of trying to use Flask to
>>>>> deprivilege.  I guess the transition could be behind a
>>>>> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
>>>>> in some fashion with Flask enforcing.
>>>>>
>>>>> Another idea: Flask could start in permissive and only transition to
>>>>> enforcing at the deprivilege point.  Kinda gross, but it works without
>>>>> needing a transition.
>>>>
>>>> I don't think that would be right. Logically such behavior ought to be
>>>> mirrored to SILO, and I'll take that for the example for being the
>>>> simpler model: Suppose an admin wants to disallow communication
>>>> between DomU-s created by Xen. Such would want enforcing when creating
>>>> those DomU-s, despite the creator (Xen) being all powerful. If the
>>>> device tree information said something different (e.g. directing for
>>>> an event channel to be established between two such DomU-s), this
>>>> should be flagged as an error, not be silently permitted.
>>>
>>> I could also see this argument the other way around: what if an admin
>>> wants to disallow domUs freely communicating between them, but would
>>> still like some controlled domU communication to be possible by
>>> setting up those channels at domain creation?
>>
>> Well, imo that would require a proper (Flask) policy then, not SILO mode.
> 
> But when creating such domains in SILO mode from dom0, dom0 would be
> allowed to create and bind event channels to the created domains, even
> if the domain themselves are not allowed the operation.
> 
> That's because the check in evtchn_bind_interdomain() is done against
> 'current' and not the domain where the event channel will be bound.

Yes and no - the check is against current, but that's because
communication is established between current ( == ld) and rd. The
function in its present shape simply can't establish a channel
between two arbitrary domains.

Jan

> Maybe such check should instead take 3 parameters: current context
> domain, domain to bind the event channel to and remote domain on the
> other end of the event channel.
> 
> Thanks, Roger.
>

Roger Pau Monné April 6, 2022, 9:40 a.m. UTC | #22

On Wed, Apr 06, 2022 at 11:16:10AM +0200, Jan Beulich wrote:
> On 06.04.2022 11:09, Roger Pau Monné wrote:
> > On Wed, Apr 06, 2022 at 10:48:23AM +0200, Jan Beulich wrote:
> >> On 06.04.2022 10:46, Roger Pau Monné wrote:
> >>> On Wed, Apr 06, 2022 at 09:06:59AM +0200, Jan Beulich wrote:
> >>>> On 05.04.2022 19:17, Jason Andryuk wrote:
> >>>>> On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
> >>>>>> On 3/31/22 09:16, Jason Andryuk wrote:
> >>>>>>> For the default policy, you could start by creating the system domains
> >>>>>>> as privileged and just have a single hook to drop privs.  Then you
> >>>>>>> don't have to worry about the "elevate" hook existing.  The patch 2
> >>>>>>> asserts could instead become the location of xsm_drop_privs calls to
> >>>>>>> have a clear demarcation point.  That expands the window with
> >>>>>>> privileges though.  It's a little simpler, but maybe you don't want
> >>>>>>> that.  However, it seems like you can only depriv once for the Flask
> >>>>>>> case since you want it to be one-way.
> >>>>>>
> >>>>>> This does simplify the solution and since today we cannot differentiate
> >>>>>> between hypervisor setup and hypervisor initiated domain construction
> >>>>>> contexts, it does not run counter to what I have proposed. As for flask,
> >>>>>> again I do not believe codifying a domain transition bound to a new XSM
> >>>>>> op is the appropriate approach.
> >>>>>
> >>>>> This hard coded domain transition does feel a little weird.  But it
> >>>>> seems like a natural consequence of trying to use Flask to
> >>>>> deprivilege.  I guess the transition could be behind a
> >>>>> dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> >>>>> in some fashion with Flask enforcing.
> >>>>>
> >>>>> Another idea: Flask could start in permissive and only transition to
> >>>>> enforcing at the deprivilege point.  Kinda gross, but it works without
> >>>>> needing a transition.
> >>>>
> >>>> I don't think that would be right. Logically such behavior ought to be
> >>>> mirrored to SILO, and I'll take that for the example for being the
> >>>> simpler model: Suppose an admin wants to disallow communication
> >>>> between DomU-s created by Xen. Such would want enforcing when creating
> >>>> those DomU-s, despite the creator (Xen) being all powerful. If the
> >>>> device tree information said something different (e.g. directing for
> >>>> an event channel to be established between two such DomU-s), this
> >>>> should be flagged as an error, not be silently permitted.
> >>>
> >>> I could also see this argument the other way around: what if an admin
> >>> wants to disallow domUs freely communicating between them, but would
> >>> still like some controlled domU communication to be possible by
> >>> setting up those channels at domain creation?
> >>
> >> Well, imo that would require a proper (Flask) policy then, not SILO mode.
> > 
> > But when creating such domains in SILO mode from dom0, dom0 would be
> > allowed to create and bind event channels to the created domains, even
> > if the domain themselves are not allowed the operation.
> > 
> > That's because the check in evtchn_bind_interdomain() is done against
> > 'current' and not the domain where the event channel will be bound.
> 
> Yes and no - the check is against current, but that's because
> communication is established between current ( == ld) and rd. The
> function in its present shape simply can't establish a channel
> between two arbitrary domains.

I've got confused with evtchn_alloc_unbound() that does take a local
and remote domains as parameters, but in that case the xsm check is
done against ld (which might not be current) and rd.

Thanks, Roger.

Jason Andryuk April 6, 2022, 12:31 p.m. UTC | #23

On Wed, Apr 6, 2022 at 3:07 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.04.2022 19:17, Jason Andryuk wrote:
> > On Mon, Apr 4, 2022 at 11:34 AM Daniel P. Smith <dpsmith@apertussolutions.com> wrote:
> >> On 3/31/22 09:16, Jason Andryuk wrote:
> >>> For the default policy, you could start by creating the system domains
> >>> as privileged and just have a single hook to drop privs.  Then you
> >>> don't have to worry about the "elevate" hook existing.  The patch 2
> >>> asserts could instead become the location of xsm_drop_privs calls to
> >>> have a clear demarcation point.  That expands the window with
> >>> privileges though.  It's a little simpler, but maybe you don't want
> >>> that.  However, it seems like you can only depriv once for the Flask
> >>> case since you want it to be one-way.
> >>
> >> This does simplify the solution and since today we cannot differentiate
> >> between hypervisor setup and hypervisor initiated domain construction
> >> contexts, it does not run counter to what I have proposed. As for flask,
> >> again I do not believe codifying a domain transition bound to a new XSM
> >> op is the appropriate approach.
> >
> > This hard coded domain transition does feel a little weird.  But it
> > seems like a natural consequence of trying to use Flask to
> > deprivilege.  I guess the transition could be behind a
> > dom0less/hyperlaunch Kconfig option.  I just don't see a way around it
> > in some fashion with Flask enforcing.
> >
> > Another idea: Flask could start in permissive and only transition to
> > enforcing at the deprivilege point.  Kinda gross, but it works without
> > needing a transition.
>
> I don't think that would be right. Logically such behavior ought to be
> mirrored to SILO, and I'll take that for the example for being the
> simpler model: Suppose an admin wants to disallow communication
> between DomU-s created by Xen. Such would want enforcing when creating
> those DomU-s, despite the creator (Xen) being all powerful. If the
> device tree information said something different (e.g. directing for
> an event channel to be established between two such DomU-s), this
> should be flagged as an error, not be silently permitted.

Yes, you are right.  As you say, you want the policy enforced when
creating the DomU-s.

Regards,
Jason

[1/2] xsm: add ability to elevate a domain to privileged

Commit Message

Comments

Patch