diff mbox series

[v2,15/21] libs/guest: obtain a compatible cpu policy from two input ones

Message ID 20210413140140.73690-16-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series libs/guest: new CPUID/MSR interface | expand

Commit Message

Roger Pau Monné April 13, 2021, 2:01 p.m. UTC
Introduce a helper to obtain a compatible cpu policy based on two
input cpu policies. Currently this is done by and'ing all CPUID leaves
and MSR entries, except for MSR_ARCH_CAPABILITIES which has the RSBA
bit or'ed.

The _AC macro is pulled from libxl_internal.h into xen-tools/libs.h
since it's required in order to use the msr-index.h header.

Note there's no need to place this helper in libx86, since the
calculation of a compatible policy shouldn't be done from the
hypervisor.

No callers of the interface introduced.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Only AND the feature parts of cpuid.
 - Use a binary search to find the matching leaves and msr entries.
 - Remove default case from MSR level function.
---
 tools/include/xen-tools/libs.h    |   5 ++
 tools/include/xenctrl.h           |   4 +
 tools/libs/guest/xg_cpuid_x86.c   | 128 ++++++++++++++++++++++++++++++
 tools/libs/light/libxl_internal.h |   2 -
 4 files changed, 137 insertions(+), 2 deletions(-)

Comments

Jan Beulich April 14, 2021, 1:49 p.m. UTC | #1
On 13.04.2021 16:01, Roger Pau Monne wrote:
> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>  
>      return false;
>  }
> +
> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> +{
> +    uint64_t val = val1 & val2;;

For arbitrary MSRs this isn't going to do any good. If only very
specific MSRs are assumed to make it here, I think this wants
commenting on.

Also, nit: stray semicolon.

> +    switch ( index )
> +    {
> +    case MSR_ARCH_CAPABILITIES:
> +        /*
> +         * Set RSBA if present on any of the input values to notice the guest
> +         * might run on vulnerable hardware at some point.
> +         */
> +        val |= (val1 | val2) & ARCH_CAPS_RSBA;
> +        break;
> +    }
> +
> +    return val;
> +}
> +
> +static bool level_leaf(xen_cpuid_leaf_t *l1, xen_cpuid_leaf_t *l2,

const (twice)?

> +                       xen_cpuid_leaf_t *out)
> +{
> +    *out = (xen_cpuid_leaf_t){ };
> +
> +    switch ( l1->leaf )
> +    {
> +    case 0x1:
> +    case 0x80000001:
> +        out->c = l1->c & l2->c;
> +        out->d = l1->d & l2->d;
> +        return true;
> +
> +    case 0xd:
> +        if ( l1->subleaf != 1 )
> +            break;
> +        out->a = l1->a & l2->a;
> +        return true;

Could you explain your thinking behind this (a code comment would
likely help)? You effectively discard everything except subleaf 1
by returning false in that case, don't you?

> +    case 0x7:
> +        switch ( l1->subleaf )
> +        {
> +        case 0:
> +            out->b = l1->b & l2->b;
> +            out->c = l1->c & l2->c;
> +            out->d = l1->d & l2->d;
> +            return true;
> +
> +        case 1:
> +            out->a = l1->a & l2->a;
> +            return true;
> +        }
> +        break;

Can we perhaps assume all subleaves here are going to hold flags,
and hence and both sides together without regard to what subleaf
we're actually dealing with (subleaf 1 remaining special as to
EAX of course)? This would avoid having to remember to make yet
another mechanical change when enabling a new subleaf.

> +    case 0x80000007:
> +        out->d = l1->d & l2->d;
> +        return true;
> +
> +    case 0x80000008:
> +        out->b = l1->b & l2->b;
> +        return true;
> +    }
> +
> +    return false;
> +}

Considering your LFENCE-always-serializing patch, I assume
whichever ends up going in last will take care of adding handling
of that leaf here?

Jan
Wei Liu April 21, 2021, 10:22 a.m. UTC | #2
On Tue, Apr 13, 2021 at 04:01:33PM +0200, Roger Pau Monne wrote:
> Introduce a helper to obtain a compatible cpu policy based on two
> input cpu policies. Currently this is done by and'ing all CPUID leaves
> and MSR entries, except for MSR_ARCH_CAPABILITIES which has the RSBA
> bit or'ed.
> 

I thought canonical source for compatibility was to be put into the
hypervisor, thus all this functionality would be in the hypervisor. Am I
misremembering?

Wei.
Roger Pau Monné April 21, 2021, 11:26 a.m. UTC | #3
On Wed, Apr 21, 2021 at 10:22:39AM +0000, Wei Liu wrote:
> On Tue, Apr 13, 2021 at 04:01:33PM +0200, Roger Pau Monne wrote:
> > Introduce a helper to obtain a compatible cpu policy based on two
> > input cpu policies. Currently this is done by and'ing all CPUID leaves
> > and MSR entries, except for MSR_ARCH_CAPABILITIES which has the RSBA
> > bit or'ed.
> > 
> 
> I thought canonical source for compatibility was to be put into the
> hypervisor, thus all this functionality would be in the hypervisor. Am I
> misremembering?

Andrew said something similar on v1, but I'm not able to figure how
this would be used by the hypervisor.

It's my understating that the toolstack will attempt to generate a CPU
policy and forward it to the hypervisor, which will either accept or
reject it based on the capabilities of the system. I'm not sure I see
why we would need to give the hypervisor two policies in order to
generate a resulting compatible one - it should all be done by the
toolstack AFAICT.

If there's a use case for this being in the hypervisor I'm happy to
add it there, but so far I haven't been able to come up with one
myself, and hence I don't see the need to make the code available.

Thanks, Roger.
Wei Liu April 21, 2021, 4:42 p.m. UTC | #4
On Wed, Apr 21, 2021 at 01:26:49PM +0200, Roger Pau Monné wrote:
> On Wed, Apr 21, 2021 at 10:22:39AM +0000, Wei Liu wrote:
> > On Tue, Apr 13, 2021 at 04:01:33PM +0200, Roger Pau Monne wrote:
> > > Introduce a helper to obtain a compatible cpu policy based on two
> > > input cpu policies. Currently this is done by and'ing all CPUID leaves
> > > and MSR entries, except for MSR_ARCH_CAPABILITIES which has the RSBA
> > > bit or'ed.
> > > 
> > 
> > I thought canonical source for compatibility was to be put into the
> > hypervisor, thus all this functionality would be in the hypervisor. Am I
> > misremembering?
> 
> Andrew said something similar on v1, but I'm not able to figure how
> this would be used by the hypervisor.
> 
> It's my understating that the toolstack will attempt to generate a CPU
> policy and forward it to the hypervisor, which will either accept or
> reject it based on the capabilities of the system. I'm not sure I see
> why we would need to give the hypervisor two policies in order to
> generate a resulting compatible one - it should all be done by the
> toolstack AFAICT.
> 
> If there's a use case for this being in the hypervisor I'm happy to
> add it there, but so far I haven't been able to come up with one
> myself, and hence I don't see the need to make the code available.

I have no opinion here. Just wanted to get on the same page.

Thanks for the explanation.

Wei.

> 
> Thanks, Roger.
Roger Pau Monné April 22, 2021, 9:42 a.m. UTC | #5
On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
> On 13.04.2021 16:01, Roger Pau Monne wrote:
> > @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
> >  
> >      return false;
> >  }
> > +
> > +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> > +{
> > +    uint64_t val = val1 & val2;;
> 
> For arbitrary MSRs this isn't going to do any good. If only very
> specific MSRs are assumed to make it here, I think this wants
> commenting on.

I've added: "MSRs passed to level_msr are expected to be bitmaps of
features"

> > +                       xen_cpuid_leaf_t *out)
> > +{
> > +    *out = (xen_cpuid_leaf_t){ };
> > +
> > +    switch ( l1->leaf )
> > +    {
> > +    case 0x1:
> > +    case 0x80000001:
> > +        out->c = l1->c & l2->c;
> > +        out->d = l1->d & l2->d;
> > +        return true;
> > +
> > +    case 0xd:
> > +        if ( l1->subleaf != 1 )
> > +            break;
> > +        out->a = l1->a & l2->a;
> > +        return true;
> 
> Could you explain your thinking behind this (a code comment would
> likely help)? You effectively discard everything except subleaf 1
> by returning false in that case, don't you?

Yes, the intent is to only level the features bitfield found in
subleaf 1.

I was planning for level_leaf so far in this series to deal with the
feature leaves part of the featureset only. I guess you would also
like to leverage other parts of the xstate leaf, like the max_size or
the supported bits in xss_{low,high}?

> > +    case 0x7:
> > +        switch ( l1->subleaf )
> > +        {
> > +        case 0:
> > +            out->b = l1->b & l2->b;
> > +            out->c = l1->c & l2->c;
> > +            out->d = l1->d & l2->d;
> > +            return true;
> > +
> > +        case 1:
> > +            out->a = l1->a & l2->a;
> > +            return true;
> > +        }
> > +        break;
> 
> Can we perhaps assume all subleaves here are going to hold flags,
> and hence and both sides together without regard to what subleaf
> we're actually dealing with (subleaf 1 remaining special as to

I think you meant subleaf 0 EAX (the max subleaf value)?

subleaf 1 EAX is just a normal bitfield AFAIK.

> EAX of course)? This would avoid having to remember to make yet
> another mechanical change when enabling a new subleaf.

Sure, seems like a fine approach since leaf 7 will only contain
feature bitmaps.

> > +    case 0x80000007:
> > +        out->d = l1->d & l2->d;
> > +        return true;
> > +
> > +    case 0x80000008:
> > +        out->b = l1->b & l2->b;
> > +        return true;
> > +    }
> > +
> > +    return false;
> > +}
> 
> Considering your LFENCE-always-serializing patch, I assume
> whichever ends up going in last will take care of adding handling
> of that leaf here?

Indeed.

Thanks, Roger.
Jan Beulich April 22, 2021, 9:58 a.m. UTC | #6
On 22.04.2021 11:42, Roger Pau Monné wrote:
> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
>> On 13.04.2021 16:01, Roger Pau Monne wrote:
>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>>>  
>>>      return false;
>>>  }
>>> +
>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
>>> +{
>>> +    uint64_t val = val1 & val2;;
>>
>> For arbitrary MSRs this isn't going to do any good. If only very
>> specific MSRs are assumed to make it here, I think this wants
>> commenting on.
> 
> I've added: "MSRs passed to level_msr are expected to be bitmaps of
> features"

How does such a comment help? I.e. how does the caller tell which MSRs
to pass here and which to deal with anyother way?

>>> +                       xen_cpuid_leaf_t *out)
>>> +{
>>> +    *out = (xen_cpuid_leaf_t){ };
>>> +
>>> +    switch ( l1->leaf )
>>> +    {
>>> +    case 0x1:
>>> +    case 0x80000001:
>>> +        out->c = l1->c & l2->c;
>>> +        out->d = l1->d & l2->d;
>>> +        return true;
>>> +
>>> +    case 0xd:
>>> +        if ( l1->subleaf != 1 )
>>> +            break;
>>> +        out->a = l1->a & l2->a;
>>> +        return true;
>>
>> Could you explain your thinking behind this (a code comment would
>> likely help)? You effectively discard everything except subleaf 1
>> by returning false in that case, don't you?
> 
> Yes, the intent is to only level the features bitfield found in
> subleaf 1.
> 
> I was planning for level_leaf so far in this series to deal with the
> feature leaves part of the featureset only. I guess you would also
> like to leverage other parts of the xstate leaf, like the max_size or
> the supported bits in xss_{low,high}?

The latter is clearly one of the things to consider, yes (alongside
the respective bits in sub-leaf 0 for XCR0). Sub-leaves > 1 may also
need dealing with ECX. Yet then again some or all of this may need
handling elsewhere, not the least because of the unusual handling of
leaf 0xd in the hypervisor. What gets checked and/or adjusted where
needs to be settled upon, and then the different parts of code would
imo better cross-reference each other.

>>> +    case 0x7:
>>> +        switch ( l1->subleaf )
>>> +        {
>>> +        case 0:
>>> +            out->b = l1->b & l2->b;
>>> +            out->c = l1->c & l2->c;
>>> +            out->d = l1->d & l2->d;
>>> +            return true;
>>> +
>>> +        case 1:
>>> +            out->a = l1->a & l2->a;
>>> +            return true;
>>> +        }
>>> +        break;
>>
>> Can we perhaps assume all subleaves here are going to hold flags,
>> and hence and both sides together without regard to what subleaf
>> we're actually dealing with (subleaf 1 remaining special as to
> 
> I think you meant subleaf 0 EAX (the max subleaf value)?
> 
> subleaf 1 EAX is just a normal bitfield AFAIK.

Indeed I did.

Jan

>> EAX of course)? This would avoid having to remember to make yet
>> another mechanical change when enabling a new subleaf.
> 
> Sure, seems like a fine approach since leaf 7 will only contain
> feature bitmaps.
Roger Pau Monné April 22, 2021, 10:34 a.m. UTC | #7
On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
> On 22.04.2021 11:42, Roger Pau Monné wrote:
> > On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
> >> On 13.04.2021 16:01, Roger Pau Monne wrote:
> >>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
> >>>  
> >>>      return false;
> >>>  }
> >>> +
> >>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> >>> +{
> >>> +    uint64_t val = val1 & val2;;
> >>
> >> For arbitrary MSRs this isn't going to do any good. If only very
> >> specific MSRs are assumed to make it here, I think this wants
> >> commenting on.
> > 
> > I've added: "MSRs passed to level_msr are expected to be bitmaps of
> > features"
> 
> How does such a comment help? I.e. how does the caller tell which MSRs
> to pass here and which to deal with anyother way?

All MSRs should be passed to level_msr, but it's handling logic would
need to be expanded to support MSRs that are not feature bitmaps.

It might be best to restore the previous switch and handle each MSR
specifically?

From your reply to v1 I wrongly misunderstood that you initially
wanted to handle all MSRs as bitmaps:

https://lore.kernel.org/xen-devel/f66e61d5-e4a0-cba3-f15c-73ca54ac4964@suse.com/

> >>> +                       xen_cpuid_leaf_t *out)
> >>> +{
> >>> +    *out = (xen_cpuid_leaf_t){ };
> >>> +
> >>> +    switch ( l1->leaf )
> >>> +    {
> >>> +    case 0x1:
> >>> +    case 0x80000001:
> >>> +        out->c = l1->c & l2->c;
> >>> +        out->d = l1->d & l2->d;
> >>> +        return true;
> >>> +
> >>> +    case 0xd:
> >>> +        if ( l1->subleaf != 1 )
> >>> +            break;
> >>> +        out->a = l1->a & l2->a;
> >>> +        return true;
> >>
> >> Could you explain your thinking behind this (a code comment would
> >> likely help)? You effectively discard everything except subleaf 1
> >> by returning false in that case, don't you?
> > 
> > Yes, the intent is to only level the features bitfield found in
> > subleaf 1.
> > 
> > I was planning for level_leaf so far in this series to deal with the
> > feature leaves part of the featureset only. I guess you would also
> > like to leverage other parts of the xstate leaf, like the max_size or
> > the supported bits in xss_{low,high}?
> 
> The latter is clearly one of the things to consider, yes (alongside
> the respective bits in sub-leaf 0 for XCR0). Sub-leaves > 1 may also
> need dealing with ECX. Yet then again some or all of this may need
> handling elsewhere, not the least because of the unusual handling of
> leaf 0xd in the hypervisor. What gets checked and/or adjusted where
> needs to be settled upon, and then the different parts of code would
> imo better cross-reference each other.

There's a comment in recalculate_xstate that mentions that Da1 leaf is
the only piece of information preserved, and that everything else is
derived from feature state. I don't think it makes sense to try to
level anything apart from Da1 if it's going to be discarded by
recalculate_xstate anyway?

I can add a comment here regarding why only Da1 is taken into account
for leveling so far.

Thanks, Roger.
Jan Beulich April 22, 2021, 10:48 a.m. UTC | #8
On 22.04.2021 12:34, Roger Pau Monné wrote:
> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
>> On 22.04.2021 11:42, Roger Pau Monné wrote:
>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>>>>>  
>>>>>      return false;
>>>>>  }
>>>>> +
>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
>>>>> +{
>>>>> +    uint64_t val = val1 & val2;;
>>>>
>>>> For arbitrary MSRs this isn't going to do any good. If only very
>>>> specific MSRs are assumed to make it here, I think this wants
>>>> commenting on.
>>>
>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
>>> features"
>>
>> How does such a comment help? I.e. how does the caller tell which MSRs
>> to pass here and which to deal with anyother way?
> 
> All MSRs should be passed to level_msr, but it's handling logic would
> need to be expanded to support MSRs that are not feature bitmaps.
> 
> It might be best to restore the previous switch and handle each MSR
> specifically?

I think so, yes. We need to be very careful with what a possible
default case does there, though.

>>>>> +                       xen_cpuid_leaf_t *out)
>>>>> +{
>>>>> +    *out = (xen_cpuid_leaf_t){ };
>>>>> +
>>>>> +    switch ( l1->leaf )
>>>>> +    {
>>>>> +    case 0x1:
>>>>> +    case 0x80000001:
>>>>> +        out->c = l1->c & l2->c;
>>>>> +        out->d = l1->d & l2->d;
>>>>> +        return true;
>>>>> +
>>>>> +    case 0xd:
>>>>> +        if ( l1->subleaf != 1 )
>>>>> +            break;
>>>>> +        out->a = l1->a & l2->a;
>>>>> +        return true;
>>>>
>>>> Could you explain your thinking behind this (a code comment would
>>>> likely help)? You effectively discard everything except subleaf 1
>>>> by returning false in that case, don't you?
>>>
>>> Yes, the intent is to only level the features bitfield found in
>>> subleaf 1.
>>>
>>> I was planning for level_leaf so far in this series to deal with the
>>> feature leaves part of the featureset only. I guess you would also
>>> like to leverage other parts of the xstate leaf, like the max_size or
>>> the supported bits in xss_{low,high}?
>>
>> The latter is clearly one of the things to consider, yes (alongside
>> the respective bits in sub-leaf 0 for XCR0). Sub-leaves > 1 may also
>> need dealing with ECX. Yet then again some or all of this may need
>> handling elsewhere, not the least because of the unusual handling of
>> leaf 0xd in the hypervisor. What gets checked and/or adjusted where
>> needs to be settled upon, and then the different parts of code would
>> imo better cross-reference each other.
> 
> There's a comment in recalculate_xstate that mentions that Da1 leaf is
> the only piece of information preserved, and that everything else is
> derived from feature state. I don't think it makes sense to try to
> level anything apart from Da1 if it's going to be discarded by
> recalculate_xstate anyway?
> 
> I can add a comment here regarding why only Da1 is taken into account
> for leveling so far.

Yes, this would help. Thanks.

Jan
Roger Pau Monné April 22, 2021, 10:56 a.m. UTC | #9
On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
> On 22.04.2021 12:34, Roger Pau Monné wrote:
> > On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
> >> On 22.04.2021 11:42, Roger Pau Monné wrote:
> >>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
> >>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
> >>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
> >>>>>  
> >>>>>      return false;
> >>>>>  }
> >>>>> +
> >>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> >>>>> +{
> >>>>> +    uint64_t val = val1 & val2;;
> >>>>
> >>>> For arbitrary MSRs this isn't going to do any good. If only very
> >>>> specific MSRs are assumed to make it here, I think this wants
> >>>> commenting on.
> >>>
> >>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
> >>> features"
> >>
> >> How does such a comment help? I.e. how does the caller tell which MSRs
> >> to pass here and which to deal with anyother way?
> > 
> > All MSRs should be passed to level_msr, but it's handling logic would
> > need to be expanded to support MSRs that are not feature bitmaps.
> > 
> > It might be best to restore the previous switch and handle each MSR
> > specifically?
> 
> I think so, yes. We need to be very careful with what a possible
> default case does there, though.

Maybe it would be better to handle level_msr in a way similar to
level_leaf: return true/false to notice whether the MSR should be
added to the resulting compatible policy?

And then make the default case in level_msr just return false in order
to prevent adding MSRs not properly leveled to the policy?

Thanks, Roger.
Jan Beulich April 22, 2021, 11:05 a.m. UTC | #10
On 22.04.2021 12:56, Roger Pau Monné wrote:
> On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
>> On 22.04.2021 12:34, Roger Pau Monné wrote:
>>> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
>>>> On 22.04.2021 11:42, Roger Pau Monné wrote:
>>>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
>>>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
>>>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>>>>>>>  
>>>>>>>      return false;
>>>>>>>  }
>>>>>>> +
>>>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
>>>>>>> +{
>>>>>>> +    uint64_t val = val1 & val2;;
>>>>>>
>>>>>> For arbitrary MSRs this isn't going to do any good. If only very
>>>>>> specific MSRs are assumed to make it here, I think this wants
>>>>>> commenting on.
>>>>>
>>>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
>>>>> features"
>>>>
>>>> How does such a comment help? I.e. how does the caller tell which MSRs
>>>> to pass here and which to deal with anyother way?
>>>
>>> All MSRs should be passed to level_msr, but it's handling logic would
>>> need to be expanded to support MSRs that are not feature bitmaps.
>>>
>>> It might be best to restore the previous switch and handle each MSR
>>> specifically?
>>
>> I think so, yes. We need to be very careful with what a possible
>> default case does there, though.
> 
> Maybe it would be better to handle level_msr in a way similar to
> level_leaf: return true/false to notice whether the MSR should be
> added to the resulting compatible policy?
> 
> And then make the default case in level_msr just return false in order
> to prevent adding MSRs not properly leveled to the policy?

I'm afraid I'm not clear about the implications. What again is the
(planned?) final effect of an MSR not getting added there?

Jan
Roger Pau Monné April 22, 2021, 11:37 a.m. UTC | #11
On Thu, Apr 22, 2021 at 01:05:23PM +0200, Jan Beulich wrote:
> On 22.04.2021 12:56, Roger Pau Monné wrote:
> > On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
> >> On 22.04.2021 12:34, Roger Pau Monné wrote:
> >>> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
> >>>> On 22.04.2021 11:42, Roger Pau Monné wrote:
> >>>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
> >>>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
> >>>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
> >>>>>>>  
> >>>>>>>      return false;
> >>>>>>>  }
> >>>>>>> +
> >>>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> >>>>>>> +{
> >>>>>>> +    uint64_t val = val1 & val2;;
> >>>>>>
> >>>>>> For arbitrary MSRs this isn't going to do any good. If only very
> >>>>>> specific MSRs are assumed to make it here, I think this wants
> >>>>>> commenting on.
> >>>>>
> >>>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
> >>>>> features"
> >>>>
> >>>> How does such a comment help? I.e. how does the caller tell which MSRs
> >>>> to pass here and which to deal with anyother way?
> >>>
> >>> All MSRs should be passed to level_msr, but it's handling logic would
> >>> need to be expanded to support MSRs that are not feature bitmaps.
> >>>
> >>> It might be best to restore the previous switch and handle each MSR
> >>> specifically?
> >>
> >> I think so, yes. We need to be very careful with what a possible
> >> default case does there, though.
> > 
> > Maybe it would be better to handle level_msr in a way similar to
> > level_leaf: return true/false to notice whether the MSR should be
> > added to the resulting compatible policy?
> > 
> > And then make the default case in level_msr just return false in order
> > to prevent adding MSRs not properly leveled to the policy?
> 
> I'm afraid I'm not clear about the implications. What again is the
> (planned?) final effect of an MSR not getting added there?

Adding the MSR with a 0 value will zero out any previous value on the
'out' policy, while not adding it would leave the previous value
there given the current code in xc_cpu_policy_calc_compatible added by
this patch.

I would expect callers of xc_cpu_policy_calc_compatible to pass a
zeroed 'out' policy, so I think the end result should be the same.

Maybe I should also clear 'out' in xc_cpu_policy_calc_compatible, at
which point both options will get the same exact result.

Thanks, Roger.
Jan Beulich April 22, 2021, 11:42 a.m. UTC | #12
On 22.04.2021 13:37, Roger Pau Monné wrote:
> On Thu, Apr 22, 2021 at 01:05:23PM +0200, Jan Beulich wrote:
>> On 22.04.2021 12:56, Roger Pau Monné wrote:
>>> On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
>>>> On 22.04.2021 12:34, Roger Pau Monné wrote:
>>>>> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
>>>>>> On 22.04.2021 11:42, Roger Pau Monné wrote:
>>>>>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
>>>>>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
>>>>>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>>>>>>>>>  
>>>>>>>>>      return false;
>>>>>>>>>  }
>>>>>>>>> +
>>>>>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
>>>>>>>>> +{
>>>>>>>>> +    uint64_t val = val1 & val2;;
>>>>>>>>
>>>>>>>> For arbitrary MSRs this isn't going to do any good. If only very
>>>>>>>> specific MSRs are assumed to make it here, I think this wants
>>>>>>>> commenting on.
>>>>>>>
>>>>>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
>>>>>>> features"
>>>>>>
>>>>>> How does such a comment help? I.e. how does the caller tell which MSRs
>>>>>> to pass here and which to deal with anyother way?
>>>>>
>>>>> All MSRs should be passed to level_msr, but it's handling logic would
>>>>> need to be expanded to support MSRs that are not feature bitmaps.
>>>>>
>>>>> It might be best to restore the previous switch and handle each MSR
>>>>> specifically?
>>>>
>>>> I think so, yes. We need to be very careful with what a possible
>>>> default case does there, though.
>>>
>>> Maybe it would be better to handle level_msr in a way similar to
>>> level_leaf: return true/false to notice whether the MSR should be
>>> added to the resulting compatible policy?
>>>
>>> And then make the default case in level_msr just return false in order
>>> to prevent adding MSRs not properly leveled to the policy?
>>
>> I'm afraid I'm not clear about the implications. What again is the
>> (planned?) final effect of an MSR not getting added there?
> 
> Adding the MSR with a 0 value will zero out any previous value on the
> 'out' policy, while not adding it would leave the previous value
> there given the current code in xc_cpu_policy_calc_compatible added by
> this patch.
> 
> I would expect callers of xc_cpu_policy_calc_compatible to pass a
> zeroed 'out' policy, so I think the end result should be the same.

But we're not talking about actual MSR values here, as this is all
about policy. So in the end we'll have to see how things need to
be once we have the first non-feature-flag-like entries there. It
feels as if simply zeroing can't be generally the right thing in
such a case. It may e.g. be that min() is wanted instead.

Jan
Roger Pau Monné April 22, 2021, 12:07 p.m. UTC | #13
On Thu, Apr 22, 2021 at 01:42:45PM +0200, Jan Beulich wrote:
> On 22.04.2021 13:37, Roger Pau Monné wrote:
> > On Thu, Apr 22, 2021 at 01:05:23PM +0200, Jan Beulich wrote:
> >> On 22.04.2021 12:56, Roger Pau Monné wrote:
> >>> On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
> >>>> On 22.04.2021 12:34, Roger Pau Monné wrote:
> >>>>> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
> >>>>>> On 22.04.2021 11:42, Roger Pau Monné wrote:
> >>>>>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
> >>>>>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
> >>>>>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
> >>>>>>>>>  
> >>>>>>>>>      return false;
> >>>>>>>>>  }
> >>>>>>>>> +
> >>>>>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
> >>>>>>>>> +{
> >>>>>>>>> +    uint64_t val = val1 & val2;;
> >>>>>>>>
> >>>>>>>> For arbitrary MSRs this isn't going to do any good. If only very
> >>>>>>>> specific MSRs are assumed to make it here, I think this wants
> >>>>>>>> commenting on.
> >>>>>>>
> >>>>>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
> >>>>>>> features"
> >>>>>>
> >>>>>> How does such a comment help? I.e. how does the caller tell which MSRs
> >>>>>> to pass here and which to deal with anyother way?
> >>>>>
> >>>>> All MSRs should be passed to level_msr, but it's handling logic would
> >>>>> need to be expanded to support MSRs that are not feature bitmaps.
> >>>>>
> >>>>> It might be best to restore the previous switch and handle each MSR
> >>>>> specifically?
> >>>>
> >>>> I think so, yes. We need to be very careful with what a possible
> >>>> default case does there, though.
> >>>
> >>> Maybe it would be better to handle level_msr in a way similar to
> >>> level_leaf: return true/false to notice whether the MSR should be
> >>> added to the resulting compatible policy?
> >>>
> >>> And then make the default case in level_msr just return false in order
> >>> to prevent adding MSRs not properly leveled to the policy?
> >>
> >> I'm afraid I'm not clear about the implications. What again is the
> >> (planned?) final effect of an MSR not getting added there?
> > 
> > Adding the MSR with a 0 value will zero out any previous value on the
> > 'out' policy, while not adding it would leave the previous value
> > there given the current code in xc_cpu_policy_calc_compatible added by
> > this patch.
> > 
> > I would expect callers of xc_cpu_policy_calc_compatible to pass a
> > zeroed 'out' policy, so I think the end result should be the same.
> 
> But we're not talking about actual MSR values here, as this is all
> about policy. So in the end we'll have to see how things need to
> be once we have the first non-feature-flag-like entries there. It
> feels as if simply zeroing can't be generally the right thing in
> such a case. It may e.g. be that min() is wanted instead.

Maybe level_msr should return an error for MSRs not explicitly
handled, that's propagated to the caller of
xc_cpu_policy_calc_compatible.

That way addition of new MSRs are not likely to miss adding the
required handling in level_msr?

Thanks, Roger.
Jan Beulich April 22, 2021, 12:08 p.m. UTC | #14
On 22.04.2021 14:07, Roger Pau Monné wrote:
> On Thu, Apr 22, 2021 at 01:42:45PM +0200, Jan Beulich wrote:
>> On 22.04.2021 13:37, Roger Pau Monné wrote:
>>> On Thu, Apr 22, 2021 at 01:05:23PM +0200, Jan Beulich wrote:
>>>> On 22.04.2021 12:56, Roger Pau Monné wrote:
>>>>> On Thu, Apr 22, 2021 at 12:48:36PM +0200, Jan Beulich wrote:
>>>>>> On 22.04.2021 12:34, Roger Pau Monné wrote:
>>>>>>> On Thu, Apr 22, 2021 at 11:58:45AM +0200, Jan Beulich wrote:
>>>>>>>> On 22.04.2021 11:42, Roger Pau Monné wrote:
>>>>>>>>> On Wed, Apr 14, 2021 at 03:49:02PM +0200, Jan Beulich wrote:
>>>>>>>>>> On 13.04.2021 16:01, Roger Pau Monne wrote:
>>>>>>>>>>> @@ -944,3 +945,130 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
>>>>>>>>>>>  
>>>>>>>>>>>      return false;
>>>>>>>>>>>  }
>>>>>>>>>>> +
>>>>>>>>>>> +static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
>>>>>>>>>>> +{
>>>>>>>>>>> +    uint64_t val = val1 & val2;;
>>>>>>>>>>
>>>>>>>>>> For arbitrary MSRs this isn't going to do any good. If only very
>>>>>>>>>> specific MSRs are assumed to make it here, I think this wants
>>>>>>>>>> commenting on.
>>>>>>>>>
>>>>>>>>> I've added: "MSRs passed to level_msr are expected to be bitmaps of
>>>>>>>>> features"
>>>>>>>>
>>>>>>>> How does such a comment help? I.e. how does the caller tell which MSRs
>>>>>>>> to pass here and which to deal with anyother way?
>>>>>>>
>>>>>>> All MSRs should be passed to level_msr, but it's handling logic would
>>>>>>> need to be expanded to support MSRs that are not feature bitmaps.
>>>>>>>
>>>>>>> It might be best to restore the previous switch and handle each MSR
>>>>>>> specifically?
>>>>>>
>>>>>> I think so, yes. We need to be very careful with what a possible
>>>>>> default case does there, though.
>>>>>
>>>>> Maybe it would be better to handle level_msr in a way similar to
>>>>> level_leaf: return true/false to notice whether the MSR should be
>>>>> added to the resulting compatible policy?
>>>>>
>>>>> And then make the default case in level_msr just return false in order
>>>>> to prevent adding MSRs not properly leveled to the policy?
>>>>
>>>> I'm afraid I'm not clear about the implications. What again is the
>>>> (planned?) final effect of an MSR not getting added there?
>>>
>>> Adding the MSR with a 0 value will zero out any previous value on the
>>> 'out' policy, while not adding it would leave the previous value
>>> there given the current code in xc_cpu_policy_calc_compatible added by
>>> this patch.
>>>
>>> I would expect callers of xc_cpu_policy_calc_compatible to pass a
>>> zeroed 'out' policy, so I think the end result should be the same.
>>
>> But we're not talking about actual MSR values here, as this is all
>> about policy. So in the end we'll have to see how things need to
>> be once we have the first non-feature-flag-like entries there. It
>> feels as if simply zeroing can't be generally the right thing in
>> such a case. It may e.g. be that min() is wanted instead.
> 
> Maybe level_msr should return an error for MSRs not explicitly
> handled, that's propagated to the caller of
> xc_cpu_policy_calc_compatible.
> 
> That way addition of new MSRs are not likely to miss adding the
> required handling in level_msr?

Perhaps, yes.

Jan
diff mbox series

Patch

diff --git a/tools/include/xen-tools/libs.h b/tools/include/xen-tools/libs.h
index a16e0c38070..b9e89f9a711 100644
--- a/tools/include/xen-tools/libs.h
+++ b/tools/include/xen-tools/libs.h
@@ -63,4 +63,9 @@ 
 #define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1))
 #endif
 
+#ifndef _AC
+#define __AC(X,Y)   (X##Y)
+#define _AC(X,Y)    __AC(X,Y)
+#endif
+
 #endif	/* __XEN_TOOLS_LIBS__ */
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 5f699c09509..c41d794683c 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -2622,6 +2622,10 @@  int xc_cpu_policy_update_msrs(xc_interface *xch, xc_cpu_policy_t policy,
 /* Compatibility calculations. */
 bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
                                  const xc_cpu_policy_t guest);
+int xc_cpu_policy_calc_compatible(xc_interface *xch,
+                                  const xc_cpu_policy_t p1,
+                                  const xc_cpu_policy_t p2,
+                                  xc_cpu_policy_t out);
 
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
 int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c
index bd2f31dd87f..6cfa4cb39d1 100644
--- a/tools/libs/guest/xg_cpuid_x86.c
+++ b/tools/libs/guest/xg_cpuid_x86.c
@@ -32,6 +32,7 @@  enum {
 #include <xen/arch-x86/cpufeatureset.h>
 };
 
+#include <xen/asm/msr-index.h>
 #include <xen/asm/x86-vendors.h>
 
 #include <xen/lib/x86/cpu-policy.h>
@@ -944,3 +945,130 @@  bool xc_cpu_policy_is_compatible(xc_interface *xch, const xc_cpu_policy_t host,
 
     return false;
 }
+
+static uint64_t level_msr(unsigned int index, uint64_t val1, uint64_t val2)
+{
+    uint64_t val = val1 & val2;;
+
+    switch ( index )
+    {
+    case MSR_ARCH_CAPABILITIES:
+        /*
+         * Set RSBA if present on any of the input values to notice the guest
+         * might run on vulnerable hardware at some point.
+         */
+        val |= (val1 | val2) & ARCH_CAPS_RSBA;
+        break;
+    }
+
+    return val;
+}
+
+static bool level_leaf(xen_cpuid_leaf_t *l1, xen_cpuid_leaf_t *l2,
+                       xen_cpuid_leaf_t *out)
+{
+    *out = (xen_cpuid_leaf_t){ };
+
+    switch ( l1->leaf )
+    {
+    case 0x1:
+    case 0x80000001:
+        out->c = l1->c & l2->c;
+        out->d = l1->d & l2->d;
+        return true;
+
+    case 0xd:
+        if ( l1->subleaf != 1 )
+            break;
+        out->a = l1->a & l2->a;
+        return true;
+
+    case 0x7:
+        switch ( l1->subleaf )
+        {
+        case 0:
+            out->b = l1->b & l2->b;
+            out->c = l1->c & l2->c;
+            out->d = l1->d & l2->d;
+            return true;
+
+        case 1:
+            out->a = l1->a & l2->a;
+            return true;
+        }
+        break;
+
+    case 0x80000007:
+        out->d = l1->d & l2->d;
+        return true;
+
+    case 0x80000008:
+        out->b = l1->b & l2->b;
+        return true;
+    }
+
+    return false;
+}
+
+int xc_cpu_policy_calc_compatible(xc_interface *xch,
+                                  const xc_cpu_policy_t p1,
+                                  const xc_cpu_policy_t p2,
+                                  xc_cpu_policy_t out)
+{
+    unsigned int nr_leaves, nr_msrs, i, index;
+    unsigned int p1_nr_leaves, p2_nr_leaves;
+    unsigned int p1_nr_entries, p2_nr_entries;
+    int rc;
+
+    p1_nr_leaves = p2_nr_leaves = ARRAY_SIZE(p1->leaves);
+    p1_nr_entries = p2_nr_entries = ARRAY_SIZE(p1->entries);
+
+    rc = xc_cpu_policy_serialise(xch, p1, p1->leaves, &p1_nr_leaves,
+                                 p1->entries, &p1_nr_entries);
+    if ( rc )
+        return rc;
+    rc = xc_cpu_policy_serialise(xch, p2, p2->leaves, &p2_nr_leaves,
+                                 p2->entries, &p2_nr_entries);
+    if ( rc )
+        return rc;
+
+    index = 0;
+    for ( i = 0; i < p1_nr_leaves; i++ )
+    {
+        xen_cpuid_leaf_t *l1 = &p1->leaves[i];
+        xen_cpuid_leaf_t *l2 = find_leaf(p2->leaves, p2_nr_leaves,
+                                         l1->leaf, l1->subleaf);
+
+        if ( l2 && level_leaf(&out->leaves[index], l1, l2) )
+        {
+            out->leaves[index].leaf = l1->leaf;
+            out->leaves[index].subleaf = l1->subleaf;
+            index++;
+        }
+    }
+    nr_leaves = index;
+
+    index = 0;
+    for ( i = 0; i < p1_nr_entries; i++ )
+    {
+        xen_msr_entry_t *l1 = &p1->entries[i];
+        xen_msr_entry_t *l2 = find_entry(p2->entries, p2_nr_entries, l1->idx);
+
+        if ( !l2 )
+            continue;
+
+        out->entries[index].idx = l1->idx;
+        out->entries[index].val = level_msr(l1->idx, l1->val, l2->val);
+        index++;
+    }
+    nr_msrs = index;
+
+    rc = deserialize_policy(xch, out, nr_leaves, nr_msrs);
+    if ( rc )
+    {
+        errno = -rc;
+        rc = -1;
+    }
+
+    return rc;
+}
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index 44a2f3c8fe3..5709bcb93fa 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -126,8 +126,6 @@ 
 #define PVSHIM_CMDLINE "pv-shim console=xen,pv"
 
 /* Size macros. */
-#define __AC(X,Y)   (X##Y)
-#define _AC(X,Y)    __AC(X,Y)
 #define MB(_mb)     (_AC(_mb, ULL) << 20)
 #define GB(_gb)     (_AC(_gb, ULL) << 30)