diff mbox series

[v2,3/8] x86/msr: explicitly handle AMD DE_CFG

Message ID 20200820150835.27440-4-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86: switch default MSR behavior | expand

Commit Message

Roger Pau Monné Aug. 20, 2020, 3:08 p.m. UTC
Report the hardware value of DE_CFG on AMD hardware and silently drop
writes.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/msr.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Andrew Cooper Aug. 20, 2020, 5:08 p.m. UTC | #1
On 20/08/2020 16:08, Roger Pau Monne wrote:

> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index ca4307e19f..a890cb9976 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -274,6 +274,14 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>          *val = msrs->tsc_aux;
>          break;
>  
> +    case MSR_AMD64_DE_CFG:
> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> +                                           X86_VENDOR_HYGON)) ||
> +             rdmsr_safe(MSR_AMD64_DE_CFG, *val) )
> +            goto gp_fault;
> +        break;

Ah.  What I intended was to read just bit 2 and nothing else.

Leaking the full value is non-ideal from a migration point of view, and
in this case, you can avoid querying hardware entirely.

Just return AMD64_DE_CFG_LFENCE_SERIALISE here.  The only case where it
won't be true is when the hypervisor running us (i.e. Xen) failed to set
it up, and the CPU boot path failed to adjust it, at which point the
whole system has much bigger problems.

> +
>      case MSR_AMD64_DR0_ADDRESS_MASK:
>      case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
>          if ( !cp->extd.dbext )
> @@ -499,6 +507,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>              wrmsr_tsc_aux(val);
>          break;
>  
> +    case MSR_AMD64_DE_CFG:
> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> +            goto gp_fault;
> +        break;

There should be no problem yielding #GP here (i.e. dropping this hunk).

IIRC, it was the behaviour of certain hypervisors when Spectre hit, so
all guests ought to cope.  (And indeed, not try to redundantly set the
bit to start with).

~Andrew

> +
>      case MSR_AMD64_DR0_ADDRESS_MASK:
>      case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
>          if ( !cp->extd.dbext || val != (uint32_t)val )
Roger Pau Monné Aug. 21, 2020, 11:52 a.m. UTC | #2
On Thu, Aug 20, 2020 at 06:08:53PM +0100, Andrew Cooper wrote:
> On 20/08/2020 16:08, Roger Pau Monne wrote:
> 
> > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> > index ca4307e19f..a890cb9976 100644
> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -274,6 +274,14 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> >          *val = msrs->tsc_aux;
> >          break;
> >  
> > +    case MSR_AMD64_DE_CFG:
> > +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> > +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> > +                                           X86_VENDOR_HYGON)) ||
> > +             rdmsr_safe(MSR_AMD64_DE_CFG, *val) )
> > +            goto gp_fault;
> > +        break;
> 
> Ah.  What I intended was to read just bit 2 and nothing else.
> 
> Leaking the full value is non-ideal from a migration point of view, and
> in this case, you can avoid querying hardware entirely.
> 
> Just return AMD64_DE_CFG_LFENCE_SERIALISE here.  The only case where it
> won't be true is when the hypervisor running us (i.e. Xen) failed to set
> it up, and the CPU boot path failed to adjust it, at which point the
> whole system has much bigger problems.

Right, the rest are just model specific workarounds AFAICT, so it's
safe to not display them. A guest might attempt to set them, but we
should simply drop the write, see below.

> 
> > +
> >      case MSR_AMD64_DR0_ADDRESS_MASK:
> >      case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
> >          if ( !cp->extd.dbext )
> > @@ -499,6 +507,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> >              wrmsr_tsc_aux(val);
> >          break;
> >  
> > +    case MSR_AMD64_DE_CFG:
> > +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> > +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> > +            goto gp_fault;
> > +        break;
> 
> There should be no problem yielding #GP here (i.e. dropping this hunk).
> 
> IIRC, it was the behaviour of certain hypervisors when Spectre hit, so
> all guests ought to cope.  (And indeed, not try to redundantly set the
> bit to start with).

It seems like OpenBSD will try to do so unconditionally, see:

https://www.illumos.org/issues/12998

According to the report there returning #GP when trying to WRMSR
DE_CFG will cause OpenBSD to panic, so I think we need to keep this
behavior of silently dropping writes.

Thanks, Roger.
Andrew Cooper Aug. 21, 2020, 2:03 p.m. UTC | #3
On 21/08/2020 12:52, Roger Pau Monné wrote:
> On Thu, Aug 20, 2020 at 06:08:53PM +0100, Andrew Cooper wrote:
>> On 20/08/2020 16:08, Roger Pau Monne wrote:
>>
>>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
>>> index ca4307e19f..a890cb9976 100644
>>> --- a/xen/arch/x86/msr.c
>>> +++ b/xen/arch/x86/msr.c
>>> @@ -274,6 +274,14 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>>          *val = msrs->tsc_aux;
>>>          break;
>>>  
>>> +    case MSR_AMD64_DE_CFG:
>>> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
>>> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
>>> +                                           X86_VENDOR_HYGON)) ||
>>> +             rdmsr_safe(MSR_AMD64_DE_CFG, *val) )
>>> +            goto gp_fault;
>>> +        break;
>> Ah.  What I intended was to read just bit 2 and nothing else.
>>
>> Leaking the full value is non-ideal from a migration point of view, and
>> in this case, you can avoid querying hardware entirely.
>>
>> Just return AMD64_DE_CFG_LFENCE_SERIALISE here.  The only case where it
>> won't be true is when the hypervisor running us (i.e. Xen) failed to set
>> it up, and the CPU boot path failed to adjust it, at which point the
>> whole system has much bigger problems.
> Right, the rest are just model specific workarounds AFAICT, so it's
> safe to not display them. A guest might attempt to set them, but we
> should simply drop the write, see below.

Most of the layout is model specific.  It's only by chance that the
LFENCE bits line up in all generations.

The bit used to work around Speculative Store Bypass in LS_CFG doesn't
line up across generations.

>>> +
>>>      case MSR_AMD64_DR0_ADDRESS_MASK:
>>>      case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
>>>          if ( !cp->extd.dbext )
>>> @@ -499,6 +507,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>>>              wrmsr_tsc_aux(val);
>>>          break;
>>>  
>>> +    case MSR_AMD64_DE_CFG:
>>> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
>>> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>>> +            goto gp_fault;
>>> +        break;
>> There should be no problem yielding #GP here (i.e. dropping this hunk).
>>
>> IIRC, it was the behaviour of certain hypervisors when Spectre hit, so
>> all guests ought to cope.  (And indeed, not try to redundantly set the
>> bit to start with).
> It seems like OpenBSD will try to do so unconditionally, see:
>
> https://www.illumos.org/issues/12998
>
> According to the report there returning #GP when trying to WRMSR
> DE_CFG will cause OpenBSD to panic, so I think we need to keep this
> behavior of silently dropping writes.

/sigh - there is always one.  Comment please, and lets leave it as
write-discard.

As for the vendor-ness, drop the checks to just cp->x86_vendor.  There
is no boot_cpu_data interaction how that you've taken the rdmsr() out.

~Andrew
Roger Pau Monné Aug. 21, 2020, 2:09 p.m. UTC | #4
On Fri, Aug 21, 2020 at 03:03:08PM +0100, Andrew Cooper wrote:
> On 21/08/2020 12:52, Roger Pau Monné wrote:
> > On Thu, Aug 20, 2020 at 06:08:53PM +0100, Andrew Cooper wrote:
> >> On 20/08/2020 16:08, Roger Pau Monne wrote:
> >>
> >>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> >>> index ca4307e19f..a890cb9976 100644
> >>> --- a/xen/arch/x86/msr.c
> >>> +++ b/xen/arch/x86/msr.c
> >>> @@ -274,6 +274,14 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> >>>          *val = msrs->tsc_aux;
> >>>          break;
> >>>  
> >>> +    case MSR_AMD64_DE_CFG:
> >>> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> >>> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> >>> +                                           X86_VENDOR_HYGON)) ||
> >>> +             rdmsr_safe(MSR_AMD64_DE_CFG, *val) )
> >>> +            goto gp_fault;
> >>> +        break;
> >> Ah.  What I intended was to read just bit 2 and nothing else.
> >>
> >> Leaking the full value is non-ideal from a migration point of view, and
> >> in this case, you can avoid querying hardware entirely.
> >>
> >> Just return AMD64_DE_CFG_LFENCE_SERIALISE here.  The only case where it
> >> won't be true is when the hypervisor running us (i.e. Xen) failed to set
> >> it up, and the CPU boot path failed to adjust it, at which point the
> >> whole system has much bigger problems.
> > Right, the rest are just model specific workarounds AFAICT, so it's
> > safe to not display them. A guest might attempt to set them, but we
> > should simply drop the write, see below.
> 
> Most of the layout is model specific.  It's only by chance that the
> LFENCE bits line up in all generations.
> 
> The bit used to work around Speculative Store Bypass in LS_CFG doesn't
> line up across generations.
> 
> >>> +
> >>>      case MSR_AMD64_DR0_ADDRESS_MASK:
> >>>      case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
> >>>          if ( !cp->extd.dbext )
> >>> @@ -499,6 +507,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> >>>              wrmsr_tsc_aux(val);
> >>>          break;
> >>>  
> >>> +    case MSR_AMD64_DE_CFG:
> >>> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
> >>> +             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> >>> +            goto gp_fault;
> >>> +        break;
> >> There should be no problem yielding #GP here (i.e. dropping this hunk).
> >>
> >> IIRC, it was the behaviour of certain hypervisors when Spectre hit, so
> >> all guests ought to cope.  (And indeed, not try to redundantly set the
> >> bit to start with).
> > It seems like OpenBSD will try to do so unconditionally, see:
> >
> > https://www.illumos.org/issues/12998
> >
> > According to the report there returning #GP when trying to WRMSR
> > DE_CFG will cause OpenBSD to panic, so I think we need to keep this
> > behavior of silently dropping writes.
> 
> /sigh - there is always one.  Comment please, and lets leave it as
> write-discard.
> 
> As for the vendor-ness, drop the checks to just cp->x86_vendor.  There
> is no boot_cpu_data interaction how that you've taken the rdmsr() out.

Sure, will wait for comments on other patches before sending the
updated version.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index ca4307e19f..a890cb9976 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -274,6 +274,14 @@  int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->tsc_aux;
         break;
 
+    case MSR_AMD64_DE_CFG:
+        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
+             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
+                                           X86_VENDOR_HYGON)) ||
+             rdmsr_safe(MSR_AMD64_DE_CFG, *val) )
+            goto gp_fault;
+        break;
+
     case MSR_AMD64_DR0_ADDRESS_MASK:
     case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
         if ( !cp->extd.dbext )
@@ -499,6 +507,12 @@  int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
             wrmsr_tsc_aux(val);
         break;
 
+    case MSR_AMD64_DE_CFG:
+        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
+             !(boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
+            goto gp_fault;
+        break;
+
     case MSR_AMD64_DR0_ADDRESS_MASK:
     case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
         if ( !cp->extd.dbext || val != (uint32_t)val )