diff mbox series

[1/3] x86: Reject bad %dr6/%dr7 values when loading guest state

Message ID 20230829134333.3551243-2-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86: Debug Regs fixes, part 1 | expand

Commit Message

Andrew Cooper Aug. 29, 2023, 1:43 p.m. UTC
Right now, bad PV state is silently dropped and zeroed, while bad HVM state is
passed directly to hardware and can trigger VMEntry/VMRUN failures.  e.g.

  (XEN) d12v0 vmentry failure (reason 0x80000021): Invalid guest state (0)
  ...
  (XEN) RFLAGS=0x00000002 (0x00000002)  DR7 = 0x4000000000000001

Furthermore, prior to c/s 30f43f4aa81e ("x86: Reorganise and rename debug
register fields in struct vcpu") in Xen 4.11 where v->arch.dr6 was reduced in
width, the toolstack can cause a host crash by loading a bad %dr6 value on
VT-x hardware.

Reject any %dr6/7 values with upper bits set.  For PV guests, also audit
%dr0..3 so they aren't silently zeroed later in the function.  Leave a comment
behind explaing how %dr4/5 handling changed, and why they're ignored now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>
---
 xen/arch/x86/domain.c  | 19 +++++++++++++++++++
 xen/arch/x86/hvm/hvm.c |  8 ++++++++
 2 files changed, 27 insertions(+)

Comments

Jan Beulich Aug. 29, 2023, 2:08 p.m. UTC | #1
On 29.08.2023 15:43, Andrew Cooper wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>  #endif
>      flags = c(flags);
>  
> +    if ( !compat )
> +    {
> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
> +            return -EINVAL;
> +    }
> +
>      if ( is_pv_domain(d) )
>      {
> +        /*
> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
> +         * subset of dr7, and dr4 was unused.
> +         *
> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
> +         * backwards compatibility, and dr7 emulation is handled
> +         * internally.
> +         */
> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )

Don't you mean __addr_ok() here, i.e. not including the
is_compat_arg_xlat_range() check? (Else I would have asked why
sizeof(long), but that question resolves itself with using the other
macro.)

Jan
Jan Beulich Aug. 30, 2023, 6:46 a.m. UTC | #2
On 29.08.2023 15:43, Andrew Cooper wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>  #endif
>      flags = c(flags);
>  
> +    if ( !compat )
> +    {
> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
> +            return -EINVAL;
> +    }
> +
>      if ( is_pv_domain(d) )
>      {
> +        /*
> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
> +         * subset of dr7, and dr4 was unused.
> +         *
> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
> +         * backwards compatibility, and dr7 emulation is handled
> +         * internally.
> +         */
> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
> +                return -EINVAL;
> +
>          if ( !compat )
>          {
>              if ( !is_canonical_address(c.nat->user_regs.rip) ||

One more thing here: v->arch.dr is an array of 4 elements, i.e. doesn't
cover %dr4 and up. That's not directly visible here, though, so the
comment ahead of the loop talking about those other 4 registers is a
little misleading. Would you mind moving it below the loop?

Jan
Andrew Cooper Aug. 30, 2023, 2:35 p.m. UTC | #3
On 29/08/2023 3:08 pm, Jan Beulich wrote:
> On 29.08.2023 15:43, Andrew Cooper wrote:
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>  #endif
>>      flags = c(flags);
>>  
>> +    if ( !compat )
>> +    {
>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>> +            return -EINVAL;
>> +    }
>> +
>>      if ( is_pv_domain(d) )
>>      {
>> +        /*
>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>> +         * subset of dr7, and dr4 was unused.
>> +         *
>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>> +         * backwards compatibility, and dr7 emulation is handled
>> +         * internally.
>> +         */
>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
> Don't you mean __addr_ok() here, i.e. not including the
> is_compat_arg_xlat_range() check? (Else I would have asked why
> sizeof(long), but that question resolves itself with using the other
> macro.)

For now, I'm simply moving a check from set_debugreg() earlier in
arch_set_info_guest().

I think it would be beneficial to keep that change independent.

~Andrew
Andrew Cooper Aug. 30, 2023, 2:39 p.m. UTC | #4
On 30/08/2023 7:46 am, Jan Beulich wrote:
> On 29.08.2023 15:43, Andrew Cooper wrote:
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>  #endif
>>      flags = c(flags);
>>  
>> +    if ( !compat )
>> +    {
>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>> +            return -EINVAL;
>> +    }
>> +
>>      if ( is_pv_domain(d) )
>>      {
>> +        /*
>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>> +         * subset of dr7, and dr4 was unused.
>> +         *
>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>> +         * backwards compatibility, and dr7 emulation is handled
>> +         * internally.
>> +         */
>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>> +                return -EINVAL;
>> +
>>          if ( !compat )
>>          {
>>              if ( !is_canonical_address(c.nat->user_regs.rip) ||
> One more thing here: v->arch.dr is an array of 4 elements, i.e. doesn't
> cover %dr4 and up.

Correct (as of the same changeset relevant in this comment).

> That's not directly visible here, though, so the
> comment ahead of the loop talking about those other 4 registers is a
> little misleading. Would you mind moving it below the loop?

Can do.

~Andrew
Jan Beulich Aug. 30, 2023, 3:12 p.m. UTC | #5
On 30.08.2023 16:35, Andrew Cooper wrote:
> On 29/08/2023 3:08 pm, Jan Beulich wrote:
>> On 29.08.2023 15:43, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>>  #endif
>>>      flags = c(flags);
>>>  
>>> +    if ( !compat )
>>> +    {
>>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>>> +            return -EINVAL;
>>> +    }
>>> +
>>>      if ( is_pv_domain(d) )
>>>      {
>>> +        /*
>>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>>> +         * subset of dr7, and dr4 was unused.
>>> +         *
>>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>>> +         * backwards compatibility, and dr7 emulation is handled
>>> +         * internally.
>>> +         */
>>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>> Don't you mean __addr_ok() here, i.e. not including the
>> is_compat_arg_xlat_range() check? (Else I would have asked why
>> sizeof(long), but that question resolves itself with using the other
>> macro.)
> 
> For now, I'm simply moving a check from set_debugreg() earlier in
> arch_set_info_guest().
> 
> I think it would be beneficial to keep that change independent.

Hmm, difficult. I'd be okay if you indeed moved the other check. But
you duplicate it here, and duplicating questionable code is, well,
questionable.

Jan
Andrew Cooper Aug. 30, 2023, 3:28 p.m. UTC | #6
On 30/08/2023 4:12 pm, Jan Beulich wrote:
> On 30.08.2023 16:35, Andrew Cooper wrote:
>> On 29/08/2023 3:08 pm, Jan Beulich wrote:
>>> On 29.08.2023 15:43, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/domain.c
>>>> +++ b/xen/arch/x86/domain.c
>>>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>>>  #endif
>>>>      flags = c(flags);
>>>>  
>>>> +    if ( !compat )
>>>> +    {
>>>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>>>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>>>> +            return -EINVAL;
>>>> +    }
>>>> +
>>>>      if ( is_pv_domain(d) )
>>>>      {
>>>> +        /*
>>>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>>>> +         * subset of dr7, and dr4 was unused.
>>>> +         *
>>>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>>>> +         * backwards compatibility, and dr7 emulation is handled
>>>> +         * internally.
>>>> +         */
>>>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>>>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>>> Don't you mean __addr_ok() here, i.e. not including the
>>> is_compat_arg_xlat_range() check? (Else I would have asked why
>>> sizeof(long), but that question resolves itself with using the other
>>> macro.)
>> For now, I'm simply moving a check from set_debugreg() earlier in
>> arch_set_info_guest().
>>
>> I think it would be beneficial to keep that change independent.
> Hmm, difficult. I'd be okay if you indeed moved the other check. But
> you duplicate it here, and duplicating questionable code is, well,
> questionable.

It can't be removed in set_debugreg() because that's used in other paths
too.

And the error from set_debugreg() can't fail arch_set_info_guest()
because that introduces a failure after mutation of the vCPU state.

This isn't a fastpath.  It's used approximately once per vCPU lifetime.

~Andrew
Jan Beulich Aug. 30, 2023, 4:13 p.m. UTC | #7
On 30.08.2023 17:28, Andrew Cooper wrote:
> On 30/08/2023 4:12 pm, Jan Beulich wrote:
>> On 30.08.2023 16:35, Andrew Cooper wrote:
>>> On 29/08/2023 3:08 pm, Jan Beulich wrote:
>>>> On 29.08.2023 15:43, Andrew Cooper wrote:
>>>>> --- a/xen/arch/x86/domain.c
>>>>> +++ b/xen/arch/x86/domain.c
>>>>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>>>>  #endif
>>>>>      flags = c(flags);
>>>>>  
>>>>> +    if ( !compat )
>>>>> +    {
>>>>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>>>>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>>>>> +            return -EINVAL;
>>>>> +    }
>>>>> +
>>>>>      if ( is_pv_domain(d) )
>>>>>      {
>>>>> +        /*
>>>>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>>>>> +         * subset of dr7, and dr4 was unused.
>>>>> +         *
>>>>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>>>>> +         * backwards compatibility, and dr7 emulation is handled
>>>>> +         * internally.
>>>>> +         */
>>>>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>>>>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>>>> Don't you mean __addr_ok() here, i.e. not including the
>>>> is_compat_arg_xlat_range() check? (Else I would have asked why
>>>> sizeof(long), but that question resolves itself with using the other
>>>> macro.)
>>> For now, I'm simply moving a check from set_debugreg() earlier in
>>> arch_set_info_guest().
>>>
>>> I think it would be beneficial to keep that change independent.
>> Hmm, difficult. I'd be okay if you indeed moved the other check. But
>> you duplicate it here, and duplicating questionable code is, well,
>> questionable.
> 
> It can't be removed in set_debugreg() because that's used in other paths
> too.

Sure, I understand that.

> And the error from set_debugreg() can't fail arch_set_info_guest()
> because that introduces a failure after mutation of the vCPU state.
> 
> This isn't a fastpath.  It's used approximately once per vCPU lifetime.

But fast or not isn't the point here. The point is that both the use
of access_ok() and the use of sizeof(long) are bogus in this context.
Switching to __addr_ok() will tighten the check here (and hence not
risk set_debugreg() later raising an error).

Jan
Andrew Cooper Aug. 30, 2023, 5:02 p.m. UTC | #8
On 30/08/2023 5:13 pm, Jan Beulich wrote:
> On 30.08.2023 17:28, Andrew Cooper wrote:
>> On 30/08/2023 4:12 pm, Jan Beulich wrote:
>>> On 30.08.2023 16:35, Andrew Cooper wrote:
>>>> On 29/08/2023 3:08 pm, Jan Beulich wrote:
>>>>> On 29.08.2023 15:43, Andrew Cooper wrote:
>>>>>> --- a/xen/arch/x86/domain.c
>>>>>> +++ b/xen/arch/x86/domain.c
>>>>>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>>>>>  #endif
>>>>>>      flags = c(flags);
>>>>>>  
>>>>>> +    if ( !compat )
>>>>>> +    {
>>>>>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>>>>>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>>>>>> +            return -EINVAL;
>>>>>> +    }
>>>>>> +
>>>>>>      if ( is_pv_domain(d) )
>>>>>>      {
>>>>>> +        /*
>>>>>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>>>>>> +         * subset of dr7, and dr4 was unused.
>>>>>> +         *
>>>>>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>>>>>> +         * backwards compatibility, and dr7 emulation is handled
>>>>>> +         * internally.
>>>>>> +         */
>>>>>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>>>>>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>>>>> Don't you mean __addr_ok() here, i.e. not including the
>>>>> is_compat_arg_xlat_range() check? (Else I would have asked why
>>>>> sizeof(long), but that question resolves itself with using the other
>>>>> macro.)
>>>> For now, I'm simply moving a check from set_debugreg() earlier in
>>>> arch_set_info_guest().
>>>>
>>>> I think it would be beneficial to keep that change independent.
>>> Hmm, difficult. I'd be okay if you indeed moved the other check. But
>>> you duplicate it here, and duplicating questionable code is, well,
>>> questionable.
>> It can't be removed in set_debugreg() because that's used in other paths
>> too.
> Sure, I understand that.
>
>> And the error from set_debugreg() can't fail arch_set_info_guest()
>> because that introduces a failure after mutation of the vCPU state.
>>
>> This isn't a fastpath.  It's used approximately once per vCPU lifetime.
> But fast or not isn't the point here.

No.  The point is no change from the existing code.

If you think it's wrong, it in a separate change and don't block this fix.

~Andrew
Jan Beulich Aug. 31, 2023, 6:08 a.m. UTC | #9
On 30.08.2023 19:02, Andrew Cooper wrote:
> On 30/08/2023 5:13 pm, Jan Beulich wrote:
>> On 30.08.2023 17:28, Andrew Cooper wrote:
>>> On 30/08/2023 4:12 pm, Jan Beulich wrote:
>>>> On 30.08.2023 16:35, Andrew Cooper wrote:
>>>>> On 29/08/2023 3:08 pm, Jan Beulich wrote:
>>>>>> On 29.08.2023 15:43, Andrew Cooper wrote:
>>>>>>> --- a/xen/arch/x86/domain.c
>>>>>>> +++ b/xen/arch/x86/domain.c
>>>>>>> @@ -1074,8 +1074,27 @@ int arch_set_info_guest(
>>>>>>>  #endif
>>>>>>>      flags = c(flags);
>>>>>>>  
>>>>>>> +    if ( !compat )
>>>>>>> +    {
>>>>>>> +        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
>>>>>>> +             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
>>>>>>> +            return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>>      if ( is_pv_domain(d) )
>>>>>>>      {
>>>>>>> +        /*
>>>>>>> +         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
>>>>>>> +         * subset of dr7, and dr4 was unused.
>>>>>>> +         *
>>>>>>> +         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
>>>>>>> +         * backwards compatibility, and dr7 emulation is handled
>>>>>>> +         * internally.
>>>>>>> +         */
>>>>>>> +        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
>>>>>>> +            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
>>>>>> Don't you mean __addr_ok() here, i.e. not including the
>>>>>> is_compat_arg_xlat_range() check? (Else I would have asked why
>>>>>> sizeof(long), but that question resolves itself with using the other
>>>>>> macro.)
>>>>> For now, I'm simply moving a check from set_debugreg() earlier in
>>>>> arch_set_info_guest().
>>>>>
>>>>> I think it would be beneficial to keep that change independent.
>>>> Hmm, difficult. I'd be okay if you indeed moved the other check. But
>>>> you duplicate it here, and duplicating questionable code is, well,
>>>> questionable.
>>> It can't be removed in set_debugreg() because that's used in other paths
>>> too.
>> Sure, I understand that.
>>
>>> And the error from set_debugreg() can't fail arch_set_info_guest()
>>> because that introduces a failure after mutation of the vCPU state.
>>>
>>> This isn't a fastpath.  It's used approximately once per vCPU lifetime.
>> But fast or not isn't the point here.
> 
> No.  The point is no change from the existing code.

Having thought about it over night: It's not nice but okay to duplicate
the bogus check here, but then please say that and why you do so in the
description. With that suitably added
Acked-by: Jan Beulich <jbeulich@suse.com>

> If you think it's wrong, it in a separate change and don't block this fix.

I would like to ask you to think about the opposite case occurring: I'm
pretty sure you wouldn't let me get away. Either - like so often - you'd
simply not reply anymore at a certain point, or - like here - you'd
expect me to adjust to your expectations.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fe86a7f8530f..0698e6d486fe 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1074,8 +1074,27 @@  int arch_set_info_guest(
 #endif
     flags = c(flags);
 
+    if ( !compat )
+    {
+        if ( c(debugreg[6]) != (uint32_t)c(debugreg[6]) ||
+             c(debugreg[7]) != (uint32_t)c(debugreg[7]) )
+            return -EINVAL;
+    }
+
     if ( is_pv_domain(d) )
     {
+        /*
+         * Prior to Xen 4.11, dr5 was used to hold the emulated-only
+         * subset of dr7, and dr4 was unused.
+         *
+         * In Xen 4.11 and later, dr4/5 are written as zero, ignored for
+         * backwards compatibility, and dr7 emulation is handled
+         * internally.
+         */
+        for ( i = 0; i < ARRAY_SIZE(v->arch.dr); i++ )
+            if ( !access_ok(c(debugreg[i]), sizeof(long)) )
+                return -EINVAL;
+
         if ( !compat )
         {
             if ( !is_canonical_address(c.nat->user_regs.rip) ||
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3a99c0ff20be..3dc2019eca67 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1032,6 +1032,14 @@  static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
         return -EINVAL;
     }
 
+    if ( ctxt.dr6 != (uint32_t)ctxt.dr6 ||
+         ctxt.dr7 != (uint32_t)ctxt.dr7 )
+    {
+        printk(XENLOG_G_ERR "%pv: HVM restore: bad DR6 %#"PRIx64" or DR7 %#"PRIx64"\n",
+               v, ctxt.dr6, ctxt.dr7);
+        return -EINVAL;
+    }
+
     if ( ctxt.cr3 >> d->arch.cpuid->extd.maxphysaddr )
     {
         printk(XENLOG_G_ERR "HVM%d restore: bad CR3 %#" PRIx64 "\n",