diff mbox

[2/2] x86/vmx: conditionally disable LBR support due to TSX format quirk

Message ID 1470812378-4513-2-git-send-email-msw@amzn.com (mailing list archive)
State New, archived
Headers show

Commit Message

Matt Wilson Aug. 10, 2016, 6:59 a.m. UTC
From: Matt Wilson <msw@amazon.com>

Systems that support LBR formats that include TSX information but do
not support TSX require special handling when saving and restoring MSR
values. For example, see the Linux kernel quirks[1, 2] in the MSR
context switching code. As a wrmsr with certain values under these
conditions causes a #GP, VM entry will fail due to MSR loading (see
last bullet of SDM 26.4 "Loading MSRS").

This failure can be triggered on a Haswell-EP system with the
following test Linux kernel module:

In domU:
  $ cat > lbr.c << EOF

  static int __init
  lbr_init(void)
  {
          u64 val;

          rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
          val |= DEBUGCTLMSR_LBR;
          wrmsrl(MSR_IA32_DEBUGCTLMSR, val);

          return 0;
  }

  static void __exit
  lbr_cleanup(void)
  {
  }

  module_init(lbr_init);
  module_exit(lbr_cleanup);

  MODULE_DESCRIPTION("lbr test");
  MODULE_LICENSE("GPL");
  EOF
  $ echo "obj-m += lbr.o" > Makefile
  $ make -C /lib/modules/`uname -r`/build M=`pwd` modules
  make: Entering directory `/home/ec2-user/linux'
    CC [M]  /home/ec2-user/lbr.o
    Building modules, stage 2.
    MODPOST 1 modules
    CC      /home/ec2-user/lbr.mod.o
    LD [M]  /home/ec2-user/lbr.ko
  make: Leaving directory `/home/ec2-user/linux'
  $ sudo insmod lbr.ko
  $ Timeout, server not responding.

In dom0:
  [...]
  (XEN) Failed vm entry (exit reason 0x80000022) caused by MSR entry 1 loading.
  [...]
  (XEN) EXIT MSR load count = 0x0001
  (XEN) EXIT MSR store count = 0x0023
  (XEN) ENTRY MSR load count = 0x0023
  (XEN)   msr_count = 35
  (XEN)   msr_area[0].index=0x000001dd .data=0x1fffffff811911db
  ...

This change dynamically disables LBR save/load on systems in the
problematic configuration.

[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=71adae99ed18
[2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=19fc9ddd61e0

Signed-off-by: Matt Wilson <msw@amazon.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Jan Beulich Aug. 10, 2016, 12:34 p.m. UTC | #1
>>> On 10.08.16 at 08:59, <msw@amzn.com> wrote:
> Systems that support LBR formats that include TSX information but do
> not support TSX require special handling when saving and restoring MSR
> values. For example, see the Linux kernel quirks[1, 2] in the MSR
> context switching code. As a wrmsr with certain values under these
> conditions causes a #GP, VM entry will fail due to MSR loading (see
> last bullet of SDM 26.4 "Loading MSRS").

I had recently noticed this as an area needing work too, so I'm glad
you (hopefully) eliminate this item from my todo list. However, I'm
sad to see that you really just disable LBR use in the problem case.
I'd prefer to take such a change only as an immediate workaround,
with the understanding that a proper one will be made available not
too much later.

Jun, Kevin - workarounds for hardware quirks like this one are really
what I would think should come out of Intel, and preferably not
after the first problem report on Xen, but soon after the quirk has
got identified in general (which according to the Linux patches must
have been at least 2 months ago).

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void)
>          /* Haswell */
>          case 60: case 63: case 69: case 70:
>          /* Broadwell */
> -        case 61: case 71: case 79: case 86:
> +        case 61: case 71: case 79: case 86: {
> +            u64 caps;
> +            bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
> +                                 boot_cpu_has(X86_FEATURE_RTM);
> +
> +            rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps);

This is guarded by a X86_FEATURE_PDCM check in Linux - why
would we not need the same here? Also I think this RDMSR should
be performed once at boot, not every time we come here.

> +            /*
> +             * Unimplemented fixups are required if the processor
> +             * supports an LBR format that includes TSX information,
> +             * but not TSX. Disable LBR save/load on such platforms.
> +             */
> +            if ( !tsx_support && (caps & 4) )

As I understand it the low 6 bits of the MSR contain an enumeration
like value, so just checking bit 2 can't be right (and would wrongly
trigger on already defined format types 5 and 6 - even if those were
known to be impossible on the models currently covered, this would
be a latent trap for someone to fall into when adding further model
values).

Jan
Matt Wilson Aug. 10, 2016, 3:47 p.m. UTC | #2
On Wed, Aug 10, 2016 at 06:34:10AM -0600, Jan Beulich wrote:
> >>> On 10.08.16 at 08:59, <msw@amzn.com> wrote:
> > Systems that support LBR formats that include TSX information but do
> > not support TSX require special handling when saving and restoring MSR
> > values. For example, see the Linux kernel quirks[1, 2] in the MSR
> > context switching code. As a wrmsr with certain values under these
> > conditions causes a #GP, VM entry will fail due to MSR loading (see
> > last bullet of SDM 26.4 "Loading MSRS").
> 
> I had recently noticed this as an area needing work too, so I'm glad
> you (hopefully) eliminate this item from my todo list. However, I'm
> sad to see that you really just disable LBR use in the problem case.
> I'd prefer to take such a change only as an immediate workaround,
> with the understanding that a proper one will be made available not
> too much later.

I agree that this is a short term workaround. I spent a while trying
to fix up the values but I wasn't successful.

> Jun, Kevin - workarounds for hardware quirks like this one are really
> what I would think should come out of Intel, and preferably not
> after the first problem report on Xen, but soon after the quirk has
> got identified in general (which according to the Linux patches must
> have been at least 2 months ago).

+1

> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void)
> >          /* Haswell */
> >          case 60: case 63: case 69: case 70:
> >          /* Broadwell */
> > -        case 61: case 71: case 79: case 86:
> > +        case 61: case 71: case 79: case 86: {
> > +            u64 caps;
> > +            bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
> > +                                 boot_cpu_has(X86_FEATURE_RTM);
> > +
> > +            rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps);
> 
> This is guarded by a X86_FEATURE_PDCM check in Linux - why
> would we not need the same here?

You're right, it should be. It also seems to be missing from
core2_vpmu_init().

> Also I think this RDMSR should be performed once at boot, not every
> time we come here.

I thought you might say that. It didn't seem obviously right to put
this in boot_cpu_data -- is that what you suggest?

> > +            /*
> > +             * Unimplemented fixups are required if the processor
> > +             * supports an LBR format that includes TSX information,
> > +             * but not TSX. Disable LBR save/load on such platforms.
> > +             */
> > +            if ( !tsx_support && (caps & 4) )
> 
> As I understand it the low 6 bits of the MSR contain an enumeration
> like value, so just checking bit 2 can't be right (and would wrongly
> trigger on already defined format types 5 and 6 - even if those were
> known to be impossible on the models currently covered, this would
> be a latent trap for someone to fall into when adding further model
> values).

Fair point.

--msw
Jan Beulich Aug. 11, 2016, 8:49 a.m. UTC | #3
>>> On 10.08.16 at 17:47, <msw@amzn.com> wrote:
> On Wed, Aug 10, 2016 at 06:34:10AM -0600, Jan Beulich wrote:
>> >>> On 10.08.16 at 08:59, <msw@amzn.com> wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void)
>> >          /* Haswell */
>> >          case 60: case 63: case 69: case 70:
>> >          /* Broadwell */
>> > -        case 61: case 71: case 79: case 86:
>> > +        case 61: case 71: case 79: case 86: {
>> > +            u64 caps;
>> > +            bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
>> > +                                 boot_cpu_has(X86_FEATURE_RTM);
>> > +
>> > +            rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps);
>> 
>> This is guarded by a X86_FEATURE_PDCM check in Linux - why
>> would we not need the same here?
> 
> You're right, it should be. It also seems to be missing from
> core2_vpmu_init().

Feel free to take the liberty to fix it there at once (but please
briefly mention this as an independent change in the description).

>> Also I think this RDMSR should be performed once at boot, not every
>> time we come here.
> 
> I thought you might say that. It didn't seem obviously right to put
> this in boot_cpu_data -- is that what you suggest?

Why boot_cpu_data? Just an ordinary (possibly per-CPU) static
in vmx.c.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 3d330b6..c51cefc 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2576,8 +2576,22 @@  static const struct lbr_info *last_branch_msr_get(void)
         /* Haswell */
         case 60: case 63: case 69: case 70:
         /* Broadwell */
-        case 61: case 71: case 79: case 86:
+        case 61: case 71: case 79: case 86: {
+            u64 caps;
+            bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
+                                 boot_cpu_has(X86_FEATURE_RTM);
+
+            rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps);
+            /*
+             * Unimplemented fixups are required if the processor
+             * supports an LBR format that includes TSX information,
+             * but not TSX. Disable LBR save/load on such platforms.
+             */
+            if ( !tsx_support && (caps & 4) )
+                return NULL;
+
             return nh_lbr;
+        }
         /* Skylake */
         case 78: case 94:
         /* future */