Message ID | 1470812378-4513-2-git-send-email-msw@amzn.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 10.08.16 at 08:59, <msw@amzn.com> wrote: > Systems that support LBR formats that include TSX information but do > not support TSX require special handling when saving and restoring MSR > values. For example, see the Linux kernel quirks[1, 2] in the MSR > context switching code. As a wrmsr with certain values under these > conditions causes a #GP, VM entry will fail due to MSR loading (see > last bullet of SDM 26.4 "Loading MSRS"). I had recently noticed this as an area needing work too, so I'm glad you (hopefully) eliminate this item from my todo list. However, I'm sad to see that you really just disable LBR use in the problem case. I'd prefer to take such a change only as an immediate workaround, with the understanding that a proper one will be made available not too much later. Jun, Kevin - workarounds for hardware quirks like this one are really what I would think should come out of Intel, and preferably not after the first problem report on Xen, but soon after the quirk has got identified in general (which according to the Linux patches must have been at least 2 months ago). > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void) > /* Haswell */ > case 60: case 63: case 69: case 70: > /* Broadwell */ > - case 61: case 71: case 79: case 86: > + case 61: case 71: case 79: case 86: { > + u64 caps; > + bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) || > + boot_cpu_has(X86_FEATURE_RTM); > + > + rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps); This is guarded by a X86_FEATURE_PDCM check in Linux - why would we not need the same here? Also I think this RDMSR should be performed once at boot, not every time we come here. > + /* > + * Unimplemented fixups are required if the processor > + * supports an LBR format that includes TSX information, > + * but not TSX. Disable LBR save/load on such platforms. > + */ > + if ( !tsx_support && (caps & 4) ) As I understand it the low 6 bits of the MSR contain an enumeration like value, so just checking bit 2 can't be right (and would wrongly trigger on already defined format types 5 and 6 - even if those were known to be impossible on the models currently covered, this would be a latent trap for someone to fall into when adding further model values). Jan
On Wed, Aug 10, 2016 at 06:34:10AM -0600, Jan Beulich wrote: > >>> On 10.08.16 at 08:59, <msw@amzn.com> wrote: > > Systems that support LBR formats that include TSX information but do > > not support TSX require special handling when saving and restoring MSR > > values. For example, see the Linux kernel quirks[1, 2] in the MSR > > context switching code. As a wrmsr with certain values under these > > conditions causes a #GP, VM entry will fail due to MSR loading (see > > last bullet of SDM 26.4 "Loading MSRS"). > > I had recently noticed this as an area needing work too, so I'm glad > you (hopefully) eliminate this item from my todo list. However, I'm > sad to see that you really just disable LBR use in the problem case. > I'd prefer to take such a change only as an immediate workaround, > with the understanding that a proper one will be made available not > too much later. I agree that this is a short term workaround. I spent a while trying to fix up the values but I wasn't successful. > Jun, Kevin - workarounds for hardware quirks like this one are really > what I would think should come out of Intel, and preferably not > after the first problem report on Xen, but soon after the quirk has > got identified in general (which according to the Linux patches must > have been at least 2 months ago). +1 > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void) > > /* Haswell */ > > case 60: case 63: case 69: case 70: > > /* Broadwell */ > > - case 61: case 71: case 79: case 86: > > + case 61: case 71: case 79: case 86: { > > + u64 caps; > > + bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) || > > + boot_cpu_has(X86_FEATURE_RTM); > > + > > + rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps); > > This is guarded by a X86_FEATURE_PDCM check in Linux - why > would we not need the same here? You're right, it should be. It also seems to be missing from core2_vpmu_init(). > Also I think this RDMSR should be performed once at boot, not every > time we come here. I thought you might say that. It didn't seem obviously right to put this in boot_cpu_data -- is that what you suggest? > > + /* > > + * Unimplemented fixups are required if the processor > > + * supports an LBR format that includes TSX information, > > + * but not TSX. Disable LBR save/load on such platforms. > > + */ > > + if ( !tsx_support && (caps & 4) ) > > As I understand it the low 6 bits of the MSR contain an enumeration > like value, so just checking bit 2 can't be right (and would wrongly > trigger on already defined format types 5 and 6 - even if those were > known to be impossible on the models currently covered, this would > be a latent trap for someone to fall into when adding further model > values). Fair point. --msw
>>> On 10.08.16 at 17:47, <msw@amzn.com> wrote: > On Wed, Aug 10, 2016 at 06:34:10AM -0600, Jan Beulich wrote: >> >>> On 10.08.16 at 08:59, <msw@amzn.com> wrote: >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> > @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void) >> > /* Haswell */ >> > case 60: case 63: case 69: case 70: >> > /* Broadwell */ >> > - case 61: case 71: case 79: case 86: >> > + case 61: case 71: case 79: case 86: { >> > + u64 caps; >> > + bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) || >> > + boot_cpu_has(X86_FEATURE_RTM); >> > + >> > + rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps); >> >> This is guarded by a X86_FEATURE_PDCM check in Linux - why >> would we not need the same here? > > You're right, it should be. It also seems to be missing from > core2_vpmu_init(). Feel free to take the liberty to fix it there at once (but please briefly mention this as an independent change in the description). >> Also I think this RDMSR should be performed once at boot, not every >> time we come here. > > I thought you might say that. It didn't seem obviously right to put > this in boot_cpu_data -- is that what you suggest? Why boot_cpu_data? Just an ordinary (possibly per-CPU) static in vmx.c. Jan
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 3d330b6..c51cefc 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -2576,8 +2576,22 @@ static const struct lbr_info *last_branch_msr_get(void) /* Haswell */ case 60: case 63: case 69: case 70: /* Broadwell */ - case 61: case 71: case 79: case 86: + case 61: case 71: case 79: case 86: { + u64 caps; + bool_t tsx_support = boot_cpu_has(X86_FEATURE_HLE) || + boot_cpu_has(X86_FEATURE_RTM); + + rdmsrl(MSR_IA32_PERF_CAPABILITIES, caps); + /* + * Unimplemented fixups are required if the processor + * supports an LBR format that includes TSX information, + * but not TSX. Disable LBR save/load on such platforms. + */ + if ( !tsx_support && (caps & 4) ) + return NULL; + return nh_lbr; + } /* Skylake */ case 78: case 94: /* future */