Message ID | 20210426175421.30497-2-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: Initial pieces for guest CET support | expand |
On 26.04.2021 19:54, Andrew Cooper wrote: > For now, let VMs opt into using CET by setting cet_ss/ibt in the CPUID > policy. Also extend cr4 handling to permit CR4.CET being set, along with > logic to interlock CR4.CET and CR0.WP. > > Everything else will malfunction for now, but this will help adding support > incrementally - there is a lot to do before CET will work properly. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Just one consideration: > --- a/xen/include/public/arch-x86/cpufeatureset.h > +++ b/xen/include/public/arch-x86/cpufeatureset.h > @@ -232,7 +232,7 @@ XEN_CPUFEATURE(UMIP, 6*32+ 2) /*S User Mode Instruction Prevention */ > XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ > XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ > XEN_CPUFEATURE(AVX512_VBMI2, 6*32+ 6) /*A Additional AVX-512 Vector Byte Manipulation Instrs */ > -XEN_CPUFEATURE(CET_SS, 6*32+ 7) /* CET - Shadow Stacks */ > +XEN_CPUFEATURE(CET_SS, 6*32+ 7) /*h CET - Shadow Stacks */ > XEN_CPUFEATURE(GFNI, 6*32+ 8) /*A Galois Field Instrs */ > XEN_CPUFEATURE(VAES, 6*32+ 9) /*A Vector AES Instrs */ > XEN_CPUFEATURE(VPCLMULQDQ, 6*32+10) /*A Vector Carry-less Multiplication Instrs */ > @@ -267,7 +267,7 @@ XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. > XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ > XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ > XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a SERIALIZE insn */ > -XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ > +XEN_CPUFEATURE(CET_IBT, 9*32+20) /*h CET - Indirect Branch Tracking */ > XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ > XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ > XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */ If by the time 4.16 finishes up the various todo items haven't been taken care of, should we take note to undo these markings? I would have suggested allowing them for debug builds only, but that's kind of ugly to achieve in a public header. Jan
On 27/04/2021 16:47, Jan Beulich wrote: > On 26.04.2021 19:54, Andrew Cooper wrote: >> For now, let VMs opt into using CET by setting cet_ss/ibt in the CPUID >> policy. Also extend cr4 handling to permit CR4.CET being set, along with >> logic to interlock CR4.CET and CR0.WP. >> >> Everything else will malfunction for now, but this will help adding support >> incrementally - there is a lot to do before CET will work properly. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Acked-by: Jan Beulich <jbeulich@suse.com> > > Just one consideration: > >> --- a/xen/include/public/arch-x86/cpufeatureset.h >> +++ b/xen/include/public/arch-x86/cpufeatureset.h >> @@ -232,7 +232,7 @@ XEN_CPUFEATURE(UMIP, 6*32+ 2) /*S User Mode Instruction Prevention */ >> XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ >> XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ >> XEN_CPUFEATURE(AVX512_VBMI2, 6*32+ 6) /*A Additional AVX-512 Vector Byte Manipulation Instrs */ >> -XEN_CPUFEATURE(CET_SS, 6*32+ 7) /* CET - Shadow Stacks */ >> +XEN_CPUFEATURE(CET_SS, 6*32+ 7) /*h CET - Shadow Stacks */ >> XEN_CPUFEATURE(GFNI, 6*32+ 8) /*A Galois Field Instrs */ >> XEN_CPUFEATURE(VAES, 6*32+ 9) /*A Vector AES Instrs */ >> XEN_CPUFEATURE(VPCLMULQDQ, 6*32+10) /*A Vector Carry-less Multiplication Instrs */ >> @@ -267,7 +267,7 @@ XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. >> XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ >> XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ >> XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a SERIALIZE insn */ >> -XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ >> +XEN_CPUFEATURE(CET_IBT, 9*32+20) /*h CET - Indirect Branch Tracking */ >> XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ >> XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ >> XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */ > If by the time 4.16 finishes up the various todo items haven't been > taken care of, should we take note to undo these markings? I would > have suggested allowing them for debug builds only, but that's kind > of ugly to achieve in a public header. TBH, I still don't think this should be a public header. If I would have forseen how much of a PITA is it, I would have argued harder against it. It is, at best, tools-only (via SYSCTL_get_featureset), and I don't intend that interface to survive the tools ABI stabilisation work, as it has been fully superseded by the cpu_policy interfaces and libx86. As for the max markings themselves, I'm not sure they ought to be debug-only. Its important aspect of making guest support "tech preview" to ensure the logic works irrespective of CONFIG_DEBUG, and I think it is entirely fine for an experimental feature to be of status "your VM will explode if you enable this right now", even if that spills over into 4.17. In reality, once we've got {U,S}_CET context switching working at the Xen level, and {RD,WR}MSR plumbing done, it ought to be safe to people to turn on an experiment with. At this point, we're in the position of "expected to work correctly in a subset of usecases". I'd ideally like to get to this point before 4.16 releases, but that will be very subject to other work. All the emulator work is for cases that a VM won't encounter by default (Task Switch too, as Minix in the Intel Management Engine is the only known 32-bit CET-SS implementation). Obviously, we want to get the checklist complete before enabling by default, but give the complexities in the emulator, I suspect we'll have a long gap between "believed can be used safely in a subset of cases", and "believed safe to use in general", and a long list of people happy to use it in a "doesn't work under introspection yet" state. ~Andrew
On 27.04.2021 19:39, Andrew Cooper wrote: > On 27/04/2021 16:47, Jan Beulich wrote: >> On 26.04.2021 19:54, Andrew Cooper wrote: >>> --- a/xen/include/public/arch-x86/cpufeatureset.h >>> +++ b/xen/include/public/arch-x86/cpufeatureset.h >>> @@ -232,7 +232,7 @@ XEN_CPUFEATURE(UMIP, 6*32+ 2) /*S User Mode Instruction Prevention */ >>> XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ >>> XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ >>> XEN_CPUFEATURE(AVX512_VBMI2, 6*32+ 6) /*A Additional AVX-512 Vector Byte Manipulation Instrs */ >>> -XEN_CPUFEATURE(CET_SS, 6*32+ 7) /* CET - Shadow Stacks */ >>> +XEN_CPUFEATURE(CET_SS, 6*32+ 7) /*h CET - Shadow Stacks */ >>> XEN_CPUFEATURE(GFNI, 6*32+ 8) /*A Galois Field Instrs */ >>> XEN_CPUFEATURE(VAES, 6*32+ 9) /*A Vector AES Instrs */ >>> XEN_CPUFEATURE(VPCLMULQDQ, 6*32+10) /*A Vector Carry-less Multiplication Instrs */ >>> @@ -267,7 +267,7 @@ XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. >>> XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ >>> XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ >>> XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a SERIALIZE insn */ >>> -XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ >>> +XEN_CPUFEATURE(CET_IBT, 9*32+20) /*h CET - Indirect Branch Tracking */ >>> XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ >>> XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ >>> XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */ >> If by the time 4.16 finishes up the various todo items haven't been >> taken care of, should we take note to undo these markings? I would >> have suggested allowing them for debug builds only, but that's kind >> of ugly to achieve in a public header. > > TBH, I still don't think this should be a public header. If I would > have forseen how much of a PITA is it, I would have argued harder > against it. > > It is, at best, tools-only (via SYSCTL_get_featureset), and I don't > intend that interface to survive the tools ABI stabilisation work, as > it has been fully superseded by the cpu_policy interfaces and libx86. Well - what features we expose is part of the (or at least something like an) ABI. The actual numbering of course isn't (or shouldn't be). I'm in no way opposed to moving the header out of public/ (and until now I was of the impression that it was you who put it there), but if we do I think we will want an alternative way of expressing which extended features we support. I say this in particular because I think SUPPORT.md doesn't lend itself to documenting support status at this level of granularity. I'd much rather see that file refer to this header, even if this may mean some difficulty to non-programmers. > As for the max markings themselves, I'm not sure they ought to be > debug-only. Its important aspect of making guest support "tech preview" > to ensure the logic works irrespective of CONFIG_DEBUG, and I think it > is entirely fine for an experimental feature to be of status "your VM > will explode if you enable this right now", even if that spills over > into 4.17. For a release I consider this undesirable. If a feature is in such a state at the point of entering the RC phase, I think such markings ought to be undone. > In reality, once we've got {U,S}_CET context switching working at the > Xen level, and {RD,WR}MSR plumbing done, it ought to be safe to people > to turn on an experiment with. At this point, we're in the position of > "expected to work correctly in a subset of usecases". I'd ideally like > to get to this point before 4.16 releases, but that will be very subject > to other work. At _that_ point I could see the markings getting introduced outside of debug builds, but still lower-case. > All the emulator work is for cases that a VM won't encounter by default > (Task Switch too, as Minix in the Intel Management Engine is the only > known 32-bit CET-SS implementation). Right, this is what is a prereq for the markings to become upper-case ones (if - not specifically for the features here, but as a general remark - we want them to become so ever in the first place) and the features fully supported. Jan > Obviously, we want to get the checklist complete before enabling by > default, but give the complexities in the emulator, I suspect we'll have > a long gap between "believed can be used safely in a subset of cases", > and "believed safe to use in general", and a long list of people happy > to use it in a "doesn't work under introspection yet" state. > > ~Andrew >
On 28/04/2021 10:11, Jan Beulich wrote: > On 27.04.2021 19:39, Andrew Cooper wrote: >> On 27/04/2021 16:47, Jan Beulich wrote: >>> On 26.04.2021 19:54, Andrew Cooper wrote: >>>> --- a/xen/include/public/arch-x86/cpufeatureset.h >>>> +++ b/xen/include/public/arch-x86/cpufeatureset.h >>>> @@ -232,7 +232,7 @@ XEN_CPUFEATURE(UMIP, 6*32+ 2) /*S User Mode Instruction Prevention */ >>>> XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ >>>> XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ >>>> XEN_CPUFEATURE(AVX512_VBMI2, 6*32+ 6) /*A Additional AVX-512 Vector Byte Manipulation Instrs */ >>>> -XEN_CPUFEATURE(CET_SS, 6*32+ 7) /* CET - Shadow Stacks */ >>>> +XEN_CPUFEATURE(CET_SS, 6*32+ 7) /*h CET - Shadow Stacks */ >>>> XEN_CPUFEATURE(GFNI, 6*32+ 8) /*A Galois Field Instrs */ >>>> XEN_CPUFEATURE(VAES, 6*32+ 9) /*A Vector AES Instrs */ >>>> XEN_CPUFEATURE(VPCLMULQDQ, 6*32+10) /*A Vector Carry-less Multiplication Instrs */ >>>> @@ -267,7 +267,7 @@ XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. >>>> XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ >>>> XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ >>>> XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a SERIALIZE insn */ >>>> -XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ >>>> +XEN_CPUFEATURE(CET_IBT, 9*32+20) /*h CET - Indirect Branch Tracking */ >>>> XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ >>>> XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ >>>> XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */ >>> If by the time 4.16 finishes up the various todo items haven't been >>> taken care of, should we take note to undo these markings? I would >>> have suggested allowing them for debug builds only, but that's kind >>> of ugly to achieve in a public header. >> TBH, I still don't think this should be a public header. If I would >> have forseen how much of a PITA is it, I would have argued harder >> against it. >> >> It is, at best, tools-only (via SYSCTL_get_featureset), and I don't >> intend that interface to survive the tools ABI stabilisation work, as >> it has been fully superseded by the cpu_policy interfaces and libx86. > Well - what features we expose is part of the (or at least something > like an) ABI. Yes, but that ABI is called CPUID and MSRs, per Intel/AMD's spec. VM's see it via the real instructions. Control software sees it via a pair of arrays ({leaf, subleaf, a, b, c, d} and {idx, val}) expressed in terms of the vendors ABI. > The actual numbering of course isn't (or shouldn't be). > I'm in no way opposed to moving the header out of public/ (and until > now I was of the impression that it was you who put it there), but if > we do I think we will want an alternative way of expressing which > extended features we support. I say this in particular because I think > SUPPORT.md doesn't lend itself to documenting support status at this > level of granularity. I'd much rather see that file refer to this > header, even if this may mean some difficulty to non-programmers. We don't need to say or do anything for experimental features. Hitting Tech Preview and supported ought to be in release notes. I do agree that SUPPORT.md isn't great. >> As for the max markings themselves, I'm not sure they ought to be >> debug-only. Its important aspect of making guest support "tech preview" >> to ensure the logic works irrespective of CONFIG_DEBUG, and I think it >> is entirely fine for an experimental feature to be of status "your VM >> will explode if you enable this right now", even if that spills over >> into 4.17. > For a release I consider this undesirable. If a feature is in such a > state at the point of entering the RC phase, I think such markings > ought to be undone. Well no. They either don't go in in the first place, or they go in and stay. Putting them in with an expectation to take them out later is a recipe for forgetting to do so. I know we're making up our "how to do complicated experimental features" process as we go here, but nothing *in Xen* will malfunction if a user opts into CET_SS/IBT. Therefore I think they're fine to go in and stay. ~Andrew
On 28.04.2021 19:54, Andrew Cooper wrote: > I know we're making up our "how to do complicated experimental features" > process as we go here, but nothing *in Xen* will malfunction if a user > opts into CET_SS/IBT. Therefore I think they're fine to go in and stay. Well, okay then. I hope possibls future additions of mine then will get similar treatment. Jan
On 29/04/2021 10:07, Jan Beulich wrote: > On 28.04.2021 19:54, Andrew Cooper wrote: >> I know we're making up our "how to do complicated experimental features" >> process as we go here, but nothing *in Xen* will malfunction if a user >> opts into CET_SS/IBT. Therefore I think they're fine to go in and stay. > Well, okay then. I hope possibls future additions of mine then will > get similar treatment. So having thought on this for a while... The annotations in the header file don't help me. My dev workflow already starts with turning them to default, skips messing around with toolstack level things. At this point in feature development, there is no use to anyone who isn't also editing the hypervisor too. So unless it is going to help you (and I suspect it wont), its probably not helpful to the wider world. I'll drop the markings from this patch, and put them back in at the point that we believe "basic VMs" work. ~Andrew
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index ae37bc434a..28beacc45b 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -976,11 +976,12 @@ const char *hvm_efer_valid(const struct vcpu *v, uint64_t value, unsigned long hvm_cr4_guest_valid_bits(const struct domain *d) { const struct cpuid_policy *p = d->arch.cpuid; - bool mce, vmxe; + bool mce, vmxe, cet; /* Logic broken out simply to aid readability below. */ mce = p->basic.mce || p->basic.mca; vmxe = p->basic.vmx && nestedhvm_enabled(d); + cet = p->feat.cet_ss || p->feat.cet_ibt; return ((p->basic.vme ? X86_CR4_VME | X86_CR4_PVI : 0) | (p->basic.tsc ? X86_CR4_TSD : 0) | @@ -999,7 +1000,8 @@ unsigned long hvm_cr4_guest_valid_bits(const struct domain *d) (p->basic.xsave ? X86_CR4_OSXSAVE : 0) | (p->feat.smep ? X86_CR4_SMEP : 0) | (p->feat.smap ? X86_CR4_SMAP : 0) | - (p->feat.pku ? X86_CR4_PKE : 0)); + (p->feat.pku ? X86_CR4_PKE : 0) | + (cet ? X86_CR4_CET : 0)); } static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h) @@ -2289,6 +2291,12 @@ int hvm_set_cr0(unsigned long value, bool may_defer) } } + if ( !(value & X86_CR0_WP) && (v->arch.hvm.guest_cr[4] & X86_CR4_CET) ) + { + gprintk(XENLOG_DEBUG, "Trying to clear WP with CET set\n"); + return X86EMUL_EXCEPTION; + } + if ( (value & X86_CR0_PG) && !(old_value & X86_CR0_PG) ) { if ( v->arch.hvm.guest_efer & EFER_LME ) @@ -2444,6 +2452,12 @@ int hvm_set_cr4(unsigned long value, bool may_defer) } } + if ( (value & X86_CR4_CET) && !(v->arch.hvm.guest_cr[0] & X86_CR0_WP) ) + { + gprintk(XENLOG_DEBUG, "Trying to set CET without WP\n"); + return X86EMUL_EXCEPTION; + } + old_cr = v->arch.hvm.guest_cr[4]; if ( (value & X86_CR4_PCIDE) && !(old_cr & X86_CR4_PCIDE) && diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index c42f56bdd4..6f94a73408 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -232,7 +232,7 @@ XEN_CPUFEATURE(UMIP, 6*32+ 2) /*S User Mode Instruction Prevention */ XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ XEN_CPUFEATURE(AVX512_VBMI2, 6*32+ 6) /*A Additional AVX-512 Vector Byte Manipulation Instrs */ -XEN_CPUFEATURE(CET_SS, 6*32+ 7) /* CET - Shadow Stacks */ +XEN_CPUFEATURE(CET_SS, 6*32+ 7) /*h CET - Shadow Stacks */ XEN_CPUFEATURE(GFNI, 6*32+ 8) /*A Galois Field Instrs */ XEN_CPUFEATURE(VAES, 6*32+ 9) /*A Vector AES Instrs */ XEN_CPUFEATURE(VPCLMULQDQ, 6*32+10) /*A Vector Carry-less Multiplication Instrs */ @@ -267,7 +267,7 @@ XEN_CPUFEATURE(SRBDS_CTRL, 9*32+ 9) /* MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS. XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a SERIALIZE insn */ -XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ +XEN_CPUFEATURE(CET_IBT, 9*32+20) /*h CET - Indirect Branch Tracking */ XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */
For now, let VMs opt into using CET by setting cet_ss/ibt in the CPUID policy. Also extend cr4 handling to permit CR4.CET being set, along with logic to interlock CR4.CET and CR0.WP. Everything else will malfunction for now, but this will help adding support incrementally - there is a lot to do before CET will work properly. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> --- xen/arch/x86/hvm/hvm.c | 18 ++++++++++++++++-- xen/include/public/arch-x86/cpufeatureset.h | 4 ++-- 2 files changed, 18 insertions(+), 4 deletions(-)