Message ID | 20210304144755.35891-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior | expand |
Roger Pau Monne writes ("[PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > Introduce an option to allow selecting a less strict behaviour for > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > Xen result in the injection of a #GP to the guest. This is a behavior > change since previously a #GP was only injected if accessing the MSR > on the real hardware will also trigger a #GP. ... > I wonder whether we need to to enable this option by default for > guests being migrated from previous Xen versions? Maybe that's not > required as the option is helpful mostly for early boot I would > assume, afterwards an OS should already have the #GP handler setup > when accessing MSRs. I think it's almost as bad to have guests which can be migrated in, but which then cannot reboot. Historically we have taken the view that new Xen must support old guests, even if that means being bug-compatible. So I am strongly in favour of avoiding such a usability regression. Which I think means enabling this option by default ? > >From a release PoV the biggest risk would be breaking some of the > existing MSR functionality. I think that's a necessary risk in order > to offer such fallback option, or else we might discover after the > release that guests that worked on Xen 4.14 don't work anymore in Xen > 4.15. Yes. Release-Acked-by: Ian Jackson <iwj@xenproject.org> Ian.
On Thu, Mar 04, 2021 at 02:59:38PM +0000, Ian Jackson wrote: > Roger Pau Monne writes ("[PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > > Introduce an option to allow selecting a less strict behaviour for > > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > > Xen result in the injection of a #GP to the guest. This is a behavior > > change since previously a #GP was only injected if accessing the MSR > > on the real hardware will also trigger a #GP. > ... > > I wonder whether we need to to enable this option by default for > > guests being migrated from previous Xen versions? Maybe that's not > > required as the option is helpful mostly for early boot I would > > assume, afterwards an OS should already have the #GP handler setup > > when accessing MSRs. > > I think it's almost as bad to have guests which can be migrated in, > but which then cannot reboot. Ups, yes, right. > Historically we have taken the view that new Xen must support old > guests, even if that means being bug-compatible. So I am strongly in > favour of avoiding such a usability regression. I'm not a xl/libxl expert, but couldn't we set the option in a persistent way for migrated-in guests? IIRC at domain creation libxl knows whether it's a restore or a fresh domain, and hence we could set the option there? The part I'm not sure is about how to make it persistent. Thanks, Roger.
Roger Pau Monné writes ("Re: [PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > On Thu, Mar 04, 2021 at 02:59:38PM +0000, Ian Jackson wrote: > > I think it's almost as bad to have guests which can be migrated in, > > but which then cannot reboot. > > Ups, yes, right. > > > Historically we have taken the view that new Xen must support old > > guests, even if that means being bug-compatible. So I am strongly in > > favour of avoiding such a usability regression. > > I'm not a xl/libxl expert, but couldn't we set the option in a > persistent way for migrated-in guests? > > IIRC at domain creation libxl knows whether it's a restore or a fresh > domain, and hence we could set the option there? > > The part I'm not sure is about how to make it persistent. The guest could be stopped with xl shutdown and then recrated with xl create, from the config file. I don't think we want to break that use case here either. Ian.
On Thu, Mar 04, 2021 at 03:20:34PM +0000, Ian Jackson wrote: > Roger Pau Monné writes ("Re: [PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > > On Thu, Mar 04, 2021 at 02:59:38PM +0000, Ian Jackson wrote: > > > I think it's almost as bad to have guests which can be migrated in, > > > but which then cannot reboot. > > > > Ups, yes, right. > > > > > Historically we have taken the view that new Xen must support old > > > guests, even if that means being bug-compatible. So I am strongly in > > > favour of avoiding such a usability regression. > > > > I'm not a xl/libxl expert, but couldn't we set the option in a > > persistent way for migrated-in guests? > > > > IIRC at domain creation libxl knows whether it's a restore or a fresh > > domain, and hence we could set the option there? > > > > The part I'm not sure is about how to make it persistent. > > The guest could be stopped with xl shutdown and then recrated with xl > create, from the config file. I don't think we want to break that use > case here either. So my original approach was to actually risk breaking creation from config file and require the user to set the rdmsr_relaxed option, and report the problem upstream. I think ideally we would like to get to a point where we could drop the rdmsr_relaxed option, but maybe that's too optimistic. We have done quite a lot of testing of this new policy, but obviously it's not possible to test all possible guest OSes. Forcing the new policy by default might be too risky, so indeed falling back to enabling this by default could be the only solution. The main downside of enabling by default is that then we have to resign to always having this kind of quirky behavior for MSR accesses as the default. Thanks, Roger.
Roger Pau Monné writes ("Re: [PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > On Thu, Mar 04, 2021 at 03:20:34PM +0000, Ian Jackson wrote: > > The guest could be stopped with xl shutdown and then recrated with xl > > create, from the config file. I don't think we want to break that use > > case here either. > > So my original approach was to actually risk breaking creation from > config file and require the user to set the rdmsr_relaxed option, and > report the problem upstream. I think ideally we would like to get to a > point where we could drop the rdmsr_relaxed option, but maybe that's > too optimistic. Isn't there some way we can move in this direction without the first thing that users experience being their guests not being able to be created ? Maybe we could print a warning on the console or something ? > We have done quite a lot of testing of this new policy, but obviously > it's not possible to test all possible guest OSes. Forcing the new > policy by default might be too risky, so indeed falling back to > enabling this by default could be the only solution. > > The main downside of enabling by default is that then we have to > resign to always having this kind of quirky behavior for MSR > accesses as the default. What would stop us changing the default later, when we had a better idea of the set of RDMSRs that need to be special-cased ? Ian.
On Thu, Mar 04, 2021 at 05:13:15PM +0000, Ian Jackson wrote: > Roger Pau Monné writes ("Re: [PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > > On Thu, Mar 04, 2021 at 03:20:34PM +0000, Ian Jackson wrote: > > > The guest could be stopped with xl shutdown and then recrated with xl > > > create, from the config file. I don't think we want to break that use > > > case here either. > > > > So my original approach was to actually risk breaking creation from > > config file and require the user to set the rdmsr_relaxed option, and > > report the problem upstream. I think ideally we would like to get to a > > point where we could drop the rdmsr_relaxed option, but maybe that's > > too optimistic. > > Isn't there some way we can move in this direction without the first > thing that users experience being their guests not being able to be > created ? > > Maybe we could print a warning on the console or something ? I (sadly) fear unless you get a guest crash no-one will ever look at the logs and notice those messages. > > We have done quite a lot of testing of this new policy, but obviously > > it's not possible to test all possible guest OSes. Forcing the new > > policy by default might be too risky, so indeed falling back to > > enabling this by default could be the only solution. > > > > The main downside of enabling by default is that then we have to > > resign to always having this kind of quirky behavior for MSR > > accesses as the default. > > What would stop us changing the default later, when we had a better > idea of the set of RDMSRs that need to be special-cased ? We could. From the Citrix side I'm afraid there's not much more testing that we can do however. We tested the new policy against all possible guests known by XenRT, but obviously that's not every possible OS. One option we could go for is making this behavior depend on Kconfig: enable strict MSR policy for debug builds and fallback to the 'relaxed' one for non-debug builds. That might get us some more data, but again I fear most people out there will run non-debug builds anyway. Thanks, Roger.
Roger Pau Monné writes ("Re: [PATCH v2 for-4.15] x86/msr: introduce an option for HVM relaxed rdmsr behavior"): > One option we could go for is making this behavior depend on Kconfig: > enable strict MSR policy for debug builds and fallback to the > 'relaxed' one for non-debug builds. That might get us some more data, > but again I fear most people out there will run non-debug builds > anyway. Hmmm. Well, anyway, my R-A for this patch stands. I wanted to explore our options, but I won't try to insist on a change to the default configuration if you experts are against it. Ian.
On 3/4/21 9:47 AM, Roger Pau Monne wrote: > Introduce an option to allow selecting a less strict behaviour for > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > Xen result in the injection of a #GP to the guest. This is a behavior > change since previously a #GP was only injected if accessing the MSR > on the real hardware will also trigger a #GP. > > This commit attempts to offer a fallback option similar to the > previous behavior. Note however that the value of the underlying MSR > is never leaked to the guest, as the newly introduced option only > changes whether a #GP is injected or not. > > Long term the plan is to properly handle all the MSRs, so the option > introduced here should be considered a temporary resort for OSes that > don't work properly with the new MSR policy. Any OS that requires this > option to be enabled should be reported to > xen-devel@lists.xenproject.org. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Changes since v1: > - Only apply the option to HVM guests. > - Only apply the special handling to MSR reads. > - Sanitize the newly introduced flags field. > - Print a warning message when the option is used. > --- > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > Boris, could you please test with Solaris to see if this fixes the > issue? Yes, still works. (It worked especially well after I noticed new option name ;-)) -boris
On 04/03/2021 23:09, Boris Ostrovsky wrote: > On 3/4/21 9:47 AM, Roger Pau Monne wrote: >> Introduce an option to allow selecting a less strict behaviour for >> rdmsr accesses targeting a MSR not explicitly handled by Xen. Since >> commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by >> Xen result in the injection of a #GP to the guest. This is a behavior >> change since previously a #GP was only injected if accessing the MSR >> on the real hardware will also trigger a #GP. >> >> This commit attempts to offer a fallback option similar to the >> previous behavior. Note however that the value of the underlying MSR >> is never leaked to the guest, as the newly introduced option only >> changes whether a #GP is injected or not. >> >> Long term the plan is to properly handle all the MSRs, so the option >> introduced here should be considered a temporary resort for OSes that >> don't work properly with the new MSR policy. Any OS that requires this >> option to be enabled should be reported to >> xen-devel@lists.xenproject.org. >> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> --- >> Changes since v1: >> - Only apply the option to HVM guests. >> - Only apply the special handling to MSR reads. >> - Sanitize the newly introduced flags field. >> - Print a warning message when the option is used. >> --- >> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> --- >> Boris, could you please test with Solaris to see if this fixes the >> issue? > > Yes, still works. (It worked especially well after I noticed new option name ;-)) I'm afraid I want to break and rework how this bugfix happens. Solaris is still broken on all older branches and this isn't a suitable fix to backport. ~Andrew
On 04/03/2021 14:47, Roger Pau Monne wrote: > Introduce an option to allow selecting a less strict behaviour for > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > Xen result in the injection of a #GP to the guest. This is a behavior > change since previously a #GP was only injected if accessing the MSR > on the real hardware will also trigger a #GP. > > This commit attempts to offer a fallback option similar to the > previous behavior. Note however that the value of the underlying MSR > is never leaked to the guest, as the newly introduced option only > changes whether a #GP is injected or not. > > Long term the plan is to properly handle all the MSRs, so the option > introduced here should be considered a temporary resort for OSes that > don't work properly with the new MSR policy. Any OS that requires this > option to be enabled should be reported to > xen-devel@lists.xenproject.org. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Changes since v1: > - Only apply the option to HVM guests. > - Only apply the special handling to MSR reads. > - Sanitize the newly introduced flags field. > - Print a warning message when the option is used. > --- > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > Boris, could you please test with Solaris to see if this fixes the > issue? > > I wonder whether we need to to enable this option by default for > guests being migrated from previous Xen versions? Maybe that's not > required as the option is helpful mostly for early boot I would > assume, afterwards an OS should already have the #GP handler setup > when accessing MSRs. We know when building a domain whether it is a migrate or not, but don't recall any version information existing at an appropriate point in the migration stream to do this easily. We can buffer the stream forward and peek at the libxc domain header, which does have the source hypervisor version, but that is going to be very invasive to implement. > > From a release PoV the biggest risk would be breaking some of the > existing MSR functionality. I think that's a necessary risk in order > to offer such fallback option, or else we might discover after the > release that guests that worked on Xen 4.14 don't work anymore in Xen > 4.15. Much as I'd prefer not to have this, I agree with the sentiment that we should have an "emergency undo" which people can use, and carry it for at least a short while. However, to be useful for the purpose of unbreaking VMs, it must change both the read and write behaviour, because both are potential compatibility concerns (without reintroducing the information leak). > --- > docs/man/xl.cfg.5.pod.in | 17 +++++++++++++++++ > tools/include/libxl.h | 8 ++++++++ > tools/libs/light/libxl_types.idl | 2 ++ > tools/libs/light/libxl_x86.c | 4 ++++ > tools/xl/xl_parse.c | 7 +++++++ > xen/arch/x86/domain.c | 10 ++++++++++ > xen/arch/x86/hvm/svm/svm.c | 6 ++++++ > xen/arch/x86/hvm/vmx/vmx.c | 7 +++++++ > xen/include/asm-x86/hvm/domain.h | 3 +++ > xen/include/public/arch-x86/xen.h | 8 ++++++++ This needs changes to the Ocaml bindings as well. I guess I'll add that to the todo list. ~Andrew
On Thu, Mar 04, 2021 at 11:28:16PM +0000, Andrew Cooper wrote: > On 04/03/2021 23:09, Boris Ostrovsky wrote: > > On 3/4/21 9:47 AM, Roger Pau Monne wrote: > >> Introduce an option to allow selecting a less strict behaviour for > >> rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > >> commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > >> Xen result in the injection of a #GP to the guest. This is a behavior > >> change since previously a #GP was only injected if accessing the MSR > >> on the real hardware will also trigger a #GP. > >> > >> This commit attempts to offer a fallback option similar to the > >> previous behavior. Note however that the value of the underlying MSR > >> is never leaked to the guest, as the newly introduced option only > >> changes whether a #GP is injected or not. > >> > >> Long term the plan is to properly handle all the MSRs, so the option > >> introduced here should be considered a temporary resort for OSes that > >> don't work properly with the new MSR policy. Any OS that requires this > >> option to be enabled should be reported to > >> xen-devel@lists.xenproject.org. > >> > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >> --- > >> Changes since v1: > >> - Only apply the option to HVM guests. > >> - Only apply the special handling to MSR reads. > >> - Sanitize the newly introduced flags field. > >> - Print a warning message when the option is used. > >> --- > >> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > >> --- > >> Boris, could you please test with Solaris to see if this fixes the > >> issue? > > > > Yes, still works. (It worked especially well after I noticed new option name ;-)) > > I'm afraid I want to break and rework how this bugfix happens. Solaris > is still broken on all older branches and this isn't a suitable fix to > backport. Right, I think that's the reactive part of the fixing that we spoke with Jan, but I think we would still need something similar to rdmsr_relaxed. There's at least one other MSR which Jan identified that we also want to handle MSR_K7_HWCR. Thanks, Roger.
On Fri, Mar 05, 2021 at 12:06:19AM +0000, Andrew Cooper wrote: > On 04/03/2021 14:47, Roger Pau Monne wrote: > > Introduce an option to allow selecting a less strict behaviour for > > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > > Xen result in the injection of a #GP to the guest. This is a behavior > > change since previously a #GP was only injected if accessing the MSR > > on the real hardware will also trigger a #GP. > > > > This commit attempts to offer a fallback option similar to the > > previous behavior. Note however that the value of the underlying MSR > > is never leaked to the guest, as the newly introduced option only > > changes whether a #GP is injected or not. > > > > Long term the plan is to properly handle all the MSRs, so the option > > introduced here should be considered a temporary resort for OSes that > > don't work properly with the new MSR policy. Any OS that requires this > > option to be enabled should be reported to > > xen-devel@lists.xenproject.org. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > Changes since v1: > > - Only apply the option to HVM guests. > > - Only apply the special handling to MSR reads. > > - Sanitize the newly introduced flags field. > > - Print a warning message when the option is used. > > --- > > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > --- > > Boris, could you please test with Solaris to see if this fixes the > > issue? > > > > I wonder whether we need to to enable this option by default for > > guests being migrated from previous Xen versions? Maybe that's not > > required as the option is helpful mostly for early boot I would > > assume, afterwards an OS should already have the #GP handler setup > > when accessing MSRs. > > We know when building a domain whether it is a migrate or not, but don't > recall any version information existing at an appropriate point in the > migration stream to do this easily. > > We can buffer the stream forward and peek at the libxc domain header, > which does have the source hypervisor version, but that is going to be > very invasive to implement. I need to look at this more closely to have an opinion. Instead of figuring out if the source version is older maybe we could add something to the 4.15 stream in a suitable position to detect whether the source is new enough? > > > > From a release PoV the biggest risk would be breaking some of the > > existing MSR functionality. I think that's a necessary risk in order > > to offer such fallback option, or else we might discover after the > > release that guests that worked on Xen 4.14 don't work anymore in Xen > > 4.15. > > Much as I'd prefer not to have this, I agree with the sentiment that we > should have an "emergency undo" which people can use, and carry it for > at least a short while. > > However, to be useful for the purpose of unbreaking VMs, it must change > both the read and write behaviour, because both are potential > compatibility concerns (without reintroducing the information leak). I think I was confused here and assumed the previous behavior would check the written value to match the current underlying value before injecting a #GP. That's not the case. I can expand this patch to include the write side, I just thought having the rad side only would be enough to cover for the unhandled MSRs accesses. > > > --- > > docs/man/xl.cfg.5.pod.in | 17 +++++++++++++++++ > > tools/include/libxl.h | 8 ++++++++ > > tools/libs/light/libxl_types.idl | 2 ++ > > tools/libs/light/libxl_x86.c | 4 ++++ > > tools/xl/xl_parse.c | 7 +++++++ > > xen/arch/x86/domain.c | 10 ++++++++++ > > xen/arch/x86/hvm/svm/svm.c | 6 ++++++ > > xen/arch/x86/hvm/vmx/vmx.c | 7 +++++++ > > xen/include/asm-x86/hvm/domain.h | 3 +++ > > xen/include/public/arch-x86/xen.h | 8 ++++++++ > > This needs changes to the Ocaml bindings as well. I guess I'll add that > to the todo list. Hm, would be better if someone else did those, as I don't know anything about Ocaml. I can try however in the next version, maybe it's trivial to add. Thanks, Roger.
On 04.03.2021 18:43, Roger Pau Monné wrote: > One option we could go for is making this behavior depend on Kconfig: > enable strict MSR policy for debug builds and fallback to the > 'relaxed' one for non-debug builds. That might get us some more data, > but again I fear most people out there will run non-debug builds > anyway. Plus of course we'd almost never test the "relaxed" code path(s). Jan
On 05.03.2021 10:15, Roger Pau Monné wrote: > On Fri, Mar 05, 2021 at 12:06:19AM +0000, Andrew Cooper wrote: >> On 04/03/2021 14:47, Roger Pau Monne wrote: >>> From a release PoV the biggest risk would be breaking some of the >>> existing MSR functionality. I think that's a necessary risk in order >>> to offer such fallback option, or else we might discover after the >>> release that guests that worked on Xen 4.14 don't work anymore in Xen >>> 4.15. >> >> Much as I'd prefer not to have this, I agree with the sentiment that we >> should have an "emergency undo" which people can use, and carry it for >> at least a short while. >> >> However, to be useful for the purpose of unbreaking VMs, it must change >> both the read and write behaviour, because both are potential >> compatibility concerns (without reintroducing the information leak). > > I think I was confused here and assumed the previous behavior would > check the written value to match the current underlying value before > injecting a #GP. That's not the case. > > I can expand this patch to include the write side, I just thought > having the rad side only would be enough to cover for the unhandled > MSRs accesses. Both when seeing this patch's title and when ripping the write part out of my patch I meant to indicate the same - dealing with just reads may not be enough. Arguably people could be told to first try with just relaxing rdmsr handling, but ones anxious to get their VMs back into production use may ignore such an advice and use the bigger hammer right away. Jan
On 04.03.2021 15:47, Roger Pau Monne wrote: > Introduce an option to allow selecting a less strict behaviour for > rdmsr accesses targeting a MSR not explicitly handled by Xen. Since > commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by > Xen result in the injection of a #GP to the guest. This is a behavior > change since previously a #GP was only injected if accessing the MSR > on the real hardware will also trigger a #GP. > > This commit attempts to offer a fallback option similar to the > previous behavior. Note however that the value of the underlying MSR > is never leaked to the guest, as the newly introduced option only > changes whether a #GP is injected or not. > > Long term the plan is to properly handle all the MSRs, so the option > introduced here should be considered a temporary resort for OSes that > don't work properly with the new MSR policy. Any OS that requires this > option to be enabled should be reported to > xen-devel@lists.xenproject.org. While the title says this is limited to HVM guests, I have to admit that I fail to see why this is, and hence I would have hoped for some clarification in the description. In particular I don't think my "guest in early boot" workaround, of which I posted v2 earlier today, can be assumed to be enough in the longer run. Recall that it relaxes behavior only when the guest didn't install a handler for #GP yet - this means it wouldn't help with any unguarded RDMSR the guest might issue later, with a handler already installed. > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -760,6 +760,13 @@ int arch_domain_create(struct domain *d, > d->domain_id); > } > > + if ( config->arch.domain_flags & ~XEN_X86_RDMSR_RELAXED ) > + { > + printk(XENLOG_G_ERR "d%d: Invalid arch domain flags: %#x\n", > + d->domain_id, config->arch.domain_flags); > + return -EINVAL; > + } This would look to better go into arch_sanitise_domain_config(). And if the flag remains HVM-only, that aspect should then also be checked (i.e. the flag being set would then also need rejecting for PV guests). > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) > const struct domain *d = v->domain; > struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; > const struct nestedsvm *nsvm = &vcpu_nestedsvm(v); > + uint64_t tmp; > > switch ( msr ) > { > @@ -1965,6 +1966,11 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) > break; > > default: > + if ( d->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) ) > + { > + *msr_content = 0; > + break; > + } You don't really need "tmp" here, do you? You could as well read into *msr_content, as you're zapping the value afterwards anyway. > --- a/xen/include/asm-x86/hvm/domain.h > +++ b/xen/include/asm-x86/hvm/domain.h > @@ -122,6 +122,9 @@ struct hvm_domain { > > bool_t is_s3_suspended; > > + /* Don't unconditionally inject #GP for unhandled MSRs reads. */ > + bool rdmsr_relaxed; If, again, this is to remain HVM-only, then you insertion wants to honor the blank padding other field decls use. I'd also like to ask for your insertion to be moved up a few lines, to after "is_in_uc_mode". I have a patch queued already to also move "is_s3_suspended" into that hole; it's from November last year, so it looks like I simply forgot to post it. Jan
On 05.03.2021 11:56, Jan Beulich wrote: > On 04.03.2021 15:47, Roger Pau Monne wrote: >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) >> const struct domain *d = v->domain; >> struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; >> const struct nestedsvm *nsvm = &vcpu_nestedsvm(v); >> + uint64_t tmp; >> >> switch ( msr ) >> { >> @@ -1965,6 +1966,11 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) >> break; >> >> default: >> + if ( d->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) ) >> + { >> + *msr_content = 0; >> + break; >> + } > > You don't really need "tmp" here, do you? You could as well read > into *msr_content, as you're zapping the value afterwards anyway. Actually, while perhaps indeed not strictly needed, it allows the compiler to produce better code, as it'll be able to recognize the value doesn't need writing to memory on any path. I guess I'll change the logic in my related patch along these lines then. Jan
On Fri, Mar 05, 2021 at 11:56:33AM +0100, Jan Beulich wrote: > On 04.03.2021 15:47, Roger Pau Monne wrote: > > --- a/xen/arch/x86/hvm/svm/svm.c > > +++ b/xen/arch/x86/hvm/svm/svm.c > > @@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) > > const struct domain *d = v->domain; > > struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; > > const struct nestedsvm *nsvm = &vcpu_nestedsvm(v); > > + uint64_t tmp; > > > > switch ( msr ) > > { > > @@ -1965,6 +1966,11 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) > > break; > > > > default: > > + if ( d->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) ) > > + { > > + *msr_content = 0; > > + break; > > + } > > You don't really need "tmp" here, do you? You could as well read > into *msr_content, as you're zapping the value afterwards anyway. I also thought about doing this, but felt unease. I fear the code might be changed in the future and maybe msr_content is not zapped anymore, thus leaking the content. I feel it's safer to use a temporary variable that will never be returned to the guest. Maybe I'm just too paranoid. Thanks, Roger.
diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index 040374dcd6..62178b9829 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -2861,6 +2861,23 @@ No MCA capabilities in above list are enabled. =back +=item B<rdmsr_relaxed=BOOLEAN> + +Select whether to use a relaxed behavior for read accesses to MSRs not +explicitly handled by Xen instead of injecting a #GP to the guest. Such access +mode will force Xen to replicate the behaviour from the hardware it's currently +running on in order to decide whether a #GP is injected to the guest. Note +that the guest is never allowed to read the value of unhandled MSRs, this +option only changes whether a #GP might be injected or not. + +This option is only relevant for HVM guests, and will be removed in future +releases once we are certain the default MSR access policy has been properly +tested by a wide variety of guests. If you need to use this option please send +a bug report to xen-devel@lists.xenproject.org with the details of the guests +you are running that require it. + +=back + =back =head1 SEE ALSO diff --git a/tools/include/libxl.h b/tools/include/libxl.h index a7b673e89d..1cc40a2d67 100644 --- a/tools/include/libxl.h +++ b/tools/include/libxl.h @@ -495,6 +495,14 @@ */ #define LIBXL_HAVE_VMTRACE_BUF_KB 1 +/* + * LIBXL_HAVE_RDMSR_RELAXED indicates the toolstack has support for switching + * the rdmsr handling in the hypervisor to relaxed mode, where #GP is only + * injected to guests for unhandled MSRs if accessing the MSR on the physical + * hardware also triggers a #GP. + */ +#define LIBXL_HAVE_RDMSR_RELAXED 1 + /* * libxl ABI compatibility * diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl index 5b85a7419f..03b0c80146 100644 --- a/tools/libs/light/libxl_types.idl +++ b/tools/libs/light/libxl_types.idl @@ -644,6 +644,8 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("arch_arm", Struct(None, [("gic_version", libxl_gic_version), ("vuart", libxl_vuart_type), ])), + ("arch_x86", Struct(None, [("rdmsr_relaxed", libxl_defbool), + ])), # Alternate p2m is not bound to any architecture or guest type, as it is # supported by x86 HVM and ARM support is planned. ("altp2m", libxl_altp2m_mode), diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c index 58187ed760..c9cff44088 100644 --- a/tools/libs/light/libxl_x86.c +++ b/tools/libs/light/libxl_x86.c @@ -5,9 +5,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, libxl_domain_config *d_config, struct xen_domctl_createdomain *config) { + config->arch.domain_flags = 0; switch(d_config->c_info.type) { case LIBXL_DOMAIN_TYPE_HVM: config->arch.emulation_flags = (XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI); + if (libxl_defbool_val(d_config->b_info.arch_x86.rdmsr_relaxed)) + config->arch.domain_flags |= XEN_X86_RDMSR_RELAXED; break; case LIBXL_DOMAIN_TYPE_PVH: config->arch.emulation_flags = XEN_X86_EMU_LAPIC; @@ -809,6 +812,7 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info) { libxl_defbool_setdefault(&b_info->acpi, true); + libxl_defbool_setdefault(&b_info->arch_x86.rdmsr_relaxed, false); } int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc, diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c index 1893cfc086..9f52c7e914 100644 --- a/tools/xl/xl_parse.c +++ b/tools/xl/xl_parse.c @@ -2636,6 +2636,13 @@ skip_usbdev: xlu_cfg_replace_string (config, "spice_streaming_video", &b_info->u.hvm.spice.streaming_video, 0); xlu_cfg_get_defbool(config, "nographic", &b_info->u.hvm.nographic, 0); + if (!xlu_cfg_get_defbool(config, "rdmsr_relaxed", + &b_info->arch_x86.rdmsr_relaxed, 0)) + fprintf(stderr, + "WARNING: rdmsr_relaxed will be removed in future versions.\n" + "If it fixes an issue you are having please report to " + "xen-devel@lists.xenproject.org.\n"); + if (!xlu_cfg_get_long(config, "gfx_passthru", &l, 1)) { libxl_defbool_set(&b_info->u.hvm.gfx_passthru, l); } else if (!xlu_cfg_get_string(config, "gfx_passthru", &buf, 0)) { diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 5e3c94d3fa..c06b17d338 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -760,6 +760,13 @@ int arch_domain_create(struct domain *d, d->domain_id); } + if ( config->arch.domain_flags & ~XEN_X86_RDMSR_RELAXED ) + { + printk(XENLOG_G_ERR "d%d: Invalid arch domain flags: %#x\n", + d->domain_id, config->arch.domain_flags); + return -EINVAL; + } + emflags = config->arch.emulation_flags; if ( is_hardware_domain(d) && is_pv_domain(d) ) @@ -824,6 +831,9 @@ int arch_domain_create(struct domain *d, { if ( (rc = hvm_domain_initialise(d)) != 0 ) goto fail; + + d->arch.hvm.rdmsr_relaxed = config->arch.domain_flags & + XEN_X86_RDMSR_RELAXED; } else if ( is_pv_domain(d) ) { diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index b819897a4a..d036809bd3 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) const struct domain *d = v->domain; struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; const struct nestedsvm *nsvm = &vcpu_nestedsvm(v); + uint64_t tmp; switch ( msr ) { @@ -1965,6 +1966,11 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) break; default: + if ( d->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) ) + { + *msr_content = 0; + break; + } gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr); goto gpf; } diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index bfea1b0f8a..883e43a0bb 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3123,6 +3123,7 @@ static int is_last_branch_msr(u32 ecx) static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content) { struct vcpu *curr = current; + uint64_t tmp; HVM_DBG_LOG(DBG_LEVEL_MSR, "ecx=%#x", msr); @@ -3204,6 +3205,12 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content) break; } + if ( curr->domain->arch.hvm.rdmsr_relaxed && !rdmsr_safe(msr, tmp) ) + { + *msr_content = 0; + break; + } + gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr); goto gp_fault; } diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h index 7b60e9125f..fdc1b36513 100644 --- a/xen/include/asm-x86/hvm/domain.h +++ b/xen/include/asm-x86/hvm/domain.h @@ -122,6 +122,9 @@ struct hvm_domain { bool_t is_s3_suspended; + /* Don't unconditionally inject #GP for unhandled MSRs reads. */ + bool rdmsr_relaxed; + /* * TSC value that VCPUs use to calculate their tsc_offset value. * Used during initialization and save/restore. diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h index 629cb2ba40..fbf91bf3b9 100644 --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -304,6 +304,14 @@ struct xen_arch_domainconfig { XEN_X86_EMU_PIT | XEN_X86_EMU_USE_PIRQ |\ XEN_X86_EMU_VPCI) uint32_t emulation_flags; + +/* + * HVM only: select whether to use a relaxed behavior for read accesses to MSRs + * not explicitly handled by Xen instead of injecting a #GP to the guest. Note + * this option doesn't allow the guest to read the hardware value. + */ +#define XEN_X86_RDMSR_RELAXED (1u << 0) + uint32_t domain_flags; }; /* Location of online VCPU bitmap. */
Introduce an option to allow selecting a less strict behaviour for rdmsr accesses targeting a MSR not explicitly handled by Xen. Since commit 84e848fd7a162f669 accesses to MSRs not explicitly handled by Xen result in the injection of a #GP to the guest. This is a behavior change since previously a #GP was only injected if accessing the MSR on the real hardware will also trigger a #GP. This commit attempts to offer a fallback option similar to the previous behavior. Note however that the value of the underlying MSR is never leaked to the guest, as the newly introduced option only changes whether a #GP is injected or not. Long term the plan is to properly handle all the MSRs, so the option introduced here should be considered a temporary resort for OSes that don't work properly with the new MSR policy. Any OS that requires this option to be enabled should be reported to xen-devel@lists.xenproject.org. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v1: - Only apply the option to HVM guests. - Only apply the special handling to MSR reads. - Sanitize the newly introduced flags field. - Print a warning message when the option is used. --- Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- Boris, could you please test with Solaris to see if this fixes the issue? I wonder whether we need to to enable this option by default for guests being migrated from previous Xen versions? Maybe that's not required as the option is helpful mostly for early boot I would assume, afterwards an OS should already have the #GP handler setup when accessing MSRs. From a release PoV the biggest risk would be breaking some of the existing MSR functionality. I think that's a necessary risk in order to offer such fallback option, or else we might discover after the release that guests that worked on Xen 4.14 don't work anymore in Xen 4.15. --- docs/man/xl.cfg.5.pod.in | 17 +++++++++++++++++ tools/include/libxl.h | 8 ++++++++ tools/libs/light/libxl_types.idl | 2 ++ tools/libs/light/libxl_x86.c | 4 ++++ tools/xl/xl_parse.c | 7 +++++++ xen/arch/x86/domain.c | 10 ++++++++++ xen/arch/x86/hvm/svm/svm.c | 6 ++++++ xen/arch/x86/hvm/vmx/vmx.c | 7 +++++++ xen/include/asm-x86/hvm/domain.h | 3 +++ xen/include/public/arch-x86/xen.h | 8 ++++++++ 10 files changed, 72 insertions(+)