Message ID | 0ac778dbcc7ab383447abe672225ff77b0d4802e.1736793323.git.teddy.astie@vates.tech (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [XEN] intel/msr: Fix handling of MSR_RAPL_POWER_UNIT | expand |
On Mon, Jan 13, 2025 at 06:42:44PM +0000, Teddy Astie wrote: > Solaris 11.4 tries to access this MSR on some Intel platforms without properly > setting up a proper #GP handler, which leads to a immediate crash. > > Emulate the access of this MSR by giving it a legal value (all values set to > default, as defined by Intel SDM "RAPL Interfaces"). > > Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs') Hm, > Signed-off-by: Teddy Astie <teddy.astie@vates.tech> > --- > Does it have a risk of negatively affecting other operating systems expecting > this MSR read to fail ? > --- > xen/arch/x86/include/asm/msr-index.h | 2 ++ > xen/arch/x86/msr.c | 16 ++++++++++++++++ > 2 files changed, 18 insertions(+) > > diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h > index 9cdb5b2625..2adcdf344f 100644 > --- a/xen/arch/x86/include/asm/msr-index.h > +++ b/xen/arch/x86/include/asm/msr-index.h > @@ -144,6 +144,8 @@ > #define MSR_RTIT_ADDR_A(n) (0x00000580 + (n) * 2) > #define MSR_RTIT_ADDR_B(n) (0x00000581 + (n) * 2) > > +#define MSR_RAPL_POWER_UNIT 0x00000606 > + > #define MSR_U_CET 0x000006a0 > #define MSR_S_CET 0x000006a2 > #define CET_SHSTK_EN (_AC(1, ULL) << 0) > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c > index 289cf10b78..b14d42dacf 100644 > --- a/xen/arch/x86/msr.c > +++ b/xen/arch/x86/msr.c > @@ -169,6 +169,22 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) > if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 ) > break; > goto gp_fault; > + Trailing spaces in the added newline. > + /* > + * Solaris 11.4 DomU tries to use read this MSR without setting up a > + * proper #GP handler leading to a crash. Emulate this MSR by giving a > + * legal value. > + */ The comment should be after (inside) the case statement IMO (but not strong opinion. Could you also raise a bug with Solaris and put a link to the bug report here, so that we have a reference to it? > + case MSR_RAPL_POWER_UNIT: > + if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) ) Has Centaur ever released a CPU with RAPL? > + goto gp_fault; > + > + /* > + * Return a legal register content with all default values defined in > + * Intel Architecture Software Developer Manual 16.10.1 RAPL Interfaces > + */ > + *val = 0x0000A1003; The SPR Specification defines the default as 000A0E03h: * SDM: Energy Status Units (bits 12:8): Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 10000b, indicating energy status unit is in 15.3 micro-Joules increment. * SPR: Energy Units (ENERGY_UNIT): Energy Units used for power control registers. The actual unit value is calculated by 1 J / Power(2,ENERGY_UNIT). The default value of 14 corresponds to Ux.14 number. Note that KVM just returns all 0s [0], so we might consider doing the same, as otherwise that could lead OSes to poke at further RAPL related MSRs if the returned value from MSR_RAPL_POWER_UNIT looks plausible. [0] https://elixir.bootlin.com/linux/v6.12.6/source/arch/x86/kvm/x86.c#L4236 Thanks.
On 13/01/2025 6:42 pm, Teddy Astie wrote: > Solaris 11.4 tries Is it only Solaris 11.4, or is the simply the one repro you had? Have you reported a bug? > to access this MSR on some Intel platforms without properly > setting up a proper #GP handler, which leads to a immediate crash. Minor grammar note. Either "without a proper #GP handler" or "without properly setting up a #GP handler", but having two proper(ly)'s in there is less than ideal. > Emulate the access of this MSR by giving it a legal value (all values set to > default, as defined by Intel SDM "RAPL Interfaces"). > > Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs') > Signed-off-by: Teddy Astie <teddy.astie@vates.tech> > --- > Does it have a risk of negatively affecting other operating systems expecting > this MSR read to fail? It's Complicated. RAPL is a non-architectural feature (on Intel; AMD did it properly). It does not have a CPUID bit to announce the presence of the MSRs. Therefore OSes use a mixture of model numbers and {wr,rd}msr_safe() to probe. I expect this will change the behaviour of Linux. ~Andrew
On Tue, Jan 14, 2025 at 10:32:25AM +0100, Roger Pau Monné wrote: > On Mon, Jan 13, 2025 at 06:42:44PM +0000, Teddy Astie wrote: > > Solaris 11.4 tries to access this MSR on some Intel platforms without properly > > setting up a proper #GP handler, which leads to a immediate crash. > > > > Emulate the access of this MSR by giving it a legal value (all values set to > > default, as defined by Intel SDM "RAPL Interfaces"). > > > > Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs') Nit: I think we usually use 12 hex character hashes, the above one is 11 characters long. > > Hm, Seems like I've sent this too early. I wanted to say I wasn't convinced this is a fix for the above, but I can see how the change can be seen as a regression if Solaris booted before that change in behavior, so I'm fine with leaving the "Fixes:" tag. Thanks, Roger.
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h index 9cdb5b2625..2adcdf344f 100644 --- a/xen/arch/x86/include/asm/msr-index.h +++ b/xen/arch/x86/include/asm/msr-index.h @@ -144,6 +144,8 @@ #define MSR_RTIT_ADDR_A(n) (0x00000580 + (n) * 2) #define MSR_RTIT_ADDR_B(n) (0x00000581 + (n) * 2) +#define MSR_RAPL_POWER_UNIT 0x00000606 + #define MSR_U_CET 0x000006a0 #define MSR_S_CET 0x000006a2 #define CET_SHSTK_EN (_AC(1, ULL) << 0) diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 289cf10b78..b14d42dacf 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -169,6 +169,22 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 ) break; goto gp_fault; + + /* + * Solaris 11.4 DomU tries to use read this MSR without setting up a + * proper #GP handler leading to a crash. Emulate this MSR by giving a + * legal value. + */ + case MSR_RAPL_POWER_UNIT: + if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) ) + goto gp_fault; + + /* + * Return a legal register content with all default values defined in + * Intel Architecture Software Developer Manual 16.10.1 RAPL Interfaces + */ + *val = 0x0000A1003; + break; case MSR_IA32_THERM_STATUS: if ( cp->x86_vendor != X86_VENDOR_INTEL )
Solaris 11.4 tries to access this MSR on some Intel platforms without properly setting up a proper #GP handler, which leads to a immediate crash. Emulate the access of this MSR by giving it a legal value (all values set to default, as defined by Intel SDM "RAPL Interfaces"). Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs') Signed-off-by: Teddy Astie <teddy.astie@vates.tech> --- Does it have a risk of negatively affecting other operating systems expecting this MSR read to fail ? --- xen/arch/x86/include/asm/msr-index.h | 2 ++ xen/arch/x86/msr.c | 16 ++++++++++++++++ 2 files changed, 18 insertions(+)