diff mbox series

[RFC,02/22] x86/msr: implement MSR_SMI_COUNT for Dom0 on Intel

Message ID 9d950b3c5502b5fb5fad62845b56b15d1bacc2d6.1698261255.git.edwin.torok@cloud.com (mailing list archive)
State New, archived
Headers show
Series vPMU bugfixes and support for PMUv5 | expand

Commit Message

Edwin Török Oct. 25, 2023, 7:29 p.m. UTC
From: Edwin Török <edvin.torok@citrix.com>

Dom0 should always be able to read this MSR: it is useful when
investigating performance issues in production.
Although the count is Thread scoped, in practice all cores were observed
to return the same count (perhaps due to implementation details of SMM),
so do not require the cpu to be pinned in order to read it.

This MSR exists on Intel since Nehalem.

Backport: 4.15+

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 xen/arch/x86/include/asm/msr-index.h | 3 +++
 xen/arch/x86/msr.c                   | 7 +++++++
 2 files changed, 10 insertions(+)

Comments

Jan Beulich Oct. 30, 2023, 4:20 p.m. UTC | #1
On 25.10.2023 21:29, Edwin Török wrote:
> Dom0 should always be able to read this MSR: it is useful when
> investigating performance issues in production.

While I'm not outright opposed, I'm also not convinced. At the very least
...

> Although the count is Thread scoped, in practice all cores were observed
> to return the same count (perhaps due to implementation details of SMM),
> so do not require the cpu to be pinned in order to read it.

... this, even if matching your observations, is going to be properly
misleading in case counts end up diverging.

> This MSR exists on Intel since Nehalem.
> 
> Backport: 4.15+

If this was a backporting candidate, I think a Fixes: tag would need
to indicate what's being fixed here. Otherwise this is merely a new
feature.

> --- a/xen/arch/x86/include/asm/msr-index.h
> +++ b/xen/arch/x86/include/asm/msr-index.h
> @@ -641,6 +641,9 @@
>  #define MSR_NHL_LBR_SELECT		0x000001c8
>  #define MSR_NHL_LASTBRANCH_TOS		0x000001c9
>  
> +/* Nehalem and newer other MSRs */
> +#define MSR_SMI_COUNT                       0x00000034

See my comment on the other patch regarding additions here.

> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -139,6 +139,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>          *val = msrs->misc_features_enables.raw;
>          break;
>  
> +    case MSR_SMI_COUNT:
> +        if ( cp->x86_vendor != X86_VENDOR_INTEL )
> +            goto gp_fault;
> +        if ( is_hardware_domain(d) && !rdmsr_safe(msr, *val) )
> +            break;
> +        return X86EMUL_UNHANDLEABLE;

Why #GP for non-Intel but UNHANDLEABLE for DomU?

Jan
Edwin Török Oct. 31, 2023, 9:42 a.m. UTC | #2
> On 30 Oct 2023, at 16:20, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 25.10.2023 21:29, Edwin Török wrote:
>> Dom0 should always be able to read this MSR: it is useful when
>> investigating performance issues in production.
> 
> While I'm not outright opposed, I'm also not convinced. At the very least
> ...
> 
>> Although the count is Thread scoped, in practice all cores were observed
>> to return the same count (perhaps due to implementation details of SMM),
>> so do not require the cpu to be pinned in order to read it.
> 
> ... this, even if matching your observations, is going to be properly
> misleading in case counts end up diverging.
> 
>> This MSR exists on Intel since Nehalem.
>> 
>> Backport: 4.15+
> 
> If this was a backporting candidate, I think a Fixes: tag would need
> to indicate what's being fixed here.


I used the Backport tag to indicate what is the oldest release that it is backportable to.
IIRC the choices were:
* 4.0+ for issues that were present for a long time (didn't look further back than that in history), so there isn't any particular commit that introduces the problem, it was like that from the very beginning, i.e. not a regression.

* 4.13+ for issues affecting only new CPU support (I think that is the release that added Icelake support). I can attempt to find the commit that added Icelake support and mark them as Fixes: that commit (although there might be several of them)

* 4.15+ for bugs introduced by the default read-write msr changes


> Otherwise this is merely a new
> feature.
> 

Prior to the default rdwrmsr changes it was possible to read any MSR, so I consider it a bug that after the default rdwrmsr changes you can no longer do that, it takes away a valuable debugging tool.
I can attempt to find a more specific commit to indicate with Fixes:

>> --- a/xen/arch/x86/include/asm/msr-index.h
>> +++ b/xen/arch/x86/include/asm/msr-index.h
>> @@ -641,6 +641,9 @@
>> #define MSR_NHL_LBR_SELECT 0x000001c8
>> #define MSR_NHL_LASTBRANCH_TOS 0x000001c9
>> 
>> +/* Nehalem and newer other MSRs */
>> +#define MSR_SMI_COUNT                       0x00000034
> 
> See my comment on the other patch regarding additions here.
> 
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -139,6 +139,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>         *val = msrs->misc_features_enables.raw;
>>         break;
>> 
>> +    case MSR_SMI_COUNT:
>> +        if ( cp->x86_vendor != X86_VENDOR_INTEL )
>> +            goto gp_fault;
>> +        if ( is_hardware_domain(d) && !rdmsr_safe(msr, *val) )
>> +            break;
>> +        return X86EMUL_UNHANDLEABLE;
> 
> Why #GP for non-Intel but UNHANDLEABLE for DomU?

I wanted to match the behaviour of the 'default:' case statement, although looking at the other MSR handling code in this file they just usually gp_fault,
with the exception of the MSR_K8* code that returns UNHANDLEABLE, so if condition here could be simplified (e.g. follow how MSR_AMD_PATCHLEVEL does it).

Best regards,
--Edwin 

> 
> Jan
Jan Beulich Oct. 31, 2023, 9:57 a.m. UTC | #3
On 31.10.2023 10:42, Edwin Torok wrote:
>> On 30 Oct 2023, at 16:20, Jan Beulich <jbeulich@suse.com> wrote:
>> On 25.10.2023 21:29, Edwin Török wrote:
>>> Dom0 should always be able to read this MSR: it is useful when
>>> investigating performance issues in production.
>>
>> While I'm not outright opposed, I'm also not convinced. At the very least
>> ...
>>
>>> Although the count is Thread scoped, in practice all cores were observed
>>> to return the same count (perhaps due to implementation details of SMM),
>>> so do not require the cpu to be pinned in order to read it.
>>
>> ... this, even if matching your observations, is going to be properly
>> misleading in case counts end up diverging.
>>
>>> This MSR exists on Intel since Nehalem.
>>>
>>> Backport: 4.15+
>>
>> If this was a backporting candidate, I think a Fixes: tag would need
>> to indicate what's being fixed here.
> 
> 
> I used the Backport tag to indicate what is the oldest release that it is backportable to.
> IIRC the choices were:
> * 4.0+ for issues that were present for a long time (didn't look further back than that in history), so there isn't any particular commit that introduces the problem, it was like that from the very beginning, i.e. not a regression.
> 
> * 4.13+ for issues affecting only new CPU support (I think that is the release that added Icelake support). I can attempt to find the commit that added Icelake support and mark them as Fixes: that commit (although there might be several of them)
> 
> * 4.15+ for bugs introduced by the default read-write msr changes
> 
> 
>> Otherwise this is merely a new
>> feature.
>>
> 
> Prior to the default rdwrmsr changes it was possible to read any MSR, so I consider it a bug that after the default rdwrmsr changes you can no longer do that, it takes away a valuable debugging tool.

As said elsewhere, making MSRs generally inaccessible was a deliberate change.
I don't think any of us x86 maintainers has so far considered that as introducing
a bug. MSRs being accessible as a debugging tool may be worthwhile to have as an
optional feature (see my suggestion elsewhere as to a possible way to approach
this), but I don't think this can be taken as an indication that we should revert
back to "blind" exposure.

Jan
Edwin Török Nov. 1, 2023, 9:16 a.m. UTC | #4
> On 31 Oct 2023, at 09:57, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 31.10.2023 10:42, Edwin Torok wrote:
>>> On 30 Oct 2023, at 16:20, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 25.10.2023 21:29, Edwin Török wrote:
>>>> Dom0 should always be able to read this MSR: it is useful when
>>>> investigating performance issues in production.
>>> 
>>> While I'm not outright opposed, I'm also not convinced. At the very least
>>> ...
>>> 
>>>> Although the count is Thread scoped, in practice all cores were observed
>>>> to return the same count (perhaps due to implementation details of SMM),
>>>> so do not require the cpu to be pinned in order to read it.
>>> 
>>> ... this, even if matching your observations, is going to be properly
>>> misleading in case counts end up diverging.
>>> 
>>>> This MSR exists on Intel since Nehalem.
>>>> 
>>>> Backport: 4.15+
>>> 
>>> If this was a backporting candidate, I think a Fixes: tag would need
>>> to indicate what's being fixed here.
>> 
>> 
>> I used the Backport tag to indicate what is the oldest release that it is backportable to.
>> IIRC the choices were:
>> * 4.0+ for issues that were present for a long time (didn't look further back than that in history), so there isn't any particular commit that introduces the problem, it was like that from the very beginning, i.e. not a regression.
>> 
>> * 4.13+ for issues affecting only new CPU support (I think that is the release that added Icelake support). I can attempt to find the commit that added Icelake support and mark them as Fixes: that commit (although there might be several of them)
>> 
>> * 4.15+ for bugs introduced by the default read-write msr changes
>> 
>> 
>>> Otherwise this is merely a new
>>> feature.
>>> 
>> 
>> Prior to the default rdwrmsr changes it was possible to read any MSR, so I consider it a bug that after the default rdwrmsr changes you can no longer do that, it takes away a valuable debugging tool.
> 
> As said elsewhere, making MSRs generally inaccessible was a deliberate change.
> I don't think any of us x86 maintainers has so far considered that as introducing
> a bug. MSRs being accessible as a debugging tool may be worthwhile to have as an
> optional feature (see my suggestion elsewhere as to a possible way to approach
> this), but I don't think this can be taken as an indication that we should revert
> back to "blind" exposure.
> 


This particular patch (unlike the other one that is more general for all MSRs) is about one specific MSR, and makes it accessible to Dom0 only in a read-only way.
That is very different from blindly exposing it to all guests.
Anyway let me get something ready for master first, and then distributions that package Xen can decide whether to backport this patch or not, I'll replace Backport: 4.X with Backportable: 4.X (and add a Fixes: line) to indicate that this patch is recommended to be backported (for anyone who uses that version of Xen in production).

[I'm not saying that this patch is complete, as you've indicated there is more work required to make it work correctly wrt to pinning vs non-pinning, and I'll try to require a pinned Dom0 core in my next version]

Best regards,
--Edwin
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 82a81bd0a2..2853a276ca 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -641,6 +641,9 @@ 
 #define MSR_NHL_LBR_SELECT		0x000001c8
 #define MSR_NHL_LASTBRANCH_TOS		0x000001c9
 
+/* Nehalem and newer other MSRs */
+#define MSR_SMI_COUNT                       0x00000034
+
 /* Skylake (and newer) last-branch recording */
 #define MSR_SKL_LASTBRANCH_0_FROM_IP	0x00000680
 #define MSR_SKL_LASTBRANCH_0_TO_IP	0x000006c0
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index c33dc78cd8..0bf6d263e7 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -139,6 +139,13 @@  int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+    case MSR_SMI_COUNT:
+        if ( cp->x86_vendor != X86_VENDOR_INTEL )
+            goto gp_fault;
+        if ( is_hardware_domain(d) && !rdmsr_safe(msr, *val) )
+            break;
+        return X86EMUL_UNHANDLEABLE;
+
     case MSR_P5_MC_ADDR:
     case MSR_P5_MC_TYPE:
     case MSR_IA32_MCG_CAP     ... MSR_IA32_MCG_CTL:      /* 0x179 -> 0x17b */