diff mbox series

[v2] KVM: x86/pt: Ignore all unknown Intel PT capabilities

Message ID 20220110034747.30498-1-likexu@tencent.com (mailing list archive)
State New, archived
Headers show
Series [v2] KVM: x86/pt: Ignore all unknown Intel PT capabilities | expand

Commit Message

Like Xu Jan. 10, 2022, 3:47 a.m. UTC
From: Like Xu <likexu@tencent.com>

Some of the new Intel PT capabilities (e.g. SDM Vol3, 32.2.4 Event
Tracing, it exposes details about the asynchronous events, when they are
generated, and when their corresponding software event handler completes
execution) cannot be safely and fully emulated by the KVM, especially
emulating the simultaneous writing of guest PT packets generated by
the KVM to the guest PT buffer.

For KVM, it's better to advertise currently supported features based on
the "static struct pt_cap_desc" implemented in the host PT driver and
ignore _all_ unknown features before they have been investigated one by
one and supported in a safe manner, leaving the rest as system-wide-only
tracing capabilities.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
v1 -> v2 Changelog:
- Be safe and ignore _all_ unknown capabilities. (Paolo)

Previous:
https://lore.kernel.org/kvm/20220106085533.84356-1-likexu@tencent.com/

 arch/x86/kvm/cpuid.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Sean Christopherson Jan. 11, 2022, 12:57 a.m. UTC | #1
On Mon, Jan 10, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Some of the new Intel PT capabilities (e.g. SDM Vol3, 32.2.4 Event
> Tracing, it exposes details about the asynchronous events, when they are
> generated, and when their corresponding software event handler completes
> execution) cannot be safely and fully emulated by the KVM, especially
> emulating the simultaneous writing of guest PT packets generated by
> the KVM to the guest PT buffer.
> 
> For KVM, it's better to advertise currently supported features based on
> the "static struct pt_cap_desc" implemented in the host PT driver and
> ignore _all_ unknown features before they have been investigated one by
> one and supported in a safe manner, leaving the rest as system-wide-only
> tracing capabilities.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
> v1 -> v2 Changelog:
> - Be safe and ignore _all_ unknown capabilities. (Paolo)
> 
> Previous:
> https://lore.kernel.org/kvm/20220106085533.84356-1-likexu@tencent.com/
> 
>  arch/x86/kvm/cpuid.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0b920e12bb6d..439b93359848 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -901,6 +901,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>  			break;
>  		}
>  
> +		/* It's better to be safe and ignore _all_ unknown capabilities. */

No need to justify why unknown capabilities are hidden as that's very much (supposed
to be) standard KVM behavior.

> +		entry->ebx &= GENMASK(5, 0);

Please add a #define somewhere so that this is self-documenting, e.g. see
KVM_SUPPORTED_XCR0.

And why just EBX?  ECX appears to enumerate features too, and EDX is presumably
reserved to enumerate yet more features when EBX/ECX run out of bits.

And is there any possibility of a malicious user/guest using features to cause
problems in the host?  I.e. does KVM need to enforce that the guest can't enable
any unsupported features?

>  		for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) {
>  			if (!do_host_cpuid(array, function, i))
>  				goto out;
> -- 
> 2.33.1
>
Like Xu Jan. 11, 2022, 4:20 a.m. UTC | #2
On 11/1/2022 8:57 am, Sean Christopherson wrote:
> On Mon, Jan 10, 2022, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> Some of the new Intel PT capabilities (e.g. SDM Vol3, 32.2.4 Event
>> Tracing, it exposes details about the asynchronous events, when they are
>> generated, and when their corresponding software event handler completes
>> execution) cannot be safely and fully emulated by the KVM, especially
>> emulating the simultaneous writing of guest PT packets generated by
>> the KVM to the guest PT buffer.
>>
>> For KVM, it's better to advertise currently supported features based on
>> the "static struct pt_cap_desc" implemented in the host PT driver and
>> ignore _all_ unknown features before they have been investigated one by
>> one and supported in a safe manner, leaving the rest as system-wide-only
>> tracing capabilities.
>>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>> v1 -> v2 Changelog:
>> - Be safe and ignore _all_ unknown capabilities. (Paolo)
>>
>> Previous:
>> https://lore.kernel.org/kvm/20220106085533.84356-1-likexu@tencent.com/
>>
>>   arch/x86/kvm/cpuid.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 0b920e12bb6d..439b93359848 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -901,6 +901,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>>   			break;
>>   		}
>>   
>> +		/* It's better to be safe and ignore _all_ unknown capabilities. */
> 
> No need to justify why unknown capabilities are hidden as that's very much (supposed
> to be) standard KVM behavior.
> 
>> +		entry->ebx &= GENMASK(5, 0);
> 
> Please add a #define somewhere so that this is self-documenting, e.g. see
> KVM_SUPPORTED_XCR0.

How about we define this macro in the <asm/intel_pt.h> so that the next PT 
capability
enabler can update the mask with minimal effort, considering that many pure kernel
developers don't care about KVM code ?

> 
> And why just EBX?  ECX appears to enumerate features too, and EDX is presumably
> reserved to enumerate yet more features when EBX/ECX run out of bits.

Yes, how about this version:

diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index ebe8d2ea44fe..da94d0eeb9df 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -24,6 +24,12 @@ enum pt_capabilities {
  	PT_CAP_psb_periods,
  };

+#define GUEST_SUPPORTED_CPUID_14_EBX	\
+	(BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(4) | BIT(5))
+
+#define GUEST_SUPPORTED_CPUID_14_ECX	\
+	(BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(31))
+
  #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
  void cpu_emergency_stop_pt(void);
  extern u32 intel_pt_validate_hw_cap(enum pt_capabilities cap);
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0b920e12bb6d..be8c9170f98e 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -19,6 +19,7 @@
  #include <asm/user.h>
  #include <asm/fpu/xstate.h>
  #include <asm/sgx.h>
+#include <asm/intel_pt.h>
  #include "cpuid.h"
  #include "lapic.h"
  #include "mmu.h"
@@ -900,7 +901,10 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array 
*array, u32 function)
  			entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
  			break;
  		}
-
+		entry->eax = min(entry->eax, 1u);
+		entry->ebx &= GUEST_SUPPORTED_CPUID_14_EBX;
+		entry->ecx &= GUEST_SUPPORTED_CPUID_14_ECX;
+		entry->edx = 0;
  		for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) {
  			if (!do_host_cpuid(array, function, i))
  				goto out;

> 
> And is there any possibility of a malicious user/guest using features to cause
> problems in the host?  I.e. does KVM need to enforce that the guest can't enable
> any unsupported features?

If a user space is set up with features not supported by KVM, it owns the risk 
itself.

AFAI, the guest Intel PT introduces a great attack interface for the host and
we only use the guest supported PT features in a highly trusted environment.

I agree that more uncertainty and fixes can be triggered in the security motive,
not expecting too much from this patch. :D

> 
>>   		for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) {
>>   			if (!do_host_cpuid(array, function, i))
>>   				goto out;
>> -- 
>> 2.33.1
>>
Like Xu Jan. 11, 2022, 6:24 a.m. UTC | #3
On 11/1/2022 12:20 pm, Like Xu wrote:
>> And is there any possibility of a malicious user/guest using features to cause
>> problems in the host?  I.e. does KVM need to enforce that the guest can't enable
>> any unsupported features?
> 
> If a user space is set up with features not supported by KVM, it owns the risk 
> itself.

I seem to have misunderstood it. KVM should prevent and stop any malicious guest
from destroying other parts on the host, is this the right direction ?

> 
> AFAI, the guest Intel PT introduces a great attack interface for the host and
> we only use the guest supported PT features in a highly trusted environment.
> 
> I agree that more uncertainty and fixes can be triggered in the security motive,
> not expecting too much from this patch. :D
Xiaoyao Li Jan. 11, 2022, 7:59 a.m. UTC | #4
On 1/11/2022 12:20 PM, Like Xu wrote:
> On 11/1/2022 8:57 am, Sean Christopherson wrote:
>> On Mon, Jan 10, 2022, Like Xu wrote:
>>> From: Like Xu <likexu@tencent.com>
>>>
>>> Some of the new Intel PT capabilities (e.g. SDM Vol3, 32.2.4 Event
>>> Tracing, it exposes details about the asynchronous events, when they are
>>> generated, and when their corresponding software event handler completes
>>> execution) cannot be safely and fully emulated by the KVM, especially
>>> emulating the simultaneous writing of guest PT packets generated by
>>> the KVM to the guest PT buffer.
>>>
>>> For KVM, it's better to advertise currently supported features based on
>>> the "static struct pt_cap_desc" implemented in the host PT driver and
>>> ignore _all_ unknown features before they have been investigated one by
>>> one and supported in a safe manner, leaving the rest as system-wide-only
>>> tracing capabilities.
>>>
>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>> ---
>>> v1 -> v2 Changelog:
>>> - Be safe and ignore _all_ unknown capabilities. (Paolo)
>>>
>>> Previous:
>>> https://lore.kernel.org/kvm/20220106085533.84356-1-likexu@tencent.com/
>>>
>>>   arch/x86/kvm/cpuid.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 0b920e12bb6d..439b93359848 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -901,6 +901,8 @@ static inline int __do_cpuid_func(struct 
>>> kvm_cpuid_array *array, u32 function)
>>>               break;
>>>           }
>>> +        /* It's better to be safe and ignore _all_ unknown 
>>> capabilities. */
>>
>> No need to justify why unknown capabilities are hidden as that's very 
>> much (supposed
>> to be) standard KVM behavior.
>>
>>> +        entry->ebx &= GENMASK(5, 0);
>>
>> Please add a #define somewhere so that this is self-documenting, e.g. see
>> KVM_SUPPORTED_XCR0.
> 
> How about we define this macro in the <asm/intel_pt.h> so that the next 
> PT capability
> enabler can update the mask with minimal effort, considering that many 
> pure kernel
> developers don't care about KVM code ?
> 
>>
>> And why just EBX?  ECX appears to enumerate features too, and EDX is 
>> presumably
>> reserved to enumerate yet more features when EBX/ECX run out of bits.
> 
> Yes, how about this version:
> 
> diff --git a/arch/x86/include/asm/intel_pt.h 
> b/arch/x86/include/asm/intel_pt.h
> index ebe8d2ea44fe..da94d0eeb9df 100644
> --- a/arch/x86/include/asm/intel_pt.h
> +++ b/arch/x86/include/asm/intel_pt.h
> @@ -24,6 +24,12 @@ enum pt_capabilities {
>       PT_CAP_psb_periods,
>   };
> 
> +#define GUEST_SUPPORTED_CPUID_14_EBX    \
> +    (BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(4) | BIT(5))
> +
> +#define GUEST_SUPPORTED_CPUID_14_ECX    \
> +    (BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(31))
> +

I doubt BIT(3) of CPUID_14_ECX can be exposed to guest directly.

It means "output to Trace Transport Subsystem Supported". If I 
understand correctly, it at least needs passthrough of the said 
Transport Subsystem or emulation of it.

>   #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
>   void cpu_emergency_stop_pt(void);
>   extern u32 intel_pt_validate_hw_cap(enum pt_capabilities cap);
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0b920e12bb6d..be8c9170f98e 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -19,6 +19,7 @@
>   #include <asm/user.h>
>   #include <asm/fpu/xstate.h>
>   #include <asm/sgx.h>
> +#include <asm/intel_pt.h>
>   #include "cpuid.h"
>   #include "lapic.h"
>   #include "mmu.h"
> @@ -900,7 +901,10 @@ static inline int __do_cpuid_func(struct 
> kvm_cpuid_array *array, u32 function)
>               entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
>               break;
>           }
> -
> +        entry->eax = min(entry->eax, 1u);
> +        entry->ebx &= GUEST_SUPPORTED_CPUID_14_EBX;
> +        entry->ecx &= GUEST_SUPPORTED_CPUID_14_ECX;
> +        entry->edx = 0;
>           for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) {
>               if (!do_host_cpuid(array, function, i))
>                   goto out;
Like Xu Jan. 11, 2022, 8:15 a.m. UTC | #5
On 11/1/2022 3:59 pm, Xiaoyao Li wrote:
>>
>> +#define GUEST_SUPPORTED_CPUID_14_EBX    \
>> +    (BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(4) | BIT(5))
>> +
>> +#define GUEST_SUPPORTED_CPUID_14_ECX    \
>> +    (BIT(0) | BIT(1) | BIT(2) | BIT(3) | BIT(31))
>> +
> 
> I doubt BIT(3) of CPUID_14_ECX can be exposed to guest directly.
> 
> It means "output to Trace Transport Subsystem Supported". If I understand 
> correctly, it at least needs passthrough of the said Transport Subsystem or 
> emulation of it.

I'm not surprised that we can route Intel Guest PT output to a platform-specific
trace endpoint (e.g., physical or emulated JTAG) as an MMIO debug port.
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0b920e12bb6d..439b93359848 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -901,6 +901,8 @@  static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			break;
 		}
 
+		/* It's better to be safe and ignore _all_ unknown capabilities. */
+		entry->ebx &= GENMASK(5, 0);
 		for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) {
 			if (!do_host_cpuid(array, function, i))
 				goto out;