diff mbox series

KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1

Message ID 20220712135009.952805-1-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1 | expand

Commit Message

Vitaly Kuznetsov July 12, 2022, 1:50 p.m. UTC
Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
hang upon boot or shortly after when a non-default TSC frequency was
set for L1. The issue is observed on a host where TSC scaling is
supported. The problem appears to be that Windows doesn't use TSC
frequency for its guests even when the feature is advertised and KVM
filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
L1's. This leads to L2 running with the default frequency (matching
host's) while L1 is running with an altered one.

Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
it was set for L1. TSC_MULTIPLIER is already correctly computed and
written by prepare_vmcs02().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Maxim Levitsky July 12, 2022, 2:27 p.m. UTC | #1
On Tue, 2022-07-12 at 15:50 +0200, Vitaly Kuznetsov wrote:
> Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
> hang upon boot or shortly after when a non-default TSC frequency was
> set for L1. The issue is observed on a host where TSC scaling is
> supported. The problem appears to be that Windows doesn't use TSC
> frequency for its guests even when the feature is advertised and KVM
> filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
> L1's. This leads to L2 running with the default frequency (matching
> host's) while L1 is running with an altered one.

Ouch.

I guess that needs a Fixes tag?

Fixes: d041b5ea93352b ("KVM: nVMX: Enable nested TSC scaling")

Also this is thankfully Intel specific, because in AMD you can't enable
TSC scaling - there is just an MSR with default value of 1.0,
which one can change if TSC scaling is supported in CPUID.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Best regards,
	Maxim Levitsky


> 
> Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
> it was set for L1. TSC_MULTIPLIER is already correctly computed and
> written by prepare_vmcs02().
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 778f82015f03..bfa366938c49 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2284,7 +2284,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
>                                   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>                                   SECONDARY_EXEC_APIC_REGISTER_VIRT |
>                                   SECONDARY_EXEC_ENABLE_VMFUNC |
> -                                 SECONDARY_EXEC_TSC_SCALING |
>                                   SECONDARY_EXEC_DESC);
>  
>                 if (nested_cpu_has(vmcs12,
Vitaly Kuznetsov July 12, 2022, 3:33 p.m. UTC | #2
Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Tue, 2022-07-12 at 15:50 +0200, Vitaly Kuznetsov wrote:
>> Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
>> hang upon boot or shortly after when a non-default TSC frequency was
>> set for L1. The issue is observed on a host where TSC scaling is
>> supported. The problem appears to be that Windows doesn't use TSC
>> frequency

^^^ scaling ^^^

>> for its guests even when the feature is advertised and KVM
>> filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
>> L1's. This leads to L2 running with the default frequency (matching
>> host's) while L1 is running with an altered one.
>
> Ouch.
>
> I guess that needs a Fixes tag?
>
> Fixes: d041b5ea93352b ("KVM: nVMX: Enable nested TSC scaling")
>

I dismissed that because prior to d041b5ea93352b SECONDARY_EXEC_TSC_SCALING
was filtered out in nested_vmx_setup_ctls_msrs() but now I think I was
wrong, SECONDARY_EXEC_TSC_SCALING was likely kept in VMCS02 regardless
of that. Will add in v2.

> Also this is thankfully Intel specific, because in AMD you can't enable
> TSC scaling - there is just an MSR with default value of 1.0,
> which one can change if TSC scaling is supported in CPUID.
>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Thanks!

> Best regards,
> 	Maxim Levitsky
>
>
>> 
>> Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
>> it was set for L1. TSC_MULTIPLIER is already correctly computed and
>> written by prepare_vmcs02().
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kvm/vmx/nested.c | 1 -
>>  1 file changed, 1 deletion(-)
>> 
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 778f82015f03..bfa366938c49 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -2284,7 +2284,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
>>                                   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>>                                   SECONDARY_EXEC_APIC_REGISTER_VIRT |
>>                                   SECONDARY_EXEC_ENABLE_VMFUNC |
>> -                                 SECONDARY_EXEC_TSC_SCALING |
>>                                   SECONDARY_EXEC_DESC);
>>  
>>                 if (nested_cpu_has(vmcs12,
>
>
Sean Christopherson July 12, 2022, 7:10 p.m. UTC | #3
On Tue, Jul 12, 2022, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > On Tue, 2022-07-12 at 15:50 +0200, Vitaly Kuznetsov wrote:
> >> Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
> >> hang upon boot or shortly after when a non-default TSC frequency was
> >> set for L1. The issue is observed on a host where TSC scaling is
> >> supported. The problem appears to be that Windows doesn't use TSC
> >> frequency
> 
> ^^^ scaling ^^^
> 
> >> for its guests even when the feature is advertised and KVM
> >> filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
> >> L1's. This leads to L2 running with the default frequency (matching
> >> host's) while L1 is running with an altered one.
> >
> > Ouch.
> >
> > I guess that needs a Fixes tag?
> >
> > Fixes: d041b5ea93352b ("KVM: nVMX: Enable nested TSC scaling")
> >
> 
> I dismissed that because prior to d041b5ea93352b SECONDARY_EXEC_TSC_SCALING
> was filtered out in nested_vmx_setup_ctls_msrs() but now I think I was
> wrong, SECONDARY_EXEC_TSC_SCALING was likely kept in VMCS02 regardless
> of that. Will add in v2.

Yep, it would have been kept in vmcs02 even though the feature wasn't advertised
to the L1 VMM.  A Cc for stable is warranted as well.

I added this (with the tags and s/frequency/scaling) to the queue of patches for
5.20 I have lined up for Paolo to consume on his return.  Paolo and I haven't
hashed out how we'll actually manage anything, i.e. my list is speculative, but
unless you hear otherwise, no need to send a v2.
Dongli Zhang July 12, 2022, 8:13 p.m. UTC | #4
Hi Vitaly,

On 7/12/22 6:50 AM, Vitaly Kuznetsov wrote:
> Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
> hang upon boot or shortly after when a non-default TSC frequency was
> set for L1. The issue is observed on a host where TSC scaling is

Would you mind helping clarify if it is L1 or L2 that hangs?

The commit message "Windows 10/11 guests with Hyper-V role (WSL2)" confuses me
if it is L1 or L2 (perhaps due to my lack of knowledge on hyper-v) that hangs.

Thank you very much!

Dongli Zhang

> supported. The problem appears to be that Windows doesn't use TSC
> frequency for its guests even when the feature is advertised and KVM
> filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
> L1's. This leads to L2 running with the default frequency (matching
> host's) while L1 is running with an altered one.
> 
> Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
> it was set for L1. TSC_MULTIPLIER is already correctly computed and
> written by prepare_vmcs02().
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 778f82015f03..bfa366938c49 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2284,7 +2284,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
>  				  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>  				  SECONDARY_EXEC_APIC_REGISTER_VIRT |
>  				  SECONDARY_EXEC_ENABLE_VMFUNC |
> -				  SECONDARY_EXEC_TSC_SCALING |
>  				  SECONDARY_EXEC_DESC);
>  
>  		if (nested_cpu_has(vmcs12,
>
Vitaly Kuznetsov July 13, 2022, 7:44 a.m. UTC | #5
Dongli Zhang <dongli.zhang@oracle.com> writes:

> Hi Vitaly,
>
> On 7/12/22 6:50 AM, Vitaly Kuznetsov wrote:
>> Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
>> hang upon boot or shortly after when a non-default TSC frequency was
>> set for L1. The issue is observed on a host where TSC scaling is
>
> Would you mind helping clarify if it is L1 or L2 that hangs?
>
> The commit message "Windows 10/11 guests with Hyper-V role (WSL2)" confuses me
> if it is L1 or L2 (perhaps due to my lack of knowledge on hyper-v) that hangs.
>

I think it's L2 but I'm not sure: there's no easy way to interract with
L1 (Hyper-V) directly, all the interfaces (UI, network,..) are handled
by L2 (Windows). Prior to the observed 'hang' Hyper-V (L1) programms
synthetic timer in KVM too far in the future but my guess is that it's
doing that on Windows' (L2) behalf, basically just relaying the
request. The issue only shows up with 'hv-reenlightenment' +
'hv-frequencies' (in QEMU's terms) features as in this case both Hyper-V
(L1) and Windows (L2) trust the information about TSC frequency from KVM
but the information turns out to be incorrect for Windows (L2).
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 778f82015f03..bfa366938c49 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2284,7 +2284,6 @@  static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
 				  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
 				  SECONDARY_EXEC_APIC_REGISTER_VIRT |
 				  SECONDARY_EXEC_ENABLE_VMFUNC |
-				  SECONDARY_EXEC_TSC_SCALING |
 				  SECONDARY_EXEC_DESC);
 
 		if (nested_cpu_has(vmcs12,