diff mbox

PATCH: setup_vmcs_config: disable TSC scaling on unlike processors

Message ID CADJev7-PpTvF4Wj+TusLD=4Q6fVMphsY8bovoFBbJ72xqTZq8Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Don Bowman Dec. 2, 2016, 2:32 a.m. UTC
Add an enable_tsc_scaling parameter to kvm

Signed-off-by: Don Bowman <db@donbowman.ca>
---

My system has what i thought were two identical processors (same
stepping ID etc).
However, bafflingly, one of them has the ability to do TSC scaling,
and one does not (as reported in the vmcs).

This in turn causes kvm to give up entirely.

I feel a better solution is to mask off this capability on the one processor.
If enable_tsc_scaling is set false, the feature is ignored, and all
processors end up matching each other, enabling acceleration.

 static bool __read_mostly enable_pml = 1;
@@ -3449,6 +3452,8 @@ static __init int setup_vmcs_config(struct
vmcs_config *vmcs_conf)
        vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
        vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
        vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
+        if (!enable_tsc_scaling)
+            vmcs_conf->cpu_based_2nd_exec_ctrl &= ~SECONDARY_EXEC_TSC_SCALING;
        vmcs_conf->vmexit_ctrl         = _vmexit_control;
        vmcs_conf->vmentry_ctrl        = _vmentry_control;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Radim Krčmář Dec. 2, 2016, 3:07 p.m. UTC | #1
Subjects that touch only arch/x86/kvm/vmx.c are usually prefixed with
"[PATCH] KVM: VMX:".

2016-12-01 21:32-0500, Don Bowman:
> Add an enable_tsc_scaling parameter to kvm
> 
> Signed-off-by: Don Bowman <db@donbowman.ca>
> ---
> 
> My system has what i thought were two identical processors (same
> stepping ID etc).
> However, bafflingly, one of them has the ability to do TSC scaling,
> and one does not (as reported in the vmcs).

Wow, the chip doesn't sound trustworthy.

> This in turn causes kvm to give up entirely.
> 
> I feel a better solution is to mask off this capability on the one processor.

I like the global toggle better -- it is less code with more uses.

> If enable_tsc_scaling is set false, the feature is ignored, and all
> processors end up matching each other, enabling acceleration.
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> @@ -103,6 +103,9 @@ module_param_named(enable_shadow_vmcs,
> enable_shadow_vmcs, bool, S_IRUGO);
>  static bool __read_mostly nested = 0;
>  module_param(nested, bool, S_IRUGO);
> 
> +static bool __read_mostly enable_tsc_scaling = true;
> +module_param(enable_tsc_scaling, bool, S_IRUGO);

The "enable_" prefix is not needed, please do

  module_param_named(tsc_scaling, ...)

> +
>  static u64 __read_mostly host_xss;
> 
>  static bool __read_mostly enable_pml = 1;
> @@ -3449,6 +3452,8 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf)
>         vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
>         vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
>         vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
> +        if (!enable_tsc_scaling)
> +            vmcs_conf->cpu_based_2nd_exec_ctrl &= ~SECONDARY_EXEC_TSC_SCALING;

Move this to vmx_secondary_exec_control(), where we disable all other
features.

You'll notice kvm_has_tsc_control in hardware_setup() when clearing the
flag if it isn't supported by hardware, so I think your change would
make sense as a x86 kvm module option based on kvm_has_tsc_control.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Don Bowman Dec. 2, 2016, 7:10 p.m. UTC | #2
On 2 December 2016 at 10:07, Radim Krčmář <rkrcmar@redhat.com> wrote:
> Subjects that touch only arch/x86/kvm/vmx.c are usually prefixed with
> "[PATCH] KVM: VMX:".
>
> 2016-12-01 21:32-0500, Don Bowman:
>> Add an enable_tsc_scaling parameter to kvm
>>
>> Signed-off-by: Don Bowman <db@donbowman.ca>
>> ---
>>
>> My system has what i thought were two identical processors (same
>> stepping ID etc).
>> However, bafflingly, one of them has the ability to do TSC scaling,
>> and one does not (as reported in the vmcs).
>
> Wow, the chip doesn't sound trustworthy.

Its TSC scaling. The tray-version doesn't have it, the retail one has it
(2696 v3 vs 2699 v3).
The TSC is reliable on both as are the other flags.
As to why, I don't know. the stepping is identical.
This happens if you have an OEM server w/ 1 socket, and then populate
the other. I'm not sure Its a problem(?)

 ...

>
> You'll notice kvm_has_tsc_control in hardware_setup() when clearing the
> flag if it isn't supported by hardware, so I think your change would
> make sense as a x86 kvm module option based on kvm_has_tsc_control.
>
> Thanks.

I will look @ this and supply new potential patch.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Hildenbrand Dec. 6, 2016, 8:49 a.m. UTC | #3
Am 02.12.2016 um 16:07 schrieb Radim Krčmář:
> Subjects that touch only arch/x86/kvm/vmx.c are usually prefixed with
> "[PATCH] KVM: VMX:".
>
> 2016-12-01 21:32-0500, Don Bowman:
>> Add an enable_tsc_scaling parameter to kvm
>>
>> Signed-off-by: Don Bowman <db@donbowman.ca>
>> ---
>>
>> My system has what i thought were two identical processors (same
>> stepping ID etc).
>> However, bafflingly, one of them has the ability to do TSC scaling,
>> and one does not (as reported in the vmcs).
>
> Wow, the chip doesn't sound trustworthy.
>
>> This in turn causes kvm to give up entirely.
>>
>> I feel a better solution is to mask off this capability on the one processor.
>
> I like the global toggle better -- it is less code with more uses.

(not having much insight) what speaks against allowing such features 
only if available on all CPUs symmetrically? I actually prefer having an 
automatism to some magic toggle.
Paolo Bonzini Dec. 6, 2016, 9:09 a.m. UTC | #4
On 06/12/2016 09:49, David Hildenbrand wrote:
>>>
>>> I feel a better solution is to mask off this capability on the one
>>> processor.
>>
>> I like the global toggle better -- it is less code with more uses.
> 
> (not having much insight) what speaks against allowing such features
> only if available on all CPUs symmetrically? I actually prefer having an
> automatism to some magic toggle.

I agree with David.  Just warn and proceed with the minimum common set
of features.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Radim Krčmář Dec. 6, 2016, 11:08 a.m. UTC | #5
2016-12-06 10:09+0100, Paolo Bonzini:
> On 06/12/2016 09:49, David Hildenbrand wrote:
>>>>
>>>> I feel a better solution is to mask off this capability on the one
>>>> processor.
>>>
>>> I like the global toggle better -- it is less code with more uses.
>> 
>> (not having much insight) what speaks against allowing such features
>> only if available on all CPUs symmetrically? I actually prefer having an
>> automatism to some magic toggle.
> 
> I agree with David.  Just warn and proceed with the minimum common set
> of features.

Yes, that is reasonable -- I though David wanted to use the feature when
running on CPUs that support it (only mask off on the one).

I'd start with changing the check in vmx_check_processor_compat() to
ignore disabled features and then extend KVM to enable only the common
set.

Finding the minimal set of common features is complicated by hotplug ...
I think that the check we have is not working perfectly, so if you
currently

 1) offline all cores on the worse CPU
 2) load kvm module
 3) online all CPUs

then KVM is going enable tsc-scaling and not complain, but guests
running on onlined CPUs are not going to work.
Can you confirm that?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Hildenbrand Dec. 7, 2016, 11:37 a.m. UTC | #6
> Yes, that is reasonable -- I though David wanted to use the feature when
> running on CPUs that support it (only mask off on the one).
>
> I'd start with changing the check in vmx_check_processor_compat() to
> ignore disabled features and then extend KVM to enable only the common
> set.
>
> Finding the minimal set of common features is complicated by hotplug ...
> I think that the check we have is not working perfectly, so if you
> currently
>
>  1) offline all cores on the worse CPU
>  2) load kvm module
>  3) online all CPUs
>
> then KVM is going enable tsc-scaling and not complain, but guests
> running on onlined CPUs are not going to work.
> Can you confirm that?

My intuition tells me that whenever we hotplug CPUs that have less 
features than used, we are in trouble. So tsc scaling might just be the 
tip of the iceberg. I think this is a general problem.

What should happen if we hotplug such CPUs? We can't run existing VCPUs 
on them. And isn't this even a general problem, also for other tasks in 
the system (how is that problem solved with cpuid features?)?

(I am currently thinking about "virsh capabilities", could it happen 
that our "host" cpu model is no longer valid after we hotplugged cpus 
(as the common feature set got smaller)? that would be very ugly)
Radim Krčmář Dec. 7, 2016, 3:25 p.m. UTC | #7
2016-12-07 12:37+0100, David Hildenbrand:
>> Yes, that is reasonable -- I though David wanted to use the feature when
>> running on CPUs that support it (only mask off on the one).
>> 
>> I'd start with changing the check in vmx_check_processor_compat() to
>> ignore disabled features and then extend KVM to enable only the common
>> set.
>> 
>> Finding the minimal set of common features is complicated by hotplug ...
>> I think that the check we have is not working perfectly, so if you
>> currently
>> 
>>  1) offline all cores on the worse CPU
>>  2) load kvm module
>>  3) online all CPUs
>> 
>> then KVM is going enable tsc-scaling and not complain, but guests
>> running on onlined CPUs are not going to work.
>> Can you confirm that?
> 
> My intuition tells me that whenever we hotplug CPUs that have less features
> than used, we are in trouble. So tsc scaling might just be the tip of the
> iceberg. I think this is a general problem.

Definitely.  It was not handled in KVM probably because it doesn't have
a simple solution.

Finding the minimal common subset on hotplug will take extra work, which
is why relaxing the check and having a toggle for features that
shouldn't be enabled nor checked is easier.

> What should happen if we hotplug such CPUs? We can't run existing VCPUs on
> them. And isn't this even a general problem, also for other tasks in the
> system (how is that problem solved with cpuid features?)?

1) Prevent the hotplug -- admin can notice the error, kill guests or
   decide to let them finish, and then online hotplugged CPU.

2) Just warn and trust that admin knows what hotplugging to non-SMP
   means.

(2) is less work and give a bit more freedom to an undesirable case, so
I slightly prefer it.  I wouldn't for example limit existing VCPUs to
compatible CPUs or cleanly kill all guests from the kernel.

> (I am currently thinking about "virsh capabilities", could it happen that
> our "host" cpu model is no longer valid after we hotplugged cpus (as the
> common feature set got smaller)? that would be very ugly)

I think it could, but I'd continue in thinking only about SMP.  VMX
features are not even noticed by `virsh capabilities`, so finding the
minimal common subset in KVM would not affect the output.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Hildenbrand Dec. 8, 2016, 11:46 a.m. UTC | #8
>> My intuition tells me that whenever we hotplug CPUs that have less features
>> than used, we are in trouble. So tsc scaling might just be the tip of the
>> iceberg. I think this is a general problem.
>
> Definitely.  It was not handled in KVM probably because it doesn't have
> a simple solution.
>
> Finding the minimal common subset on hotplug will take extra work, which
> is why relaxing the check and having a toggle for features that
> shouldn't be enabled nor checked is easier.

But even with a toggle, the problem of not knowing what will be 
hotplugged persists. Before hotplugging, the admin would have to restart 
all guests after flipping the toggle.

>
>> What should happen if we hotplug such CPUs? We can't run existing VCPUs on
>> them. And isn't this even a general problem, also for other tasks in the
>> system (how is that problem solved with cpuid features?)?
>
> 1) Prevent the hotplug -- admin can notice the error, kill guests or
>    decide to let them finish, and then online hotplugged CPU.
>
> 2) Just warn and trust that admin knows what hotplugging to non-SMP
>    means.
>
> (2) is less work and give a bit more freedom to an undesirable case, so
> I slightly prefer it.  I wouldn't for example limit existing VCPUs to
> compatible CPUs or cleanly kill all guests from the kernel.

And I assume that such scenarios are also quite unlikely. Or is it a 
common practice in production to hotplug random CPUs from the shelf? I 
doubt it.

>
>> (I am currently thinking about "virsh capabilities", could it happen that
>> our "host" cpu model is no longer valid after we hotplugged cpus (as the
>> common feature set got smaller)? that would be very ugly)
>
> I think it could, but I'd continue in thinking only about SMP.  VMX
> features are not even noticed by `virsh capabilities`, so finding the
> minimal common subset in KVM would not affect the output.

Do you know if we can hotplug CPUs with differing CPUID features on x86?
Radim Krčmář Dec. 8, 2016, 2:32 p.m. UTC | #9
2016-12-08 12:46+0100, David Hildenbrand:
>> > My intuition tells me that whenever we hotplug CPUs that have less features
>> > than used, we are in trouble. So tsc scaling might just be the tip of the
>> > iceberg. I think this is a general problem.
>> 
>> Definitely.  It was not handled in KVM probably because it doesn't have
>> a simple solution.
>> 
>> Finding the minimal common subset on hotplug will take extra work, which
>> is why relaxing the check and having a toggle for features that
>> shouldn't be enabled nor checked is easier.
> 
> But even with a toggle, the problem of not knowing what will be hotplugged
> persists. Before hotplugging, the admin would have to restart all guests
> after flipping the toggle.

Yes.  The admin would need that situation and disable incompatible
features in advance (KVM would just provide the option to do so).
The toggle would be a readonly kvm_module parameter, like we have for
ept and other features.

>> > What should happen if we hotplug such CPUs? We can't run existing VCPUs on
>> > them. And isn't this even a general problem, also for other tasks in the
>> > system (how is that problem solved with cpuid features?)?
>> 
>> 1) Prevent the hotplug -- admin can notice the error, kill guests or
>>    decide to let them finish, and then online hotplugged CPU.
>> 
>> 2) Just warn and trust that admin knows what hotplugging to non-SMP
>>    means.
>> 
>> (2) is less work and give a bit more freedom to an undesirable case, so
>> I slightly prefer it.  I wouldn't for example limit existing VCPUs to
>> compatible CPUs or cleanly kill all guests from the kernel.
> 
> And I assume that such scenarios are also quite unlikely. Or is it a common
> practice in production to hotplug random CPUs from the shelf? I doubt it.

Absurdly unlikely.  I hope it never happens with serious intentions.
(It is amusing to think about, but considering this situation is a waste
 of time for practical purposes.)

>> > (I am currently thinking about "virsh capabilities", could it happen that
>> > our "host" cpu model is no longer valid after we hotplugged cpus (as the
>> > common feature set got smaller)? that would be very ugly)
>> 
>> I think it could, but I'd continue in thinking only about SMP.  VMX
>> features are not even noticed by `virsh capabilities`, so finding the
>> minimal common subset in KVM would not affect the output.
> 
> Do you know if we can hotplug CPUs with differing CPUID features on x86?

Linux doesn't seem to do any sanity checks and it uses globals for CPUID
features and assumes that they are all the same, so we'd be getting what
we deserve.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5382b82..5ace575 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -103,6 +103,9 @@  module_param_named(enable_shadow_vmcs,
enable_shadow_vmcs, bool, S_IRUGO);
 static bool __read_mostly nested = 0;
 module_param(nested, bool, S_IRUGO);

+static bool __read_mostly enable_tsc_scaling = true;
+module_param(enable_tsc_scaling, bool, S_IRUGO);
+
 static u64 __read_mostly host_xss;