diff mbox

[v2] x86/time: Don't use virtual TSC if host and guest frequencies are equal

Message ID 1489692926-10719-1-git-send-email-boris.ostrovsky@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Boris Ostrovsky March 16, 2017, 7:35 p.m. UTC
Commit 82713ec8d2 ("x86: use native RDTSC(P) execution when guest and
host frequencies are the same") left out optimization for PV guests
when host and guest run at the same frequency.

For such a case we should be able not to use virtual TSC regardless
of whether we are runing before or after a migration (i.e. regardless
of incarnation value).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Jan Beulich <JBeulich@suse.com>
---
Changes in v2:
 * Replaced unnecessary incarnation test with an ASSERT

 xen/arch/x86/time.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

Comments

Jan Beulich March 17, 2017, 7:48 a.m. UTC | #1
>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
> -        /*
> -         * In default mode use native TSC if the host has safe TSC and:
> -         *  HVM/PVH: host and guest frequencies are the same (either
> -         *           "naturally" or via TSC scaling)
> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
> -         */
> +
> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);

Hmm, is this valid for other than TSC_MODE_DEFAULT?

Jan

>          if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
> -             (has_hvm_container_domain(d) ?
> -              (d->arch.tsc_khz == cpu_khz ||
> -               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
> -              incarnation == 0) )
> +             (d->arch.tsc_khz == cpu_khz ||
> +              (has_hvm_container_domain(d) &&
> +               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
>          {
>      case TSC_MODE_NEVER_EMULATE:
>              d->arch.vtsc = 0;
Boris Ostrovsky March 17, 2017, 1:36 p.m. UTC | #2
On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>> --- a/xen/arch/x86/time.c
>> +++ b/xen/arch/x86/time.c
>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>> -        /*
>> -         * In default mode use native TSC if the host has safe TSC and:
>> -         *  HVM/PVH: host and guest frequencies are the same (either
>> -         *           "naturally" or via TSC scaling)
>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
>> -         */
>> +
>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
> Hmm, is this valid for other than TSC_MODE_DEFAULT?

It is valid for all modes but I thought that the ASSERT is really only
"interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
decide whether or not to set vtsc.

Since I need to rebase this anyway (due to PVH1 removal) I can move this
down right after the switch if you feel it would be useful.

-boris


>
> Jan
>
>>          if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
>> -             (has_hvm_container_domain(d) ?
>> -              (d->arch.tsc_khz == cpu_khz ||
>> -               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
>> -              incarnation == 0) )
>> +             (d->arch.tsc_khz == cpu_khz ||
>> +              (has_hvm_container_domain(d) &&
>> +               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
>>          {
>>      case TSC_MODE_NEVER_EMULATE:
>>              d->arch.vtsc = 0;
>
Jan Beulich March 17, 2017, 2:24 p.m. UTC | #3
>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote:
> On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>>> --- a/xen/arch/x86/time.c
>>> +++ b/xen/arch/x86/time.c
>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>>> -        /*
>>> -         * In default mode use native TSC if the host has safe TSC and:
>>> -         *  HVM/PVH: host and guest frequencies are the same (either
>>> -         *           "naturally" or via TSC scaling)
>>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
>>> -         */
>>> +
>>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
>> Hmm, is this valid for other than TSC_MODE_DEFAULT?
> 
> It is valid for all modes but I thought that the ASSERT is really only
> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
> decide whether or not to set vtsc.
> 
> Since I need to rebase this anyway (due to PVH1 removal) I can move this
> down right after the switch if you feel it would be useful.

Actually I think the other way around: For ALWAYS_EMULATE as
well as for PVRDTSCP I don't think the assertion is valid, the more
that d->arch.tsc_khz gets set from input to the function. That last
fact actually makes the ASSERT() dubious in all cases, I'm afraid.

Jan
Boris Ostrovsky March 17, 2017, 2:50 p.m. UTC | #4
On 03/17/2017 10:24 AM, Jan Beulich wrote:
>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote:
>> On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>>>> --- a/xen/arch/x86/time.c
>>>> +++ b/xen/arch/x86/time.c
>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>>>> -        /*
>>>> -         * In default mode use native TSC if the host has safe TSC and:
>>>> -         *  HVM/PVH: host and guest frequencies are the same (either
>>>> -         *           "naturally" or via TSC scaling)
>>>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
>>>> -         */
>>>> +
>>>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
>>> Hmm, is this valid for other than TSC_MODE_DEFAULT?
>> It is valid for all modes but I thought that the ASSERT is really only
>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
>> decide whether or not to set vtsc.
>>
>> Since I need to rebase this anyway (due to PVH1 removal) I can move this
>> down right after the switch if you feel it would be useful.
> Actually I think the other way around: For ALWAYS_EMULATE as
> well as for PVRDTSCP I don't think the assertion is valid, the more
> that d->arch.tsc_khz gets set from input to the function. That last
> fact actually makes the ASSERT() dubious in all cases, I'm afraid.

It is valid (in the sense that it will evaluate to true) because we
always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0
from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz.

-boris
Jan Beulich March 17, 2017, 2:56 p.m. UTC | #5
>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote:
> On 03/17/2017 10:24 AM, Jan Beulich wrote:
>>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote:
>>> On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>>>>> --- a/xen/arch/x86/time.c
>>>>> +++ b/xen/arch/x86/time.c
>>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>>>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>>>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>>>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>>>>> -        /*
>>>>> -         * In default mode use native TSC if the host has safe TSC and:
>>>>> -         *  HVM/PVH: host and guest frequencies are the same (either
>>>>> -         *           "naturally" or via TSC scaling)
>>>>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == 
> cpu_khz)
>>>>> -         */
>>>>> +
>>>>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
>>>> Hmm, is this valid for other than TSC_MODE_DEFAULT?
>>> It is valid for all modes but I thought that the ASSERT is really only
>>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
>>> decide whether or not to set vtsc.
>>>
>>> Since I need to rebase this anyway (due to PVH1 removal) I can move this
>>> down right after the switch if you feel it would be useful.
>> Actually I think the other way around: For ALWAYS_EMULATE as
>> well as for PVRDTSCP I don't think the assertion is valid, the more
>> that d->arch.tsc_khz gets set from input to the function. That last
>> fact actually makes the ASSERT() dubious in all cases, I'm afraid.
> 
> It is valid (in the sense that it will evaluate to true) because we
> always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0
> from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz.

It is valid for this specific call. A malicious tool stack could easily
pass incarnation zero to the domctl together with a random
gtsc_khz.

Jan
Boris Ostrovsky March 17, 2017, 3:13 p.m. UTC | #6
On 03/17/2017 10:56 AM, Jan Beulich wrote:
>>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote:
>> On 03/17/2017 10:24 AM, Jan Beulich wrote:
>>>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote:
>>>> On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>>>>>> --- a/xen/arch/x86/time.c
>>>>>> +++ b/xen/arch/x86/time.c
>>>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>>>>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>>>>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>>>>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>>>>>> -        /*
>>>>>> -         * In default mode use native TSC if the host has safe TSC and:
>>>>>> -         *  HVM/PVH: host and guest frequencies are the same (either
>>>>>> -         *           "naturally" or via TSC scaling)
>>>>>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == 
>> cpu_khz)
>>>>>> -         */
>>>>>> +
>>>>>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
>>>>> Hmm, is this valid for other than TSC_MODE_DEFAULT?
>>>> It is valid for all modes but I thought that the ASSERT is really only
>>>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
>>>> decide whether or not to set vtsc.
>>>>
>>>> Since I need to rebase this anyway (due to PVH1 removal) I can move this
>>>> down right after the switch if you feel it would be useful.
>>> Actually I think the other way around: For ALWAYS_EMULATE as
>>> well as for PVRDTSCP I don't think the assertion is valid, the more
>>> that d->arch.tsc_khz gets set from input to the function. That last
>>> fact actually makes the ASSERT() dubious in all cases, I'm afraid.
>> It is valid (in the sense that it will evaluate to true) because we
>> always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0
>> from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz.
> It is valid for this specific call. A malicious tool stack could easily
> pass incarnation zero to the domctl together with a random
> gtsc_khz.


So how do you want to go about this then? Original (but rebased) patch,
remove incarnation check and add a comment stating that there is no need
to check it?

-boris
Jan Beulich March 17, 2017, 3:20 p.m. UTC | #7
>>> On 17.03.17 at 16:13, <boris.ostrovsky@oracle.com> wrote:
> On 03/17/2017 10:56 AM, Jan Beulich wrote:
>>>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote:
>>> On 03/17/2017 10:24 AM, Jan Beulich wrote:
>>>>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote:
>>>>> On 03/17/2017 03:48 AM, Jan Beulich wrote:
>>>>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote:
>>>>>>> --- a/xen/arch/x86/time.c
>>>>>>> +++ b/xen/arch/x86/time.c
>>>>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d,
>>>>>>>          d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
>>>>>>>          d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
>>>>>>>          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
>>>>>>> -        /*
>>>>>>> -         * In default mode use native TSC if the host has safe TSC and:
>>>>>>> -         *  HVM/PVH: host and guest frequencies are the same (either
>>>>>>> -         *           "naturally" or via TSC scaling)
>>>>>>> -         *  PV: guest has not migrated yet (and thus arch.tsc_khz == 
>>> cpu_khz)
>>>>>>> -         */
>>>>>>> +
>>>>>>> +        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
>>>>>> Hmm, is this valid for other than TSC_MODE_DEFAULT?
>>>>> It is valid for all modes but I thought that the ASSERT is really only
>>>>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we
>>>>> decide whether or not to set vtsc.
>>>>>
>>>>> Since I need to rebase this anyway (due to PVH1 removal) I can move this
>>>>> down right after the switch if you feel it would be useful.
>>>> Actually I think the other way around: For ALWAYS_EMULATE as
>>>> well as for PVRDTSCP I don't think the assertion is valid, the more
>>>> that d->arch.tsc_khz gets set from input to the function. That last
>>>> fact actually makes the ASSERT() dubious in all cases, I'm afraid.
>>> It is valid (in the sense that it will evaluate to true) because we
>>> always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0
>>> from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz.
>> It is valid for this specific call. A malicious tool stack could easily
>> pass incarnation zero to the domctl together with a random
>> gtsc_khz.
> 
> 
> So how do you want to go about this then? Original (but rebased) patch,
> remove incarnation check and add a comment stating that there is no need
> to check it?

v2 patch with ASSERT() changed to comment, I would say.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index faa638b..46a00f6 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2051,17 +2051,12 @@  void tsc_set_info(struct domain *d,
         d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
         d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
         set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
-        /*
-         * In default mode use native TSC if the host has safe TSC and:
-         *  HVM/PVH: host and guest frequencies are the same (either
-         *           "naturally" or via TSC scaling)
-         *  PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz)
-         */
+
+        ASSERT(incarnation || d->arch.tsc_khz == cpu_khz);
         if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
-             (has_hvm_container_domain(d) ?
-              (d->arch.tsc_khz == cpu_khz ||
-               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) :
-              incarnation == 0) )
+             (d->arch.tsc_khz == cpu_khz ||
+              (has_hvm_container_domain(d) &&
+               hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
         {
     case TSC_MODE_NEVER_EMULATE:
             d->arch.vtsc = 0;