Message ID | 1489692926-10719-1-git-send-email-boris.ostrovsky@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, > d->arch.vtsc_offset = get_s_time() - elapsed_nsec; > d->arch.tsc_khz = gtsc_khz ?: cpu_khz; > set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); > - /* > - * In default mode use native TSC if the host has safe TSC and: > - * HVM/PVH: host and guest frequencies are the same (either > - * "naturally" or via TSC scaling) > - * PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz) > - */ > + > + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); Hmm, is this valid for other than TSC_MODE_DEFAULT? Jan > if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() && > - (has_hvm_container_domain(d) ? > - (d->arch.tsc_khz == cpu_khz || > - hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) : > - incarnation == 0) ) > + (d->arch.tsc_khz == cpu_khz || > + (has_hvm_container_domain(d) && > + hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) ) > { > case TSC_MODE_NEVER_EMULATE: > d->arch.vtsc = 0;
On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >> --- a/xen/arch/x86/time.c >> +++ b/xen/arch/x86/time.c >> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >> - /* >> - * In default mode use native TSC if the host has safe TSC and: >> - * HVM/PVH: host and guest frequencies are the same (either >> - * "naturally" or via TSC scaling) >> - * PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz) >> - */ >> + >> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); > Hmm, is this valid for other than TSC_MODE_DEFAULT? It is valid for all modes but I thought that the ASSERT is really only "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we decide whether or not to set vtsc. Since I need to rebase this anyway (due to PVH1 removal) I can move this down right after the switch if you feel it would be useful. -boris > > Jan > >> if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() && >> - (has_hvm_container_domain(d) ? >> - (d->arch.tsc_khz == cpu_khz || >> - hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) : >> - incarnation == 0) ) >> + (d->arch.tsc_khz == cpu_khz || >> + (has_hvm_container_domain(d) && >> + hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) ) >> { >> case TSC_MODE_NEVER_EMULATE: >> d->arch.vtsc = 0; >
>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote: > On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >>> --- a/xen/arch/x86/time.c >>> +++ b/xen/arch/x86/time.c >>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >>> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >>> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >>> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >>> - /* >>> - * In default mode use native TSC if the host has safe TSC and: >>> - * HVM/PVH: host and guest frequencies are the same (either >>> - * "naturally" or via TSC scaling) >>> - * PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz) >>> - */ >>> + >>> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); >> Hmm, is this valid for other than TSC_MODE_DEFAULT? > > It is valid for all modes but I thought that the ASSERT is really only > "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we > decide whether or not to set vtsc. > > Since I need to rebase this anyway (due to PVH1 removal) I can move this > down right after the switch if you feel it would be useful. Actually I think the other way around: For ALWAYS_EMULATE as well as for PVRDTSCP I don't think the assertion is valid, the more that d->arch.tsc_khz gets set from input to the function. That last fact actually makes the ASSERT() dubious in all cases, I'm afraid. Jan
On 03/17/2017 10:24 AM, Jan Beulich wrote: >>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote: >> On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >>>> --- a/xen/arch/x86/time.c >>>> +++ b/xen/arch/x86/time.c >>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >>>> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >>>> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >>>> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >>>> - /* >>>> - * In default mode use native TSC if the host has safe TSC and: >>>> - * HVM/PVH: host and guest frequencies are the same (either >>>> - * "naturally" or via TSC scaling) >>>> - * PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz) >>>> - */ >>>> + >>>> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); >>> Hmm, is this valid for other than TSC_MODE_DEFAULT? >> It is valid for all modes but I thought that the ASSERT is really only >> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we >> decide whether or not to set vtsc. >> >> Since I need to rebase this anyway (due to PVH1 removal) I can move this >> down right after the switch if you feel it would be useful. > Actually I think the other way around: For ALWAYS_EMULATE as > well as for PVRDTSCP I don't think the assertion is valid, the more > that d->arch.tsc_khz gets set from input to the function. That last > fact actually makes the ASSERT() dubious in all cases, I'm afraid. It is valid (in the sense that it will evaluate to true) because we always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0 from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz. -boris
>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote: > On 03/17/2017 10:24 AM, Jan Beulich wrote: >>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote: >>> On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >>>>> --- a/xen/arch/x86/time.c >>>>> +++ b/xen/arch/x86/time.c >>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >>>>> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >>>>> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >>>>> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >>>>> - /* >>>>> - * In default mode use native TSC if the host has safe TSC and: >>>>> - * HVM/PVH: host and guest frequencies are the same (either >>>>> - * "naturally" or via TSC scaling) >>>>> - * PV: guest has not migrated yet (and thus arch.tsc_khz == > cpu_khz) >>>>> - */ >>>>> + >>>>> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); >>>> Hmm, is this valid for other than TSC_MODE_DEFAULT? >>> It is valid for all modes but I thought that the ASSERT is really only >>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we >>> decide whether or not to set vtsc. >>> >>> Since I need to rebase this anyway (due to PVH1 removal) I can move this >>> down right after the switch if you feel it would be useful. >> Actually I think the other way around: For ALWAYS_EMULATE as >> well as for PVRDTSCP I don't think the assertion is valid, the more >> that d->arch.tsc_khz gets set from input to the function. That last >> fact actually makes the ASSERT() dubious in all cases, I'm afraid. > > It is valid (in the sense that it will evaluate to true) because we > always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0 > from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz. It is valid for this specific call. A malicious tool stack could easily pass incarnation zero to the domctl together with a random gtsc_khz. Jan
On 03/17/2017 10:56 AM, Jan Beulich wrote: >>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote: >> On 03/17/2017 10:24 AM, Jan Beulich wrote: >>>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote: >>>> On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >>>>>> --- a/xen/arch/x86/time.c >>>>>> +++ b/xen/arch/x86/time.c >>>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >>>>>> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >>>>>> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >>>>>> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >>>>>> - /* >>>>>> - * In default mode use native TSC if the host has safe TSC and: >>>>>> - * HVM/PVH: host and guest frequencies are the same (either >>>>>> - * "naturally" or via TSC scaling) >>>>>> - * PV: guest has not migrated yet (and thus arch.tsc_khz == >> cpu_khz) >>>>>> - */ >>>>>> + >>>>>> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); >>>>> Hmm, is this valid for other than TSC_MODE_DEFAULT? >>>> It is valid for all modes but I thought that the ASSERT is really only >>>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we >>>> decide whether or not to set vtsc. >>>> >>>> Since I need to rebase this anyway (due to PVH1 removal) I can move this >>>> down right after the switch if you feel it would be useful. >>> Actually I think the other way around: For ALWAYS_EMULATE as >>> well as for PVRDTSCP I don't think the assertion is valid, the more >>> that d->arch.tsc_khz gets set from input to the function. That last >>> fact actually makes the ASSERT() dubious in all cases, I'm afraid. >> It is valid (in the sense that it will evaluate to true) because we >> always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0 >> from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz. > It is valid for this specific call. A malicious tool stack could easily > pass incarnation zero to the domctl together with a random > gtsc_khz. So how do you want to go about this then? Original (but rebased) patch, remove incarnation check and add a comment stating that there is no need to check it? -boris
>>> On 17.03.17 at 16:13, <boris.ostrovsky@oracle.com> wrote: > On 03/17/2017 10:56 AM, Jan Beulich wrote: >>>>> On 17.03.17 at 15:50, <boris.ostrovsky@oracle.com> wrote: >>> On 03/17/2017 10:24 AM, Jan Beulich wrote: >>>>>>> On 17.03.17 at 14:36, <boris.ostrovsky@oracle.com> wrote: >>>>> On 03/17/2017 03:48 AM, Jan Beulich wrote: >>>>>>>>> On 16.03.17 at 20:35, <boris.ostrovsky@oracle.com> wrote: >>>>>>> --- a/xen/arch/x86/time.c >>>>>>> +++ b/xen/arch/x86/time.c >>>>>>> @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, >>>>>>> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >>>>>>> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >>>>>>> set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >>>>>>> - /* >>>>>>> - * In default mode use native TSC if the host has safe TSC and: >>>>>>> - * HVM/PVH: host and guest frequencies are the same (either >>>>>>> - * "naturally" or via TSC scaling) >>>>>>> - * PV: guest has not migrated yet (and thus arch.tsc_khz == >>> cpu_khz) >>>>>>> - */ >>>>>>> + >>>>>>> + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); >>>>>> Hmm, is this valid for other than TSC_MODE_DEFAULT? >>>>> It is valid for all modes but I thought that the ASSERT is really only >>>>> "interesting" for DEFAULT and ALWAYS_EMULATE since this is when we >>>>> decide whether or not to set vtsc. >>>>> >>>>> Since I need to rebase this anyway (due to PVH1 removal) I can move this >>>>> down right after the switch if you feel it would be useful. >>>> Actually I think the other way around: For ALWAYS_EMULATE as >>>> well as for PVRDTSCP I don't think the assertion is valid, the more >>>> that d->arch.tsc_khz gets set from input to the function. That last >>>> fact actually makes the ASSERT() dubious in all cases, I'm afraid. >>> It is valid (in the sense that it will evaluate to true) because we >>> always first call tsc_set_info with DEFAULT mode and with gtsc_khz=0 >>> from arch_domain_create(). So d->arch.tsc_khz will be primed to cpu_khz. >> It is valid for this specific call. A malicious tool stack could easily >> pass incarnation zero to the domctl together with a random >> gtsc_khz. > > > So how do you want to go about this then? Original (but rebased) patch, > remove incarnation check and add a comment stating that there is no need > to check it? v2 patch with ASSERT() changed to comment, I would say. Jan
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index faa638b..46a00f6 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -2051,17 +2051,12 @@ void tsc_set_info(struct domain *d, d->arch.vtsc_offset = get_s_time() - elapsed_nsec; d->arch.tsc_khz = gtsc_khz ?: cpu_khz; set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); - /* - * In default mode use native TSC if the host has safe TSC and: - * HVM/PVH: host and guest frequencies are the same (either - * "naturally" or via TSC scaling) - * PV: guest has not migrated yet (and thus arch.tsc_khz == cpu_khz) - */ + + ASSERT(incarnation || d->arch.tsc_khz == cpu_khz); if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() && - (has_hvm_container_domain(d) ? - (d->arch.tsc_khz == cpu_khz || - hvm_get_tsc_scaling_ratio(d->arch.tsc_khz)) : - incarnation == 0) ) + (d->arch.tsc_khz == cpu_khz || + (has_hvm_container_domain(d) && + hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) ) { case TSC_MODE_NEVER_EMULATE: d->arch.vtsc = 0;
Commit 82713ec8d2 ("x86: use native RDTSC(P) execution when guest and host frequencies are the same") left out optimization for PV guests when host and guest run at the same frequency. For such a case we should be able not to use virtual TSC regardless of whether we are runing before or after a migration (i.e. regardless of incarnation value). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Jan Beulich <JBeulich@suse.com> --- Changes in v2: * Replaced unnecessary incarnation test with an ASSERT xen/arch/x86/time.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)