diff mbox

xen/time: fix gtime_to_gtsc for vtsc=1 PV guests

Message ID alpine.DEB.2.10.1604251308370.24872@sstabellini-ThinkPad-X260 (mailing list archive)
State New, archived
Headers show

Commit Message

Stefano Stabellini April 25, 2016, 12:19 p.m. UTC
On Mon, 25 Apr 2016, Jan Beulich wrote:
> >>> On 25.04.16 at 13:18, <sstabellini@kernel.org> wrote:
> > From: Jan Beulich <JBeulich@suse.com>
> > 
> > For vtsc=1 PV guests, rdtsc is trapped and calculated from get_s_time()
> > using gtime_to_gtsc. Similarly the tsc_timestamp, part of struct
> > vcpu_time_info, is calculated from stime_local_stamp using
> > gtime_to_gtsc.
> > 
> > However gtime_to_gtsc can return 0, if time < vtsc_offset, which can
> > actually happen when gtime_to_gtsc is called passing stime_local_stamp
> > (the caller function is __update_vcpu_system_time).
> > 
> > In that case the pvclock protocol doesn't work properly and the guest is
> > unable to calculate the system time correctly. As a consequence when the
> > guest tries to set a timer event (for example calling the
> > VCPUOP_set_singleshot_timer hypercall), the event will be in the past
> > causing Linux to hang.
> > 
> > The purpose of the pvclock protocol is to allow the guest to calculate
> > the system_time in nanosec correctly. The guest calculates as follow:
> > 
> >   from_vtsc_scale(rdtsc - vcpu_time_info.tsc_timestamp) + 
> > vcpu_time_info.system_time
> > 
> > Given that with vtsc=1:
> >   rdtsc = to_vtsc_scale(NOW() - vtsc_offset)
> >   vcpu_time_info.tsc_timestamp = to_vtsc_scale(vcpu_time_info.system_time - 
> > vtsc_offset)
> > 
> > The expression evaluates to NOW(), which is what we want.  However when
> > stime_local_stamp < vtsc_offset, vcpu_time_info.tsc_timestamp is
> > actually 0. As a consequence the calculated overall system_time is not
> > correct.
> > 
> > This patch fixes the issue by letting gtime_to_gtsc return a negative
> > integer in the form of a wrapped around unsigned integer, thus when the
> > guest subtracts vcpu_time_info.tsc_timestamp from rdtsc will calculate
> > the right value.
> > 
> > Signed-off-by: Jan Beulich <JBeulich@suse.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> Assuming you mean for this to go into 4.7, I've added Wei to Cc
> (and you should do so in case of re-submission).
> 
> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -1663,7 +1663,13 @@ custom_param("tsc", tsc_parse);
> >  u64 gtime_to_gtsc(struct domain *d, u64 time)
> >  {
> >      if ( !is_hvm_domain(d) )
> > +    {
> >          time = max_t(s64, time - d->arch.vtsc_offset, 0);
> 
> This line should have been deleted. While I'd be happy to do this
> while committing, its presence raises the question of whether
> things actually work as expected.

A mistake forward-porting the patch from 4.6. Sorry.
I tested the code again and works correctly.

The patch should be:
diff mbox

Patch

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 687e39b..6438b47 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1663,7 +1663,12 @@  custom_param("tsc", tsc_parse);
 u64 gtime_to_gtsc(struct domain *d, u64 time)
 {
     if ( !is_hvm_domain(d) )
-        time = max_t(s64, time - d->arch.vtsc_offset, 0);
+    {
+        if ( time < d->arch.vtsc_offset )
+            return -scale_delta(d->arch.vtsc_offset - time,
+                                &d->arch.ns_to_vtsc);
+        time -= d->arch.vtsc_offset;
+    }
     return scale_delta(time, &d->arch.ns_to_vtsc);
 }