From patchwork Wed Sep 14 17:37:49 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 9332189 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F183A607FD for ; Wed, 14 Sep 2016 17:39:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E6DC12A246 for ; Wed, 14 Sep 2016 17:39:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DBA6C2A24F; Wed, 14 Sep 2016 17:39:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6795E2A246 for ; Wed, 14 Sep 2016 17:39:17 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bkE7L-00013Y-Fl; Wed, 14 Sep 2016 17:37:03 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bkE7K-00011s-AG for xen-devel@lists.xenproject.org; Wed, 14 Sep 2016 17:37:02 +0000 Received: from [85.158.143.35] by server-9.bemta-6.messagelabs.com id D5/9E-28857-DBA89D75; Wed, 14 Sep 2016 17:37:01 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrLLMWRWlGSWpSXmKPExsXSO6nOVXdP181 wg8XLNS2+b5nM5MDocfjDFZYAxijWzLyk/IoE1oz/ZyexF3xQq1jcd4+xgXGCfBcjF4eQwEQm iVmbG5khnG+MEvtWHWSCcDYySsztb2SFcBoZJdYvOgqU4eRgE9CTaD3/mRnEFhFQkri3ajJYB 7NAB6PEl/OnwIqEBZwkVh4+ytLFyMHBIqAqsbpZHiTMK+AhsXPqShYQW0JATuL88Z9gczgFPC V+3PgMFhcCqjnw4BYzRI2hxOeNS5knMPItYGRYxahRnFpUllqka2Ssl1SUmZ5RkpuYmaNraGC ml5taXJyYnpqTmFSsl5yfu4kRGCwMQLCD8c/8wEOMkhxMSqK8pcE3w4X4kvJTKjMSizPii0pz UosPMcpwcChJ8D7vBMoJFqWmp1akZeYAwxYmLcHBoyTCex4kzVtckJhbnJkOkTrFqCglzvsIJ CEAksgozYNrg8XKJUZZKWFeRqBDhHgKUotyM0tQ5V8xinMwKgnzFoJM4cnMK4Gb/gpoMRPQ4i 1rroMsLklESEk1MOYdSb3W4qPdyXv365motVXxsXP6fANfLbvw8c2lyQ6954P+Te+acrUqds3 15zFryuyM7687sXTf4v0bCndtzZv38b3t2i9vSrm3CxwJ+2WTwbEobtKt+eKx+m85nhxQezbx eMlt/d8sk04Ly4ln5hgFF7TIcPQdKdj8JWT/4if2k7S+XE/a8GuVEktxRqKhFnNRcSIAaNipc JACAAA= X-Env-Sender: joao.m.martins@oracle.com X-Msg-Ref: server-5.tower-21.messagelabs.com!1473874619!33255507!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 8.84; banners=-,-,- X-VirusChecked: Checked Received: (qmail 38836 invoked from network); 14 Sep 2016 17:37:00 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-5.tower-21.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 14 Sep 2016 17:37:00 -0000 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u8EHavUZ014521 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2016 17:36:58 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.13.8) with ESMTP id u8EHauOF019700 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 14 Sep 2016 17:36:57 GMT Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u8EHasLx009911; Wed, 14 Sep 2016 17:36:54 GMT Received: from paddy.lan (/89.114.92.174) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 14 Sep 2016 10:36:54 -0700 From: Joao Martins To: xen-devel@lists.xenproject.org Date: Wed, 14 Sep 2016 18:37:49 +0100 Message-Id: <1473874670-4986-5-git-send-email-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1473874670-4986-1-git-send-email-joao.m.martins@oracle.com> References: <1473874670-4986-1-git-send-email-joao.m.martins@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] Cc: Andrew Cooper , Joao Martins , Jan Beulich Subject: [Xen-devel] [PATCH v4 4/5] x86/time: implement PVCLOCK_TSC_STABLE_BIT X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch proposes relying on host TSC synchronization and passthrough to the guest, when running on a TSC-safe platform. On time_calibration we retrieve the platform time in ns and the counter read by the clocksource that was used to compute system time. We introduce a new rendezous function which doesn't require synchronization between master and slave CPUS and just reads calibration_rendezvous struct and writes it down the stime and stamp to the cpu_calibration struct to be used later on. We can guarantee that on a platform with a constant and reliable TSC, that the time read on vcpu B right after A is bigger independently of the VCPU calibration error. Since pvclock time infos are monotonic as seen by any vCPU set PVCLOCK_TSC_STABLE_BIT, which then enables usage of VDSO on Linux. IIUC, this is similar to how it's implemented on KVM. Add also a comment regarding this bit changing and that guests are expected to check this bit on every read. Should note that I've yet to see time going backwards in a long running test for 2 weeks (in a dual socket machine), plus few other tests I did on older platforms, including migration. Signed-off-by: Joao Martins Reviewed-by: Jan Beulich --- Cc: Jan Beulich Cc: Andrew Cooper Changes since v3: - Do not adjust time_calibration_rendezvous_tail for nop_rendezvous and instead set cpu_time_stamp directly on the rendezvous function. - Move CPU Hotplug checks into patch 2 - Add a commit and code comment regarding guests cope with this bit changing on hosts. - s/host_tsc_is_clocksource/clocksource_is_tsc Changes since v2: - Add XEN_ prefix to pvclock flags. - Adapter time_calibration_rendezvous_tail to have the case of setting master tsc/stime and use it for the nop_rendezvous. - Removed hotplug CPU option that was added in v1 - Prevent online of CPUs when clocksource is tsc. - Remove use_tsc_stable_bit, since clocksource is only used to seed values. So instead we test if hotplug is possible, and prevent clocksource=tsc to be used. - Remove 1st paragrah of commit message since the behaviour described no longer applies since b64438c. Changes since v1: - Change approach to skip std_rendezvous by introducing a nop_rendezvous - Change commit message reflecting the change above. - Use TSC_STABLE_BIT only if cpu hotplug isn't possible. - Add command line option to override it if no cpu hotplug is intended. --- xen/arch/x86/time.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index af9e31f..0c1badc 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -951,6 +951,14 @@ static void __update_vcpu_system_time(struct vcpu *v, int force) _u.tsc_timestamp = tsc_stamp; _u.system_time = t->stamp.local_stime; + /* + * It's expected that domains cope with this bit changing on every + * pvclock read to check whether they can resort solely on this tuple + * or if it further requires monotonicity checks with other vcpus. + */ + if ( clocksource_is_tsc() ) + _u.flags |= XEN_PVCLOCK_TSC_STABLE_BIT; + if ( is_hvm_domain(d) ) _u.tsc_timestamp += v->arch.hvm_vcpu.cache_tsc_offset; @@ -1409,6 +1417,22 @@ static void time_calibration_std_rendezvous(void *_r) time_calibration_rendezvous_tail(r); } +/* + * Rendezvous function used when clocksource is TSC and + * no CPU hotplug will be performed. + */ +static void time_calibration_nop_rendezvous(void *rv) +{ + const struct calibration_rendezvous *r = rv; + struct cpu_time_stamp *c = &this_cpu(cpu_calibration); + + c->local_tsc = r->master_tsc_stamp; + c->local_stime = r->master_stime; + c->master_stime = r->master_stime; + + raise_softirq(TIME_CALIBRATE_SOFTIRQ); +} + static void (*time_calibration_rendezvous_fn)(void *) = time_calibration_std_rendezvous; @@ -1418,6 +1442,13 @@ static void time_calibration(void *unused) .semaphore = ATOMIC_INIT(0) }; + if ( clocksource_is_tsc() ) + { + local_irq_disable(); + r.master_stime = read_platform_stime(&r.master_tsc_stamp); + local_irq_enable(); + } + cpumask_copy(&r.cpu_calibration_map, &cpu_online_map); /* @wait=1 because we must wait for all cpus before freeing @r. */ @@ -1587,6 +1618,13 @@ static int __init verify_tsc_reliability(void) */ on_selected_cpus(&cpu_online_map, reset_percpu_time, NULL, 1); + /* + * We won't do CPU Hotplug and TSC clocksource is being used which + * means we have a reliable TSC, plus we don't sync with any other + * clocksource so no need for rendezvous. + */ + time_calibration_rendezvous_fn = time_calibration_nop_rendezvous; + /* Finish platform timer switch. */ try_platform_timer_tail();