From patchwork Tue Jan 7 23:43:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anchal Agarwal X-Patchwork-Id: 11322387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1E95138C for ; Tue, 7 Jan 2020 23:44:26 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CE6522075A for ; Tue, 7 Jan 2020 23:44:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="vFyaKuFv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CE6522075A Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ioyVb-0002Ci-Pd; Tue, 07 Jan 2020 23:43:35 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ioyVa-0002CU-Hs for xen-devel@lists.xenproject.org; Tue, 07 Jan 2020 23:43:34 +0000 X-Inumbo-ID: 843d1a0a-31a7-11ea-b56d-bc764e2007e4 Received: from smtp-fw-9102.amazon.com (unknown [207.171.184.29]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 843d1a0a-31a7-11ea-b56d-bc764e2007e4; Tue, 07 Jan 2020 23:43:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1578440615; x=1609976615; h=date:from:to:cc:subject:message-id:mime-version; bh=rdcg1lWycoz2N6RBe/TY1zMNmoMIsPWFhPaqjiy0cqo=; b=vFyaKuFvdk6gTKUientBsLj0fvn4rbn/bmbVNCCEJzBpM+jDCusRRwJX KuXQzsoDkfPtZVQ5H6Z/ytAbz9hMlBTKmtXgHKN+4/1Iw4GJVk5LpBxvT OyBFCtb4ne8FSLrSyT0OSIiXWGMURTnPnAFl2/ESy8TgahkHFsIhIJuBy 0=; IronPort-SDR: axEFrB4PTHlFKe/HXSD29MdTq89JZXh7B1fKqv3wb6QDgYRWQBHq3hPDRCvX3S368jrGZCfCgc QU7WOYAKbDQA== X-IronPort-AV: E=Sophos;i="5.69,407,1571702400"; d="scan'208";a="17326029" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-1e-a70de69e.us-east-1.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP; 07 Jan 2020 23:43:32 +0000 Received: from EX13MTAUEE002.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1e-a70de69e.us-east-1.amazon.com (Postfix) with ESMTPS id E2104A01E2; Tue, 7 Jan 2020 23:43:24 +0000 (UTC) Received: from EX13D08UEE002.ant.amazon.com (10.43.62.92) by EX13MTAUEE002.ant.amazon.com (10.43.62.24) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 7 Jan 2020 23:43:06 +0000 Received: from EX13MTAUEA002.ant.amazon.com (10.43.61.77) by EX13D08UEE002.ant.amazon.com (10.43.62.92) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 7 Jan 2020 23:43:06 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.61.169) with Microsoft SMTP Server id 15.0.1236.3 via Frontend Transport; Tue, 7 Jan 2020 23:43:06 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id 0A80440E65; Tue, 7 Jan 2020 23:43:06 +0000 (UTC) Date: Tue, 7 Jan 2020 23:43:06 +0000 From: Anchal Agarwal To: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Message-ID: <20200107234306.GA18610@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: Bulk Subject: [Xen-devel] [RFC PATCH V2 07/11] x86/xen: save and restore steal clock during hibernation X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: anchalag@amazon.com Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" From: Munehisa Kamata Currently, steal time accounting code in scheduler expects steal clock callback to provide monotonically increasing value. If the accounting code receives a smaller value than previous one, it uses a negative value to calculate steal time and results in incorrectly updated idle and steal time accounting. This breaks userspace tools which read /proc/stat. top - 08:05:35 up 2:12, 3 users, load average: 0.00, 0.07, 0.23 Tasks: 80 total, 1 running, 79 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,30100.0%id, 0.0%wa, 0.0%hi, 0.0%si, -1253874204672.0%st This can actually happen when a Xen PVHVM guest gets restored from hibernation, because such a restored guest is just a fresh domain from Xen perspective and the time information in runstate info starts over from scratch. Introduce xen_save_steal_clock() which saves current steal clock values of all present CPUs in runstate info into per-cpu variables during system core ops suspend callbacks. Its couterpart, xen_restore_steal_clock(), restores a boot CPU's steal clock in the system core resume callback. It sets offset if it found the current values in runstate info are smaller than previous ones. xen_steal_clock() is also modified to use the offset to ensure that scheduler only sees monotonically increasing number. For non-boot CPUs, restore after they're brought up, because runstate info for non-boot CPUs are not active until then. [Anchal Changelog: Merged patch xen/time: introduce xen_{save,restore}_steal_clock with this one for better code readability] Signed-off-by: Anchal Agarwal Signed-off-by: Munehisa Kamata --- arch/x86/xen/suspend.c | 13 ++++++++++++- arch/x86/xen/time.c | 3 +++ drivers/xen/time.c | 28 +++++++++++++++++++++++++++- include/xen/xen-ops.h | 2 ++ 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index 784c4484100b..dae0f74f5390 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -91,12 +91,20 @@ void xen_arch_suspend(void) static int xen_syscore_suspend(void) { struct xen_remove_from_physmap xrfp; - int ret; + int cpu, ret; /* Xen suspend does similar stuffs in its own logic */ if (xen_suspend_mode_is_xen_suspend()) return 0; + for_each_present_cpu(cpu) { + /* + * Nonboot CPUs are already offline, but the last copy of + * runstate info is still accessible. + */ + xen_save_steal_clock(cpu); + } + xrfp.domid = DOMID_SELF; xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT; @@ -118,6 +126,9 @@ static void xen_syscore_resume(void) pvclock_resume(); + /* Nonboot CPUs will be resumed when they're brought up */ + xen_restore_steal_clock(smp_processor_id()); + gnttab_resume(); } diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index befbdd8b17f0..8cf632dda605 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -537,6 +537,9 @@ static void xen_hvm_setup_cpu_clockevents(void) { int cpu = smp_processor_id(); xen_setup_runstate_info(cpu); + if (cpu) + xen_restore_steal_clock(cpu); + /* * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence * doing it xen_hvm_cpu_notify (which gets called by smp_init during diff --git a/drivers/xen/time.c b/drivers/xen/time.c index 0968859c29d0..3713d716070c 100644 --- a/drivers/xen/time.c +++ b/drivers/xen/time.c @@ -20,6 +20,8 @@ /* runstate info updated by Xen */ static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); +static DEFINE_PER_CPU(u64, xen_prev_steal_clock); +static DEFINE_PER_CPU(u64, xen_steal_clock_offset); static DEFINE_PER_CPU(u64[4], old_runstate_time); @@ -149,7 +151,7 @@ bool xen_vcpu_stolen(int vcpu) return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable; } -u64 xen_steal_clock(int cpu) +static u64 __xen_steal_clock(int cpu) { struct vcpu_runstate_info state; @@ -157,6 +159,30 @@ u64 xen_steal_clock(int cpu) return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline]; } +u64 xen_steal_clock(int cpu) +{ + return __xen_steal_clock(cpu) + per_cpu(xen_steal_clock_offset, cpu); +} + +void xen_save_steal_clock(int cpu) +{ + per_cpu(xen_prev_steal_clock, cpu) = xen_steal_clock(cpu); +} + +void xen_restore_steal_clock(int cpu) +{ + u64 steal_clock = __xen_steal_clock(cpu); + + if (per_cpu(xen_prev_steal_clock, cpu) > steal_clock) { + /* Need to update the offset */ + per_cpu(xen_steal_clock_offset, cpu) = + per_cpu(xen_prev_steal_clock, cpu) - steal_clock; + } else { + /* Avoid unnecessary steal clock warp */ + per_cpu(xen_steal_clock_offset, cpu) = 0; + } +} + void xen_setup_runstate_info(int cpu) { struct vcpu_register_runstate_memory_area area; diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index 3b3992b5b0c2..12b3f4474a05 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -37,6 +37,8 @@ void xen_time_setup_guest(void); void xen_manage_runstate_time(int action); void xen_get_runstate_snapshot(struct vcpu_runstate_info *res); u64 xen_steal_clock(int cpu); +void xen_save_steal_clock(int cpu); +void xen_restore_steal_clock(int cpu); int xen_setup_shutdown_event(void);