From patchwork Sun Oct 8 05:29:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dongli Zhang X-Patchwork-Id: 9991729 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E23B660244 for ; Sun, 8 Oct 2017 05:32:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C678B28735 for ; Sun, 8 Oct 2017 05:32:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA10328739; Sun, 8 Oct 2017 05:32:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id AB2D028735 for ; Sun, 8 Oct 2017 05:32:10 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e149m-0007c6-DE; Sun, 08 Oct 2017 05:29:42 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e149k-0007bi-61 for xen-devel@lists.xen.org; Sun, 08 Oct 2017 05:29:40 +0000 Received: from [193.109.254.147] by server-3.bemta-6.messagelabs.com id 92/1A-03101-3C7B9D95; Sun, 08 Oct 2017 05:29:39 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrGIsWRWlGSWpSXmKPExsXSO6nOVXf39pu RBrcOKlgs+biYxWLrsj/sDkweR3f/ZvLY3reLPYApijUzLym/IoE1483798wFO1Uq7lzay9rA +F+ui5GLQ0hgApPEukOzGCGcX4wSm4+vYoNwNjBKLDx6CSjDycErIChxcuYTFgjbSmLLiz+sI DaLgJZE677DbCA2m4COxLQDp8BqRAQUJHr+7gSrYRaIlZj5+SkTiC0s4CCx8dA3sBoJASWJf1 u7gWwOoBp1ifXzhCDKtSWWLXzNDBGWllj+jwOi2lDi88alzBMY+WchOWgWQvMsJM2zEJoXMLK sYtQoTi0qSy3SNbLUSyrKTM8oyU3MzNE1NDDTy00tLk5MT81JTCrWS87P3cQIDFcGINjBeGBR 4CFGSQ4mJVFes603I4X4kvJTKjMSizPii0pzUosPMcpwcChJ8NpsA8oJFqWmp1akZeYAIwcmL cHBoyTCGwaS5i0uSMwtzkyHSJ1itOQ4tunyHyaOH5OuAMmOm3f/MAmx5OXnpUqJ8waBNAiANG SU5sGNg0X3JUZZKWFeRqADhXgKUotyM0tQ5V8xinMwKgnzHgCZwpOZVwK39RXQQUxABzEW3wA 5qCQRISXVwLht15HD/QbcnJ/bOy+E3NyZzPn2OYvUzffiOc4NMa/k/2e83HeIpzmRwyaji9H5 rZjvpu1PJSyYLHZNrPuy6TDXx1fClu8T5wd2Xlha/ZXTbcKdct40zV2PH+hJBp9ZsDhK+gDTI 56Kz4s6VC+xuty4pqXgVVOYySA82cxI8/+bcy+4VMqWeSqxFGckGmoxFxUnAgAx8NvT6QIAAA == X-Env-Sender: dongli.zhang@oracle.com X-Msg-Ref: server-3.tower-27.messagelabs.com!1507440569!110312389!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n, ML_RADAR_SPEW_LINKS_8, spamassassin: ,async_handler: YXN5bmNfZGVsYXk6IDAgKHRpbWVvdXQp\n X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 1106 invoked from network); 8 Oct 2017 05:29:31 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-3.tower-27.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 8 Oct 2017 05:29:31 -0000 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v985TDgt024557 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 8 Oct 2017 05:29:14 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v985TDIF010780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 8 Oct 2017 05:29:13 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v985TC0R016943; Sun, 8 Oct 2017 05:29:12 GMT MIME-Version: 1.0 Message-ID: <0fb7b738-5637-4e9a-ad2e-6b61a894348a@default> Date: Sat, 7 Oct 2017 22:29:11 -0700 (PDT) From: Dongli Zhang To: X-Mailer: Zimbra on Oracle Beehive Content-Disposition: inline X-Source-IP: userv0022.oracle.com [156.151.31.74] Cc: xen.list@daevel.fr, xen-users@lists.xensource.com, xen-devel@lists.xen.org Subject: Re: [Xen-devel] high CPU stolen time after live migrate X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Hi Dario and Olivier, I have just encountered this issue in the past. While the fix mentioned in the link is effective, I assume the fix was derived from upstream linux and it will introduce new error as mentioned below. While there is a kernel bug in the guest kernel, I think the root cause is at the hypervisor side. From my own test, the issue is reproducible even when migration a VM locally within the same dom0. From the test, once guest VM is migrated, RUNSTATE_offline time looks normal, while RUNSTATE_runnable is moving backward and decreased. Therefore, the value returned by paravirt_steal_clock() (actually xen_steal_clock()), which is equivalent to the sum of RUNSTATE_offline and RUNSTATE_runnable, is decreased as well. However, the kernel such as 4.8 could not handle this special situation correctly as the code in cputime.c is not written specifically for xen hypervisor. For kernel like v4.8-rc8, would something as below would be better? This issue seems not getting totally fixed by most up-to-date upstream linux (I have tested with 4.12.0-rc7). The issue in 4.12.0-rc7 is different. After live migration, although the steal clock counter is not overflowed (become a very large unsigned number), the steal clock counter in /proc/stat is moving backward and decreased (e.g., from 329 to 311). test@vm:~$ cat /proc/stat cpu 248 0 240 31197 893 0 1 329 0 0 cpu0 248 0 240 31197 893 0 1 329 0 0 intr 39051 16307 0 0 0 0 0 990 127 592 1004 1360 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ctxt 59400 btime 1506731352 processes 1877 procs_running 1 procs_blocked 0 softirq 38903 0 15524 1227 6904 0 0 6 0 0 15242 After live migration, steal counter in ubuntu guest running 4.12.0-rc7 was decreased to 311. test@vm:~$ cat /proc/stat cpu 251 0 242 31245 893 0 1 311 0 0 cpu0 251 0 242 31245 893 0 1 311 0 0 intr 39734 16404 0 0 0 0 0 1440 128 0 8 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ctxt 60880 btime 1506731352 processes 1882 procs_running 3 procs_blocked 0 softirq 39195 0 15618 1286 6958 0 0 7 0 0 15326 I assume this is not an expected behavior. A different patch (similar to the one I mentioned above) to upstream linux would fix this issue. --------------------------------------------------------- Whatever the fix would be applied to guest kernel side, I think the root cause is because xen hypervisor returns a RUNSTATE_runnable time less than the previous one before live migration. As I am not clear enough with xen scheduling, I do not understand why RUNSTATE_runnable cputime is decreased after live migration. Dongli Zhang ----- Original Message ----- From: dario.faggioli@citrix.com To: xen.list@daevel.fr, xen-users@lists.xensource.com Cc: xen-devel@lists.xen.org Sent: Tuesday, October 3, 2017 5:24:49 PM GMT +08:00 Beijing / Chongqing / Hong Kong / Urumqi Subject: Re: [Xen-devel] high CPU stolen time after live migrate On Mon, 2017-10-02 at 18:37 +0200, Olivier Bonvalet wrote: > root! laussor:/proc# cat /proc/uptime > 652005.23 2631328.82 > > > Values for "stolen time" in /proc/stat seems impossible with only 7 > days of uptime. > I think it can be this: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-parav irtualized-xen-guest/ What's the version of your guest kernel? Dario diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index a846cf8..3546e21 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -274,11 +274,17 @@ static __always_inline cputime_t steal_account_process_time(cputime_t maxtime) if (static_key_false(¶virt_steal_enabled)) { cputime_t steal_cputime; u64 steal; + s64 steal_diff; steal = paravirt_steal_clock(smp_processor_id()); - steal -= this_rq()->prev_steal_time; + steal_diff = steal - this_rq()->prev_steal_time; - steal_cputime = min(nsecs_to_cputime(steal), maxtime); + if (steal_diff < 0) { + this_rq()->prev_steal_time = steal; + return 0; + } + + steal_cputime = min(nsecs_to_cputime(steal_diff), maxtime); account_steal_time(steal_cputime); this_rq()->prev_steal_time += cputime_to_nsecs(steal_cputime);