From patchwork Wed Aug 17 17:20:21 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9286253 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E4CB460839 for ; Wed, 17 Aug 2016 17:22:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D509C294A2 for ; Wed, 17 Aug 2016 17:22:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C9C59294B8; Wed, 17 Aug 2016 17:22:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2E3CD294A2 for ; Wed, 17 Aug 2016 17:22:49 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba4Vu-0001uq-5Z; Wed, 17 Aug 2016 17:20:26 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba4Vt-0001tO-63 for xen-devel@lists.xenproject.org; Wed, 17 Aug 2016 17:20:25 +0000 Received: from [193.109.254.147] by server-6.bemta-6.messagelabs.com id 6E/8C-11175-8DC94B75; Wed, 17 Aug 2016 17:20:24 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrLIsWRWlGSWpSXmKPExsXiVRvkrHtjzpZ wgwu7WC2+b5nM5MDocfjDFZYAxijWzLyk/IoE1oxdP2czFzxSr7jV18LawHhNrouRi0NIYCaj xLMZt5lBHBaBNawSV/d/ZwJxJAQusUrsbZnF3sXICeTESHRu38MIYVdKTNy4nRXEFhJQkbi5f RUTxKi5TBKthz6ygSSEBfQkjhz9wQ5hR0h0d7axgNhsAgYSb3bsBWsWEVCSuLdqMlAzBwezQL jE6k5ukDCLgKpE58NusDG8Aj4ShxfOA2vlBLJvrf7NArHXW+Lw5B6wGlEBOYmVl1tYIeoFJU7 OfMICMVJTYv0ufZAws4C8xPa3c5gnMIrMQlI1C6FqFpKqBYzMqxjVi1OLylKLdC31kooy0zNK chMzc3QNDcz0clOLixPTU3MSk4r1kvNzNzECg58BCHYw3t0UcIhRkoNJSZT3TvWWcCG+pPyUy ozE4oz4otKc1OJDjBocHAITzs6dziTFkpefl6okwdszG6hOsCg1PbUiLTMHGJ8wpRIcPEoivD tB0rzFBYm5xZnpEKlTjLocW6beW8skBDZDSpw3BaRIAKQoozQPbgQsVVxilJUS5mUEOlCIpyC 1KDezBFX+FaM4B6OSMO9jkCk8mXklcJteAR3BBHQELz/YESWJCCmpBkbmDtO4ZoMrkWuVvSIP SCiZHA5ZnnJzPrPh98kKafrbu09ZRJU/TDM9N3+Sy+9MD03hgBvTe5pun7nh9PQx14nYq5G5a vvu2YRKiWi7R7wNNDm6WXfpLe2mtcXLDXWfBR19c2tqfqeVyCOv4J0uafemrUj+Kl+5827V1Y ddvxdfmrLh/fmfBn5KLMUZiYZazEXFiQBMc9S1EAMAAA== X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-13.tower-27.messagelabs.com!1471454423!54150752!1 X-Originating-IP: [74.125.82.67] X-SpamReason: No, hits=0.2 required=7.0 tests=RCVD_ILLEGAL_IP X-StarScan-Received: X-StarScan-Version: 8.84; banners=-,-,- X-VirusChecked: Checked Received: (qmail 35918 invoked from network); 17 Aug 2016 17:20:24 -0000 Received: from mail-wm0-f67.google.com (HELO mail-wm0-f67.google.com) (74.125.82.67) by server-13.tower-27.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 17 Aug 2016 17:20:24 -0000 Received: by mail-wm0-f67.google.com with SMTP id q128so26291200wma.1 for ; Wed, 17 Aug 2016 10:20:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=M4doqUNXcwpG0ICc5wvG+YErfxsMTAh5tbcdSHDAF3Q=; b=uwPzjm3qFMRpV1BXy0EYfwquAO5rI7pfAPdzLRlvDBDsfAcl3Oz5ioutP8h+tPFCYH Wq6LuF3/qT2X3LQXQjragZ+BNzIJ5ZniG1Hnmw3rphQ1rOrzYOOIN4M6yv866d3nqX2l jLu7hjo/VX1NDMWuz97KQmt99DVXgtl60vApeqOtsnypetpLGByCPIP+ZErfBbKdzYID 2OoFOYyzfVRDAW+oHzOXKVFtiODCM/ggixWSg0Lupx1HPunhFAX+2Le/o98Ck2Au1L+J pDy8CSahqPGWsCbFvOisH9b6V4hJYOjIqgdyNCGi4J/F+apaExbDRsvzVAJvNpaWdW7V ZOUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=M4doqUNXcwpG0ICc5wvG+YErfxsMTAh5tbcdSHDAF3Q=; b=Dk1MxQTciphBCZ66fncEOuNy+yf+8iVi2Vy9udhK7x1xj7mP5lhjHceeHSxWLEx2l4 WK/QyiKwl8MGVaJdC1bFKY4gDQxaWRi93n3jgTgnsy1wu8VAsZZa9JMAuMnJ9p/Bm90K yyN1z5yjF5CXY26eeBlrZaQrC7CMdCxCtEwb7fnO7LgRQ349NrKkQ99inFVlRkAs0m0C 1ub020YEUuzmOkTq/d+XVOdACJuZ5Ib3y24grZ7APpZePSDsr8D0ysqxewbxQVkzxjVm x/mIP0prmkByoElYYhOO8znT6BY7XFRcFlGDCmG/OaFP/tIpJyggSgay3dhs5CJ2j6eJ U8NA== X-Gm-Message-State: AEkoouu9HatVbje2592+BDc/BnvqwYG8ZmdnIcTxSIHtRJzHLmzz8NdUsbN4YgXOKdZOHg== X-Received: by 10.28.15.194 with SMTP id 185mr28632788wmp.58.1471454423601; Wed, 17 Aug 2016 10:20:23 -0700 (PDT) Received: from Solace.fritz.box (net-2-32-14-104.cust.vodafonedsl.it. [2.32.14.104]) by smtp.gmail.com with ESMTPSA id s6sm32477277wjm.25.2016.08.17.10.20.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Aug 2016 10:20:22 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Wed, 17 Aug 2016 19:20:21 +0200 Message-ID: <147145442187.25877.14699749197377382888.stgit@Solace.fritz.box> In-Reply-To: <147145358844.25877.7490417583264534196.stgit@Solace.fritz.box> References: <147145358844.25877.7490417583264534196.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: Anshul Makkar , George Dunlap Subject: [Xen-devel] [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP With context switching ratelimiting enabled, the following pattern is quite common in a scheduling trace: 0.000845622 |||||||||||.x||| d32768v12 csched2:runq_insert d0v13, position 0 0.000845831 |||||||||||.x||| d32768v12 csched2:runq_tickle_new d0v13, processor = 12, credit = 10135529 0.000846546 |||||||||||.x||| d32768v12 csched2:burn_credits d2v7, credit = 2619231, delta = 255937 [1] 0.000846739 |||||||||||.x||| d32768v12 csched2:runq_tickle cpu 12 [...] [2] 0.000850597 ||||||||||||x||| d32768v12 csched2:schedule cpu 12, rq# 1, busy, SMT busy, tickled 0.000850760 ||||||||||||x||| d32768v12 csched2:burn_credits d2v7, credit = 2614028, delta = 5203 [3] 0.000851022 ||||||||||||x||| d32768v12 csched2:ratelimit triggered [4] 0.000851614 ||||||||||||x||| d32768v12 runstate_continue d2v7 running->running Basically, what happens is that runq_tickle() realizes d0v13 should preempt d2v7, running on cpu 12, as it has higher credits (10135529 vs. 2619231). It therefore tickles cpu 12 [1], which, in turn, schedules [2]. But --surprise surprise-- d2v7 has run for less than the ratelimit interval [3], and hence it is _not_ preempted, and continues to run. This indeed looks fine. Actually, this is what ratelimiting is there for. Note, however, that: 1) we interrupted cpu 12 for nothing; 2) what if, say on cpu 8, there is a vcpu that has: + less credit than d0v13 (so d0v13 can well preempt it), + more credit than d2v7 (that's why it was not selected to be preempted), + run for more than the ratelimiting interval (so it can really be scheduled out)? This patch tries to figure out whether the situation is the one described at 2) and, if it is, tickles 8 (in the example above) instead of 12. Signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: Anshul Makkar --- xen/common/sched_credit2.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index f03ecce..3bb764d 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -146,6 +146,8 @@ #define CSCHED2_MIGRATE_RESIST ((opt_migrate_resist)*MICROSECS(1)) /* How much to "compensate" a vcpu for L2 migration */ #define CSCHED2_MIGRATE_COMPENSATION MICROSECS(50) +/* How tolerant we should be when peeking at runtime of vcpus on other cpus */ +#define CSCHED2_RATELIMIT_TICKLE_TOLERANCE MICROSECS(50) /* How big of a bias we should have against a yielding vcpu */ #define CSCHED2_YIELD_BIAS ((opt_yield_bias)*MICROSECS(1)) #define CSCHED2_YIELD_BIAS_MIN CSCHED2_MIN_TIMER @@ -972,6 +974,27 @@ static inline bool_t soft_aff_check_preempt(unsigned int bs, unsigned int cpu) return !cpumask_test_cpu(cpu, cpumask_scratch); } +/* + * What we want to know is whether svc, which we assume to be running on some + * pcpu, can be interrupted and preempted. So fat, the only reason because of + * which a preemption would be deferred is context switch ratelimiting, so + * check for that. + * + * Use a caller provided value of ratelimit, instead of the scheduler's own + * prv->ratelimit_us so the caller can play some tricks, if he wants (which, + * as a matter of fact, he does, by applying the tolerance). + */ +static inline bool_t is_preemptable(const struct csched2_vcpu *svc, + s_time_t now, s_time_t ratelimit) +{ + s_time_t runtime; + + ASSERT(svc->vcpu->is_running); + runtime = now - svc->vcpu->runstate.state_entry_time; + + return runtime > ratelimit; +} + void burn_credits(struct csched2_runqueue_data *rqd, struct csched2_vcpu *, s_time_t); /* @@ -997,6 +1020,8 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now) s_time_t lowest = (1<<30); unsigned int bs, cpu = new->vcpu->processor; struct csched2_runqueue_data *rqd = RQD(ops, cpu); + s_time_t ratelimit = MICROSECS(CSCHED2_PRIV(ops)->ratelimit_us) - + CSCHED2_RATELIMIT_TICKLE_TOLERANCE; cpumask_t mask, skip_mask; struct csched2_vcpu * cur; @@ -1104,7 +1129,8 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now) (unsigned char *)&d); } - if ( cur->credit < new->credit ) + if ( cur->credit < new->credit && + is_preemptable(cur, now, ratelimit) ) { SCHED_STAT_CRANK(tickled_busy_cpu); ipid = cpu; @@ -1155,7 +1181,8 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now) (unsigned char *)&d); } - if ( cur->credit < lowest ) + if ( cur->credit < lowest && + is_preemptable(cur, now, ratelimit) ) { ipid = i; lowest = cur->credit;