From patchwork Wed Oct 19 15:13:54 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Meng Xu X-Patchwork-Id: 9384389 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8EE7D607D0 for ; Wed, 19 Oct 2016 15:17:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E56028AB8 for ; Wed, 19 Oct 2016 15:17:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6F07628B50; Wed, 19 Oct 2016 15:17:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B259228AB8 for ; Wed, 19 Oct 2016 15:17:41 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bwsZe-0000Af-3Z; Wed, 19 Oct 2016 15:14:34 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bwsZd-0000AZ-66 for xen-devel@lists.xenproject.org; Wed, 19 Oct 2016 15:14:33 +0000 Received: from [85.158.137.68] by server-13.bemta-3.messagelabs.com id 3A/79-16301-7DD87085; Wed, 19 Oct 2016 15:14:31 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrFLMWRWlGSWpSXmKPExsUyr8m9Wvd6L3u EwdmJ/Bbft0xmcmD0OPzhCksAYxRrZl5SfkUCa0bP7gtsBX8VKvpXOzcwXpDsYuTiEBL4zyhx Z/U5pi5GTg42ARWJ4xsesYLYIgJKEvdWTWYCKWIWuMEk8aKpjw0kISxgJvHl/QR2EJtFQFWiZ cE2sAZeAWeJr51LwWwJATmJk8cmQ9mhEmsWn2eCiT9++IBxAiPXAkaGVYwaxalFZalFukaGek lFmekZJbmJmTm6hgbGermpxcWJ6ak5iUnFesn5uZsYgX6sZ2Bg3MHYs9fvEKMkB5OSKK9PFHu EEF9SfkplRmJxRnxRaU5q8SFGGQ4OJQnexh6gnGBRanpqRVpmDjCgYNISHDxKIrzfuoHSvMUF ibnFmekQqVOMilLivNEgfQIgiYzSPLg2WBBfYpSVEuZlZGBgEOIpSC3KzSxBlX/FKM7BqCTMe wRkCk9mXgnc9FdAi5mAFp/LYwFZXJKIkJJqYDTw/hckbJtR2vsuIVTfd8Le3Us2JU9+0pR2Lu 3y4+ILDfYlN+Walv4t1FUps+f78Fc6zHAdzxPXE+/FWk3XMB4QdRd63yzddS2nacUzbmWmoNK l0qmlxXf6fu4X3nfqzZ3PhopWd/tf7Km3P6mRJP1BlmXfQYvGR8E/pHccP6/nfmfOi9UvWZRY ijMSDbWYi4oTAdP9PJhdAgAA X-Env-Sender: mengxu@cis.upenn.edu X-Msg-Ref: server-11.tower-31.messagelabs.com!1476890070!36106594!1 X-Originating-IP: [158.130.71.123] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.0.13; banners=-,-,- X-VirusChecked: Checked Received: (qmail 4826 invoked from network); 19 Oct 2016 15:14:31 -0000 Received: from renard.seas.upenn.edu (HELO fox.seas.upenn.edu) (158.130.71.123) by server-11.tower-31.messagelabs.com with SMTP; 19 Oct 2016 15:14:31 -0000 Received: from panda-catbroadwell.cis.upenn.edu ([158.130.48.19]) (authenticated bits=0) by fox.seas.upenn.edu (8.14.9/8.14.5) with ESMTP id u9JFEIFk008357 (version=TLSv1/SSLv3 cipher=AES256-SHA256 bits=256 verify=NOT); Wed, 19 Oct 2016 11:14:18 -0400 From: Meng Xu To: xen-devel@lists.xenproject.org Date: Wed, 19 Oct 2016 11:13:54 -0400 Message-Id: <1476890041-4248-1-git-send-email-mengxu@cis.upenn.edu> X-Mailer: git-send-email 1.9.1 X-Proofpoint-Virus-Version: vendor=nai engine=5600 definitions=5800 signatures=585085 X-Proofpoint-Spam-Reason: safe Cc: Wei Liu , George Dunlap , Haoran Li , Dario Faggioli , Linh Thi Xuan Phan , Meng Xu , Meng Xu , Dagaen Golomb Subject: [Xen-devel] [PATCH] xen:rtds:fix bug in accounting budget X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP The bug is introduced in Xen 4.7 when we converted RTDS scheduler from quantum-driven model to event-driven model. We assumed rt_schedule() is always called for a VCPU before the VCPUs budget replenished handler. This assumption does not hold, when system is overloaded, or when the VCPU budget is almost equal its period. Buggy behavior: 1) A VCPU may get less budget that assigned in a period. 2) A full capacity VCPU, i.e., a VCPU whose period is equal to budget, may not get any budget in some period. Bug analysis: 1) A VCPU deadline can be fast-forwarded by more than one period. However, the VCPU last_start time was not updated immediately. If rt_schedule() is called after rt_update_deadline(), which happens when VCPU budget is equal to period or when VCPU has deadline miss, burn_budget() will burn the budget that was just replenished, although the replenished budget should be used in the most recent period only. We should update VCPU last_start time to the start of the current period when rt_update_deadline() updates a VCPU period. 2) When a full capacity VCPU depletes its budget and is context switching out, but has not updated the cores current running VCPU, the budget replenish timer may be triggerred. The replenish handler failed to re-schedule the full capacity VCPU because it thought the VCPU is running. When a VCPU budget is replenished, we try to tickle a CPU. When we find a core for a VCPU to tickle and the VCPU is context switching out, we will always tickle the core where the VCPU was running, if the VCPU cannot find another core to tickle This bug was reported by Dagaen Golomb Signed-off-by: Meng Xu --- Cc: Dagaen Golomb Cc: Dario Faggioli Cc: George Dunlap Cc: Wei Liu Cc: Linh Thi Xuan Phan Cc: Haoran Li Cc: Meng Xu --- xen/common/sched_rt.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index d95f798..cdc5c06 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -407,6 +407,13 @@ rt_update_deadline(s_time_t now, struct rt_vcpu *svc) svc->cur_deadline += count * svc->period; } + /* + * rt_schedule may be scheduled after update deadline + * we should only deduct the budget consumed in current period + */ + if ( svc->last_start < (svc->cur_deadline - svc->period) ) + svc->last_start = svc->cur_deadline - svc->period; + svc->cur_budget = svc->budget; /* TRACE */ @@ -1195,6 +1202,19 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) goto out; } + /* + * new may be preempted due to out of budget + * new may replenish its budget before it is contexted switched out + * then new may preempt the to-be-scheduled task on its prev cpu + */ + if ( curr_on_cpu(new->vcpu->processor) == new->vcpu && + test_bit(__RTDS_delayed_runq_add, &new->flags) ) + { + SCHED_STAT_CRANK(tickled_busy_cpu); + cpu_to_tickle = new->vcpu->processor; + goto out; + } + /* didn't tickle any cpu */ SCHED_STAT_CRANK(tickled_no_cpu); return; @@ -1472,6 +1492,7 @@ static void repl_timer_handler(void *data){ { svc = replq_elem(iter); + /* Another ready VCPU may preempt svc who updates its deadline */ if ( curr_on_cpu(svc->vcpu->processor) == svc->vcpu && !list_empty(runq) ) { @@ -1480,8 +1501,9 @@ static void repl_timer_handler(void *data){ if ( svc->cur_deadline > next_on_runq->cur_deadline ) runq_tickle(ops, next_on_runq); } - else if ( vcpu_on_q(svc) && - __test_and_clear_bit(__RTDS_depleted, &svc->flags) ) + /* svc may preempt another VCPU because it has budget again */ + if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) && + vcpu_runnable(svc->vcpu) ) runq_tickle(ops, svc); list_del(&svc->replq_elem);