From patchwork Fri Mar 4 16:34:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 8506071 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 23A6A9F38C for ; Fri, 4 Mar 2016 16:37:48 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0C6A120259 for ; Fri, 4 Mar 2016 16:37:47 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CE2542011B for ; Fri, 4 Mar 2016 16:37:41 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1absgm-0001Fs-1C; Fri, 04 Mar 2016 16:34:52 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1absgl-0001Fi-1n for xen-devel@lists.xenproject.org; Fri, 04 Mar 2016 16:34:51 +0000 Received: from [193.109.254.147] by server-16.bemta-14.messagelabs.com id 16/A2-02863-A29B9D65; Fri, 04 Mar 2016 16:34:50 +0000 X-Env-Sender: prvs=864b493c6=dario.faggioli@citrix.com X-Msg-Ref: server-7.tower-27.messagelabs.com!1457109287!28839830!1 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 8.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 47814 invoked from network); 4 Mar 2016 16:34:49 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-7.tower-27.messagelabs.com with RC4-SHA encrypted SMTP; 4 Mar 2016 16:34:49 -0000 X-IronPort-AV: E=Sophos;i="5.22,536,1449532800"; d="asc'?scan'208";a="343250478" Message-ID: <1457109279.2959.553.camel@citrix.com> From: Dario Faggioli To: "Chen, Tianyang" Date: Fri, 4 Mar 2016 17:34:39 +0100 In-Reply-To: References: <1456430736-4606-1-git-send-email-tiche@seas.upenn.edu> <1456443078.2959.85.camel@citrix.com> <56CFDF85.504@seas.upenn.edu> <1456477891.2959.132.camel@citrix.com> <56D08B50.5090301@seas.upenn.edu> <1456510170.2959.210.camel@citrix.com> Organization: Citrix Inc. X-Mailer: Evolution 3.18.5.1 (3.18.5.1-1.fc23) MIME-Version: 1.0 X-DLP: MIA2 Cc: xen-devel@lists.xenproject.org, george.dunlap@citrix.com, Dagaen Golomb , Meng Xu Subject: Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 2016-02-26 at 13:33 -0500, Chen, Tianyang wrote: > Attached. > Hey, here I am... sorry it took a bit. I've had a look, and I've been able to come up with some code that I at least do not dislike too much! ;-P Have a look at the attached patch (you should apply it on top of the sched_rt.c file that you sent me in attachment). About what was happening... > On 2016-02-26 13:09, Dario Faggioli wrote: > > On Fri, 2016-02-26 at 12:28 -0500, Tianyang Chen wrote: > > >  > > > I added debug prints in all these functions and noticed that it > > > could > > > be  > > > caused by racing between spurious wakeups and context switching. > > > It's not really spurious wakeup, it's regular wakeup happening immediately after the vcpu blocked, when we still haven't been able to execute rt_context_saved(). It's not really a race, and in fact we have the _RTDS_scheduled and _RTDS_delayed_runq_add flags that are meant to deal with exactly these situations. In fact (I added some more debug printk): > > > (XEN) vcpu1 wakes up nowhere > > > (XEN) cpu1 picked vcpu1     *** vcpu1 is on a cpu > > > (XEN) cpu1 picked idle      *** vcpu1 is waiting to be context > > > switched > > > (XEN) vcpu2 wakes up nowhere > > > (XEN) cpu0 picked vcpu0 > > > (XEN) cpu2 picked vcpu2 > > > (XEN) cpu0 picked vcpu0 > > > (XEN) cpu0 picked vcpu0 > > > (XEN) d0 attempted to change d0v2's CR4 flags 00000620 -> > > > 00040660 > > > (XEN) cpu0 picked vcpu0 > > > (XEN) vcpu2 sleeps on cpu > > > (XEN) vcpu1 wakes up nowhere      *** vcpu1 wakes up without > > > sleep? > > > > > > (XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:526 > > > (XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]--- > > > - > (XEN) [   56.113897] vcpu13 wakes up nowhere (XEN) [   56.113906] cpu4 picked vcpu13 (XEN) [   56.113962] vcpu13 blocks (XEN) [   56.113965] cpu4 picked idle (XEN) [   56.113969] vcpu13 unblocks (XEN) [   56.113972] vcpu13 wakes up nowhere (XEN) [   56.113980] vcpu13 woken while still in scheduling tail (XEN) [   56.113985] vcpu13 context saved and added back to queue (XEN) [   56.113992] cpu4 picked vcpu13 So, as you can see, at 56.113962 vcpu13 blocks (note, _blocks_ != _sleeps_), and cpu4 goes idle. Ideally, rt_context_saved() would run and remove the replenishment event of vcpu13 from the replenishment queue. But, at 56.113969, vcpu13 wakes up already, before rt_context_saved() had a chance to run (which happens at 56.113985). It is not a spurious wakeup, as the vcpu was actually blocked and is being unblocked. Since rt_context_saved() hasn't run yet, the replenishment event is still in the queue, and hence any ASSERT asking for !__vcpu_on_replq(vcpu13) is doomed to making Xen explode! :-/ That does not happen in the execution trace above, because that was collected with my patch applied, where I either leave the replenishment event alone, if it still valid (i.e., no deadline miss happened during the blocked period) or, if a new one needs to be enqueued, I dequeue the old one first. The end result is not particularly pretty, but I am up for doing even worse, for the sake of keeping things like this:  ASSERT( !__vcpu_on_replq(svc) ); at the beginning of replq_insert() (and its appropriate counterpart at the beginning of replq_remove()). In fact, I consider them really really helpful, when reasoning and trying to figure out how the code works... There is nothing that I hate more than an 'enqueue' function for which you don't know if it is ok to call it when the entity being enqueued is in the queue already (and what actually happens if it is). These ASSERTs, greatly help, from that point of view, clearly stating that, _no_, in this case it's absolutely not right to call the enqueue function if the event is in the queue already. :-) Hope I made myself clear. I gave some testing to the attached patch, and it seems to work to me, but I'd appreciate if you could do more of that. Regards, Dario --- /home/dario/Desktop/sched_rt.c 2016-03-04 15:47:14.901787122 +0100 +++ xen/common/sched_rt.c 2016-03-04 16:55:06.746641424 +0100 @@ -439,6 +439,8 @@ struct list_head *replq = rt_replq(ops); struct timer* repl_timer = prv->repl_timer; + ASSERT( __vcpu_on_replq(svc) ); + /* * Disarm the timer before re-programming it. * It doesn't matter if the vcpu to be removed @@ -580,7 +582,7 @@ } static void -rt_deinit(const struct scheduler *ops) +rt_deinit(struct scheduler *ops) { struct rt_private *prv = rt_priv(ops); @@ -1153,6 +1155,7 @@ { struct rt_vcpu * const svc = rt_vcpu(vc); s_time_t now = NOW(); + bool_t missed; BUG_ON( is_idle_vcpu(vc) ); @@ -1181,28 +1184,42 @@ /* * If a deadline passed while svc was asleep/blocked, we need new - * scheduling parameters ( a new deadline and full budget), and - * also a new replenishment event + * scheduling parameters (a new deadline and full budget). */ - if ( now >= svc->cur_deadline) - { + if ( (missed = now >= svc->cur_deadline) ) rt_update_deadline(now, svc); - __replq_remove(ops, svc); - } - - // if( !__vcpu_on_replq(svc) ) - __replq_insert(ops, svc); - /* If context hasn't been saved for this vcpu yet, we can't put it on - * the Runqueue/DepletedQ. Instead, we set a flag so that it will be - * put on the Runqueue/DepletedQ after the context has been saved. + /* + * If context hasn't been saved for this vcpu yet, we can't put it on + * the run-queue/depleted-queue. Instead, we set the appropriate flag, + * it the vcpu will be put back on queue after the context has been saved + * (in rt_context_save()). */ if ( unlikely(svc->flags & RTDS_scheduled) ) { + printk("vcpu%d woken while still in scheduling tail\n", vc->vcpu_id); set_bit(__RTDS_delayed_runq_add, &svc->flags); + /* + * The vcpu is waking up already, and we didn't even had the time to + * remove its next replenishment event from the replenishment queue + * when he blocked! No big deal. If we did not miss the deadline in + * the meantime, let's just leave it there. If we did, let's remove it + * and queue a new one (to occur at our new deadline). + */ + if ( missed ) + { + // XXX You may want to implement something like replq_reinsert(), + // for dealing with this case, and calling that one from here + // (of course!). I'd do that by merging _remove and _insert + // and killing the duplicated bits. + __replq_remove(ops, svc); + __replq_insert(ops, svc); + } return; } + /* Replenishment event got cancelled when we blocked. Add it back. */ + __replq_insert(ops, svc); /* insert svc to runq/depletedq because svc is not in queue now */ __runq_insert(ops, svc);