From patchwork Fri Mar  4 16:34:39 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Dario Faggioli <dario.faggioli@citrix.com>
X-Patchwork-Id: 8506071
Return-Path: <xen-devel-bounces@lists.xen.org>
X-Original-To: patchwork-xen-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork1.web.kernel.org (Postfix) with ESMTP id 23A6A9F38C
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Fri,  4 Mar 2016 16:37:48 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 0C6A120259
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Fri,  4 Mar 2016 16:37:47 +0000 (UTC)
Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120])
	(using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id CE2542011B
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Fri,  4 Mar 2016 16:37:41 +0000 (UTC)
Received: from localhost ([127.0.0.1] helo=lists.xenproject.org)
	by lists.xen.org with esmtp (Exim 4.84)
	(envelope-from <xen-devel-bounces@lists.xen.org>)
	id 1absgm-0001Fs-1C; Fri, 04 Mar 2016 16:34:52 +0000
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.84)
	(envelope-from <prvs=864b493c6=dario.faggioli@citrix.com>)
	id 1absgl-0001Fi-1n
	for xen-devel@lists.xenproject.org; Fri, 04 Mar 2016 16:34:51 +0000
Received: from [193.109.254.147] by server-16.bemta-14.messagelabs.com id
	16/A2-02863-A29B9D65; Fri, 04 Mar 2016 16:34:50 +0000
X-Env-Sender: prvs=864b493c6=dario.faggioli@citrix.com
X-Msg-Ref: server-7.tower-27.messagelabs.com!1457109287!28839830!1
X-Originating-IP: [66.165.176.63]
X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor:
	VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n,
	received_headers: No Received headers
X-StarScan-Received: 
X-StarScan-Version: 8.11; banners=-,-,-
X-VirusChecked: Checked
Received: (qmail 47814 invoked from network); 4 Mar 2016 16:34:49 -0000
Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63)
	by server-7.tower-27.messagelabs.com with RC4-SHA encrypted SMTP;
	4 Mar 2016 16:34:49 -0000
X-IronPort-AV: E=Sophos;i="5.22,536,1449532800";
	d="asc'?scan'208";a="343250478"
Message-ID: <1457109279.2959.553.camel@citrix.com>
From: Dario Faggioli <dario.faggioli@citrix.com>
To: "Chen, Tianyang" <tiche@seas.upenn.edu>
Date: Fri, 4 Mar 2016 17:34:39 +0100
In-Reply-To: <c6581b7361f7f89b8a981d7f76ab5cee@seas.upenn.edu>
References: <1456430736-4606-1-git-send-email-tiche@seas.upenn.edu>
	<1456443078.2959.85.camel@citrix.com> <56CFDF85.504@seas.upenn.edu>
	<1456477891.2959.132.camel@citrix.com>
	<56D08B50.5090301@seas.upenn.edu>
	<1456510170.2959.210.camel@citrix.com>
	<c6581b7361f7f89b8a981d7f76ab5cee@seas.upenn.edu>
Organization: Citrix Inc.
X-Mailer: Evolution 3.18.5.1 (3.18.5.1-1.fc23) 
MIME-Version: 1.0
X-DLP: MIA2
Cc: xen-devel@lists.xenproject.org, george.dunlap@citrix.com,
	Dagaen Golomb <dgolomb@seas.upenn.edu>, Meng Xu <mengxu@cis.upenn.edu>
Subject: Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time
	to event driven model
X-BeenThere: xen-devel@lists.xen.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Xen developer discussion <xen-devel.lists.xen.org>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Errors-To: xen-devel-bounces@lists.xen.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY
	autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Fri, 2016-02-26 at 13:33 -0500, Chen, Tianyang wrote:
> Attached.
> 
Hey,

here I am... sorry it took a bit.

I've had a look, and I've been able to come up with some code that I at
least do not dislike too much! ;-P

Have a look at the attached patch (you should apply it on top of the
sched_rt.c file that you sent me in attachment).

About what was happening...

> On 2016-02-26 13:09, Dario Faggioli wrote:
> > On Fri, 2016-02-26 at 12:28 -0500, Tianyang Chen wrote:
> > > 
> > > I added debug prints in all these functions and noticed that it
> > > could
> > > be 
> > > caused by racing between spurious wakeups and context switching.
> > > 
It's not really spurious wakeup, it's regular wakeup happening
immediately after the vcpu blocked, when we still haven't been able to
execute rt_context_saved().

It's not really a race, and in fact we have the _RTDS_scheduled and
_RTDS_delayed_runq_add flags that are meant to deal with exactly these
situations.

In fact (I added some more debug printk):

> > > (XEN) vcpu1 wakes up nowhere
> > > (XEN) cpu1 picked vcpu1     *** vcpu1 is on a cpu
> > > (XEN) cpu1 picked idle      *** vcpu1 is waiting to be context
> > > switched
> > > (XEN) vcpu2 wakes up nowhere
> > > (XEN) cpu0 picked vcpu0
> > > (XEN) cpu2 picked vcpu2
> > > (XEN) cpu0 picked vcpu0
> > > (XEN) cpu0 picked vcpu0
> > > (XEN) d0 attempted to change d0v2's CR4 flags 00000620 ->
> > > 00040660
> > > (XEN) cpu0 picked vcpu0
> > > (XEN) vcpu2 sleeps on cpu
> > > (XEN) vcpu1 wakes up nowhere      *** vcpu1 wakes up without
> > > sleep?
> > > 
> > > (XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:526
> > > (XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]---
> > > -
>
(XEN) [   56.113897] vcpu13 wakes up nowhere
(XEN) [   56.113906] cpu4 picked vcpu13
(XEN) [   56.113962] vcpu13 blocks
(XEN) [   56.113965] cpu4 picked idle
(XEN) [   56.113969] vcpu13 unblocks
(XEN) [   56.113972] vcpu13 wakes up nowhere
(XEN) [   56.113980] vcpu13 woken while still in scheduling tail
(XEN) [   56.113985] vcpu13 context saved and added back to queue
(XEN) [   56.113992] cpu4 picked vcpu13

So, as you can see, at 56.113962 vcpu13 blocks (note, _blocks_ !=
_sleeps_), and cpu4 goes idle. Ideally, rt_context_saved() would run
and remove the replenishment event of vcpu13 from the replenishment
queue.

But, at 56.113969, vcpu13 wakes up already, before rt_context_saved()
had a chance to run (which happens at 56.113985). It is not a spurious
wakeup, as the vcpu was actually blocked and is being unblocked.

Since rt_context_saved() hasn't run yet, the replenishment event is
still in the queue, and hence any ASSERT asking for
!__vcpu_on_replq(vcpu13) is doomed to making Xen explode! :-/

That does not happen in the execution trace above, because that was
collected with my patch applied, where I either leave the replenishment
event alone, if it still valid (i.e., no deadline miss happened during
the blocked period) or, if a new one needs to be enqueued, I dequeue
the old one first.

The end result is not particularly pretty, but I am up for doing even
worse, for the sake of keeping things like this:

 ASSERT( !__vcpu_on_replq(svc) );

at the beginning of replq_insert() (and its appropriate counterpart at
the beginning of replq_remove()).

In fact, I consider them really really helpful, when reasoning and
trying to figure out how the code works... There is nothing that I hate
more than an 'enqueue' function for which you don't know if it is ok to
call it when the entity being enqueued is in the queue already (and
what actually happens if it is).

These ASSERTs, greatly help, from that point of view, clearly stating
that, _no_, in this case it's absolutely not right to call the enqueue
function if the event is in the queue already. :-)

Hope I made myself clear.

I gave some testing to the attached patch, and it seems to work to me,
but I'd appreciate if you could do more of that.

Regards,
Dario

--- /home/dario/Desktop/sched_rt.c	2016-03-04 15:47:14.901787122 +0100
+++ xen/common/sched_rt.c	2016-03-04 16:55:06.746641424 +0100
@@ -439,6 +439,8 @@
     struct list_head *replq = rt_replq(ops);
     struct timer* repl_timer = prv->repl_timer;
 
+    ASSERT( __vcpu_on_replq(svc) );
+
     /*
      * Disarm the timer before re-programming it.
      * It doesn't matter if the vcpu to be removed
@@ -580,7 +582,7 @@
 }
 
 static void
-rt_deinit(const struct scheduler *ops)
+rt_deinit(struct scheduler *ops)
 {
     struct rt_private *prv = rt_priv(ops);
 
@@ -1153,6 +1155,7 @@
 {
     struct rt_vcpu * const svc = rt_vcpu(vc);
     s_time_t now = NOW();
+    bool_t missed;
 
     BUG_ON( is_idle_vcpu(vc) );
 
@@ -1181,28 +1184,42 @@
 
     /*
      * If a deadline passed while svc was asleep/blocked, we need new
-     * scheduling parameters ( a new deadline and full budget), and
-     * also a new replenishment event
+     * scheduling parameters (a new deadline and full budget).
      */
-    if ( now >= svc->cur_deadline)
-    {   
+    if ( (missed = now >= svc->cur_deadline) )
         rt_update_deadline(now, svc);
-        __replq_remove(ops, svc);
-    }
-
-   // if( !__vcpu_on_replq(svc) )
-        __replq_insert(ops, svc);
 
-    /* If context hasn't been saved for this vcpu yet, we can't put it on
-     * the Runqueue/DepletedQ. Instead, we set a flag so that it will be
-     * put on the Runqueue/DepletedQ after the context has been saved.
+    /* 
+     * If context hasn't been saved for this vcpu yet, we can't put it on
+     * the run-queue/depleted-queue. Instead, we set the appropriate flag,
+     * it the vcpu will be put back on queue after the context has been saved
+     * (in rt_context_save()).
      */
     if ( unlikely(svc->flags & RTDS_scheduled) )
     {
+        printk("vcpu%d woken while still in scheduling tail\n", vc->vcpu_id);
         set_bit(__RTDS_delayed_runq_add, &svc->flags);
+        /*
+         * The vcpu is waking up already, and we didn't even had the time to
+         * remove its next replenishment event from the replenishment queue
+         * when he blocked! No big deal. If we did not miss the deadline in
+         * the meantime, let's just leave it there. If we did, let's remove it
+         * and queue a new one (to occur at our new deadline).
+         */
+        if ( missed )
+        {
+           // XXX You may want to implement something like replq_reinsert(),
+           //     for dealing with this case, and calling that one from here
+           //     (of course!). I'd do that by merging _remove and _insert
+           //     and killing the duplicated bits.
+           __replq_remove(ops, svc); 
+           __replq_insert(ops, svc); 
+        }
         return;
     }
 
+    /* Replenishment event got cancelled when we blocked. Add it back. */
+    __replq_insert(ops, svc); 
     /* insert svc to runq/depletedq because svc is not in queue now */
     __runq_insert(ops, svc);