diff mbox

xen:rtds:Clear __RTDS_depleted bit once replenish vcpu

Message ID 1477102322-4162-1-git-send-email-mengxu@cis.upenn.edu (mailing list archive)
State New, archived
Headers show

Commit Message

Meng Xu Oct. 22, 2016, 2:12 a.m. UTC
We should clear the __RTDS_depleted bit once a VCPU budget is replenished.
Because repl_timer_handler may be called after rt_schedule
but before rt_context_saved, the VCPU may be not on CPU or on queue
when the VCPU is the middle of context switch

Signed-off-by: Meng Xu <mengxu@cis.upenn.edu>
---
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Linh Thi Xuan Phan <linhphan@cis.upenn.edu>
Cc: Haoran Li <lihaoran@wustl.edu>
Cc: Meng Xu <xumengpanda@gmail.com>
Cc: Dagaen Golomb <dgolomb@cis.upenn.edu>
Cc: Tianyang Chen <tiche@cis.upenn.edu>
---
 xen/common/sched_rt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Dario Faggioli Oct. 25, 2016, 9:20 a.m. UTC | #1
On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote:
> We should clear the __RTDS_depleted bit once a VCPU budget is
> replenished.
> Because repl_timer_handler may be called after rt_schedule
> but before rt_context_saved, the VCPU may be not on CPU or on queue
> when the VCPU is the middle of context switch
>
Yes, this makes sense, and is a good fix.

I actually would prefer the subject to become (something like):

"xen: rtds: always clear the flag when replenishing a depleted vcpu"

whith that changed:

> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu>
>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>

This being said, I hope you see the value of having split the patches
--especially patches like this-- in such a way that each one deals with
one specific issue.

It doesn't matter if they end up being very small, in terms of changed
lines. In fact, the behavioral issue being dealt with in here is rather
subtle, and most important "self-contained", i.e., independent from any
other issue that we may or may not have already found.

Also, consider that, if it turns out that this patch is wrong, i.e., it
does not really fix the problem and/or introduce other ones, we will be
able to revert it. If the hunk below was part of a more complex patch,
that would have been both more difficult and less neat.

Another example:

> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1488,8 +1488,8 @@ static void repl_timer_handler(void *data){
>              if ( svc->cur_deadline > next_on_runq->cur_deadline )
>                  runq_tickle(ops, next_on_runq);
>          }
> -        else if ( vcpu_on_q(svc) &&
> -                  __test_and_clear_bit(__RTDS_depleted, &svc->flags) 
> )
> +        else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) 
> &&
> +                  vcpu_on_q(svc) )
>              runq_tickle(ops, svc);
>  
The previous patch was dropping the 'else', the reason for which was
not really easy to see to me. I went looking at the commit message, but
it was not really clear itself, and actually contained the description
of two problems and two fixes! :-O

So, again, for things like these, 1 issue ==> 1 patch.

About the other patch, I think I understand the situation, but I don't
like the fix much. I'm looking into it and thinking what could be an
alternative solution. I'll let you know.

And finally, independently from these patches, I was thinking whether
it would not be worth dealing with the case budget==period as a special
case. I mean, if that is the case, there's no need to program the
timer, etc, we could just identify the situation and act accordingly
(e.g., in burn_budget()).

This is just an idea, though, based on the fact that people may want to
set budget==period for some domain/vcpu (like the "special" ones, dom0
or driver domains), and if there are a few of them, we avoid the
overhead of timers firing very very close to re-scheduling points,
etc. 
On the other hand, special cases make the code uglier, and often harder
to follow. So I'm saying it may (in my opinion) be worth trying to
write a patch, but it's not guaranteed that we actually want to take
it... It's just hard to tell, before actually seeing the code.

This is, of course, just my idea, with which you can well disagree. :-)

Thanks and regards,
Dario
Meng Xu Oct. 25, 2016, 1:27 p.m. UTC | #2
On Tue, Oct 25, 2016 at 5:20 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote:
>> We should clear the __RTDS_depleted bit once a VCPU budget is
>> replenished.
>> Because repl_timer_handler may be called after rt_schedule
>> but before rt_context_saved, the VCPU may be not on CPU or on queue
>> when the VCPU is the middle of context switch
>>
> Yes, this makes sense, and is a good fix.
>
> I actually would prefer the subject to become (something like):
>
> "xen: rtds: always clear the flag when replenishing a depleted vcpu"
>
> whith that changed:
>
>> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu>
>>
> Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
>
> This being said, I hope you see the value of having split the patches
> --especially patches like this-- in such a way that each one deals with
> one specific issue.

Yep. :-)

>
> It doesn't matter if they end up being very small, in terms of changed
> lines. In fact, the behavioral issue being dealt with in here is rather
> subtle, and most important "self-contained", i.e., independent from any
> other issue that we may or may not have already found.
>
> Also, consider that, if it turns out that this patch is wrong, i.e., it
> does not really fix the problem and/or introduce other ones, we will be
> able to revert it. If the hunk below was part of a more complex patch,
> that would have been both more difficult and less neat.
>
> Another example:
>
>> --- a/xen/common/sched_rt.c
>> +++ b/xen/common/sched_rt.c
>> @@ -1488,8 +1488,8 @@ static void repl_timer_handler(void *data){
>>              if ( svc->cur_deadline > next_on_runq->cur_deadline )
>>                  runq_tickle(ops, next_on_runq);
>>          }
>> -        else if ( vcpu_on_q(svc) &&
>> -                  __test_and_clear_bit(__RTDS_depleted, &svc->flags)
>> )
>> +        else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags)
>> &&
>> +                  vcpu_on_q(svc) )
>>              runq_tickle(ops, svc);
>>
> The previous patch was dropping the 'else', the reason for which was
> not really easy to see to me. I went looking at the commit message, but
> it was not really clear itself, and actually contained the description
> of two problems and two fixes! :-O
>
> So, again, for things like these, 1 issue ==> 1 patch.

Yep... Splitting the patch is better and easier to explain as well. ;-)

>
> About the other patch, I think I understand the situation, but I don't
> like the fix much. I'm looking into it and thinking what could be an
> alternative solution. I'll let you know.

OK. I guess you don't want to update the budget related variable in
the update_deadline function.
In addition, we may not want to fire a budget_enforcement timer when
the timer is after the budget_replenish timer.  (This can be solved by
using the min(remaining_budget, cur_deadline - now) at the end of
rt_schedule().)

Let's discuss this issue in the other patch, instead of here.

>
> And finally, independently from these patches, I was thinking whether
> it would not be worth dealing with the case budget==period as a special
> case. I mean, if that is the case, there's no need to program the
> timer, etc, we could just identify the situation and act accordingly
> (e.g., in burn_budget()).

Ah-ha. I thought about this in 2015 March. At that time, I want to
disable the scheduler for this special case so that we can get rid of
the scheduler overhead for the low-latency applications (and the HPC
applications). The reference is at
https://lists.xenproject.org/archives/html/xen-devel/2015-03/msg02854.html

The conclusion was that we may make the code a little "dirty" by
having this. Once the RTDS become event-driven, we can set the VCPU's
period and budget to a very very very very large value. The scheduler
won't kick in until a very large period, say 10mins. The overhead will
be negligible small.

>
> This is just an idea, though, based on the fact that people may want to
> set budget==period for some domain/vcpu (like the "special" ones, dom0
> or driver domains), and if there are a few of them, we avoid the
> overhead of timers firing very very close to re-scheduling points,
> etc.
> On the other hand, special cases make the code uglier, and often harder
> to follow. So I'm saying it may (in my opinion) be worth trying to
> write a patch, but it's not guaranteed that we actually want to take
> it... It's just hard to tell, before actually seeing the code.

Yep. The other hand is the issue. How about setting a large period for
the special case when period = budget? Maybe we should discuss this in
a different thread? ;-)

>
> This is, of course, just my idea, with which you can well disagree. :-)

My concern is still the maintenance of the code. Is it better to have
a separate scheduler for the full-capacity VCPU case?  People can
choose to compile the separate scheduler and use it in a separate
cpupool.

Actually, if Xen can have the similar functionality like Jailhouse,
that would be awesome for low-latency applications (and maybe for HPC
as well?).  This will also provide the solution for the full-capacity
VCPU case as well.

Thanks and Best Regards,

Meng
Wei Liu Oct. 26, 2016, 12:41 p.m. UTC | #3
On Tue, Oct 25, 2016 at 11:20:51AM +0200, Dario Faggioli wrote:
> On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote:
> > We should clear the __RTDS_depleted bit once a VCPU budget is
> > replenished.
> > Because repl_timer_handler may be called after rt_schedule
> > but before rt_context_saved, the VCPU may be not on CPU or on queue
> > when the VCPU is the middle of context switch
> >
> Yes, this makes sense, and is a good fix.
> 
> I actually would prefer the subject to become (something like):
> 
> "xen: rtds: always clear the flag when replenishing a depleted vcpu"
> 
> whith that changed:
> 
> > Signed-off-by: Meng Xu <mengxu@cis.upenn.edu>
> >
> Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
> 

Fixed the subject line and applied for 4.8.
diff mbox

Patch

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index ff17bd2..16b5691 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1488,8 +1488,8 @@  static void repl_timer_handler(void *data){
             if ( svc->cur_deadline > next_on_runq->cur_deadline )
                 runq_tickle(ops, next_on_runq);
         }
-        else if ( vcpu_on_q(svc) &&
-                  __test_and_clear_bit(__RTDS_depleted, &svc->flags) )
+        else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) &&
+                  vcpu_on_q(svc) )
             runq_tickle(ops, svc);
 
         list_del(&svc->replq_elem);