Message ID | 1477102322-4162-1-git-send-email-mengxu@cis.upenn.edu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote: > We should clear the __RTDS_depleted bit once a VCPU budget is > replenished. > Because repl_timer_handler may be called after rt_schedule > but before rt_context_saved, the VCPU may be not on CPU or on queue > when the VCPU is the middle of context switch > Yes, this makes sense, and is a good fix. I actually would prefer the subject to become (something like): "xen: rtds: always clear the flag when replenishing a depleted vcpu" whith that changed: > Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> > Acked-by: Dario Faggioli <dario.faggioli@citrix.com> This being said, I hope you see the value of having split the patches --especially patches like this-- in such a way that each one deals with one specific issue. It doesn't matter if they end up being very small, in terms of changed lines. In fact, the behavioral issue being dealt with in here is rather subtle, and most important "self-contained", i.e., independent from any other issue that we may or may not have already found. Also, consider that, if it turns out that this patch is wrong, i.e., it does not really fix the problem and/or introduce other ones, we will be able to revert it. If the hunk below was part of a more complex patch, that would have been both more difficult and less neat. Another example: > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > @@ -1488,8 +1488,8 @@ static void repl_timer_handler(void *data){ > if ( svc->cur_deadline > next_on_runq->cur_deadline ) > runq_tickle(ops, next_on_runq); > } > - else if ( vcpu_on_q(svc) && > - __test_and_clear_bit(__RTDS_depleted, &svc->flags) > ) > + else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) > && > + vcpu_on_q(svc) ) > runq_tickle(ops, svc); > The previous patch was dropping the 'else', the reason for which was not really easy to see to me. I went looking at the commit message, but it was not really clear itself, and actually contained the description of two problems and two fixes! :-O So, again, for things like these, 1 issue ==> 1 patch. About the other patch, I think I understand the situation, but I don't like the fix much. I'm looking into it and thinking what could be an alternative solution. I'll let you know. And finally, independently from these patches, I was thinking whether it would not be worth dealing with the case budget==period as a special case. I mean, if that is the case, there's no need to program the timer, etc, we could just identify the situation and act accordingly (e.g., in burn_budget()). This is just an idea, though, based on the fact that people may want to set budget==period for some domain/vcpu (like the "special" ones, dom0 or driver domains), and if there are a few of them, we avoid the overhead of timers firing very very close to re-scheduling points, etc. On the other hand, special cases make the code uglier, and often harder to follow. So I'm saying it may (in my opinion) be worth trying to write a patch, but it's not guaranteed that we actually want to take it... It's just hard to tell, before actually seeing the code. This is, of course, just my idea, with which you can well disagree. :-) Thanks and regards, Dario
On Tue, Oct 25, 2016 at 5:20 AM, Dario Faggioli <dario.faggioli@citrix.com> wrote: > On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote: >> We should clear the __RTDS_depleted bit once a VCPU budget is >> replenished. >> Because repl_timer_handler may be called after rt_schedule >> but before rt_context_saved, the VCPU may be not on CPU or on queue >> when the VCPU is the middle of context switch >> > Yes, this makes sense, and is a good fix. > > I actually would prefer the subject to become (something like): > > "xen: rtds: always clear the flag when replenishing a depleted vcpu" > > whith that changed: > >> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> >> > Acked-by: Dario Faggioli <dario.faggioli@citrix.com> > > This being said, I hope you see the value of having split the patches > --especially patches like this-- in such a way that each one deals with > one specific issue. Yep. :-) > > It doesn't matter if they end up being very small, in terms of changed > lines. In fact, the behavioral issue being dealt with in here is rather > subtle, and most important "self-contained", i.e., independent from any > other issue that we may or may not have already found. > > Also, consider that, if it turns out that this patch is wrong, i.e., it > does not really fix the problem and/or introduce other ones, we will be > able to revert it. If the hunk below was part of a more complex patch, > that would have been both more difficult and less neat. > > Another example: > >> --- a/xen/common/sched_rt.c >> +++ b/xen/common/sched_rt.c >> @@ -1488,8 +1488,8 @@ static void repl_timer_handler(void *data){ >> if ( svc->cur_deadline > next_on_runq->cur_deadline ) >> runq_tickle(ops, next_on_runq); >> } >> - else if ( vcpu_on_q(svc) && >> - __test_and_clear_bit(__RTDS_depleted, &svc->flags) >> ) >> + else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) >> && >> + vcpu_on_q(svc) ) >> runq_tickle(ops, svc); >> > The previous patch was dropping the 'else', the reason for which was > not really easy to see to me. I went looking at the commit message, but > it was not really clear itself, and actually contained the description > of two problems and two fixes! :-O > > So, again, for things like these, 1 issue ==> 1 patch. Yep... Splitting the patch is better and easier to explain as well. ;-) > > About the other patch, I think I understand the situation, but I don't > like the fix much. I'm looking into it and thinking what could be an > alternative solution. I'll let you know. OK. I guess you don't want to update the budget related variable in the update_deadline function. In addition, we may not want to fire a budget_enforcement timer when the timer is after the budget_replenish timer. (This can be solved by using the min(remaining_budget, cur_deadline - now) at the end of rt_schedule().) Let's discuss this issue in the other patch, instead of here. > > And finally, independently from these patches, I was thinking whether > it would not be worth dealing with the case budget==period as a special > case. I mean, if that is the case, there's no need to program the > timer, etc, we could just identify the situation and act accordingly > (e.g., in burn_budget()). Ah-ha. I thought about this in 2015 March. At that time, I want to disable the scheduler for this special case so that we can get rid of the scheduler overhead for the low-latency applications (and the HPC applications). The reference is at https://lists.xenproject.org/archives/html/xen-devel/2015-03/msg02854.html The conclusion was that we may make the code a little "dirty" by having this. Once the RTDS become event-driven, we can set the VCPU's period and budget to a very very very very large value. The scheduler won't kick in until a very large period, say 10mins. The overhead will be negligible small. > > This is just an idea, though, based on the fact that people may want to > set budget==period for some domain/vcpu (like the "special" ones, dom0 > or driver domains), and if there are a few of them, we avoid the > overhead of timers firing very very close to re-scheduling points, > etc. > On the other hand, special cases make the code uglier, and often harder > to follow. So I'm saying it may (in my opinion) be worth trying to > write a patch, but it's not guaranteed that we actually want to take > it... It's just hard to tell, before actually seeing the code. Yep. The other hand is the issue. How about setting a large period for the special case when period = budget? Maybe we should discuss this in a different thread? ;-) > > This is, of course, just my idea, with which you can well disagree. :-) My concern is still the maintenance of the code. Is it better to have a separate scheduler for the full-capacity VCPU case? People can choose to compile the separate scheduler and use it in a separate cpupool. Actually, if Xen can have the similar functionality like Jailhouse, that would be awesome for low-latency applications (and maybe for HPC as well?). This will also provide the solution for the full-capacity VCPU case as well. Thanks and Best Regards, Meng
On Tue, Oct 25, 2016 at 11:20:51AM +0200, Dario Faggioli wrote: > On Fri, 2016-10-21 at 22:12 -0400, Meng Xu wrote: > > We should clear the __RTDS_depleted bit once a VCPU budget is > > replenished. > > Because repl_timer_handler may be called after rt_schedule > > but before rt_context_saved, the VCPU may be not on CPU or on queue > > when the VCPU is the middle of context switch > > > Yes, this makes sense, and is a good fix. > > I actually would prefer the subject to become (something like): > > "xen: rtds: always clear the flag when replenishing a depleted vcpu" > > whith that changed: > > > Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> > > > Acked-by: Dario Faggioli <dario.faggioli@citrix.com> > Fixed the subject line and applied for 4.8.
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index ff17bd2..16b5691 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1488,8 +1488,8 @@ static void repl_timer_handler(void *data){ if ( svc->cur_deadline > next_on_runq->cur_deadline ) runq_tickle(ops, next_on_runq); } - else if ( vcpu_on_q(svc) && - __test_and_clear_bit(__RTDS_depleted, &svc->flags) ) + else if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) && + vcpu_on_q(svc) ) runq_tickle(ops, svc); list_del(&svc->replq_elem);
We should clear the __RTDS_depleted bit once a VCPU budget is replenished. Because repl_timer_handler may be called after rt_schedule but before rt_context_saved, the VCPU may be not on CPU or on queue when the VCPU is the middle of context switch Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> --- Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: George Dunlap <George.Dunlap@eu.citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Linh Thi Xuan Phan <linhphan@cis.upenn.edu> Cc: Haoran Li <lihaoran@wustl.edu> Cc: Meng Xu <xumengpanda@gmail.com> Cc: Dagaen Golomb <dgolomb@cis.upenn.edu> Cc: Tianyang Chen <tiche@cis.upenn.edu> --- xen/common/sched_rt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)