Message ID | 20200113170843.21332-11-ian.jackson@eu.citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | libxl: event: Fix hang for some applications | expand |
On 1/13/20 5:08 PM, Ian Jackson wrote: > If a timer event callback causes this poller to be woken (not very > unlikely) we would go round the poll loop twice rather than once. > > Do the poller pipe emptying at the end; this is slightly more > efficient because it can't cause any callbacks, so it happens after > all the callbacks have been run. > > (This pipe-emptying has to happen in afterpoll rather than the > apparently more logical beforepoll, because the application calling > beforepoll doesn't constitute a promise to actually do anything.) > > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> I can't quite figure out: why would you end up going around the loop twice, and how does this fix it? -George
George Dunlap writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"): > On 1/13/20 5:08 PM, Ian Jackson wrote: > > If a timer event callback causes this poller to be woken (not very > > unlikely) we would go round the poll loop twice rather than once. > > > > Do the poller pipe emptying at the end; this is slightly more > > efficient because it can't cause any callbacks, so it happens after > > all the callbacks have been run. > > > > (This pipe-emptying has to happen in afterpoll rather than the > > apparently more logical beforepoll, because the application calling > > beforepoll doesn't constitute a promise to actually do anything.) > > > > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> > > I can't quite figure out: why would you end up going around the loop > twice, and how does this fix it? I now think this is not true and the situation I describe cannot happen. What I was thinking was that pollers_note_osevent_added might be called by something from time_occurs, and that would write a byte into the poller pipe. But pollers_note_osevent_added doesn't wake up pollers; it just tags them osevents_added. I now think the spurious wakeup cannot happen because: For this patch to make any difference, the poller pipe would have to be woken up by something in the time scan loop in afterpoll_internal. But poller pipes are only woken up by ao completion or by cleanup_1_baton. cleanup_1_baton is not called anywhere there (as an argument against: any such call would violate the rule that cleanup_1_baton may not be called with a poller in hand). And as for ao completion, we would indeed wake up the poller. But we also mark the ao as complete, so ao_inprogress would spot !ao_work_outstanding, and not reenter eventloop_iteration at all. The woken-up poller would be put by ao_inprogress. This leads me to this observation: poller_get might give you a woken-up poller. This is not incorrect, but it is pointless. So maybe I should write a patch that puts a call to libxl__self_pipe_eatall in libxl__poller_get. TBH I still think this patch tidies the code up a bit. Ian.
Ian Jackson writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"): > TBH I still think this patch tidies the code up a bit. Given you tested it with this change, and I think it makes it a bit tidier and no less correct, I would like to keep it. I rewrote the commit message - see below. Ian. libxl: event: Move poller pipe emptying to the end of afterpoll This seems neater. It doesn't have any significant effect because: The poller fd wouldn't be emptied by time_occurs. It would only be woken by time_occurs as a result of an ao completing, or by libxl__egc_ao_cleanup_1_baton. But ...1_baton won't be called in between (for one thing, this would violate the rule of not still having the active caller when ...1_baton is called). While discussing this patch, I noticed that there is a possibility (in libxl in general) that poller_put might be called on a woken poller. It would probably be sensible at some point to make poller_get empty the pipe, at least if the pipe_nonempty flag is set. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: George Dunlap <george.dunlap@citrix.com> --- v2: Completely revised commit message; now we think this is just cleanup.
On 1/17/20 2:24 PM, Ian Jackson wrote: > George Dunlap writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"): >> On 1/13/20 5:08 PM, Ian Jackson wrote: >>> If a timer event callback causes this poller to be woken (not very >>> unlikely) we would go round the poll loop twice rather than once. >>> >>> Do the poller pipe emptying at the end; this is slightly more >>> efficient because it can't cause any callbacks, so it happens after >>> all the callbacks have been run. >>> >>> (This pipe-emptying has to happen in afterpoll rather than the >>> apparently more logical beforepoll, because the application calling >>> beforepoll doesn't constitute a promise to actually do anything.) >>> >>> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> >> >> I can't quite figure out: why would you end up going around the loop >> twice, and how does this fix it? > > I now think this is not true and the situation I describe cannot > happen. > > What I was thinking was that pollers_note_osevent_added might be > called by something from time_occurs, and that would write a byte into > the poller pipe. But pollers_note_osevent_added doesn't wake up > pollers; it just tags them osevents_added. > > I now think the spurious wakeup cannot happen because: > > For this patch to make any difference, the poller pipe would have to > be woken up by something in the time scan loop in afterpoll_internal. > > But poller pipes are only woken up by ao completion or by > cleanup_1_baton. > > cleanup_1_baton is not called anywhere there (as an argument against: > any such call would violate the rule that cleanup_1_baton may not be > called with a poller in hand). > > And as for ao completion, we would indeed wake up the poller. But we > also mark the ao as complete, so ao_inprogress would spot > !ao_work_outstanding, and not reenter eventloop_iteration at all. > The woken-up poller would be put by ao_inprogress. > > This leads me to this observation: poller_get might give you a > woken-up poller. This is not incorrect, but it is pointless. So > maybe I should write a patch that puts a call to > libxl__self_pipe_eatall in libxl__poller_get. > > TBH I still think this patch tidies the code up a bit. No objection to it on those grounds. :-) Thanks for the explanation, -George
On 1/17/20 2:33 PM, Ian Jackson wrote: > Ian Jackson writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"): >> TBH I still think this patch tidies the code up a bit. > > Given you tested it with this change, and I think it makes it a bit > tidier and no less correct, I would like to keep it. > > I rewrote the commit message - see below. > > Ian. > > libxl: event: Move poller pipe emptying to the end of afterpoll > > This seems neater. It doesn't have any significant effect because: > > The poller fd wouldn't be emptied by time_occurs. It would only be > woken by time_occurs as a result of an ao completing, or by > libxl__egc_ao_cleanup_1_baton. But ...1_baton won't be called in > between (for one thing, this would violate the rule of not still > having the active caller when ...1_baton is called). > > While discussing this patch, I noticed that there is a possibility (in > libxl in general) that poller_put might be called on a woken poller. > It would probably be sensible at some point to make poller_get empty > the pipe, at least if the pipe_nonempty flag is set. > > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> > Tested-by: George Dunlap <george.dunlap@citrix.com> With the new commit message: Reviewed-by: George Dunlap <george.dunlap@citrix.com>
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 5f6a607d80..7c5387e94f 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -1453,12 +1453,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller, fd_occurs(egc, efd, revents); } - if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) { - poller->pipe_nonempty = 0; - int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]); - if (e) LIBXL__EVENT_DISASTER(gc, "read wakeup", e, 0); - } - for (;;) { libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes); if (!etime) @@ -1473,6 +1467,12 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller, time_occurs(egc, etime, ERROR_TIMEDOUT); } + + if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) { + poller->pipe_nonempty = 0; + int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]); + if (e) LIBXL__EVENT_DISASTER(gc, "read wakeup", e, 0); + } } void libxl_osevent_afterpoll(libxl_ctx *ctx, int nfds, const struct pollfd *fds,
If a timer event callback causes this poller to be woken (not very unlikely) we would go round the poll loop twice rather than once. Do the poller pipe emptying at the end; this is slightly more efficient because it can't cause any callbacks, so it happens after all the callbacks have been run. (This pipe-emptying has to happen in afterpoll rather than the apparently more logical beforepoll, because the application calling beforepoll doesn't constitute a promise to actually do anything.) Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> --- tools/libxl/libxl_event.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)