diff mbox series

[v2,10/10] libxl: event: Move poller pipe emptying to the end of afterpoll

Message ID 20200113170843.21332-11-ian.jackson@eu.citrix.com (mailing list archive)
State Superseded
Headers show
Series libxl: event: Fix hang for some applications | expand

Commit Message

Ian Jackson Jan. 13, 2020, 5:08 p.m. UTC
If a timer event callback causes this poller to be woken (not very
unlikely) we would go round the poll loop twice rather than once.

Do the poller pipe emptying at the end; this is slightly more
efficient because it can't cause any callbacks, so it happens after
all the callbacks have been run.

(This pipe-emptying has to happen in afterpoll rather than the
apparently more logical beforepoll, because the application calling
beforepoll doesn't constitute a promise to actually do anything.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

George Dunlap Jan. 17, 2020, 12:32 p.m. UTC | #1
On 1/13/20 5:08 PM, Ian Jackson wrote:
> If a timer event callback causes this poller to be woken (not very
> unlikely) we would go round the poll loop twice rather than once.
> 
> Do the poller pipe emptying at the end; this is slightly more
> efficient because it can't cause any callbacks, so it happens after
> all the callbacks have been run.
> 
> (This pipe-emptying has to happen in afterpoll rather than the
> apparently more logical beforepoll, because the application calling
> beforepoll doesn't constitute a promise to actually do anything.)
> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

I can't quite figure out: why would you end up going around the loop
twice, and how does this fix it?

 -George
Ian Jackson Jan. 17, 2020, 2:24 p.m. UTC | #2
George Dunlap writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"):
> On 1/13/20 5:08 PM, Ian Jackson wrote:
> > If a timer event callback causes this poller to be woken (not very
> > unlikely) we would go round the poll loop twice rather than once.
> > 
> > Do the poller pipe emptying at the end; this is slightly more
> > efficient because it can't cause any callbacks, so it happens after
> > all the callbacks have been run.
> > 
> > (This pipe-emptying has to happen in afterpoll rather than the
> > apparently more logical beforepoll, because the application calling
> > beforepoll doesn't constitute a promise to actually do anything.)
> > 
> > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> I can't quite figure out: why would you end up going around the loop
> twice, and how does this fix it?

I now think this is not true and the situation I describe cannot
happen.

What I was thinking was that pollers_note_osevent_added might be
called by something from time_occurs, and that would write a byte into
the poller pipe.  But pollers_note_osevent_added doesn't wake up
pollers; it just tags them osevents_added.

I now think the spurious wakeup cannot happen because:

For this patch to make any difference, the poller pipe would have to
be woken up by something in the time scan loop in afterpoll_internal.

But poller pipes are only woken up by ao completion or by
cleanup_1_baton.

cleanup_1_baton is not called anywhere there (as an argument against:
any such call would violate the rule that cleanup_1_baton may not be
called with a poller in hand).

And as for ao completion, we would indeed wake up the poller.  But we
also mark the ao as complete, so ao_inprogress would spot
!ao_work_outstanding, and not reenter eventloop_iteration at all.
The woken-up poller would be put by ao_inprogress.

This leads me to this observation: poller_get might give you a
woken-up poller.  This is not incorrect, but it is pointless.  So
maybe I should write a patch that puts a call to
libxl__self_pipe_eatall in libxl__poller_get.

TBH I still think this patch tidies the code up a bit.

Ian.
Ian Jackson Jan. 17, 2020, 2:33 p.m. UTC | #3
Ian Jackson writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"):
> TBH I still think this patch tidies the code up a bit.

Given you tested it with this change, and I think it makes it a bit
tidier and no less correct, I would like to keep it.

I rewrote the commit message - see below.

Ian.

libxl: event: Move poller pipe emptying to the end of afterpoll

This seems neater.  It doesn't have any significant effect because:

The poller fd wouldn't be emptied by time_occurs.  It would only be
woken by time_occurs as a result of an ao completing, or by
libxl__egc_ao_cleanup_1_baton.  But ...1_baton won't be called in
between (for one thing, this would violate the rule of not still
having the active caller when ...1_baton is called).

While discussing this patch, I noticed that there is a possibility (in
libxl in general) that poller_put might be called on a woken poller.
It would probably be sensible at some point to make poller_get empty
the pipe, at least if the pipe_nonempty flag is set.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tested-by: George Dunlap <george.dunlap@citrix.com>
---
v2: Completely revised commit message; now we think this is just
    cleanup.
George Dunlap Jan. 17, 2020, 2:34 p.m. UTC | #4
On 1/17/20 2:24 PM, Ian Jackson wrote:
> George Dunlap writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"):
>> On 1/13/20 5:08 PM, Ian Jackson wrote:
>>> If a timer event callback causes this poller to be woken (not very
>>> unlikely) we would go round the poll loop twice rather than once.
>>>
>>> Do the poller pipe emptying at the end; this is slightly more
>>> efficient because it can't cause any callbacks, so it happens after
>>> all the callbacks have been run.
>>>
>>> (This pipe-emptying has to happen in afterpoll rather than the
>>> apparently more logical beforepoll, because the application calling
>>> beforepoll doesn't constitute a promise to actually do anything.)
>>>
>>> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
>>
>> I can't quite figure out: why would you end up going around the loop
>> twice, and how does this fix it?
> 
> I now think this is not true and the situation I describe cannot
> happen.
> 
> What I was thinking was that pollers_note_osevent_added might be
> called by something from time_occurs, and that would write a byte into
> the poller pipe.  But pollers_note_osevent_added doesn't wake up
> pollers; it just tags them osevents_added.
> 
> I now think the spurious wakeup cannot happen because:
> 
> For this patch to make any difference, the poller pipe would have to
> be woken up by something in the time scan loop in afterpoll_internal.
> 
> But poller pipes are only woken up by ao completion or by
> cleanup_1_baton.
> 
> cleanup_1_baton is not called anywhere there (as an argument against:
> any such call would violate the rule that cleanup_1_baton may not be
> called with a poller in hand).
> 
> And as for ao completion, we would indeed wake up the poller.  But we
> also mark the ao as complete, so ao_inprogress would spot
> !ao_work_outstanding, and not reenter eventloop_iteration at all.
> The woken-up poller would be put by ao_inprogress.
> 
> This leads me to this observation: poller_get might give you a
> woken-up poller.  This is not incorrect, but it is pointless.  So
> maybe I should write a patch that puts a call to
> libxl__self_pipe_eatall in libxl__poller_get.
> 
> TBH I still think this patch tidies the code up a bit.

No objection to it on those grounds. :-)

Thanks for the explanation,
 -George
George Dunlap Jan. 17, 2020, 2:39 p.m. UTC | #5
On 1/17/20 2:33 PM, Ian Jackson wrote:
> Ian Jackson writes ("Re: [PATCH v2 10/10] libxl: event: Move poller pipe emptying to the end of afterpoll"):
>> TBH I still think this patch tidies the code up a bit.
> 
> Given you tested it with this change, and I think it makes it a bit
> tidier and no less correct, I would like to keep it.
> 
> I rewrote the commit message - see below.
> 
> Ian.
> 
> libxl: event: Move poller pipe emptying to the end of afterpoll
> 
> This seems neater.  It doesn't have any significant effect because:
> 
> The poller fd wouldn't be emptied by time_occurs.  It would only be
> woken by time_occurs as a result of an ao completing, or by
> libxl__egc_ao_cleanup_1_baton.  But ...1_baton won't be called in
> between (for one thing, this would violate the rule of not still
> having the active caller when ...1_baton is called).
> 
> While discussing this patch, I noticed that there is a possibility (in
> libxl in general) that poller_put might be called on a woken poller.
> It would probably be sensible at some point to make poller_get empty
> the pipe, at least if the pipe_nonempty flag is set.
> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> Tested-by: George Dunlap <george.dunlap@citrix.com>

With the new commit message:

Reviewed-by: George Dunlap <george.dunlap@citrix.com>
diff mbox series

Patch

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 5f6a607d80..7c5387e94f 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1453,12 +1453,6 @@  static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
         fd_occurs(egc, efd, revents);
     }
 
-    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
-        poller->pipe_nonempty = 0;
-        int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
-        if (e) LIBXL__EVENT_DISASTER(gc, "read wakeup", e, 0);
-    }
-
     for (;;) {
         libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
         if (!etime)
@@ -1473,6 +1467,12 @@  static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
 
         time_occurs(egc, etime, ERROR_TIMEDOUT);
     }
+
+    if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
+        poller->pipe_nonempty = 0;
+        int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
+        if (e) LIBXL__EVENT_DISASTER(gc, "read wakeup", e, 0);
+    }
 }
 
 void libxl_osevent_afterpoll(libxl_ctx *ctx, int nfds, const struct pollfd *fds,