mbox series

[0/2] alternate approach to fixing fsmonitor hangs

Message ID 20241008083121.GA676391@coredump.intra.peff.net (mailing list archive)
Headers show
Series alternate approach to fixing fsmonitor hangs | expand

Message

Jeff King Oct. 8, 2024, 8:31 a.m. UTC
On Mon, Oct 07, 2024 at 06:45:22PM +0900, Koji Nakamaru wrote:

> > -     pthread_mutex_lock(&server_data->work_available_mutex);
> > +     /* If we haven't started yet, we are already holding lock. */
> > +     if (!server_data->started)
> > +             pthread_mutex_lock(&server_data->work_available_mutex);
> >
> >       server_data->shutdown_requested = 1;
> 
> Is this condition inverted?

Yes, good catch.

> > I had also previously checked my suggested solution. So I do think
> > either is a valid solution to the problem.
> 
> I also tested your approach on Windows with a few additions to
> ipc-win32.c, and it worked correctly.

Yeah, I never even tried building mine on Windows, and was just testing
via tmate on our macOS CI environment. :-/

I've added in the necessary Windows bits, along with smoothing a few
rough edges. Especially with the extra Windows changes (which I mostly
had to guess-and-check by pushing to CI), I'm beginning to wonder if my
solution isn't getting a bit too complicated, and maybe yours was the
right way after all.

But I've cleaned it up for presentation here, so at least we can look at
the final form of both and see which we prefer.

  [1/2]: simple-ipc: split async server initialization and running
  [2/2]: fsmonitor: initialize fs event listener before accepting clients

 builtin/fsmonitor--daemon.c          |  6 ++--
 compat/fsmonitor/fsm-listen-darwin.c |  6 ++++
 compat/fsmonitor/fsm-listen-win32.c  |  6 ++++
 compat/simple-ipc/ipc-shared.c       |  5 +--
 compat/simple-ipc/ipc-unix-socket.c  | 28 +++++++++++++---
 compat/simple-ipc/ipc-win32.c        | 48 +++++++++++++++++++++++++---
 simple-ipc.h                         | 17 +++++++---
 7 files changed, 98 insertions(+), 18 deletions(-)

-Peff

Comments

Koji Nakamaru Oct. 8, 2024, 4:03 p.m. UTC | #1
On Tue, Oct 8, 2024 at 5:31 PM Jeff King <peff@peff.net> wrote:
> I've added in the necessary Windows bits, along with smoothing a few
> rough edges. Especially with the extra Windows changes (which I mostly
> had to guess-and-check by pushing to CI), I'm beginning to wonder if my
> solution isn't getting a bit too complicated, and maybe yours was the
> right way after all.
>
> But I've cleaned it up for presentation here, so at least we can look at
> the final form of both and see which we prefer.

Thank you for the new patch. It prevents to start accepting requests
until starting fs event listening and simplifies the code flow. It also
has sufficient comments, so later everyone can easily understand how it
works. I also tested it both on mac and windows and it works correctly.

I think this one should be adopted :)

Koji Nakamaru
Jeff King Oct. 11, 2024, 9 a.m. UTC | #2
On Wed, Oct 09, 2024 at 01:03:14AM +0900, Koji Nakamaru wrote:

> > But I've cleaned it up for presentation here, so at least we can look at
> > the final form of both and see which we prefer.
> 
> Thank you for the new patch. It prevents to start accepting requests
> until starting fs event listening and simplifies the code flow. It also
> has sufficient comments, so later everyone can easily understand how it
> works. I also tested it both on mac and windows and it works correctly.
> 
> I think this one should be adopted :)

Thanks for reviewing, and for all your work identifying the problem in
the first place! Looks like Junio has picked up my patch and it's
already in 'next', so hopefully these 6-hour CI timeouts will soon be a
thing of the past. :)

-Peff
Junio C Hamano Oct. 11, 2024, 6:01 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> On Wed, Oct 09, 2024 at 01:03:14AM +0900, Koji Nakamaru wrote:
>
>> > But I've cleaned it up for presentation here, so at least we can look at
>> > the final form of both and see which we prefer.
>> 
>> Thank you for the new patch. It prevents to start accepting requests
>> until starting fs event listening and simplifies the code flow. It also
>> has sufficient comments, so later everyone can easily understand how it
>> works. I also tested it both on mac and windows and it works correctly.
>> 
>> I think this one should be adopted :)
>
> Thanks for reviewing, and for all your work identifying the problem in
> the first place! Looks like Junio has picked up my patch and it's
> already in 'next', so hopefully these 6-hour CI timeouts will soon be a
> thing of the past. :)

Yup.  Thanks, both of you.