mbox series

[mdadm,0/2] Bug fixes for --write-zeros option

Message ID 20240604163837.798219-1-logang@deltatee.com (mailing list archive)
Headers show
Series Bug fixes for --write-zeros option | expand

Message

Logan Gunthorpe June 4, 2024, 4:38 p.m. UTC
Hi,

Xiao noticed that the write-zeros tests failed randomly, especially
with small disks. We tracked this down to an issue with signalfd which
coallesced SIGCHLD signals into one. This is fixed by checking the
status of all children after every SIGCHLD.

While we were at it, we noticed a potential reace with SIGCHLD coming
in before the signal was blocked in wait_for_zero_forks() and fix this
by moving the blocking before the child creation.

Thanks,

Logan

--

Logan Gunthorpe (2):
  mdadm: Fix hang race condition in wait_for_zero_forks()
  mdadm: Block SIGCHLD processes before starting children

 Create.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)


base-commit: 46f19270265fe54cda1c728cb156b755273b4ab6
--
2.39.2

Comments

Mariusz Tkaczyk June 12, 2024, 10:15 a.m. UTC | #1
On Tue,  4 Jun 2024 10:38:35 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:

> Hi,
> 
> Xiao noticed that the write-zeros tests failed randomly, especially
> with small disks. We tracked this down to an issue with signalfd which
> coallesced SIGCHLD signals into one. This is fixed by checking the
> status of all children after every SIGCHLD.
> 
> While we were at it, we noticed a potential reace with SIGCHLD coming
> in before the signal was blocked in wait_for_zero_forks() and fix this
> by moving the blocking before the child creation.
> 
> Thanks,
> 
> Logan
> 
> --
Hello Logan,
Thanks for fixes. LGTM.

I will fix typo when merging, no need to sent v2.
I have proxy issue, I have to solve it first.

Thanks,
Mariusz
Mariusz Tkaczyk June 13, 2024, 1:24 p.m. UTC | #2
On Tue,  4 Jun 2024 10:38:35 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:

> Hi,
> 
> Xiao noticed that the write-zeros tests failed randomly, especially
> with small disks. We tracked this down to an issue with signalfd which
> coallesced SIGCHLD signals into one. This is fixed by checking the
> status of all children after every SIGCHLD.
> 
> While we were at it, we noticed a potential reace with SIGCHLD coming
> in before the signal was blocked in wait_for_zero_forks() and fix this
> by moving the blocking before the child creation.
> 
> Thanks,
> 
> Logan
> 
> --

Applied! 

Thanks,
Mariusz