diff mbox series

md/raid5: release batch_last before waiting for another stripe_head

Message ID 20231002183422.13047-1-djeffery@redhat.com (mailing list archive)
State Accepted, archived
Headers show
Series md/raid5: release batch_last before waiting for another stripe_head | expand

Commit Message

David Jeffery Oct. 2, 2023, 6:32 p.m. UTC
When raid5_get_active_stripe is called with a ctx containing a stripe_head in
its batch_last pointer, it can cause a deadlock if the task sleeps waiting on
another stripe_head to become available. The stripe_head held by batch_last
can be blocking the advancement of other stripe_heads, leading to no
stripe_heads being released so raid5_get_active_stripe waits forever.

Like with the quiesce state handling earlier in the function, batch_last
needs to be released by raid5_get_active_stripe before it waits for another
stripe_head.


Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch")
Signed-off-by: David Jeffery <djeffery@redhat.com>

---
 drivers/md/raid5.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

John Pittman Oct. 2, 2023, 7:21 p.m. UTC | #1
Thanks a lot David.  Song, as a note, David's patch was tested by a
Red Hat customer and it indeed resolved their hit on the deadlock.
cc. Laurence Oberman who assisted on that case.


On Mon, Oct 2, 2023 at 2:39 PM David Jeffery <djeffery@redhat.com> wrote:
>
> When raid5_get_active_stripe is called with a ctx containing a stripe_head in
> its batch_last pointer, it can cause a deadlock if the task sleeps waiting on
> another stripe_head to become available. The stripe_head held by batch_last
> can be blocking the advancement of other stripe_heads, leading to no
> stripe_heads being released so raid5_get_active_stripe waits forever.
>
> Like with the quiesce state handling earlier in the function, batch_last
> needs to be released by raid5_get_active_stripe before it waits for another
> stripe_head.
>
>
> Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch")
> Signed-off-by: David Jeffery <djeffery@redhat.com>
>
> ---
>  drivers/md/raid5.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 6383723468e5..0644b83fd3f4 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -854,6 +854,13 @@ struct stripe_head *raid5_get_active_stripe(struct r5conf *conf,
>
>                 set_bit(R5_INACTIVE_BLOCKED, &conf->cache_state);
>                 r5l_wake_reclaim(conf->log, 0);
> +
> +               /* release batch_last before wait to avoid risk of deadlock */
> +               if (ctx && ctx->batch_last) {
> +                       raid5_release_stripe(ctx->batch_last);
> +                       ctx->batch_last = NULL;
> +               }
> +
>                 wait_event_lock_irq(conf->wait_for_stripe,
>                                     is_inactive_blocked(conf, hash),
>                                     *(conf->hash_locks + hash));
> --
> 2.41.0
>
Song Liu Oct. 3, 2023, 6:48 a.m. UTC | #2
CC Logan.

On Mon, Oct 2, 2023 at 12:22 PM John Pittman <jpittman@redhat.com> wrote:
>
> Thanks a lot David.  Song, as a note, David's patch was tested by a
> Red Hat customer and it indeed resolved their hit on the deadlock.
> cc. Laurence Oberman who assisted on that case.
>
>
> On Mon, Oct 2, 2023 at 2:39 PM David Jeffery <djeffery@redhat.com> wrote:
> >
> > When raid5_get_active_stripe is called with a ctx containing a stripe_head in
> > its batch_last pointer, it can cause a deadlock if the task sleeps waiting on
> > another stripe_head to become available. The stripe_head held by batch_last
> > can be blocking the advancement of other stripe_heads, leading to no
> > stripe_heads being released so raid5_get_active_stripe waits forever.
> >
> > Like with the quiesce state handling earlier in the function, batch_last
> > needs to be released by raid5_get_active_stripe before it waits for another
> > stripe_head.
> >
> >
> > Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch")
> > Signed-off-by: David Jeffery <djeffery@redhat.com>

Thanks for the fix! I applied it to md-fixes.

Question: How easy/difficult is it to reproduce this issue?

Song

> > ---
> >  drivers/md/raid5.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 6383723468e5..0644b83fd3f4 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -854,6 +854,13 @@ struct stripe_head *raid5_get_active_stripe(struct r5conf *conf,
> >
> >                 set_bit(R5_INACTIVE_BLOCKED, &conf->cache_state);
> >                 r5l_wake_reclaim(conf->log, 0);
> > +
> > +               /* release batch_last before wait to avoid risk of deadlock */
> > +               if (ctx && ctx->batch_last) {
> > +                       raid5_release_stripe(ctx->batch_last);
> > +                       ctx->batch_last = NULL;
> > +               }
> > +
> >                 wait_event_lock_irq(conf->wait_for_stripe,
> >                                     is_inactive_blocked(conf, hash),
> >                                     *(conf->hash_locks + hash));
> > --
> > 2.41.0
> >
>
Logan Gunthorpe Oct. 3, 2023, 3:47 p.m. UTC | #3
On 2023-10-03 00:48, Song Liu wrote:
> CC Logan.
> 
> On Mon, Oct 2, 2023 at 12:22 PM John Pittman <jpittman@redhat.com> wrote:
>>
>> Thanks a lot David.  Song, as a note, David's patch was tested by a
>> Red Hat customer and it indeed resolved their hit on the deadlock.
>> cc. Laurence Oberman who assisted on that case.
>>
>>
>> On Mon, Oct 2, 2023 at 2:39 PM David Jeffery <djeffery@redhat.com> wrote:
>>>
>>> When raid5_get_active_stripe is called with a ctx containing a stripe_head in
>>> its batch_last pointer, it can cause a deadlock if the task sleeps waiting on
>>> another stripe_head to become available. The stripe_head held by batch_last
>>> can be blocking the advancement of other stripe_heads, leading to no
>>> stripe_heads being released so raid5_get_active_stripe waits forever.
>>>
>>> Like with the quiesce state handling earlier in the function, batch_last
>>> needs to be released by raid5_get_active_stripe before it waits for another
>>> stripe_head.
>>>
>>>
>>> Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch")
>>> Signed-off-by: David Jeffery <djeffery@redhat.com>

This makes sense to me. Nice catch on the difficult bug.

Reviewed-by: Logan Gunthorpe <logang@deltatee.com>

Logan
David Jeffery Oct. 3, 2023, 6:50 p.m. UTC | #4
On Tue, Oct 3, 2023 at 2:48 AM Song Liu <song@kernel.org> wrote:
>
> Thanks for the fix! I applied it to md-fixes.
>
> Question: How easy/difficult is it to reproduce this issue?
>

Hello,

One customer system could trigger it reliably and confirmed the fix,
but I haven't had any success recreating it with synthetic workloads
on a test system.

David Jeffery
Song Liu Oct. 3, 2023, 8:14 p.m. UTC | #5
Hi David,

On Tue, Oct 3, 2023 at 11:50 AM David Jeffery <djeffery@redhat.com> wrote:
>
> On Tue, Oct 3, 2023 at 2:48 AM Song Liu <song@kernel.org> wrote:
> >
> > Thanks for the fix! I applied it to md-fixes.
> >
> > Question: How easy/difficult is it to reproduce this issue?
> >
>
> Hello,
>
> One customer system could trigger it reliably and confirmed the fix,
> but I haven't had any success recreating it with synthetic workloads
> on a test system.

Thanks for the information!

Song
diff mbox series

Patch

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6383723468e5..0644b83fd3f4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -854,6 +854,13 @@  struct stripe_head *raid5_get_active_stripe(struct r5conf *conf,
 
 		set_bit(R5_INACTIVE_BLOCKED, &conf->cache_state);
 		r5l_wake_reclaim(conf->log, 0);
+
+		/* release batch_last before wait to avoid risk of deadlock */
+		if (ctx && ctx->batch_last) {
+			raid5_release_stripe(ctx->batch_last);
+			ctx->batch_last = NULL;
+		}
+
 		wait_event_lock_irq(conf->wait_for_stripe,
 				    is_inactive_blocked(conf, hash),
 				    *(conf->hash_locks + hash));