fs/netfs/read_collect: add to next->prev_donated

Message ID	20250210191118.3444416-1-max.kellermann@ionos.com (mailing list archive)
State	Handled Elsewhere, archived
Headers	show Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A80025E463 for <linux-nfs@vger.kernel.org>; Mon, 10 Feb 2025 19:11:23 +0000 (UTC) From: Max Kellermann <max.kellermann@ionos.com> To: dhowells@redhat.com, netfs@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org Cc: Max Kellermann <max.kellermann@ionos.com> Subject: [PATCH] fs/netfs/read_collect: add to next->prev_donated Date: Mon, 10 Feb 2025 20:11:18 +0100 Message-ID: <20250210191118.3444416-1-max.kellermann@ionos.com> In-Reply-To: <CAKPOu+_4mUwYgQtRTbXCmi+-k3PGvLysnPadkmHOyB7Gz0iSMA@mail.gmail.com> References: <CAKPOu+_4mUwYgQtRTbXCmi+-k3PGvLysnPadkmHOyB7Gz0iSMA@mail.gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	fs/netfs/read_collect: add to next->prev_donated \| expand fs/netfs/read_collect: add to next->prev_donated

Max Kellermann Feb. 10, 2025, 7:11 p.m. UTC

If multiple subrequests donate data to the same "next" request
(depending on the subrequest completion order), each of them would
overwrite the `prev_donated` field, causing data corruption and a
BUG() crash ("Can't donate prior to front").

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Closes: https://lore.kernel.org/netfs/CAKPOu+_4mUwYgQtRTbXCmi+-k3PGvLysnPadkmHOyB7Gz0iSMA@mail.gmail.com/
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
David, this seems to fix the bug for me.  Please also check if we need
a "donation_changed" check.
---
 fs/netfs/read_collect.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

David Howells Feb. 14, 2025, 12:47 p.m. UTC | #1

Max Kellermann <max.kellermann@ionos.com> wrote:

> David, this seems to fix the bug for me.  Please also check if we need
> a "donation_changed" check.

Shouldn't do since at that point we've been holding rreq->lock since the last
check.

> If multiple subrequests donate data to the same "next" request
> (depending on the subrequest completion order), each of them would
> overwrite the `prev_donated` field, causing data corruption and a
> BUG() crash ("Can't donate prior to front").
> 
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Closes: https://lore.kernel.org/netfs/CAKPOu+_4mUwYgQtRTbXCmi+-k3PGvLysnPadkmHOyB7Gz0iSMA@mail.gmail.com/
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

Signed-off-by: David Howells <dhowells@redhat.com>

Max Kellermann Feb. 20, 2025, 1:09 p.m. UTC | #2

On Fri, Feb 14, 2025 at 1:47 PM David Howells <dhowells@redhat.com> wrote:
> Signed-off-by: David Howells <dhowells@redhat.com>

Greg, you merged my other two netfs fixes into 6.13.3, but omitted
this one. Did you miss it, or was there another reason?

Greg KH Feb. 20, 2025, 2:17 p.m. UTC | #3

On Thu, Feb 20, 2025 at 02:09:17PM +0100, Max Kellermann wrote:
> On Fri, Feb 14, 2025 at 1:47 PM David Howells <dhowells@redhat.com> wrote:
> > Signed-off-by: David Howells <dhowells@redhat.com>
> 
> Greg, you merged my other two netfs fixes into 6.13.3, but omitted
> this one. Did you miss it, or was there another reason?

It wasn't sent to me or to the stable list, so how could have I seen it?

confused,

greg k-h

Max Kellermann Feb. 20, 2025, 3 p.m. UTC | #4

On Thu, Feb 20, 2025 at 3:17 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> It wasn't sent to me or to the stable list, so how could have I seen it?

Oh, of course, I forgot to add stable. How shall we proceed? Do you
want me to resend to you with David's Signed-off-by?

Greg KH Feb. 20, 2025, 3:10 p.m. UTC | #5

On Thu, Feb 20, 2025 at 04:00:49PM +0100, Max Kellermann wrote:
> On Thu, Feb 20, 2025 at 3:17 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > It wasn't sent to me or to the stable list, so how could have I seen it?
> 
> Oh, of course, I forgot to add stable. How shall we proceed? Do you
> want me to resend to you with David's Signed-off-by?

Please do, and cc: stable.

thanks,

greg k-h

Salvatore Bonaccorso March 1, 2025, 2:17 p.m. UTC | #6

Hi

(btw side question from a bystander looking a the bug)

On Thu, Feb 20, 2025 at 04:00:49PM +0100, Max Kellermann wrote:
> On Thu, Feb 20, 2025 at 3:17 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > It wasn't sent to me or to the stable list, so how could have I seen it?
> 
> Oh, of course, I forgot to add stable. How shall we proceed? Do you
> want me to resend to you with David's Signed-off-by?

Do I see it correctly that this will be 6.12.y and 6.13.y specific
backports since the code in mainline changed substantially with
e2d46f2ec332 ("netfs: Change the read result collector to only use one
work item") and so your change does not apply there anymore?

I'm asking since we got a bug report in Debian which seems to idicate
it might have the same root cause as you report, and it is at
https://bugs.debian.org/1098698 .

Regards,
Salvatore

Max Kellermann March 1, 2025, 4:51 p.m. UTC | #7

On Sat, Mar 1, 2025 at 3:17 PM Salvatore Bonaccorso <carnil@debian.org> wrote:
> Do I see it correctly that this will be 6.12.y and 6.13.y specific
> backports since the code in mainline changed substantially with
> e2d46f2ec332 ("netfs: Change the read result collector to only use one
> work item") and so your change does not apply there anymore?

Correct.

> I'm asking since we got a bug report in Debian which seems to idicate
> it might have the same root cause as you report, and it is at
> https://bugs.debian.org/1098698 .

This indeed looks similar.

Note that this is one of four netfs crash fixes I posted recently.
Two other fixes have been released with 6.13.4 already:
https://lore.kernel.org/netfs/20250211093432.3524035-1-max.kellermann@ionos.com/
https://lore.kernel.org/netfs/20250210223144.3481766-1-max.kellermann@ionos.com/
The fourth bug (that got no reviews so far) is here (this one affects
master as well):
https://lore.kernel.org/netfs/20250228123602.2140459-1-max.kellermann@ionos.com/

Max

fs/netfs/read_collect: add to next->prev_donated

Commit Message

Comments

Patch