diff mbox series

netfs: Fix ceph copy to cache on write-begin

Message ID 2117977.1733750054@warthog.procyon.org.uk (mailing list archive)
State New
Headers show
Series netfs: Fix ceph copy to cache on write-begin | expand

Commit Message

David Howells Dec. 9, 2024, 1:14 p.m. UTC
Hi Max,

Could you try this?

David
---
netfs: Fix ceph copy to cache on write-begin

At the end of netfs_unlock_read_folio() in which folios are marked
appropriately for copying to the cache (either with by being marked dirty
and having their private data set or by having PG_private_2 set) and then
unlocked, the folio_queue struct has the entry pointing to the folio
cleared.  This presents a problem for netfs_pgpriv2_write_to_the_cache(),
which is used to write folios marked with PG_private_2 to the cache as it
expects to be able to trawl the folio_queue list thereafter to find the
relevant folios, leading to a hang.

Fix this by not clearing the folio_queue entry if we're going to do the
deprecated copy-to-cache.  The clearance will be done instead as the folios
are written to the cache.

This can be reproduced by starting cachefiles, mounting a ceph filesystem
with "-o fsc" and writing to it.

Reported-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: netfs@lists.linux.dev
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/read_collect.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Max Kellermann Dec. 9, 2024, 1:31 p.m. UTC | #1
On Mon, Dec 9, 2024 at 2:14 PM David Howells <dhowells@redhat.com> wrote:
> Could you try this?

No change, still hangs immediately on the first try.
David Howells Dec. 12, 2024, 4 p.m. UTC | #2
How about if you add the attached?

For convenience, I've put the outstanding fix patches I have here:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes

David
---
commit d0bc2ecca996105f55da22e8867905ca1dad7c8f
Author: David Howells <dhowells@redhat.com>
Date:   Thu Dec 12 15:26:24 2024 +0000

    netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled
    
    When the caching for a cookie is temporarily disabled (e.g. due to a DIO
    write on that file), future copying to the cache for that file is disabled
    until all fds open on that file are closed.  However, if netfslib is using
    the deprecated PG_private_2 method (such as is currently used by ceph), and
    decides it wants to copy to the cache, netfs_advance_write() will just bail
    at the first check seeing that the cache stream is unavailable, and
    indicate that it dealt with all the content.
    
    This means that we have no subrequests to provide notifications to drive
    the state machine or even to pin the request and the request just gets
    discarded, leaving the folios with PG_private_2 set.
    
    Fix this by jumping directly to cancel the request if the cache is not
    available.  That way, we don't remove mark3 from the folio_queue list and
    netfs_pgpriv2_cancel() will clean up the folios.
    
    This was found by running the generic/013 xfstest against ceph with an active
    cache and the "-o fsc" option passed to ceph.  That would usually hang
    
    Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
    Reported-by: Max Kellermann <max.kellermann@ionos.com>
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: Ilya Dryomov <idryomov@gmail.com>
    cc: Xiubo Li <xiubli@redhat.com>
    cc: netfs@lists.linux.dev
    cc: ceph-devel@vger.kernel.org
    cc: linux-fsdevel@vger.kernel.org

diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index ba5af89d37fa..54d5004fec18 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -170,6 +170,10 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq)
 
 	trace_netfs_write(wreq, netfs_write_trace_copy_to_cache);
 	netfs_stat(&netfs_n_wh_copy_to_cache);
+	if (!wreq->io_streams[1].avail) {
+		netfs_put_request(wreq, false, netfs_rreq_trace_put_return);
+		goto couldnt_start;
+	}
 
 	for (;;) {
 		error = netfs_pgpriv2_copy_folio(wreq, folio);
diff mbox series

Patch

diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 849f40f64443..72a16222b63b 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -62,10 +62,14 @@  static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq,
 		} else {
 			trace_netfs_folio(folio, netfs_folio_trace_read_done);
 		}
+
+		folioq_clear(folioq, slot);
 	} else {
 		// TODO: Use of PG_private_2 is deprecated.
 		if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags))
 			netfs_pgpriv2_mark_copy_to_cache(subreq, rreq, folioq, slot);
+		else
+			folioq_clear(folioq, slot);
 	}
 
 	if (!test_bit(NETFS_RREQ_DONT_UNLOCK_FOLIOS, &rreq->flags)) {
@@ -77,8 +81,6 @@  static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq,
 			folio_unlock(folio);
 		}
 	}
-
-	folioq_clear(folioq, slot);
 }
 
 /*