diff mbox series

netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag"

Message ID 3575457.1722355300@warthog.procyon.org.uk (mailing list archive)
State New, archived
Headers show
Series netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag" | expand

Commit Message

David Howells July 30, 2024, 4:01 p.m. UTC
Hi Max,

Can you try this patch instead of either of yours?

David
---

This reverts commit ae678317b95e760607c7b20b97c9cd4ca9ed6e1a.

Revert the patch that removes the deprecated use of PG_private_2 in
netfslib for the moment as Ceph is actually still using this to track
data copied to the cache.

Fixes: ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/ceph/addr.c               |   19 +++++
 fs/netfs/buffered_read.c     |    8 ++
 fs/netfs/io.c                |  144 +++++++++++++++++++++++++++++++++++++++++++
 include/trace/events/netfs.h |    1 
 4 files changed, 170 insertions(+), 2 deletions(-)

Comments

Max Kellermann July 30, 2024, 4:28 p.m. UTC | #1
On Tue, Jul 30, 2024 at 6:01 PM David Howells <dhowells@redhat.com> wrote:
> Can you try this patch instead of either of yours?

I booted it on one of the servers, and no problem so far. All tests
complete successfully, even the one with copy_file_range that crashed
with my patch. I'll let you know when problems occur later, but until
then, I agree with merging your revert instead of my patches.

If I understand this correctly, my other problem (the
folio_attach_private conflict between netfs and ceph) I posted in
https://lore.kernel.org/ceph-devel/CAKPOu+8q_1rCnQndOj3KAitNY2scPQFuSS-AxeGru02nP9ZO0w@mail.gmail.com/
was caused by my (bad) patch after all, wasn't it?

> For the moment, ceph has to continue using PG_private_2.  It doesn't use
> netfs_writepages().  I have mostly complete patches to fix that, but they got
> popped onto the back burner for a bit.

When you're done with those patches, Cc me on those if you want me to
help test them.

Max
Max Kellermann July 30, 2024, 8 p.m. UTC | #2
On Tue, Jul 30, 2024 at 6:28 PM Max Kellermann <max.kellermann@ionos.com> wrote:
> I'll let you know when problems occur later, but until
> then, I agree with merging your revert instead of my patches.

Not sure if that's the same bug/cause (looks different), but 6.10.2
with your patch is still unstable:

 rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: {
9-.... 15-.... } 521399 jiffies s: 2085 root: 0x1/.
 rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x8200/.
 Sending NMI from CPU 3 to CPUs 9:
 NMI backtrace for cpu 9
 CPU: 9 PID: 2756 Comm: kworker/9:2 Tainted: G      D
6.10.2-cm4all2-vm+ #171
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Workqueue: ceph-msgr ceph_con_workfn
 RIP: 0010:native_queued_spin_lock_slowpath+0x80/0x260
 Code: 57 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 0f b6 03 84 c0 75 f7
b8 01 00 00 00 66 89 03 5b 5d 41 5c 41 5d c3 cc cc cc cc f3 90 <eb> 93
8b 37 b8 00 02 00 00 81 fe 00 01 00 00 74 07 eb a1 83 e8 01
 RSP: 0018:ffffaf5880c03bb8 EFLAGS: 00000202
 RAX: 0000000000000001 RBX: ffffa02bc37c9e98 RCX: ffffaf5880c03c90
 RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffa02bc37c9e98
 RBP: ffffa02bc2f94000 R08: ffffaf5880c03c90 R09: 0000000000000010
 R10: 0000000000000514 R11: 0000000000000000 R12: ffffaf5880c03c90
 R13: ffffffffb4bcb2f0 R14: ffffa036c9e7e8e8 R15: ffffa02bc37c9e98
 FS:  0000000000000000(0000) GS:ffffa036cf040000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055fecac48568 CR3: 000000030d82c002 CR4: 00000000001706b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <NMI>
  ? nmi_cpu_backtrace+0x83/0xf0
  ? nmi_cpu_backtrace_handler+0xd/0x20
  ? nmi_handle+0x56/0x120
  ? default_do_nmi+0x40/0x100
  ? exc_nmi+0xdc/0x100
  ? end_repeat_nmi+0xf/0x53
  ? __pfx_ceph_ino_compare+0x10/0x10
  ? native_queued_spin_lock_slowpath+0x80/0x260
  ? native_queued_spin_lock_slowpath+0x80/0x260
  ? native_queued_spin_lock_slowpath+0x80/0x260
  </NMI>
  <TASK>
  ? __pfx_ceph_ino_compare+0x10/0x10
  _raw_spin_lock+0x1e/0x30
  find_inode+0x6e/0xc0
  ? __pfx_ceph_ino_compare+0x10/0x10
  ? __pfx_ceph_set_ino_cb+0x10/0x10
  ilookup5_nowait+0x6d/0xa0
  ? __pfx_ceph_ino_compare+0x10/0x10
  iget5_locked+0x33/0xe0
  ceph_get_inode+0xb8/0xf0
  mds_dispatch+0xfe8/0x1ff0
  ? inet_recvmsg+0x4d/0xf0
  ceph_con_process_message+0x66/0x80
  ceph_con_v1_try_read+0xcfc/0x17c0
  ? __switch_to_asm+0x39/0x70
  ? finish_task_switch.isra.0+0x78/0x240
  ? __schedule+0x32a/0x1440
  ceph_con_workfn+0x339/0x4f0
  process_one_work+0x138/0x2e0
  worker_thread+0x2b9/0x3d0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xba/0xe0
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x30/0x50
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
 </TASK>
Max Kellermann July 31, 2024, 8:16 a.m. UTC | #3
On Tue, Jul 30, 2024 at 6:28 PM Max Kellermann <max.kellermann@ionos.com> wrote:
> If I understand this correctly, my other problem (the
> folio_attach_private conflict between netfs and ceph) I posted in
> https://lore.kernel.org/ceph-devel/CAKPOu+8q_1rCnQndOj3KAitNY2scPQFuSS-AxeGru02nP9ZO0w@mail.gmail.com/
> was caused by my (bad) patch after all, wasn't it?

It was not caused by my bad patch. Without my patch, but with your
revert instead I just got a crash (this time, I enabled lots of
debugging options in the kernel, including KASAN) - it's the same
crash as in the post I linked in my previous email:

 ------------[ cut here ]------------
 WARNING: CPU: 13 PID: 3621 at fs/ceph/caps.c:3386
ceph_put_wrbuffer_cap_refs+0x416/0x500
 Modules linked in:
 CPU: 13 PID: 3621 Comm: rsync Not tainted 6.10.2-cm4all2-vm+ #176
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 RIP: 0010:ceph_put_wrbuffer_cap_refs+0x416/0x500
 Code: e8 af 7f 50 01 45 84 ed 75 27 45 8d 74 24 ff e9 cf fd ff ff e8
ab ea 64 ff e9 4c fc ff ff 31 f6 48 89 df e8 3c 86 ff ff eb b5 <0f> 0b
e9 7a ff ff ff 31 f6 48 89 df e8 29 86 ff ff eb cd 0f 0b 48
 RSP: 0018:ffff88813c57f868 EFLAGS: 00010286
 RAX: dffffc0000000000 RBX: ffff88823dc66588 RCX: 0000000000000000
 RDX: 1ffff11047b8cda7 RSI: ffff88823dc66df0 RDI: ffff88823dc66d38
 RBP: 0000000000000001 R08: 0000000000000000 R09: fffffbfff5f9a8cd
 R10: ffffffffafcd466f R11: 0000000000000001 R12: 0000000000000000
 R13: ffffea000947af00 R14: 00000000ffffffff R15: 0000000000000356
 FS:  00007f1e82957b80(0000) GS:ffff888a73400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000559037dacea8 CR3: 000000013f1b2002 CR4: 00000000001706b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  ? __warn+0xc8/0x2c0
  ? ceph_put_wrbuffer_cap_refs+0x416/0x500
  ? report_bug+0x257/0x2b0
  ? handle_bug+0x3c/0x70
  ? exc_invalid_op+0x13/0x40
  ? asm_exc_invalid_op+0x16/0x20
  ? ceph_put_wrbuffer_cap_refs+0x416/0x500
  ? ceph_put_wrbuffer_cap_refs+0x2e/0x500
  ceph_invalidate_folio+0x241/0x310
  truncate_cleanup_folio+0x277/0x330
  truncate_inode_pages_range+0x1b4/0x940
  ? __pfx_truncate_inode_pages_range+0x10/0x10
  ? __lock_acquire+0x19f3/0x5c10
  ? __lock_acquire+0x19f3/0x5c10
  ? __pfx___lock_acquire+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  ? srso_alias_untrain_ret+0x1/0x10
  ? lock_acquire+0x186/0x490
  ? find_held_lock+0x2d/0x110
  ? kvm_sched_clock_read+0xd/0x20
  ? local_clock_noinstr+0x9/0xb0
  ? __pfx_lock_release+0x10/0x10
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ceph_evict_inode+0xd5/0x530
  evict+0x251/0x560
  __dentry_kill+0x17b/0x500
  dput+0x393/0x690
  __fput+0x40e/0xa60
  __x64_sys_close+0x78/0xd0
  do_syscall_64+0x82/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? syscall_exit_to_user_mode+0x9f/0x190
  ? do_syscall_64+0x8e/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? syscall_exit_to_user_mode+0x9f/0x190
  ? do_syscall_64+0x8e/0x130
  ? do_syscall_64+0x8e/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f1e823178e0
 Code: 0d 00 00 00 eb b2 e8 ff f7 01 00 66 2e 0f 1f 84 00 00 00 00 00
0f 1f 44 00 00 80 3d 01 1d 0e 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d
00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
 RSP: 002b:00007ffe16c2e108 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
 RAX: ffffffffffffffda RBX: 000000000000001e RCX: 00007f1e823178e0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
 RBP: 00007f1e8219bc08 R08: 0000000000000000 R09: 0000559037df64b0
 R10: fe04b91e88691591 R11: 0000000000000202 R12: 0000000000000001
 R13: 0000000000000000 R14: 00007ffe16c2e220 R15: 0000000000000001
  </TASK>
 irq event stamp: 26945
 hardirqs last  enabled at (26951): [<ffffffffaaac5a99>]
console_unlock+0x189/0x1b0
 hardirqs last disabled at (26956): [<ffffffffaaac5a7e>]
console_unlock+0x16e/0x1b0
 softirqs last  enabled at (26518): [<ffffffffaa962375>] irq_exit_rcu+0x95/0xc0
 softirqs last disabled at (26513): [<ffffffffaa962375>] irq_exit_rcu+0x95/0xc0
 ---[ end trace 0000000000000000 ]---
 ==================================================================
 BUG: KASAN: null-ptr-deref in ceph_put_snap_context+0x18/0x50
 Write of size 4 at addr 0000000000000356 by task rsync/3621

 CPU: 13 PID: 3621 Comm: rsync Tainted: G        W
6.10.2-cm4all2-vm+ #176
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x74/0xd0
  kasan_report+0xb9/0xf0
  ? ceph_put_snap_context+0x18/0x50
  kasan_check_range+0xeb/0x1a0
  ceph_put_snap_context+0x18/0x50
  ceph_invalidate_folio+0x249/0x310
  truncate_cleanup_folio+0x277/0x330
  truncate_inode_pages_range+0x1b4/0x940
  ? __pfx_truncate_inode_pages_range+0x10/0x10
  ? __lock_acquire+0x19f3/0x5c10
  ? __lock_acquire+0x19f3/0x5c10
  ? __pfx___lock_acquire+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  ? srso_alias_untrain_ret+0x1/0x10
  ? lock_acquire+0x186/0x490
  ? find_held_lock+0x2d/0x110
  ? kvm_sched_clock_read+0xd/0x20
  ? local_clock_noinstr+0x9/0xb0
  ? __pfx_lock_release+0x10/0x10
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ceph_evict_inode+0xd5/0x530
  evict+0x251/0x560
  __dentry_kill+0x17b/0x500
  dput+0x393/0x690
  __fput+0x40e/0xa60
  __x64_sys_close+0x78/0xd0
  do_syscall_64+0x82/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? syscall_exit_to_user_mode+0x9f/0x190
  ? do_syscall_64+0x8e/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  ? syscall_exit_to_user_mode+0x9f/0x190
  ? do_syscall_64+0x8e/0x130
  ? do_syscall_64+0x8e/0x130
  ? lockdep_hardirqs_on_prepare+0x275/0x3e0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f1e823178e0
 Code: 0d 00 00 00 eb b2 e8 ff f7 01 00 66 2e 0f 1f 84 00 00 00 00 00
0f 1f 44 00 00 80 3d 01 1d 0e 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d
00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
 RSP: 002b:00007ffe16c2e108 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
 RAX: ffffffffffffffda RBX: 000000000000001e RCX: 00007f1e823178e0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
 RBP: 00007f1e8219bc08 R08: 0000000000000000 R09: 0000559037df64b0
 R10: fe04b91e88691591 R11: 0000000000000202 R12: 0000000000000001
 R13: 0000000000000000 R14: 00007ffe16c2e220 R15: 0000000000000001
  </TASK>
David Howells July 31, 2024, 10:41 a.m. UTC | #4
Max Kellermann <max.kellermann@ionos.com> wrote:

> It was not caused by my bad patch. Without my patch, but with your
> revert instead I just got a crash (this time, I enabled lots of
> debugging options in the kernel, including KASAN) - it's the same
> crash as in the post I linked in my previous email:
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 13 PID: 3621 at fs/ceph/caps.c:3386
> ceph_put_wrbuffer_cap_refs+0x416/0x500

Is that "WARN_ON_ONCE(ci->i_auth_cap);" for you?

David
Max Kellermann July 31, 2024, 11:37 a.m. UTC | #5
On Wed, Jul 31, 2024 at 12:41 PM David Howells <dhowells@redhat.com> wrote:

> >  ------------[ cut here ]------------
> >  WARNING: CPU: 13 PID: 3621 at fs/ceph/caps.c:3386
> > ceph_put_wrbuffer_cap_refs+0x416/0x500
>
> Is that "WARN_ON_ONCE(ci->i_auth_cap);" for you?

Yes, and that happens because no "capsnap" was found, because the
"snapc" parameter is 0x356 (NETFS_FOLIO_COPY_TO_CACHE); no
snap_context with address 0x356 could be found, of course.

Max
diff mbox series

Patch

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 8c16bc5250ef..73b5a07bf94d 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -498,6 +498,11 @@  const struct netfs_request_ops ceph_netfs_ops = {
 };
 
 #ifdef CONFIG_CEPH_FSCACHE
+static void ceph_set_page_fscache(struct page *page)
+{
+	folio_start_private_2(page_folio(page)); /* [DEPRECATED] */
+}
+
 static void ceph_fscache_write_terminated(void *priv, ssize_t error, bool was_async)
 {
 	struct inode *inode = priv;
@@ -515,6 +520,10 @@  static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, b
 			       ceph_fscache_write_terminated, inode, true, caching);
 }
 #else
+static inline void ceph_set_page_fscache(struct page *page)
+{
+}
+
 static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, bool caching)
 {
 }
@@ -706,6 +715,8 @@  static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
 		len = wlen;
 
 	set_page_writeback(page);
+	if (caching)
+		ceph_set_page_fscache(page);
 	ceph_fscache_write_to_cache(inode, page_off, len, caching);
 
 	if (IS_ENCRYPTED(inode)) {
@@ -789,6 +800,8 @@  static int ceph_writepage(struct page *page, struct writeback_control *wbc)
 		return AOP_WRITEPAGE_ACTIVATE;
 	}
 
+	folio_wait_private_2(page_folio(page)); /* [DEPRECATED] */
+
 	err = writepage_nounlock(page, wbc);
 	if (err == -ERESTARTSYS) {
 		/* direct memory reclaimer was killed by SIGKILL. return 0
@@ -1062,7 +1075,8 @@  static int ceph_writepages_start(struct address_space *mapping,
 				unlock_page(page);
 				break;
 			}
-			if (PageWriteback(page)) {
+			if (PageWriteback(page) ||
+			    PagePrivate2(page) /* [DEPRECATED] */) {
 				if (wbc->sync_mode == WB_SYNC_NONE) {
 					doutc(cl, "%p under writeback\n", page);
 					unlock_page(page);
@@ -1070,6 +1084,7 @@  static int ceph_writepages_start(struct address_space *mapping,
 				}
 				doutc(cl, "waiting on writeback %p\n", page);
 				wait_on_page_writeback(page);
+				folio_wait_private_2(page_folio(page)); /* [DEPRECATED] */
 			}
 
 			if (!clear_page_dirty_for_io(page)) {
@@ -1254,6 +1269,8 @@  static int ceph_writepages_start(struct address_space *mapping,
 			}
 
 			set_page_writeback(page);
+			if (caching)
+				ceph_set_page_fscache(page);
 			len += thp_size(page);
 		}
 		ceph_fscache_write_to_cache(inode, offset, len, caching);
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index a688d4c75d99..424048f9ed1f 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -466,7 +466,7 @@  int netfs_write_begin(struct netfs_inode *ctx,
 	if (!netfs_is_cache_enabled(ctx) &&
 	    netfs_skip_folio_read(folio, pos, len, false)) {
 		netfs_stat(&netfs_n_rh_write_zskip);
-		goto have_folio;
+		goto have_folio_no_wait;
 	}
 
 	rreq = netfs_alloc_request(mapping, file,
@@ -507,6 +507,12 @@  int netfs_write_begin(struct netfs_inode *ctx,
 	netfs_put_request(rreq, false, netfs_rreq_trace_put_return);
 
 have_folio:
+	if (test_bit(NETFS_ICTX_USE_PGPRIV2, &ctx->flags)) {
+		ret = folio_wait_private_2_killable(folio);
+		if (ret < 0)
+			goto error;
+	}
+have_folio_no_wait:
 	*_folio = folio;
 	_leave(" = 0");
 	return 0;
diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index c93851b98368..c179a1c73fa7 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -98,6 +98,146 @@  static void netfs_rreq_completed(struct netfs_io_request *rreq, bool was_async)
 	netfs_put_request(rreq, was_async, netfs_rreq_trace_put_complete);
 }
 
+/*
+ * [DEPRECATED] Deal with the completion of writing the data to the cache.  We
+ * have to clear the PG_fscache bits on the folios involved and release the
+ * caller's ref.
+ *
+ * May be called in softirq mode and we inherit a ref from the caller.
+ */
+static void netfs_rreq_unmark_after_write(struct netfs_io_request *rreq,
+					  bool was_async)
+{
+	struct netfs_io_subrequest *subreq;
+	struct folio *folio;
+	pgoff_t unlocked = 0;
+	bool have_unlocked = false;
+
+	rcu_read_lock();
+
+	list_for_each_entry(subreq, &rreq->subrequests, rreq_link) {
+		XA_STATE(xas, &rreq->mapping->i_pages, subreq->start / PAGE_SIZE);
+
+		xas_for_each(&xas, folio, (subreq->start + subreq->len - 1) / PAGE_SIZE) {
+			if (xas_retry(&xas, folio))
+				continue;
+
+			/* We might have multiple writes from the same huge
+			 * folio, but we mustn't unlock a folio more than once.
+			 */
+			if (have_unlocked && folio->index <= unlocked)
+				continue;
+			unlocked = folio_next_index(folio) - 1;
+			trace_netfs_folio(folio, netfs_folio_trace_end_copy);
+			folio_end_private_2(folio);
+			have_unlocked = true;
+		}
+	}
+
+	rcu_read_unlock();
+	netfs_rreq_completed(rreq, was_async);
+}
+
+static void netfs_rreq_copy_terminated(void *priv, ssize_t transferred_or_error,
+				       bool was_async) /* [DEPRECATED] */
+{
+	struct netfs_io_subrequest *subreq = priv;
+	struct netfs_io_request *rreq = subreq->rreq;
+
+	if (IS_ERR_VALUE(transferred_or_error)) {
+		netfs_stat(&netfs_n_rh_write_failed);
+		trace_netfs_failure(rreq, subreq, transferred_or_error,
+				    netfs_fail_copy_to_cache);
+	} else {
+		netfs_stat(&netfs_n_rh_write_done);
+	}
+
+	trace_netfs_sreq(subreq, netfs_sreq_trace_write_term);
+
+	/* If we decrement nr_copy_ops to 0, the ref belongs to us. */
+	if (atomic_dec_and_test(&rreq->nr_copy_ops))
+		netfs_rreq_unmark_after_write(rreq, was_async);
+
+	netfs_put_subrequest(subreq, was_async, netfs_sreq_trace_put_terminated);
+}
+
+/*
+ * [DEPRECATED] Perform any outstanding writes to the cache.  We inherit a ref
+ * from the caller.
+ */
+static void netfs_rreq_do_write_to_cache(struct netfs_io_request *rreq)
+{
+	struct netfs_cache_resources *cres = &rreq->cache_resources;
+	struct netfs_io_subrequest *subreq, *next, *p;
+	struct iov_iter iter;
+	int ret;
+
+	trace_netfs_rreq(rreq, netfs_rreq_trace_copy);
+
+	/* We don't want terminating writes trying to wake us up whilst we're
+	 * still going through the list.
+	 */
+	atomic_inc(&rreq->nr_copy_ops);
+
+	list_for_each_entry_safe(subreq, p, &rreq->subrequests, rreq_link) {
+		if (!test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) {
+			list_del_init(&subreq->rreq_link);
+			netfs_put_subrequest(subreq, false,
+					     netfs_sreq_trace_put_no_copy);
+		}
+	}
+
+	list_for_each_entry(subreq, &rreq->subrequests, rreq_link) {
+		/* Amalgamate adjacent writes */
+		while (!list_is_last(&subreq->rreq_link, &rreq->subrequests)) {
+			next = list_next_entry(subreq, rreq_link);
+			if (next->start != subreq->start + subreq->len)
+				break;
+			subreq->len += next->len;
+			list_del_init(&next->rreq_link);
+			netfs_put_subrequest(next, false,
+					     netfs_sreq_trace_put_merged);
+		}
+
+		ret = cres->ops->prepare_write(cres, &subreq->start, &subreq->len,
+					       subreq->len, rreq->i_size, true);
+		if (ret < 0) {
+			trace_netfs_failure(rreq, subreq, ret, netfs_fail_prepare_write);
+			trace_netfs_sreq(subreq, netfs_sreq_trace_write_skip);
+			continue;
+		}
+
+		iov_iter_xarray(&iter, ITER_SOURCE, &rreq->mapping->i_pages,
+				subreq->start, subreq->len);
+
+		atomic_inc(&rreq->nr_copy_ops);
+		netfs_stat(&netfs_n_rh_write);
+		netfs_get_subrequest(subreq, netfs_sreq_trace_get_copy_to_cache);
+		trace_netfs_sreq(subreq, netfs_sreq_trace_write);
+		cres->ops->write(cres, subreq->start, &iter,
+				 netfs_rreq_copy_terminated, subreq);
+	}
+
+	/* If we decrement nr_copy_ops to 0, the usage ref belongs to us. */
+	if (atomic_dec_and_test(&rreq->nr_copy_ops))
+		netfs_rreq_unmark_after_write(rreq, false);
+}
+
+static void netfs_rreq_write_to_cache_work(struct work_struct *work) /* [DEPRECATED] */
+{
+	struct netfs_io_request *rreq =
+		container_of(work, struct netfs_io_request, work);
+
+	netfs_rreq_do_write_to_cache(rreq);
+}
+
+static void netfs_rreq_write_to_cache(struct netfs_io_request *rreq) /* [DEPRECATED] */
+{
+	rreq->work.func = netfs_rreq_write_to_cache_work;
+	if (!queue_work(system_unbound_wq, &rreq->work))
+		BUG();
+}
+
 /*
  * Handle a short read.
  */
@@ -275,6 +415,10 @@  static void netfs_rreq_assess(struct netfs_io_request *rreq, bool was_async)
 	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
 	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
 
+	if (test_bit(NETFS_RREQ_COPY_TO_CACHE, &rreq->flags) &&
+	    test_bit(NETFS_RREQ_USE_PGPRIV2, &rreq->flags))
+		return netfs_rreq_write_to_cache(rreq);
+
 	netfs_rreq_completed(rreq, was_async);
 }
 
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index da23484268df..24ec3434d32e 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -145,6 +145,7 @@ 
 	EM(netfs_folio_trace_clear_g,		"clear-g")	\
 	EM(netfs_folio_trace_clear_s,		"clear-s")	\
 	EM(netfs_folio_trace_copy_to_cache,	"mark-copy")	\
+	EM(netfs_folio_trace_end_copy,		"end-copy")	\
 	EM(netfs_folio_trace_filled_gaps,	"filled-gaps")	\
 	EM(netfs_folio_trace_kill,		"kill")		\
 	EM(netfs_folio_trace_kill_cc,		"kill-cc")	\