Message ID | 286638.1736163444@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | netfs: Fix kernel async DIO | expand |
Hi David Thanks for the job ! I will buid Linux 6.10 and mainline with the provided change and I'm comming here as soon as I get results from tests (CET working time). Thanks again for help in this issue Nicolas Le 2025-01-06 12:37, David Howells a écrit : > Hi Nicolas, > > Does the attached fix your problem? > > David > --- > netfs: Fix kernel async DIO > > Netfslib needs to be able to handle kernel-initiated asynchronous DIO > that > is supplied with a bio_vec[] array. Currently, because of the async > flag, > this gets passed to netfs_extract_user_iter() which throws a warning > and > fails because it only handles IOVEC and UBUF iterators. This can be > triggered through a combination of cifs and a loopback blockdev with > something like: > > mount //my/cifs/share /foo > dd if=/dev/zero of=/foo/m0 bs=4K count=1K > losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 > echo hello >/dev/loop2046 > > This causes the following to appear in syslog: > > WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 > netfs_extract_user_iter+0x170/0x250 [netfs] > > and the write to fail. > > Fix this by removing the check in netfs_unbuffered_write_iter_locked() > that > causes async kernel DIO writes to be handled as userspace writes. Note > that this change relies on the kernel caller maintaining the existence > of > the bio_vec array (or kvec[] or folio_queue) until the op is complete. > > Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") > Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr> > Closes: > https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Steve French <smfrench@gmail.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: netfs@lists.linux.dev > cc: linux-cifs@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > --- > fs/netfs/direct_write.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c > index eded8afaa60b..42ce53cc216e 100644 > --- a/fs/netfs/direct_write.c > +++ b/fs/netfs/direct_write.c > @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct > kiocb *iocb, struct iov_iter * > * allocate a sufficiently large bvec array and may shorten the > * request. > */ > - if (async || user_backed_iter(iter)) { > + if (user_backed_iter(iter)) { > n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0); > if (n < 0) { > ret = n; > @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct > kiocb *iocb, struct iov_iter * > wreq->direct_bv_count = n; > wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); > } else { > + /* If this is a kernel-generated async DIO request, > + * assume that any resources the iterator points to > + * (eg. a bio_vec array) will persist till the end of > + * the op. > + */ > wreq->buffer.iter = *iter; > } > }
On Mon, Jan 06, 2025 at 11:37:24AM +0000, David Howells wrote: > mount //my/cifs/share /foo > dd if=/dev/zero of=/foo/m0 bs=4K count=1K > losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 > echo hello >/dev/loop2046 Can you add a testcase using losetup --direct-io with a file on $TEST_DIR so that we get coverage for ITER_BVEC directio to xfstests?
Hi David As your patch was written on top on linux-next I was required to make some small modifications to make it work on mainline (6.13-rc6). The following patch is working fine for me on mainline, but i think it would be better to wait for your confirmation / validation (or new patch) before applying it on production. #-------- PATCH --------# diff --git a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c b/linux-6.13-rc6/fs/netfs/direct_write.c index 88f2adf..94a1ee8 100644 --- a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c +++ b/linux-6.13-rc6/fs/netfs/direct_write.c @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * * allocate a sufficiently large bvec array and may shorten the * request. */ - if (async || user_backed_iter(iter)) { + if (user_backed_iter(iter)) { n = netfs_extract_user_iter(iter, len, &wreq->iter, 0); if (n < 0) { ret = n; @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * wreq->direct_bv_count = n; wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); } else { + /* If this is a kernel-generated async DIO request, + * assume that any resources the iterator points to + * (eg. a bio_vec array) will persist till the end of + * the op. + */ wreq->iter = *iter; } #-------- TESTS --------# Using this patch Linux 6.13-rc6 build with no error and '--direct-io=on' is working : 18:38:47 root@deb12-lab-10d:~# uname -a Linux deb12-lab-10d.lab.lan 6.13.0-rc6-amd64 #0 SMP PREEMPT_DYNAMIC Mon Jan 6 18:14:07 CET 2025 x86_64 GNU/Linux 18:39:29 root@deb12-lab-10d:~# losetup NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC /dev/loop2046 0 0 0 0 /mnt/FBX24T/FS-LAN/bckcrypt2046 1 4096 18:39:32 root@deb12-lab-10d:~# dmsetup ls | grep bckcrypt bckcrypt (254:7) 18:39:55 root@deb12-lab-10d:~# cryptsetup status bckcrypt /dev/mapper/bckcrypt is active and is in use. type: LUKS2 cipher: aes-xts-plain64 keysize: 512 bits key location: keyring device: /dev/loop2046 loop: /mnt/FBX24T/FS-LAN/bckcrypt2046 sector size: 512 offset: 32768 sectors size: 8589901824 sectors mode: read/write 18:40:36 root@deb12-lab-10d:~# df -h | egrep 'cifs|bckcrypt' //10.0.10.100/FBX24T cifs 22T 13T 9,0T 60% /mnt/FBX24T /dev/mapper/bckcrypt btrfs 4,0T 3,3T 779G 82% /mnt/bckcrypt 09:08:44 root@deb12-lab-10d:~# LANG=en_US.UTF-8 09:08:46 root@deb12-lab-10d:~# dd if=/dev/zero of=/mnt/bckcrypt/test/test.dd bs=256M count=16 oflag=direct status=progress 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14 s, 302 MB/s 16+0 records in 16+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14.2061 s, 302 MB/s No write errors using '--direct-io=on' option of losetup with this patch => writing to the back-file is more than 20x faster ... It seems to be ok ! Let me know if something's wrong in this patch or if it can safely be used in production. Again thanks everyone for help. Nicolas Le 2025-01-06 13:07, nicolas.baranger@3xo.fr a écrit : > Hi David > > Thanks for the job ! > I will buid Linux 6.10 and mainline with the provided change and I'm > comming here as soon as I get results from tests (CET working time). > > Thanks again for help in this issue > Nicolas > > Le 2025-01-06 12:37, David Howells a écrit : > >> Hi Nicolas, >> >> Does the attached fix your problem? >> >> David >> --- >> netfs: Fix kernel async DIO >> >> Netfslib needs to be able to handle kernel-initiated asynchronous DIO >> that >> is supplied with a bio_vec[] array. Currently, because of the async >> flag, >> this gets passed to netfs_extract_user_iter() which throws a warning >> and >> fails because it only handles IOVEC and UBUF iterators. This can be >> triggered through a combination of cifs and a loopback blockdev with >> something like: >> >> mount //my/cifs/share /foo >> dd if=/dev/zero of=/foo/m0 bs=4K count=1K >> losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 >> echo hello >/dev/loop2046 >> >> This causes the following to appear in syslog: >> >> WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 >> netfs_extract_user_iter+0x170/0x250 [netfs] >> >> and the write to fail. >> >> Fix this by removing the check in netfs_unbuffered_write_iter_locked() >> that >> causes async kernel DIO writes to be handled as userspace writes. >> Note >> that this change relies on the kernel caller maintaining the existence >> of >> the bio_vec array (or kvec[] or folio_queue) until the op is complete. >> >> Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") >> Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr> >> Closes: >> https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ >> Signed-off-by: David Howells <dhowells@redhat.com> >> cc: Steve French <smfrench@gmail.com> >> cc: Jeff Layton <jlayton@kernel.org> >> cc: netfs@lists.linux.dev >> cc: linux-cifs@vger.kernel.org >> cc: linux-fsdevel@vger.kernel.org >> --- >> fs/netfs/direct_write.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c >> index eded8afaa60b..42ce53cc216e 100644 >> --- a/fs/netfs/direct_write.c >> +++ b/fs/netfs/direct_write.c >> @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct >> kiocb *iocb, struct iov_iter * >> * allocate a sufficiently large bvec array and may shorten the >> * request. >> */ >> - if (async || user_backed_iter(iter)) { >> + if (user_backed_iter(iter)) { >> n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0); >> if (n < 0) { >> ret = n; >> @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct >> kiocb *iocb, struct iov_iter * >> wreq->direct_bv_count = n; >> wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); >> } else { >> + /* If this is a kernel-generated async DIO request, >> + * assume that any resources the iterator points to >> + * (eg. a bio_vec array) will persist till the end of >> + * the op. >> + */ >> wreq->buffer.iter = *iter; >> } >> }
David Howells <dhowells@redhat.com> writes: > netfs: Fix kernel async DIO > > Netfslib needs to be able to handle kernel-initiated asynchronous DIO that > is supplied with a bio_vec[] array. Currently, because of the async flag, > this gets passed to netfs_extract_user_iter() which throws a warning and > fails because it only handles IOVEC and UBUF iterators. This can be > triggered through a combination of cifs and a loopback blockdev with > something like: > > mount //my/cifs/share /foo > dd if=/dev/zero of=/foo/m0 bs=4K count=1K > losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 > echo hello >/dev/loop2046 > > This causes the following to appear in syslog: > > WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs] > > and the write to fail. > > Fix this by removing the check in netfs_unbuffered_write_iter_locked() that > causes async kernel DIO writes to be handled as userspace writes. Note > that this change relies on the kernel caller maintaining the existence of > the bio_vec array (or kvec[] or folio_queue) until the op is complete. > > Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") > Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr> > Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Steve French <smfrench@gmail.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: netfs@lists.linux.dev > cc: linux-cifs@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > --- > fs/netfs/direct_write.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) LGTM. Feel free to add: Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Thanks Christoph and Dave!
Thanks!
I ported the patch to linus/master (see below) and it looks pretty much the
same as yours, give or take tabs getting converted to spaces.
Could I put you down as a Tested-by?
David
---
netfs: Fix kernel async DIO
Netfslib needs to be able to handle kernel-initiated asynchronous DIO that
is supplied with a bio_vec[] array. Currently, because of the async flag,
this gets passed to netfs_extract_user_iter() which throws a warning and
fails because it only handles IOVEC and UBUF iterators. This can be
triggered through a combination of cifs and a loopback blockdev with
something like:
mount //my/cifs/share /foo
dd if=/dev/zero of=/foo/m0 bs=4K count=1K
losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0
echo hello >/dev/loop2046
This causes the following to appear in syslog:
WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs]
and the write to fail.
Fix this by removing the check in netfs_unbuffered_write_iter_locked() that
causes async kernel DIO writes to be handled as userspace writes. Note
that this change relies on the kernel caller maintaining the existence of
the bio_vec array (or kvec[] or folio_queue) until the op is complete.
Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support")
Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr>
Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <smfrench@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/direct_write.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index 173e8b5e6a93..f9421f3e6d37 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
* allocate a sufficiently large bvec array and may shorten the
* request.
*/
- if (async || user_backed_iter(iter)) {
+ if (user_backed_iter(iter)) {
n = netfs_extract_user_iter(iter, len, &wreq->iter, 0);
if (n < 0) {
ret = n;
@@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
wreq->direct_bv_count = n;
wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
} else {
+ /* If this is a kernel-generated async DIO request,
+ * assume that any resources the iterator points to
+ * (eg. a bio_vec array) will persist till the end of
+ * the op.
+ */
wreq->iter = *iter;
}
Hi David Sure you can ! Please also note that after building 'linux-next' and applying the first patch you provide I sucessfully test DIO write (same test process as before). It works fine too ! I stay availiable for further testing Thanks again for help (special thanks to Christoph and David) Nicolas Le 2025-01-07 15:49, David Howells a écrit : > Thanks! > > I ported the patch to linus/master (see below) and it looks pretty much > the > same as yours, give or take tabs getting converted to spaces. > > Could I put you down as a Tested-by? > > David > > --- > netfs: Fix kernel async DIO > > Netfslib needs to be able to handle kernel-initiated asynchronous DIO > that > is supplied with a bio_vec[] array. Currently, because of the async > flag, > this gets passed to netfs_extract_user_iter() which throws a warning > and > fails because it only handles IOVEC and UBUF iterators. This can be > triggered through a combination of cifs and a loopback blockdev with > something like: > > mount //my/cifs/share /foo > dd if=/dev/zero of=/foo/m0 bs=4K count=1K > losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 > echo hello >/dev/loop2046 > > This causes the following to appear in syslog: > > WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 > netfs_extract_user_iter+0x170/0x250 [netfs] > > and the write to fail. > > Fix this by removing the check in netfs_unbuffered_write_iter_locked() > that > causes async kernel DIO writes to be handled as userspace writes. Note > that this change relies on the kernel caller maintaining the existence > of > the bio_vec array (or kvec[] or folio_queue) until the op is complete. > > Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") > Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr> > Closes: > https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Steve French <smfrench@gmail.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: netfs@lists.linux.dev > cc: linux-cifs@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > --- > fs/netfs/direct_write.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c > index 173e8b5e6a93..f9421f3e6d37 100644 > --- a/fs/netfs/direct_write.c > +++ b/fs/netfs/direct_write.c > @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct > kiocb *iocb, struct iov_iter * > * allocate a sufficiently large bvec array and may shorten the > * request. > */ > - if (async || user_backed_iter(iter)) { > + if (user_backed_iter(iter)) { > n = netfs_extract_user_iter(iter, len, &wreq->iter, 0); > if (n < 0) { > ret = n; > @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct > kiocb *iocb, struct iov_iter * > wreq->direct_bv_count = n; > wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); > } else { > + /* If this is a kernel-generated async DIO request, > + * assume that any resources the iterator points to > + * (eg. a bio_vec array) will persist till the end of > + * the op. > + */ > wreq->iter = *iter; > }
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c index eded8afaa60b..42ce53cc216e 100644 --- a/fs/netfs/direct_write.c +++ b/fs/netfs/direct_write.c @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * * allocate a sufficiently large bvec array and may shorten the * request. */ - if (async || user_backed_iter(iter)) { + if (user_backed_iter(iter)) { n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0); if (n < 0) { ret = n; @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * wreq->direct_bv_count = n; wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); } else { + /* If this is a kernel-generated async DIO request, + * assume that any resources the iterator points to + * (eg. a bio_vec array) will persist till the end of + * the op. + */ wreq->buffer.iter = *iter; } }
Hi Nicolas, Does the attached fix your problem? David --- netfs: Fix kernel async DIO Netfslib needs to be able to handle kernel-initiated asynchronous DIO that is supplied with a bio_vec[] array. Currently, because of the async flag, this gets passed to netfs_extract_user_iter() which throws a warning and fails because it only handles IOVEC and UBUF iterators. This can be triggered through a combination of cifs and a loopback blockdev with something like: mount //my/cifs/share /foo dd if=/dev/zero of=/foo/m0 bs=4K count=1K losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 echo hello >/dev/loop2046 This causes the following to appear in syslog: WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs] and the write to fail. Fix this by removing the check in netfs_unbuffered_write_iter_locked() that causes async kernel DIO writes to be handled as userspace writes. Note that this change relies on the kernel caller maintaining the existence of the bio_vec array (or kvec[] or folio_queue) until the op is complete. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Reported by: Nicolas Baranger <nicolas.baranger@3xo.fr> Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <smfrench@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/direct_write.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)