diff mbox series

[2/2] xfs_db: use directio for device access

Message ID 169567915609.2320255.8945830759168479067.stgit@frogsfrogsfrogs (mailing list archive)
State New, archived
Headers show
Series xfs_db: use directio for filesystem access | expand

Commit Message

Darrick J. Wong Sept. 25, 2023, 9:59 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

XFS and tools (mkfs, copy, repair) don't generally rely on the block
device page cache, preferring instead to use directio.  For whatever
reason, the debugger was never made to do this, but let's do that now.

This should eliminate the weird fstests failures resulting from
udev/blkid pinning a cache page while the unmounting filesystem writes
to the superblock such that xfs_db finds the stale pagecache instead of
the post-unmount superblock.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 db/init.c |    1 +
 1 file changed, 1 insertion(+)

Comments

Christoph Hellwig Jan. 17, 2024, 6:30 p.m. UTC | #1
On Mon, Sep 25, 2023 at 02:59:16PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> XFS and tools (mkfs, copy, repair) don't generally rely on the block
> device page cache, preferring instead to use directio.  For whatever
> reason, the debugger was never made to do this, but let's do that now.
> 
> This should eliminate the weird fstests failures resulting from
> udev/blkid pinning a cache page while the unmounting filesystem writes
> to the superblock such that xfs_db finds the stale pagecache instead of
> the post-unmount superblock.

After some debugging I found out that this breaks a bunch of tests
(at least xfs/002 xfs/070 xfs/424 in the quick group) on 4k device
because xfs_db tries some unaligned reads.

For xfs/002 that is the libxfs_buf_read in __set_cur, when setting the
type to data, but I haven't looked at the other test in detail.

Should I look into finding all these assumptions in xfs_db, or
just make the direct I/O enablement conditional n a 612 byte sector
size?
Darrick J. Wong Jan. 18, 2024, 1:32 a.m. UTC | #2
On Wed, Jan 17, 2024 at 10:30:23AM -0800, Christoph Hellwig wrote:
> On Mon, Sep 25, 2023 at 02:59:16PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > XFS and tools (mkfs, copy, repair) don't generally rely on the block
> > device page cache, preferring instead to use directio.  For whatever
> > reason, the debugger was never made to do this, but let's do that now.
> > 
> > This should eliminate the weird fstests failures resulting from
> > udev/blkid pinning a cache page while the unmounting filesystem writes
> > to the superblock such that xfs_db finds the stale pagecache instead of
> > the post-unmount superblock.
> 
> After some debugging I found out that this breaks a bunch of tests
> (at least xfs/002 xfs/070 xfs/424 in the quick group) on 4k device
> because xfs_db tries some unaligned reads.
> 
> For xfs/002 that is the libxfs_buf_read in __set_cur, when setting the
> type to data, but I haven't looked at the other test in detail.

Hmm.  Perhaps the userspace buftarg setup should go find the physical
sector size of the device?  That "bb_count = 1" in set_iocur_type looks
a bit smelly.

> Should I look into finding all these assumptions in xfs_db, or
> just make the direct I/O enablement conditional n a 612 byte sector
> size?

Let me go run a lbasize=4k fstests run overnight and see what happens.
IIRC zorro told me last year that it wasn't pretty.

--D
Christoph Hellwig Jan. 18, 2024, 4:21 a.m. UTC | #3
On Wed, Jan 17, 2024 at 05:32:50PM -0800, Darrick J. Wong wrote:
> > 
> > For xfs/002 that is the libxfs_buf_read in __set_cur, when setting the
> > type to data, but I haven't looked at the other test in detail.
> 
> Hmm.  Perhaps the userspace buftarg setup should go find the physical
> sector size of the device?  That "bb_count = 1" in set_iocur_type looks
> a bit smelly.

Yes, that should fix this particular issue.

> > Should I look into finding all these assumptions in xfs_db, or
> > just make the direct I/O enablement conditional n a 612 byte sector
> > size?
> 
> Let me go run a lbasize=4k fstests run overnight and see what happens.
> IIRC zorro told me last year that it wasn't pretty.

There's a few failures, but I've been slowly trying to fix this.  The
libxfs/mkfs log sector size detection series in one part of that,
and this:

https://lore.kernel.org/linux-block/20240117175901.871796-1-hch@lst.de/T/#u

is another
Darrick J. Wong Jan. 19, 2024, 12:52 a.m. UTC | #4
On Wed, Jan 17, 2024 at 08:21:25PM -0800, Christoph Hellwig wrote:
> On Wed, Jan 17, 2024 at 05:32:50PM -0800, Darrick J. Wong wrote:
> > > 
> > > For xfs/002 that is the libxfs_buf_read in __set_cur, when setting the
> > > type to data, but I haven't looked at the other test in detail.
> > 
> > Hmm.  Perhaps the userspace buftarg setup should go find the physical
> > sector size of the device?  That "bb_count = 1" in set_iocur_type looks
> > a bit smelly.
> 
> Yes, that should fix this particular issue.
> 
> > > Should I look into finding all these assumptions in xfs_db, or
> > > just make the direct I/O enablement conditional n a 612 byte sector
> > > size?
> > 
> > Let me go run a lbasize=4k fstests run overnight and see what happens.
> > IIRC zorro told me last year that it wasn't pretty.
> 
> There's a few failures, but I've been slowly trying to fix this.  The
> libxfs/mkfs log sector size detection series in one part of that,
> and this:
> 
> https://lore.kernel.org/linux-block/20240117175901.871796-1-hch@lst.de/T/#u

Hmm well I didn't manage to add your loop device patch before I sent
this out last night, but here's the fstest results:

https://djwong.org/fstests/output/.67c2f90f0a1bb329a1b895c50285b0d23c1bd2bb44b7839f3543f82281665db1/.4a10533d4dd2085d3f996649e0886284f557617c94e604189448672e6009b9e8/

Looks like there were a lot of weird problems.  OFC now the second ice
storm has started and the lights are flickering so that might be all
from me for now.

--D

> is another
>
diff mbox series

Patch

diff --git a/db/init.c b/db/init.c
index eec65d0884d..4599cc00d71 100644
--- a/db/init.c
+++ b/db/init.c
@@ -96,6 +96,7 @@  init(
 		x.volname = fsdevice;
 	else
 		x.dname = fsdevice;
+	x.isdirect = LIBXFS_DIRECT;
 
 	x.bcache_flags = CACHE_MISCOMPARE_PURGE;
 	if (!libxfs_init(&x)) {