Message ID | 20250305015301.1610092-1-mcgrof@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | bdev: add back PAGE_SIZE block size validation for sb_set_blocksize() | expand |
On Tue, Mar 04, 2025 at 05:53:01PM -0800, Luis Chamberlain wrote: > The commit titled "block/bdev: lift block size restrictions to 64k" > lifted the block layer's max supported block size to 64k inside the > helper blk_validate_block_size() now that we support large folios. > However in lifting the block size we also removed the silly use > cases many filesystems have to use sb_set_blocksize() to *verify* > that the block size < PAGE_SIZE. The call to sb_set_blocksize() can > happen in-kernel given mkfs utilities *can* create for example an > ext4 32k block size filesystem on x86_64, the issue we want to prevent > is mounting it on x86_64 unless the filesystem supports LBS. > > While, we could argue that such checks should be filesystem specific, > there are much more users of sb_set_blocksize() than LBS enabled > filesystem on linux-next, so just do the easier thing and bring back > the PAGE_SIZE check for sb_set_blocksize() users. > > This will ensure that tests such as generic/466 when run in a loop > against say, ext4, won't try to try to actually mount a filesystem with > a block size larger than your filesystem supports given your PAGE_SIZE > and in the worst case crash. So this is expedient because XFS happens to not call sb_set_blocksize()? What is the path forward for filesystems which call sb_set_blocksize() today and want to support LBS in future?
On Wed, Mar 05, 2025 at 06:04:21AM +0000, Matthew Wilcox wrote: > On Tue, Mar 04, 2025 at 05:53:01PM -0800, Luis Chamberlain wrote: > > The commit titled "block/bdev: lift block size restrictions to 64k" > > lifted the block layer's max supported block size to 64k inside the > > helper blk_validate_block_size() now that we support large folios. > > However in lifting the block size we also removed the silly use > > cases many filesystems have to use sb_set_blocksize() to *verify* > > that the block size < PAGE_SIZE. The call to sb_set_blocksize() can > > happen in-kernel given mkfs utilities *can* create for example an > > ext4 32k block size filesystem on x86_64, the issue we want to prevent > > is mounting it on x86_64 unless the filesystem supports LBS. > > > > While, we could argue that such checks should be filesystem specific, > > there are much more users of sb_set_blocksize() than LBS enabled > > filesystem on linux-next, so just do the easier thing and bring back > > the PAGE_SIZE check for sb_set_blocksize() users. > > > > This will ensure that tests such as generic/466 when run in a loop > > against say, ext4, won't try to try to actually mount a filesystem with > > a block size larger than your filesystem supports given your PAGE_SIZE > > and in the worst case crash. > > So this is expedient because XFS happens to not call sb_set_blocksize()? > What is the path forward for filesystems which call sb_set_blocksize() > today and want to support LBS in future? Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like XFS does. Luis: Is the bsize > PAGE_SIZE constraint in set_blocksize go away? IOWs, will xfs support sector sizes > 4k in the near future? --D
On 3/5/25 02:53, Luis Chamberlain wrote: > The commit titled "block/bdev: lift block size restrictions to 64k" > lifted the block layer's max supported block size to 64k inside the > helper blk_validate_block_size() now that we support large folios. > However in lifting the block size we also removed the silly use > cases many filesystems have to use sb_set_blocksize() to *verify* > that the block size < PAGE_SIZE. The call to sb_set_blocksize() can > happen in-kernel given mkfs utilities *can* create for example an > ext4 32k block size filesystem on x86_64, the issue we want to prevent > is mounting it on x86_64 unless the filesystem supports LBS. > > While, we could argue that such checks should be filesystem specific, > there are much more users of sb_set_blocksize() than LBS enabled > filesystem on linux-next, so just do the easier thing and bring back > the PAGE_SIZE check for sb_set_blocksize() users. > > This will ensure that tests such as generic/466 when run in a loop > against say, ext4, won't try to try to actually mount a filesystem with > a block size larger than your filesystem supports given your PAGE_SIZE > and in the worst case crash. > > Cc: Kent Overstreet <kent.overstreet@linux.dev> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > > Christian, a small fixup for a crash when running generic/466 on ext4 > in a loop. The issue is obvious, and we just need to ensure we don't > break old filesystem expectations of sb_set_blocksize(). > > This still allows XFS with 32k block size and I even tested with XFS > with 32k block size and a 32k sector size set. > > block/bdev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/bdev.c b/block/bdev.c > index 3bd948e6438d..de9ebc3e5d15 100644 > --- a/block/bdev.c > +++ b/block/bdev.c > @@ -181,7 +181,7 @@ EXPORT_SYMBOL(set_blocksize); > > int sb_set_blocksize(struct super_block *sb, int size) > { > - if (set_blocksize(sb->s_bdev_file, size)) > + if (size > PAGE_SIZE || set_blocksize(sb->s_bdev_file, size)) > return 0; > /* If we get here, we know size is validated */ > sb->s_blocksize = size; Can you add a comment stating why it's needed, even with LBS? It's kinda non-obious, and we don't want to repeat the mistake in the future. Cheers, Hannes
On Tue, Mar 04, 2025 at 10:33:30PM -0800, Darrick J. Wong wrote: > > So this is expedient because XFS happens to not call sb_set_blocksize()? > > What is the path forward for filesystems which call sb_set_blocksize() > > today and want to support LBS in future? > > Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like > XFS does. I'm kind of hoping that isn't the answer. > Luis: Is the bsize > PAGE_SIZE constraint in set_blocksize go away? > IOWs, will xfs support sector sizes > 4k in the near future? Already there in linux-next. 47dd67532303 in next-20250304.
On Wed, Mar 05, 2025 at 02:15:47PM +0000, Matthew Wilcox wrote: > On Tue, Mar 04, 2025 at 10:33:30PM -0800, Darrick J. Wong wrote: > > > So this is expedient because XFS happens to not call sb_set_blocksize()? > > > What is the path forward for filesystems which call sb_set_blocksize() > > > today and want to support LBS in future? > > > > Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like > > XFS does. > > I'm kind of hoping that isn't the answer. set_blocksize() can be used. The only extra steps the filesystem needs to in addition is: sb->s_blocksize = size; sb->s_blocksize_bits = blksize_bits(size); Which is what both XFS and bcachefs do. We could modify sb to add an LBS flag but that alone would not suffice either as the upper limit is still a filesystem specific limit. Additionally it also does not suffice for filesystems that support a different device for metadata writes, for instance XFS supports this and uses the sector size for set_blocksize(). So I think that if ext4 for example wants to use LBS then simply it would open code the above two lines and use set_blocksize(). Let me know if you have any other recommendations. Luis
On Wed, Mar 05, 2025 at 08:18:29AM +0100, Hannes Reinecke wrote: > On 3/5/25 02:53, Luis Chamberlain wrote: > > The commit titled "block/bdev: lift block size restrictions to 64k" > > lifted the block layer's max supported block size to 64k inside the > > helper blk_validate_block_size() now that we support large folios. > > However in lifting the block size we also removed the silly use > > cases many filesystems have to use sb_set_blocksize() to *verify* > > that the block size < PAGE_SIZE. The call to sb_set_blocksize() can > > happen in-kernel given mkfs utilities *can* create for example an > > ext4 32k block size filesystem on x86_64, the issue we want to prevent > > is mounting it on x86_64 unless the filesystem supports LBS. > > > > While, we could argue that such checks should be filesystem specific, > > there are much more users of sb_set_blocksize() than LBS enabled > > filesystem on linux-next, so just do the easier thing and bring back > > the PAGE_SIZE check for sb_set_blocksize() users. > > > > This will ensure that tests such as generic/466 when run in a loop > > against say, ext4, won't try to try to actually mount a filesystem with > > a block size larger than your filesystem supports given your PAGE_SIZE > > and in the worst case crash. > > > > Cc: Kent Overstreet <kent.overstreet@linux.dev> > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > > --- > > > > Christian, a small fixup for a crash when running generic/466 on ext4 > > in a loop. The issue is obvious, and we just need to ensure we don't > > break old filesystem expectations of sb_set_blocksize(). > > > > This still allows XFS with 32k block size and I even tested with XFS > > with 32k block size and a 32k sector size set. > > > > block/bdev.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/bdev.c b/block/bdev.c > > index 3bd948e6438d..de9ebc3e5d15 100644 > > --- a/block/bdev.c > > +++ b/block/bdev.c > > @@ -181,7 +181,7 @@ EXPORT_SYMBOL(set_blocksize); > > int sb_set_blocksize(struct super_block *sb, int size) > > { > > - if (set_blocksize(sb->s_bdev_file, size)) > > + if (size > PAGE_SIZE || set_blocksize(sb->s_bdev_file, size)) > > return 0; > > /* If we get here, we know size is validated */ > > sb->s_blocksize = size; > > Can you add a comment stating why it's needed, even with LBS? > It's kinda non-obious, and we don't want to repeat the mistake > in the future. Sure. Luis
On Wed, Mar 05, 2025 at 09:04:50AM -0800, Luis Chamberlain wrote: > On Wed, Mar 05, 2025 at 02:15:47PM +0000, Matthew Wilcox wrote: > > On Tue, Mar 04, 2025 at 10:33:30PM -0800, Darrick J. Wong wrote: > > > > So this is expedient because XFS happens to not call sb_set_blocksize()? > > > > What is the path forward for filesystems which call sb_set_blocksize() > > > > today and want to support LBS in future? > > > > > > Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like > > > XFS does. > > > > I'm kind of hoping that isn't the answer. > > set_blocksize() can be used. The only extra steps the filesystem needs > to in addition is: > > sb->s_blocksize = size; > sb->s_blocksize_bits = blksize_bits(size); > > Which is what both XFS and bcachefs do. > > We could modify sb to add an LBS flag but that alone would not suffice > either as the upper limit is still a filesystem specific limit. Additionally > it also does not suffice for filesystems that support a different device > for metadata writes, for instance XFS supports this and uses the sector > size for set_blocksize(). > > So I think that if ext4 for example wants to use LBS then simply it > would open code the above two lines and use set_blocksize(). Let me know > if you have any other recommendations. int sb_set_large_blocksize(struct super_block *sb, int size) { if (set_blocksize(sb->s_bdev_file, size)) return 0; sb->s_blocksize = size; sb->s_blocksize_bits = blksize_bits(size); return sb->s_blocksize; } EXPORT_SYMBOL_GPL(sb_set_large_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { if (size > PAGE_SIZE) return 0; return sb_set_large_blocksize(sb, size); } EXPORT_SYMBOL(sb_set_blocksize); Though you'll note that this doesn't help XFS, or any other filesystem where the bdev block size isn't set to the fs block size. But xfs can just be weird on its own like always. ;) --D > > Luis
On Wed, Mar 05, 2025 at 10:48:34AM -0800, Darrick J. Wong wrote: > On Wed, Mar 05, 2025 at 09:04:50AM -0800, Luis Chamberlain wrote: > > On Wed, Mar 05, 2025 at 02:15:47PM +0000, Matthew Wilcox wrote: > > > On Tue, Mar 04, 2025 at 10:33:30PM -0800, Darrick J. Wong wrote: > > > > > So this is expedient because XFS happens to not call sb_set_blocksize()? > > > > > What is the path forward for filesystems which call sb_set_blocksize() > > > > > today and want to support LBS in future? > > > > > > > > Well they /could/ set sb_blocksize/sb_blocksize_bits themselves, like > > > > XFS does. > > > > > > I'm kind of hoping that isn't the answer. > > > > set_blocksize() can be used. The only extra steps the filesystem needs > > to in addition is: > > > > sb->s_blocksize = size; > > sb->s_blocksize_bits = blksize_bits(size); > > > > Which is what both XFS and bcachefs do. > > > > We could modify sb to add an LBS flag but that alone would not suffice > > either as the upper limit is still a filesystem specific limit. Additionally > > it also does not suffice for filesystems that support a different device > > for metadata writes, for instance XFS supports this and uses the sector > > size for set_blocksize(). > > > > So I think that if ext4 for example wants to use LBS then simply it > > would open code the above two lines and use set_blocksize(). Let me know > > if you have any other recommendations. > > int sb_set_large_blocksize(struct super_block *sb, int size) > { > if (set_blocksize(sb->s_bdev_file, size)) > return 0; > sb->s_blocksize = size; > sb->s_blocksize_bits = blksize_bits(size); > return sb->s_blocksize; > } > EXPORT_SYMBOL_GPL(sb_set_large_blocksize); > > int sb_set_blocksize(struct super_block *sb, int size) > { > if (size > PAGE_SIZE) > return 0; > return sb_set_large_blocksize(sb, size); > } > EXPORT_SYMBOL(sb_set_blocksize); > > Though you'll note that this doesn't help XFS, or any other filesystem > where the bdev block size isn't set to the fs block size. But xfs can > just be weird on its own like always. ;) Actually, I failed to also notice XFS implicitly calls sb_set_large_blocksize() through: xfs_fs_get_tree() --> get_tree_bdev() --> get_tree_bdev_flags() --> get_tree_bdev_flags() --> sb_set_blocksize() We just don't care if sb_set_blocksize() if fails inside get_tree_bdev_flags(). To be clear this has been the case since we added and tested LBS on XFS. So if we wanted something more automatic, how about: diff --git a/block/bdev.c b/block/bdev.c index 3bd948e6438d..4844d1e27b6f 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -181,6 +181,8 @@ EXPORT_SYMBOL(set_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { + if (!(sb->s_type->fs_flags & FS_LBS) && size > PAGE_SIZE) + return 0; if (set_blocksize(sb->s_bdev_file, size)) return 0; /* If we get here, we know size is validated */ diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c index 90ade8f648d9..e99e378d68ea 100644 --- a/fs/bcachefs/fs.c +++ b/fs/bcachefs/fs.c @@ -2396,7 +2396,7 @@ static struct file_system_type bcache_fs_type = { .name = "bcachefs", .init_fs_context = bch2_init_fs_context, .kill_sb = bch2_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_LBS, }; MODULE_ALIAS_FS("bcachefs"); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index d92d7a07ea89..3d8b80165d48 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2118,7 +2118,8 @@ static struct file_system_type xfs_fs_type = { .init_fs_context = xfs_init_fs_context, .parameters = xfs_fs_parameters, .kill_sb = xfs_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME | + FS_LBS, }; MODULE_ALIAS_FS("xfs"); diff --git a/include/linux/fs.h b/include/linux/fs.h index 2c3b2f8a621f..16d17bd82be0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2616,6 +2616,7 @@ struct file_system_type { #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */ #define FS_MGTIME 64 /* FS uses multigrain timestamps */ +#define FS_LBS 128 /* FS supports LBS */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters;
diff --git a/block/bdev.c b/block/bdev.c index 3bd948e6438d..de9ebc3e5d15 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -181,7 +181,7 @@ EXPORT_SYMBOL(set_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { - if (set_blocksize(sb->s_bdev_file, size)) + if (size > PAGE_SIZE || set_blocksize(sb->s_bdev_file, size)) return 0; /* If we get here, we know size is validated */ sb->s_blocksize = size;
The commit titled "block/bdev: lift block size restrictions to 64k" lifted the block layer's max supported block size to 64k inside the helper blk_validate_block_size() now that we support large folios. However in lifting the block size we also removed the silly use cases many filesystems have to use sb_set_blocksize() to *verify* that the block size < PAGE_SIZE. The call to sb_set_blocksize() can happen in-kernel given mkfs utilities *can* create for example an ext4 32k block size filesystem on x86_64, the issue we want to prevent is mounting it on x86_64 unless the filesystem supports LBS. While, we could argue that such checks should be filesystem specific, there are much more users of sb_set_blocksize() than LBS enabled filesystem on linux-next, so just do the easier thing and bring back the PAGE_SIZE check for sb_set_blocksize() users. This will ensure that tests such as generic/466 when run in a loop against say, ext4, won't try to try to actually mount a filesystem with a block size larger than your filesystem supports given your PAGE_SIZE and in the worst case crash. Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- Christian, a small fixup for a crash when running generic/466 on ext4 in a loop. The issue is obvious, and we just need to ensure we don't break old filesystem expectations of sb_set_blocksize(). This still allows XFS with 32k block size and I even tested with XFS with 32k block size and a 32k sector size set. block/bdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)