diff mbox series

xfs: update XFS_IOC_DIOINFO memory alignment value

Message ID 20240711003637.2979807-1-david@fromorbit.com (mailing list archive)
State Accepted, archived
Headers show
Series xfs: update XFS_IOC_DIOINFO memory alignment value | expand

Commit Message

Dave Chinner July 11, 2024, 12:36 a.m. UTC
From: Dave Chinner <dchinner@redhat.com>

As of v6.0, the DIO memory buffer alignment is no longer aligned to
the logical sector size of the underlying block device. There is now
a specific DMA alignment parameter that memory buffers should be
aligned to. statx(STATX_DIOALIGN) gets this right, but
XFS_IOC_DIOINFO does not - it still uses the older fixed alignment
defined by the block device logical sector size.

This was found because the s390 DASD driver increased DMA alignment
to PAGE_SIZE in commit bc792884b76f ("s390/dasd: Establish DMA
alignment") and DIO aligned to logical sector sizes have started
failing on kernels with that commit. Fixing the "userspace fails
because device alignment constraints increased" issue is not XFS's
problem, but we really should be reporting the correct device memory
alignment in XFS_IOC_DIOINFO.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_ioctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Darrick J. Wong July 11, 2024, 2:52 a.m. UTC | #1
On Thu, Jul 11, 2024 at 10:36:37AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> As of v6.0, the DIO memory buffer alignment is no longer aligned to
> the logical sector size of the underlying block device. There is now
> a specific DMA alignment parameter that memory buffers should be
> aligned to. statx(STATX_DIOALIGN) gets this right, but
> XFS_IOC_DIOINFO does not - it still uses the older fixed alignment
> defined by the block device logical sector size.
> 
> This was found because the s390 DASD driver increased DMA alignment
> to PAGE_SIZE in commit bc792884b76f ("s390/dasd: Establish DMA
> alignment") and DIO aligned to logical sector sizes have started
> failing on kernels with that commit. Fixing the "userspace fails
> because device alignment constraints increased" issue is not XFS's
> problem, but we really should be reporting the correct device memory
> alignment in XFS_IOC_DIOINFO.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_ioctl.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index f0117188f302..71eba4849e03 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1368,7 +1368,8 @@ xfs_file_ioctl(
>  		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
>  		struct dioattr		da;
>  
> -		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> +		da.d_mem = bdev_dma_alignment(target->bt_bdev);

bdev_dma_alignment returns a mask, so I think you want to add one here?

Though at this point, perhaps DIOINFO should query the STATX_DIOALIGN
information so xfs doesn't have to maintain this anymore?

(Or just make a helper that statx and DIOINFO can both call?)

--D

> +		da.d_miniosz = target->bt_logical_sectorsize;
>  		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
>  
>  		if (copy_to_user(arg, &da, sizeof(da)))
> -- 
> 2.45.1
> 
>
Christoph Hellwig July 11, 2024, 4:03 a.m. UTC | #2
On Wed, Jul 10, 2024 at 07:52:06PM -0700, Darrick J. Wong wrote:
> > -		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> > +		da.d_mem = bdev_dma_alignment(target->bt_bdev);
> 
> bdev_dma_alignment returns a mask, so I think you want to add one here?

Yes.

> Though at this point, perhaps DIOINFO should query the STATX_DIOALIGN
> information so xfs doesn't have to maintain this anymore?
> 
> (Or just make a helper that statx and DIOINFO can both call?)

Lift DIOINFO to the VFS and back it using the statx data?
Darrick J. Wong July 11, 2024, 4:36 a.m. UTC | #3
On Wed, Jul 10, 2024 at 09:03:15PM -0700, Christoph Hellwig wrote:
> On Wed, Jul 10, 2024 at 07:52:06PM -0700, Darrick J. Wong wrote:
> > > -		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> > > +		da.d_mem = bdev_dma_alignment(target->bt_bdev);
> > 
> > bdev_dma_alignment returns a mask, so I think you want to add one here?
> 
> Yes.
> 
> > Though at this point, perhaps DIOINFO should query the STATX_DIOALIGN
> > information so xfs doesn't have to maintain this anymore?
> > 
> > (Or just make a helper that statx and DIOINFO can both call?)
> 
> Lift DIOINFO to the VFS and back it using the statx data?

<shrug> Is there anything that DIOINFO provides that statx doesn't?
AFAICT XFS is the only fs that implements DIOINFO, so why expand that?

--D
Christoph Hellwig July 11, 2024, 4:44 a.m. UTC | #4
On Wed, Jul 10, 2024 at 09:36:14PM -0700, Darrick J. Wong wrote:
> <shrug> Is there anything that DIOINFO provides that statx doesn't?
> AFAICT XFS is the only fs that implements DIOINFO, so why expand that?

Because it'll just make all the existing software using it do the right
thing everywhere?
Dave Chinner July 11, 2024, 5:51 a.m. UTC | #5
On Wed, Jul 10, 2024 at 07:52:06PM -0700, Darrick J. Wong wrote:
> On Thu, Jul 11, 2024 at 10:36:37AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > As of v6.0, the DIO memory buffer alignment is no longer aligned to
> > the logical sector size of the underlying block device. There is now
> > a specific DMA alignment parameter that memory buffers should be
> > aligned to. statx(STATX_DIOALIGN) gets this right, but
> > XFS_IOC_DIOINFO does not - it still uses the older fixed alignment
> > defined by the block device logical sector size.
> > 
> > This was found because the s390 DASD driver increased DMA alignment
> > to PAGE_SIZE in commit bc792884b76f ("s390/dasd: Establish DMA
> > alignment") and DIO aligned to logical sector sizes have started
> > failing on kernels with that commit. Fixing the "userspace fails
> > because device alignment constraints increased" issue is not XFS's
> > problem, but we really should be reporting the correct device memory
> > alignment in XFS_IOC_DIOINFO.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_ioctl.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > index f0117188f302..71eba4849e03 100644
> > --- a/fs/xfs/xfs_ioctl.c
> > +++ b/fs/xfs/xfs_ioctl.c
> > @@ -1368,7 +1368,8 @@ xfs_file_ioctl(
> >  		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
> >  		struct dioattr		da;
> >  
> > -		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> > +		da.d_mem = bdev_dma_alignment(target->bt_bdev);
> 
> bdev_dma_alignment returns a mask, so I think you want to add one here?

Ah, yes, good eyes, I forgot to refresh the patch. I'll send an
updated version to the list.

> Though at this point, perhaps DIOINFO should query the STATX_DIOALIGN
> information so xfs doesn't have to maintain this anymore?

We open code the STATX_DIOALIGN stuff in xfs_vn_getattr() ourselves
- there's no point using statx to query information we supply statx
with in the first place.

> (Or just make a helper that statx and DIOINFO can both call?)

If we grow more internal users, then maybe?

-Dave.
Dave Chinner July 11, 2024, 5:52 a.m. UTC | #6
On Wed, Jul 10, 2024 at 09:44:57PM -0700, Christoph Hellwig wrote:
> On Wed, Jul 10, 2024 at 09:36:14PM -0700, Darrick J. Wong wrote:
> > <shrug> Is there anything that DIOINFO provides that statx doesn't?
> > AFAICT XFS is the only fs that implements DIOINFO, so why expand that?
> 
> Because it'll just make all the existing software using it do the right
> thing everywhere?

I'm just fixing a bug. If you want to make DIOINFO a VFS ioctl, I'll
review the patches but it's way outside the scope of fixing a minor
oversight in a recent feature addition...

-Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index f0117188f302..71eba4849e03 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1368,7 +1368,8 @@  xfs_file_ioctl(
 		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
 		struct dioattr		da;
 
-		da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
+		da.d_mem = bdev_dma_alignment(target->bt_bdev);
+		da.d_miniosz = target->bt_logical_sectorsize;
 		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
 
 		if (copy_to_user(arg, &da, sizeof(da)))