diff mbox series

[v6,13/13] xfs: update atomic write max size

Message ID 20250313171310.1886394-14-john.g.garry@oracle.com (mailing list archive)
State New
Headers show
Series large atomic writes for xfs with CoW | expand

Commit Message

John Garry March 13, 2025, 5:13 p.m. UTC
Now that CoW-based atomic writes are supported, update the max size of an
atomic write.

For simplicity, limit at the max of what the mounted bdev can support in
terms of atomic write limits. Maybe in future we will have a better way
to advertise this optimised limit.

In addition, the max atomic write size needs to be aligned to the agsize.
Limit the size of atomic writes to the greatest power-of-two factor of the
agsize so that allocations for an atomic write will always be aligned
compatibly with the alignment requirements of the storage.

rtvol is not commonly used, so it is not very important to support large
atomic writes there initially.

Furthermore, adding large atomic writes for rtvol would be complicated due
to alignment already offered by rtextsize and also the limitation of
reflink support only be possible for rtextsize is a power-of-2.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 fs/xfs/xfs_iops.c  | 14 +++++++++++++-
 fs/xfs/xfs_mount.c | 29 +++++++++++++++++++++++++++++
 fs/xfs/xfs_mount.h |  1 +
 3 files changed, 43 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig March 17, 2025, 7:25 a.m. UTC | #1
On Thu, Mar 13, 2025 at 05:13:10PM +0000, John Garry wrote:
> For simplicity, limit at the max of what the mounted bdev can support in
> terms of atomic write limits. Maybe in future we will have a better way
> to advertise this optimised limit.

You'll still need to cover limit this by the amount that can
be commited in a single transactions.  And handle the case where there
is no hardware support at all.

>  xfs_get_atomic_write_max_attr(

I missed it in the previous version, but can be drop the
pointless _attr for these two helpers?

> +static inline void
> +xfs_compute_awu_max(

And use a more descriptive name than AWU, wich really just is a
nvme field name.

> +	awu_max = 1;
> +	while (1) {
> +		if (agsize % (awu_max * 2))
> +			break;

	while ((agsize % (awu_max * 2) == 0)) {

?

> +	xfs_extlen_t		m_awu_max;	/* data device max atomic write */

overly long line.
John Garry March 17, 2025, 9:57 a.m. UTC | #2
On 17/03/2025 07:25, Christoph Hellwig wrote:
> On Thu, Mar 13, 2025 at 05:13:10PM +0000, John Garry wrote:
>> For simplicity, limit at the max of what the mounted bdev can support in
>> terms of atomic write limits. Maybe in future we will have a better way
>> to advertise this optimised limit.
> 
> You'll still need to cover limit this by the amount that can
> be commited in a single transactions. 

yeah ... I'll revisit that

> And handle the case where there
> is no hardware support at all.

So xfs_get_atomic_write_max_attr() -> xfs_inode_can_atomicwrite() covers 
no HW support.

The point of this function is just to calc atomic write limits according 
to mount point geometry and features.

Do you think that it is necessary to call xfs_inode_can_atomicwrite() 
here also? [And remove the xfs_get_atomic_write_max_attr() -> 
xfs_inode_can_atomicwrite()?]

> 
>>   xfs_get_atomic_write_max_attr(
> 
> I missed it in the previous version, but can be drop the
> pointless _attr for these two helpers?

ok, fine

> 
>> +static inline void
>> +xfs_compute_awu_max(
> 
> And use a more descriptive name than AWU, wich really just is a
> nvme field name.

I am just trying to be concise to limit spilling lines.

Maybe atomicwrite_unit_max is preferred

> 
>> +	awu_max = 1;
>> +	while (1) {
>> +		if (agsize % (awu_max * 2))
>> +			break;
> 
> 	while ((agsize % (awu_max * 2) == 0)) {
> 
> ?
> 
>> +	xfs_extlen_t		m_awu_max;	/* data device max atomic write */
> 
> overly long line.

there are a few overly long lines here (so following that example), but 
since there is a request to change the name, I'll be definitely using a 
newline for the comment

Thanks,
John

>
hch March 18, 2025, 5:47 a.m. UTC | #3
On Mon, Mar 17, 2025 at 09:57:45AM +0000, John Garry wrote:
>> And handle the case where there
>> is no hardware support at all.
>
> So xfs_get_atomic_write_max_attr() -> xfs_inode_can_atomicwrite() covers no 
> HW support.
>
> The point of this function is just to calc atomic write limits according to 
> mount point geometry and features.
>
> Do you think that it is necessary to call xfs_inode_can_atomicwrite() here 
> also? [And remove the xfs_get_atomic_write_max_attr() -> 
> xfs_inode_can_atomicwrite()?]

At least document what it does..

>>> +static inline void
>>> +xfs_compute_awu_max(
>>
>> And use a more descriptive name than AWU, wich really just is a
>> nvme field name.
>
> I am just trying to be concise to limit spilling lines.
>
> Maybe atomicwrite_unit_max is preferred

I guess if we ant to stick to the unit encoded in awu and used by the
block layer, yes.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 64b1f8c73824..7c22eefd6b89 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -615,10 +615,22 @@  unsigned int
 xfs_get_atomic_write_max_attr(
 	struct xfs_inode	*ip)
 {
+	struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
+	struct xfs_mount	*mp = ip->i_mount;
+
 	if (!xfs_inode_can_atomicwrite(ip))
 		return 0;
 
-	return ip->i_mount->m_sb.sb_blocksize;
+	/*
+	 * rtvol is not commonly used and supporting large atomic writes
+	 * would also be complicated to support there, so limit to a single
+	 * block for now.
+	 */
+	if (XFS_IS_REALTIME_INODE(ip))
+		return mp->m_sb.sb_blocksize;
+
+	return min_t(unsigned int, XFS_FSB_TO_B(mp, mp->m_awu_max),
+				target->bt_bdev_awu_max);
 }
 
 static void
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index e65a659901d5..fd89cb7a83fd 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -666,6 +666,33 @@  xfs_agbtree_compute_maxlevels(
 	mp->m_agbtree_maxlevels = max(levels, mp->m_refc_maxlevels);
 }
 
+static inline void
+xfs_compute_awu_max(
+	struct xfs_mount	*mp)
+{
+	xfs_agblock_t		agsize = mp->m_sb.sb_agblocks;
+	xfs_agblock_t		awu_max;
+
+	if (!xfs_has_reflink(mp)) {
+		mp->m_awu_max = 1;
+		return;
+	}
+
+	/*
+	 * Find highest power-of-2 evenly divisible into agsize and which
+	 * also fits into an unsigned int field.
+	 */
+	awu_max = 1;
+	while (1) {
+		if (agsize % (awu_max * 2))
+			break;
+		if (XFS_FSB_TO_B(mp, awu_max * 2) > UINT_MAX)
+			break;
+		awu_max *= 2;
+	}
+	mp->m_awu_max = awu_max;
+}
+
 /* Compute maximum possible height for realtime btree types for this fs. */
 static inline void
 xfs_rtbtree_compute_maxlevels(
@@ -751,6 +778,8 @@  xfs_mountfs(
 	xfs_agbtree_compute_maxlevels(mp);
 	xfs_rtbtree_compute_maxlevels(mp);
 
+	xfs_compute_awu_max(mp);
+
 	/*
 	 * Check if sb_agblocks is aligned at stripe boundary.  If sb_agblocks
 	 * is NOT aligned turn off m_dalign since allocator alignment is within
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 799b84220ebb..1b0136da2aec 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -229,6 +229,7 @@  typedef struct xfs_mount {
 	bool			m_finobt_nores; /* no per-AG finobt resv. */
 	bool			m_update_sb;	/* sb needs update in mount */
 	unsigned int		m_max_open_zones;
+	xfs_extlen_t		m_awu_max;	/* data device max atomic write */
 
 	/*
 	 * Bitsets of per-fs metadata that have been checked and/or are sick.