Message ID | 20250313171310.1886394-14-john.g.garry@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | large atomic writes for xfs with CoW | expand |
On Thu, Mar 13, 2025 at 05:13:10PM +0000, John Garry wrote: > For simplicity, limit at the max of what the mounted bdev can support in > terms of atomic write limits. Maybe in future we will have a better way > to advertise this optimised limit. You'll still need to cover limit this by the amount that can be commited in a single transactions. And handle the case where there is no hardware support at all. > xfs_get_atomic_write_max_attr( I missed it in the previous version, but can be drop the pointless _attr for these two helpers? > +static inline void > +xfs_compute_awu_max( And use a more descriptive name than AWU, wich really just is a nvme field name. > + awu_max = 1; > + while (1) { > + if (agsize % (awu_max * 2)) > + break; while ((agsize % (awu_max * 2) == 0)) { ? > + xfs_extlen_t m_awu_max; /* data device max atomic write */ overly long line.
On 17/03/2025 07:25, Christoph Hellwig wrote: > On Thu, Mar 13, 2025 at 05:13:10PM +0000, John Garry wrote: >> For simplicity, limit at the max of what the mounted bdev can support in >> terms of atomic write limits. Maybe in future we will have a better way >> to advertise this optimised limit. > > You'll still need to cover limit this by the amount that can > be commited in a single transactions. yeah ... I'll revisit that > And handle the case where there > is no hardware support at all. So xfs_get_atomic_write_max_attr() -> xfs_inode_can_atomicwrite() covers no HW support. The point of this function is just to calc atomic write limits according to mount point geometry and features. Do you think that it is necessary to call xfs_inode_can_atomicwrite() here also? [And remove the xfs_get_atomic_write_max_attr() -> xfs_inode_can_atomicwrite()?] > >> xfs_get_atomic_write_max_attr( > > I missed it in the previous version, but can be drop the > pointless _attr for these two helpers? ok, fine > >> +static inline void >> +xfs_compute_awu_max( > > And use a more descriptive name than AWU, wich really just is a > nvme field name. I am just trying to be concise to limit spilling lines. Maybe atomicwrite_unit_max is preferred > >> + awu_max = 1; >> + while (1) { >> + if (agsize % (awu_max * 2)) >> + break; > > while ((agsize % (awu_max * 2) == 0)) { > > ? > >> + xfs_extlen_t m_awu_max; /* data device max atomic write */ > > overly long line. there are a few overly long lines here (so following that example), but since there is a request to change the name, I'll be definitely using a newline for the comment Thanks, John >
On Mon, Mar 17, 2025 at 09:57:45AM +0000, John Garry wrote: >> And handle the case where there >> is no hardware support at all. > > So xfs_get_atomic_write_max_attr() -> xfs_inode_can_atomicwrite() covers no > HW support. > > The point of this function is just to calc atomic write limits according to > mount point geometry and features. > > Do you think that it is necessary to call xfs_inode_can_atomicwrite() here > also? [And remove the xfs_get_atomic_write_max_attr() -> > xfs_inode_can_atomicwrite()?] At least document what it does.. >>> +static inline void >>> +xfs_compute_awu_max( >> >> And use a more descriptive name than AWU, wich really just is a >> nvme field name. > > I am just trying to be concise to limit spilling lines. > > Maybe atomicwrite_unit_max is preferred I guess if we ant to stick to the unit encoded in awu and used by the block layer, yes.
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 64b1f8c73824..7c22eefd6b89 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -615,10 +615,22 @@ unsigned int xfs_get_atomic_write_max_attr( struct xfs_inode *ip) { + struct xfs_buftarg *target = xfs_inode_buftarg(ip); + struct xfs_mount *mp = ip->i_mount; + if (!xfs_inode_can_atomicwrite(ip)) return 0; - return ip->i_mount->m_sb.sb_blocksize; + /* + * rtvol is not commonly used and supporting large atomic writes + * would also be complicated to support there, so limit to a single + * block for now. + */ + if (XFS_IS_REALTIME_INODE(ip)) + return mp->m_sb.sb_blocksize; + + return min_t(unsigned int, XFS_FSB_TO_B(mp, mp->m_awu_max), + target->bt_bdev_awu_max); } static void diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index e65a659901d5..fd89cb7a83fd 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -666,6 +666,33 @@ xfs_agbtree_compute_maxlevels( mp->m_agbtree_maxlevels = max(levels, mp->m_refc_maxlevels); } +static inline void +xfs_compute_awu_max( + struct xfs_mount *mp) +{ + xfs_agblock_t agsize = mp->m_sb.sb_agblocks; + xfs_agblock_t awu_max; + + if (!xfs_has_reflink(mp)) { + mp->m_awu_max = 1; + return; + } + + /* + * Find highest power-of-2 evenly divisible into agsize and which + * also fits into an unsigned int field. + */ + awu_max = 1; + while (1) { + if (agsize % (awu_max * 2)) + break; + if (XFS_FSB_TO_B(mp, awu_max * 2) > UINT_MAX) + break; + awu_max *= 2; + } + mp->m_awu_max = awu_max; +} + /* Compute maximum possible height for realtime btree types for this fs. */ static inline void xfs_rtbtree_compute_maxlevels( @@ -751,6 +778,8 @@ xfs_mountfs( xfs_agbtree_compute_maxlevels(mp); xfs_rtbtree_compute_maxlevels(mp); + xfs_compute_awu_max(mp); + /* * Check if sb_agblocks is aligned at stripe boundary. If sb_agblocks * is NOT aligned turn off m_dalign since allocator alignment is within diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 799b84220ebb..1b0136da2aec 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -229,6 +229,7 @@ typedef struct xfs_mount { bool m_finobt_nores; /* no per-AG finobt resv. */ bool m_update_sb; /* sb needs update in mount */ unsigned int m_max_open_zones; + xfs_extlen_t m_awu_max; /* data device max atomic write */ /* * Bitsets of per-fs metadata that have been checked and/or are sick.