Message ID | 161610681767.1887542.5197301352012661570.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xfs: make xfs_can_free_eofblocks a predicate | expand |
On Thu, Mar 18, 2021 at 03:33:37PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Fix the weird split of responsibilities between xfs_can_free_eofblocks > and xfs_free_eofblocks by moving the chunk of code that looks for any > actual post-EOF space mappings from the second function into the first. > > This clears the way for deferred inode inactivation to be able to decide > if an inode needs inactivation work before committing the released inode > to the inactivation code paths (vs. marking it for reclaim). > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > fs/xfs/xfs_bmap_util.c | 148 +++++++++++++++++++++++++----------------------- > 1 file changed, 78 insertions(+), 70 deletions(-) > > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index e7d68318e6a5..d4ceba5370c7 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -597,8 +597,17 @@ xfs_bmap_punch_delalloc_range( > * regular files that are marked preallocated or append-only. > */ > bool > -xfs_can_free_eofblocks(struct xfs_inode *ip, bool force) > +xfs_can_free_eofblocks( > + struct xfs_inode *ip, > + bool force) > { > + struct xfs_bmbt_irec imap; > + struct xfs_mount *mp = ip->i_mount; > + xfs_fileoff_t end_fsb; > + xfs_fileoff_t last_fsb; > + int nimaps = 1; > + int error; Should we have an assert here that this is called under the iolock? Or can't the reclaim be expressed nicely? > +/* > + * This is called to free any blocks beyond eof. The caller must hold > + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only > + * reference to the inode. > + */ Same thing here, usually asserts are better than comments.. That being said can_free_eofblocks would benefit from at least a comment if the assert doesn't work. Otherwise this looks good.
On Fri, Mar 19, 2021 at 05:59:07AM +0000, Christoph Hellwig wrote: > On Thu, Mar 18, 2021 at 03:33:37PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > Fix the weird split of responsibilities between xfs_can_free_eofblocks > > and xfs_free_eofblocks by moving the chunk of code that looks for any > > actual post-EOF space mappings from the second function into the first. > > > > This clears the way for deferred inode inactivation to be able to decide > > if an inode needs inactivation work before committing the released inode > > to the inactivation code paths (vs. marking it for reclaim). > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > fs/xfs/xfs_bmap_util.c | 148 +++++++++++++++++++++++++----------------------- > > 1 file changed, 78 insertions(+), 70 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > > index e7d68318e6a5..d4ceba5370c7 100644 > > --- a/fs/xfs/xfs_bmap_util.c > > +++ b/fs/xfs/xfs_bmap_util.c > > @@ -597,8 +597,17 @@ xfs_bmap_punch_delalloc_range( > > * regular files that are marked preallocated or append-only. > > */ > > bool > > -xfs_can_free_eofblocks(struct xfs_inode *ip, bool force) > > +xfs_can_free_eofblocks( > > + struct xfs_inode *ip, > > + bool force) > > { > > + struct xfs_bmbt_irec imap; > > + struct xfs_mount *mp = ip->i_mount; > > + xfs_fileoff_t end_fsb; > > + xfs_fileoff_t last_fsb; > > + int nimaps = 1; > > + int error; > > Should we have an assert here that this is called under the iolock? > Or can't the reclaim be expressed nicely? xfs_inactive doesn't take the iolock because (evidently) at some point there were lockdep complaints about taking it in reclaim context. By the time the inode reaches inactivation context, there can't be any other users of it anyway -- the last caller dropped its reference, we tore down the VFS inode, and anyone who wants to resuscitate the inode will wait in xfs_iget for us to finish. --D > > +/* > > + * This is called to free any blocks beyond eof. The caller must hold > > + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only > > + * reference to the inode. > > + */ > > Same thing here, usually asserts are better than comments.. That being > said can_free_eofblocks would benefit from at least a comment if the > assert doesn't work. > > Otherwise this looks good.
On Thu, Mar 18, 2021 at 11:05:34PM -0700, Darrick J. Wong wrote: > xfs_inactive doesn't take the iolock because (evidently) at some point > there were lockdep complaints about taking it in reclaim context. By > the time the inode reaches inactivation context, there can't be any > other users of it anyway -- the last caller dropped its reference, we > tore down the VFS inode, and anyone who wants to resuscitate the inode > will wait in xfs_iget for us to finish. Yes. What I meant is that if we can deduce that we are in inactive somehow (probably using the VFS inode state) we can ASSERT that we are either in inactive or hold the iolock.
On Fri, Mar 19, 2021 at 06:35:37AM +0000, Christoph Hellwig wrote: > On Thu, Mar 18, 2021 at 11:05:34PM -0700, Darrick J. Wong wrote: > > xfs_inactive doesn't take the iolock because (evidently) at some point > > there were lockdep complaints about taking it in reclaim context. By > > the time the inode reaches inactivation context, there can't be any > > other users of it anyway -- the last caller dropped its reference, we > > tore down the VFS inode, and anyone who wants to resuscitate the inode > > will wait in xfs_iget for us to finish. > > Yes. What I meant is that if we can deduce that we are in inactive > somehow (probably using the VFS inode state) we can ASSERT that we > are either in inactive or hold the iolock. Yeah, I think we can do: ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL) || (VFS_I(ip)->i_state & I_FREEING)); --D
On Fri, Mar 19, 2021 at 09:59:24AM -0700, Darrick J. Wong wrote: > > Yes. What I meant is that if we can deduce that we are in inactive > > somehow (probably using the VFS inode state) we can ASSERT that we > > are either in inactive or hold the iolock. > > Yeah, I think we can do: > > ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL) || > (VFS_I(ip)->i_state & I_FREEING)); Yes, that looks sensible.
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index e7d68318e6a5..d4ceba5370c7 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -597,8 +597,17 @@ xfs_bmap_punch_delalloc_range( * regular files that are marked preallocated or append-only. */ bool -xfs_can_free_eofblocks(struct xfs_inode *ip, bool force) +xfs_can_free_eofblocks( + struct xfs_inode *ip, + bool force) { + struct xfs_bmbt_irec imap; + struct xfs_mount *mp = ip->i_mount; + xfs_fileoff_t end_fsb; + xfs_fileoff_t last_fsb; + int nimaps = 1; + int error; + /* prealloc/delalloc exists only on regular files */ if (!S_ISREG(VFS_I(ip)->i_mode)) return false; @@ -624,91 +633,90 @@ xfs_can_free_eofblocks(struct xfs_inode *ip, bool force) if (!force || ip->i_delayed_blks == 0) return false; - return true; -} - -/* - * This is called to free any blocks beyond eof. The caller must hold - * IOLOCK_EXCL unless we are in the inode reclaim path and have the only - * reference to the inode. - */ -int -xfs_free_eofblocks( - struct xfs_inode *ip) -{ - struct xfs_trans *tp; - int error; - xfs_fileoff_t end_fsb; - xfs_fileoff_t last_fsb; - xfs_filblks_t map_len; - int nimaps; - struct xfs_bmbt_irec imap; - struct xfs_mount *mp = ip->i_mount; - /* - * Figure out if there are any blocks beyond the end - * of the file. If not, then there is nothing to do. + * Do not try to free post-EOF blocks if EOF is beyond the end of the + * range supported by the page cache, because the truncation will loop + * forever. */ end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)XFS_ISIZE(ip)); last_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes); if (last_fsb <= end_fsb) - return 0; - map_len = last_fsb - end_fsb; + return false; - nimaps = 1; + /* + * Look up the mapping for the first block past EOF. If we can't find + * it, there's nothing to free. + */ xfs_ilock(ip, XFS_ILOCK_SHARED); - error = xfs_bmapi_read(ip, end_fsb, map_len, &imap, &nimaps, 0); + error = xfs_bmapi_read(ip, end_fsb, last_fsb - end_fsb, &imap, &nimaps, + 0); xfs_iunlock(ip, XFS_ILOCK_SHARED); + if (error || nimaps == 0) + return false; /* - * If there are blocks after the end of file, truncate the file to its - * current size to free them up. + * If there's a real mapping there or there are delayed allocation + * reservations, then we have post-EOF blocks to try to free. */ - if (!error && (nimaps != 0) && - (imap.br_startblock != HOLESTARTBLOCK || - ip->i_delayed_blks)) { - /* - * Attach the dquots to the inode up front. - */ - error = xfs_qm_dqattach(ip); - if (error) - return error; + return imap.br_startblock != HOLESTARTBLOCK || ip->i_delayed_blks; +} - /* wait on dio to ensure i_size has settled */ - inode_dio_wait(VFS_I(ip)); +/* + * This is called to free any blocks beyond eof. The caller must hold + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only + * reference to the inode. + */ +int +xfs_free_eofblocks( + struct xfs_inode *ip) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = ip->i_mount; + int error; - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, - &tp); - if (error) { - ASSERT(XFS_FORCED_SHUTDOWN(mp)); - return error; - } + /* Attach the dquots to the inode up front. */ + error = xfs_qm_dqattach(ip); + if (error) + return error; - xfs_ilock(ip, XFS_ILOCK_EXCL); - xfs_trans_ijoin(tp, ip, 0); + /* Wait on dio to ensure i_size has settled. */ + inode_dio_wait(VFS_I(ip)); - /* - * Do not update the on-disk file size. If we update the - * on-disk file size and then the system crashes before the - * contents of the file are flushed to disk then the files - * may be full of holes (ie NULL files bug). - */ - error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK, - XFS_ISIZE(ip), XFS_BMAPI_NODISCARD); - if (error) { - /* - * If we get an error at this point we simply don't - * bother truncating the file. - */ - xfs_trans_cancel(tp); - } else { - error = xfs_trans_commit(tp); - if (!error) - xfs_inode_clear_eofblocks_tag(ip); - } - - xfs_iunlock(ip, XFS_ILOCK_EXCL); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp); + if (error) { + ASSERT(XFS_FORCED_SHUTDOWN(mp)); + return error; } + + xfs_ilock(ip, XFS_ILOCK_EXCL); + xfs_trans_ijoin(tp, ip, 0); + + /* + * Do not update the on-disk file size. If we update the on-disk file + * size and then the system crashes before the contents of the file are + * flushed to disk then the files may be full of holes (ie NULL files + * bug). + */ + error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK, + XFS_ISIZE(ip), XFS_BMAPI_NODISCARD); + if (error) + goto err_cancel; + + error = xfs_trans_commit(tp); + if (error) + goto out_unlock; + + xfs_inode_clear_eofblocks_tag(ip); + goto out_unlock; + +err_cancel: + /* + * If we get an error at this point we simply don't + * bother truncating the file. + */ + xfs_trans_cancel(tp); +out_unlock: + xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; }