diff mbox series

[2/2] xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks

Message ID 161610681767.1887542.5197301352012661570.stgit@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs: make xfs_can_free_eofblocks a predicate | expand

Commit Message

Darrick J. Wong March 18, 2021, 10:33 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Fix the weird split of responsibilities between xfs_can_free_eofblocks
and xfs_free_eofblocks by moving the chunk of code that looks for any
actual post-EOF space mappings from the second function into the first.

This clears the way for deferred inode inactivation to be able to decide
if an inode needs inactivation work before committing the released inode
to the inactivation code paths (vs. marking it for reclaim).

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_bmap_util.c |  148 +++++++++++++++++++++++++-----------------------
 1 file changed, 78 insertions(+), 70 deletions(-)

Comments

Christoph Hellwig March 19, 2021, 5:59 a.m. UTC | #1
On Thu, Mar 18, 2021 at 03:33:37PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Fix the weird split of responsibilities between xfs_can_free_eofblocks
> and xfs_free_eofblocks by moving the chunk of code that looks for any
> actual post-EOF space mappings from the second function into the first.
> 
> This clears the way for deferred inode inactivation to be able to decide
> if an inode needs inactivation work before committing the released inode
> to the inactivation code paths (vs. marking it for reclaim).
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_bmap_util.c |  148 +++++++++++++++++++++++++-----------------------
>  1 file changed, 78 insertions(+), 70 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index e7d68318e6a5..d4ceba5370c7 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -597,8 +597,17 @@ xfs_bmap_punch_delalloc_range(
>   * regular files that are marked preallocated or append-only.
>   */
>  bool
> -xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
> +xfs_can_free_eofblocks(
> +	struct xfs_inode	*ip,
> +	bool			force)
>  {
> +	struct xfs_bmbt_irec	imap;
> +	struct xfs_mount	*mp = ip->i_mount;
> +	xfs_fileoff_t		end_fsb;
> +	xfs_fileoff_t		last_fsb;
> +	int			nimaps = 1;
> +	int			error;

Should we have an assert here that this is called under the iolock?
Or can't the reclaim be expressed nicely?

> +/*
> + * This is called to free any blocks beyond eof. The caller must hold
> + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
> + * reference to the inode.
> + */

Same thing here, usually asserts are better than comments..  That being
said can_free_eofblocks would benefit from at least a comment if the
assert doesn't work.

Otherwise this looks good.
Darrick J. Wong March 19, 2021, 6:05 a.m. UTC | #2
On Fri, Mar 19, 2021 at 05:59:07AM +0000, Christoph Hellwig wrote:
> On Thu, Mar 18, 2021 at 03:33:37PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Fix the weird split of responsibilities between xfs_can_free_eofblocks
> > and xfs_free_eofblocks by moving the chunk of code that looks for any
> > actual post-EOF space mappings from the second function into the first.
> > 
> > This clears the way for deferred inode inactivation to be able to decide
> > if an inode needs inactivation work before committing the released inode
> > to the inactivation code paths (vs. marking it for reclaim).
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/xfs_bmap_util.c |  148 +++++++++++++++++++++++++-----------------------
> >  1 file changed, 78 insertions(+), 70 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > index e7d68318e6a5..d4ceba5370c7 100644
> > --- a/fs/xfs/xfs_bmap_util.c
> > +++ b/fs/xfs/xfs_bmap_util.c
> > @@ -597,8 +597,17 @@ xfs_bmap_punch_delalloc_range(
> >   * regular files that are marked preallocated or append-only.
> >   */
> >  bool
> > -xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
> > +xfs_can_free_eofblocks(
> > +	struct xfs_inode	*ip,
> > +	bool			force)
> >  {
> > +	struct xfs_bmbt_irec	imap;
> > +	struct xfs_mount	*mp = ip->i_mount;
> > +	xfs_fileoff_t		end_fsb;
> > +	xfs_fileoff_t		last_fsb;
> > +	int			nimaps = 1;
> > +	int			error;
> 
> Should we have an assert here that this is called under the iolock?
> Or can't the reclaim be expressed nicely?

xfs_inactive doesn't take the iolock because (evidently) at some point
there were lockdep complaints about taking it in reclaim context.  By
the time the inode reaches inactivation context, there can't be any
other users of it anyway -- the last caller dropped its reference, we
tore down the VFS inode, and anyone who wants to resuscitate the inode
will wait in xfs_iget for us to finish.

--D

> > +/*
> > + * This is called to free any blocks beyond eof. The caller must hold
> > + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
> > + * reference to the inode.
> > + */
> 
> Same thing here, usually asserts are better than comments..  That being
> said can_free_eofblocks would benefit from at least a comment if the
> assert doesn't work.
> 
> Otherwise this looks good.
Christoph Hellwig March 19, 2021, 6:35 a.m. UTC | #3
On Thu, Mar 18, 2021 at 11:05:34PM -0700, Darrick J. Wong wrote:
> xfs_inactive doesn't take the iolock because (evidently) at some point
> there were lockdep complaints about taking it in reclaim context.  By
> the time the inode reaches inactivation context, there can't be any
> other users of it anyway -- the last caller dropped its reference, we
> tore down the VFS inode, and anyone who wants to resuscitate the inode
> will wait in xfs_iget for us to finish.

Yes.  What I meant is that if we can deduce that we are in inactive
somehow (probably using the VFS inode state) we can ASSERT that we
are either in inactive or hold the iolock.
Darrick J. Wong March 19, 2021, 4:59 p.m. UTC | #4
On Fri, Mar 19, 2021 at 06:35:37AM +0000, Christoph Hellwig wrote:
> On Thu, Mar 18, 2021 at 11:05:34PM -0700, Darrick J. Wong wrote:
> > xfs_inactive doesn't take the iolock because (evidently) at some point
> > there were lockdep complaints about taking it in reclaim context.  By
> > the time the inode reaches inactivation context, there can't be any
> > other users of it anyway -- the last caller dropped its reference, we
> > tore down the VFS inode, and anyone who wants to resuscitate the inode
> > will wait in xfs_iget for us to finish.
> 
> Yes.  What I meant is that if we can deduce that we are in inactive
> somehow (probably using the VFS inode state) we can ASSERT that we
> are either in inactive or hold the iolock.

Yeah, I think we can do:

	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL) ||
	       (VFS_I(ip)->i_state & I_FREEING));

--D
Christoph Hellwig March 23, 2021, 6:32 p.m. UTC | #5
On Fri, Mar 19, 2021 at 09:59:24AM -0700, Darrick J. Wong wrote:
> > Yes.  What I meant is that if we can deduce that we are in inactive
> > somehow (probably using the VFS inode state) we can ASSERT that we
> > are either in inactive or hold the iolock.
> 
> Yeah, I think we can do:
> 
> 	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL) ||
> 	       (VFS_I(ip)->i_state & I_FREEING));

Yes, that looks sensible.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index e7d68318e6a5..d4ceba5370c7 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -597,8 +597,17 @@  xfs_bmap_punch_delalloc_range(
  * regular files that are marked preallocated or append-only.
  */
 bool
-xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
+xfs_can_free_eofblocks(
+	struct xfs_inode	*ip,
+	bool			force)
 {
+	struct xfs_bmbt_irec	imap;
+	struct xfs_mount	*mp = ip->i_mount;
+	xfs_fileoff_t		end_fsb;
+	xfs_fileoff_t		last_fsb;
+	int			nimaps = 1;
+	int			error;
+
 	/* prealloc/delalloc exists only on regular files */
 	if (!S_ISREG(VFS_I(ip)->i_mode))
 		return false;
@@ -624,91 +633,90 @@  xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
 		if (!force || ip->i_delayed_blks == 0)
 			return false;
 
-	return true;
-}
-
-/*
- * This is called to free any blocks beyond eof. The caller must hold
- * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
- * reference to the inode.
- */
-int
-xfs_free_eofblocks(
-	struct xfs_inode	*ip)
-{
-	struct xfs_trans	*tp;
-	int			error;
-	xfs_fileoff_t		end_fsb;
-	xfs_fileoff_t		last_fsb;
-	xfs_filblks_t		map_len;
-	int			nimaps;
-	struct xfs_bmbt_irec	imap;
-	struct xfs_mount	*mp = ip->i_mount;
-
 	/*
-	 * Figure out if there are any blocks beyond the end
-	 * of the file.  If not, then there is nothing to do.
+	 * Do not try to free post-EOF blocks if EOF is beyond the end of the
+	 * range supported by the page cache, because the truncation will loop
+	 * forever.
 	 */
 	end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)XFS_ISIZE(ip));
 	last_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
 	if (last_fsb <= end_fsb)
-		return 0;
-	map_len = last_fsb - end_fsb;
+		return false;
 
-	nimaps = 1;
+	/*
+	 * Look up the mapping for the first block past EOF.  If we can't find
+	 * it, there's nothing to free.
+	 */
 	xfs_ilock(ip, XFS_ILOCK_SHARED);
-	error = xfs_bmapi_read(ip, end_fsb, map_len, &imap, &nimaps, 0);
+	error = xfs_bmapi_read(ip, end_fsb, last_fsb - end_fsb, &imap, &nimaps,
+			0);
 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
+	if (error || nimaps == 0)
+		return false;
 
 	/*
-	 * If there are blocks after the end of file, truncate the file to its
-	 * current size to free them up.
+	 * If there's a real mapping there or there are delayed allocation
+	 * reservations, then we have post-EOF blocks to try to free.
 	 */
-	if (!error && (nimaps != 0) &&
-	    (imap.br_startblock != HOLESTARTBLOCK ||
-	     ip->i_delayed_blks)) {
-		/*
-		 * Attach the dquots to the inode up front.
-		 */
-		error = xfs_qm_dqattach(ip);
-		if (error)
-			return error;
+	return imap.br_startblock != HOLESTARTBLOCK || ip->i_delayed_blks;
+}
 
-		/* wait on dio to ensure i_size has settled */
-		inode_dio_wait(VFS_I(ip));
+/*
+ * This is called to free any blocks beyond eof. The caller must hold
+ * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
+ * reference to the inode.
+ */
+int
+xfs_free_eofblocks(
+	struct xfs_inode	*ip)
+{
+	struct xfs_trans	*tp;
+	struct xfs_mount	*mp = ip->i_mount;
+	int			error;
 
-		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0,
-				&tp);
-		if (error) {
-			ASSERT(XFS_FORCED_SHUTDOWN(mp));
-			return error;
-		}
+	/* Attach the dquots to the inode up front. */
+	error = xfs_qm_dqattach(ip);
+	if (error)
+		return error;
 
-		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		xfs_trans_ijoin(tp, ip, 0);
+	/* Wait on dio to ensure i_size has settled. */
+	inode_dio_wait(VFS_I(ip));
 
-		/*
-		 * Do not update the on-disk file size.  If we update the
-		 * on-disk file size and then the system crashes before the
-		 * contents of the file are flushed to disk then the files
-		 * may be full of holes (ie NULL files bug).
-		 */
-		error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK,
-					XFS_ISIZE(ip), XFS_BMAPI_NODISCARD);
-		if (error) {
-			/*
-			 * If we get an error at this point we simply don't
-			 * bother truncating the file.
-			 */
-			xfs_trans_cancel(tp);
-		} else {
-			error = xfs_trans_commit(tp);
-			if (!error)
-				xfs_inode_clear_eofblocks_tag(ip);
-		}
-
-		xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
+	if (error) {
+		ASSERT(XFS_FORCED_SHUTDOWN(mp));
+		return error;
 	}
+
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_trans_ijoin(tp, ip, 0);
+
+	/*
+	 * Do not update the on-disk file size.  If we update the on-disk file
+	 * size and then the system crashes before the contents of the file are
+	 * flushed to disk then the files may be full of holes (ie NULL files
+	 * bug).
+	 */
+	error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK,
+				XFS_ISIZE(ip), XFS_BMAPI_NODISCARD);
+	if (error)
+		goto err_cancel;
+
+	error = xfs_trans_commit(tp);
+	if (error)
+		goto out_unlock;
+
+	xfs_inode_clear_eofblocks_tag(ip);
+	goto out_unlock;
+
+err_cancel:
+	/*
+	 * If we get an error at this point we simply don't
+	 * bother truncating the file.
+	 */
+	xfs_trans_cancel(tp);
+out_unlock:
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
 }