diff mbox series

xfs: truncate should remove all blocks, not just to the end of the page cache

Message ID 20191222163630.GS7489@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs: truncate should remove all blocks, not just to the end of the page cache | expand

Commit Message

Darrick J. Wong Dec. 22, 2019, 4:36 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

xfs_itruncate_extents_flags() is supposed to unmap every block in a file
from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
bunmapi range, even though s_maxbytes reflects the highest offset the
pagecache can support, not the highest offset that XFS supports.

The result of this confusion is that if you create a 20T file on a
64-bit machine, mount the filesystem on a 32-bit machine, and remove the
file, we leak everything above 16T.  Fix this by capping the bunmapi
request at the maximum possible block offset, not s_maxbytes.

Fixes: 32972383ca462 ("xfs: make largest supported offset less shouty")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_inode.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Dec. 24, 2019, 8:21 a.m. UTC | #1
On Sun, Dec 22, 2019 at 08:36:30AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_itruncate_extents_flags() is supposed to unmap every block in a file
> from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
> bunmapi range, even though s_maxbytes reflects the highest offset the
> pagecache can support, not the highest offset that XFS supports.
> 
> The result of this confusion is that if you create a 20T file on a
> 64-bit machine, mount the filesystem on a 32-bit machine, and remove the
> file, we leak everything above 16T.  Fix this by capping the bunmapi
> request at the maximum possible block offset, not s_maxbytes.
> 
> Fixes: 32972383ca462 ("xfs: make largest supported offset less shouty")

Why would that fix that commit?  The commit just changed how do derive
the value, but not the value itself.

> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 401da197f012..eaa85d5933cb 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1544,9 +1544,12 @@ xfs_itruncate_extents_flags(
>  	 * possible file size.  If the first block to be removed is
>  	 * beyond the maximum file size (ie it is the same as last_block),
>  	 * then there is nothing to do.
> +	 *
> +	 * We have to free all the blocks to the bmbt maximum offset, even if
> +	 * the page cache can't scale that far.
>  	 */
>  	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
> -	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
> +	last_block = (1ULL << BMBT_STARTOFF_BITLEN) - 1;
>  	if (first_unmap_block == last_block)
>  		return 0;

That check is now never true.  I think that whole function wants some
attenttion instead.  Kill that whole last_block calculation, switch to
__xfs_bunmapi and pass ULLONG_MAX for the rlen input and just exit the
loop once rlen is 0.
Darrick J. Wong Dec. 24, 2019, 4:30 p.m. UTC | #2
On Tue, Dec 24, 2019 at 12:21:27AM -0800, Christoph Hellwig wrote:
> On Sun, Dec 22, 2019 at 08:36:30AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > xfs_itruncate_extents_flags() is supposed to unmap every block in a file
> > from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
> > bunmapi range, even though s_maxbytes reflects the highest offset the
> > pagecache can support, not the highest offset that XFS supports.
> > 
> > The result of this confusion is that if you create a 20T file on a
> > 64-bit machine, mount the filesystem on a 32-bit machine, and remove the
> > file, we leak everything above 16T.  Fix this by capping the bunmapi
> > request at the maximum possible block offset, not s_maxbytes.
> > 
> > Fixes: 32972383ca462 ("xfs: make largest supported offset less shouty")
> 
> Why would that fix that commit?  The commit just changed how do derive
> the value, but not the value itself.

I'm not sure what to put for a fixes tag when the code in question is
from the bitkeeper era.

> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 401da197f012..eaa85d5933cb 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1544,9 +1544,12 @@ xfs_itruncate_extents_flags(
> >  	 * possible file size.  If the first block to be removed is
> >  	 * beyond the maximum file size (ie it is the same as last_block),
> >  	 * then there is nothing to do.
> > +	 *
> > +	 * We have to free all the blocks to the bmbt maximum offset, even if
> > +	 * the page cache can't scale that far.
> >  	 */
> >  	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
> > -	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
> > +	last_block = (1ULL << BMBT_STARTOFF_BITLEN) - 1;
> >  	if (first_unmap_block == last_block)
> >  		return 0;
> 
> That check is now never true.  I think that whole function wants some
> attenttion instead.  Kill that whole last_block calculation, switch to
> __xfs_bunmapi and pass ULLONG_MAX for the rlen input and just exit the
> loop once rlen is 0.

I'll give that a try.

--D
diff mbox series

Patch

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 401da197f012..eaa85d5933cb 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1544,9 +1544,12 @@  xfs_itruncate_extents_flags(
 	 * possible file size.  If the first block to be removed is
 	 * beyond the maximum file size (ie it is the same as last_block),
 	 * then there is nothing to do.
+	 *
+	 * We have to free all the blocks to the bmbt maximum offset, even if
+	 * the page cache can't scale that far.
 	 */
 	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
-	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
+	last_block = (1ULL << BMBT_STARTOFF_BITLEN) - 1;
 	if (first_unmap_block == last_block)
 		return 0;