[2/2,V3] xfs: verify size-vs-format for symlinks & dirs
diff mbox series

Message ID 1e55b111-eae4-b0cf-0221-c96eb3f17b77@redhat.com
State New
Headers show
Series
  • xfs: validate size vs format
Related show

Commit Message

Eric Sandeen Sept. 25, 2018, 3 a.m. UTC
Today, xfs_ifork_verify_data() will simply skip verification if the inode
claims to be in non-local format.  However, nothing catches the case where
the size for the format is too small to be non-local.  xfs_repair tests
for this mismatch in process_check_inode_sizes(), so do the same in this
verifier.

Reported-by: Xu, Wen <wen.xu@gatech.edu>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200925
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---

V2: restructure code & tests per Dave's suggestion on the V1 patch.
V3: rewrite dave's comments per brian's suggestions

Comments

Dave Chinner Sept. 30, 2018, 3:25 a.m. UTC | #1
On Mon, Sep 24, 2018 at 10:00:39PM -0500, Eric Sandeen wrote:
> Today, xfs_ifork_verify_data() will simply skip verification if the inode
> claims to be in non-local format.  However, nothing catches the case where
> the size for the format is too small to be non-local.  xfs_repair tests
> for this mismatch in process_check_inode_sizes(), so do the same in this
> verifier.
> 
> Reported-by: Xu, Wen <wen.xu@gatech.edu>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200925
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> V2: restructure code & tests per Dave's suggestion on the V1 patch.
> V3: rewrite dave's comments per brian's suggestions
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index f9acf1d436f6..d1a58e7a872f 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -704,12 +704,33 @@ xfs_ifork_verify_data(
>  	struct xfs_inode	*ip,
>  	struct xfs_ifork_ops	*ops)
>  {
> -	/* Non-local data fork, we're done. */
> -	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> +	struct xfs_mount	*mp = ip->i_mount;
> +	int			mode = VFS_I(ip)->i_mode;
> +
> +	/*
> +	 * Verify non-local format forks have a valid size. Symlinks must have
> +	 * outgrown the data fork size. The same goes for non-local dirs, but
> +	 * dirs grow at dirblock granularity. Perform a slightly stronger check
> +	 * and require the dir is at least one dirblock in size.
> +	 */
> +	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL) {
> +		switch (mode & S_IFMT) {
> +		case S_IFDIR:
> +			if (ip->i_d.di_size < mp->m_dir_geo->blksize)
> +				return __this_address;
> +			break;
> +		case S_IFLNK:
> +			if (ip->i_d.di_size <= XFS_IFORK_DSIZE(ip))
> +				return __this_address;

Just had this fire in inode writeback from generic/390. I'm going to
drop it for the moment, because I'm not sure what the correct fix is
yet.  Consider this:

	create symlink XFS_LITINO bytes in length
	  fits in inode, so put inline. size <= IFORK_DSIZE
	[....]
	add attr to symlink
	  creates attr fork
	    inline data fork too large, size > new IFORK_DSIZE
	      xfs_symlink_local_to_remote()
		data fork goes to extent format, size remains unchanged
	[....]
	remove last attrs from inode
	  remove attr fork
	    IFORK_DSIZE grows again, now size = IFORK_DSIZE again
	    data fork remains in extent format
	[....]
	inode writeback
	  size = IFORK_DSIZE, extent format
	    xfs_ifork_verify_data verifier fails.


With this process, I think a symlink can be out of line even if it
is less than the size of the data fork. I think this can happen even
for symlinks much smaller than XFS_LITINO, because the attribute
fork can grow into free space in the literal area and push local
data larger than XFS_BMDR_SPACE_CALC(MINDBTPTRS) bytes to extent
format.

#define MINDBTPTRS 3

#define XFS_BMDR_SPACE_CALC(nrecs) \
        (int)(sizeof(xfs_bmdr_block_t) + \
	       ((nrecs) * (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t)))) 

= 4 + 3 * (8 + 8)
= 52 bytes
= 56 bytes when rounded up to 8 byte offset

So, yeah, I think that this check needs to be different because I
think we could have symlinks as short at 56 bytes in extent format,
even when the inode has no attribute fork...

Cheers,

Dave.
Eric Sandeen Sept. 30, 2018, 5:06 a.m. UTC | #2
On 9/29/18 10:25 PM, Dave Chinner wrote:
> On Mon, Sep 24, 2018 at 10:00:39PM -0500, Eric Sandeen wrote:
>> Today, xfs_ifork_verify_data() will simply skip verification if the inode
>> claims to be in non-local format.  However, nothing catches the case where
>> the size for the format is too small to be non-local.  xfs_repair tests
>> for this mismatch in process_check_inode_sizes(), so do the same in this
>> verifier.
>>
>> Reported-by: Xu, Wen <wen.xu@gatech.edu>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200925
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> Reviewed-by: Brian Foster <bfoster@redhat.com>
>> ---
>>
>> V2: restructure code & tests per Dave's suggestion on the V1 patch.
>> V3: rewrite dave's comments per brian's suggestions
>>
>> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
>> index f9acf1d436f6..d1a58e7a872f 100644
>> --- a/fs/xfs/libxfs/xfs_inode_fork.c
>> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
>> @@ -704,12 +704,33 @@ xfs_ifork_verify_data(
>>  	struct xfs_inode	*ip,
>>  	struct xfs_ifork_ops	*ops)
>>  {
>> -	/* Non-local data fork, we're done. */
>> -	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
>> +	struct xfs_mount	*mp = ip->i_mount;
>> +	int			mode = VFS_I(ip)->i_mode;
>> +
>> +	/*
>> +	 * Verify non-local format forks have a valid size. Symlinks must have
>> +	 * outgrown the data fork size. The same goes for non-local dirs, but
>> +	 * dirs grow at dirblock granularity. Perform a slightly stronger check
>> +	 * and require the dir is at least one dirblock in size.
>> +	 */
>> +	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL) {
>> +		switch (mode & S_IFMT) {
>> +		case S_IFDIR:
>> +			if (ip->i_d.di_size < mp->m_dir_geo->blksize)
>> +				return __this_address;
>> +			break;
>> +		case S_IFLNK:
>> +			if (ip->i_d.di_size <= XFS_IFORK_DSIZE(ip))
>> +				return __this_address;
> 
> Just had this fire in inode writeback from generic/390. I'm going to
> drop it for the moment, because I'm not sure what the correct fix is
> yet.  Consider this:
> 
> 	create symlink XFS_LITINO bytes in length
> 	  fits in inode, so put inline. size <= IFORK_DSIZE
> 	[....]
> 	add attr to symlink
> 	  creates attr fork
> 	    inline data fork too large, size > new IFORK_DSIZE
> 	      xfs_symlink_local_to_remote()
> 		data fork goes to extent format, size remains unchanged
> 	[....]
> 	remove last attrs from inode
> 	  remove attr fork
> 	    IFORK_DSIZE grows again, now size = IFORK_DSIZE again
> 	    data fork remains in extent format
> 	[....]
> 	inode writeback
> 	  size = IFORK_DSIZE, extent format
> 	    xfs_ifork_verify_data verifier fails.
> 
> 
> With this process, I think a symlink can be out of line even if it
> is less than the size of the data fork. I think this can happen even
> for symlinks much smaller than XFS_LITINO, because the attribute
> fork can grow into free space in the literal area and push local
> data larger than XFS_BMDR_SPACE_CALC(MINDBTPTRS) bytes to extent
> format.
> 
> #define MINDBTPTRS 3
> 
> #define XFS_BMDR_SPACE_CALC(nrecs) \
>         (int)(sizeof(xfs_bmdr_block_t) + \
> 	       ((nrecs) * (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t)))) 
> 
> = 4 + 3 * (8 + 8)
> = 52 bytes
> = 56 bytes when rounded up to 8 byte offset
> 
> So, yeah, I think that this check needs to be different because I
> think we could have symlinks as short at 56 bytes in extent format,
> even when the inode has no attribute fork...

Hrmph.  And yet, xfs_repair:

static int
process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
{
        xfs_fileoff_t           expected_offset;
        xfs_bmbt_rec_t          *rp;
        xfs_bmbt_irec_t         irec;
        int                     numrecs;
        int                     i;
        int                     max_blocks;

        if (be64_to_cpu(dino->di_size) <= XFS_DFORK_DSIZE(dino, mp)) {
                if (dino->di_format == XFS_DINODE_FMT_LOCAL)
                        return 0;
                do_warn(
_("mismatch between format (%d) and size (%" PRId64 ") in symlink ino %" PRIu64 "\n"),
                        dino->di_format,
                        (int64_t)be64_to_cpu(dino->di_size), lino);
                return 1;
        }

seems to clearly call "non-local symlink with size < XFS_DFORK_DSIZE" corruption.
What gives?

-Eric
Dave Chinner Sept. 30, 2018, 6:05 a.m. UTC | #3
On Sun, Sep 30, 2018 at 12:06:44AM -0500, Eric Sandeen wrote:
> On 9/29/18 10:25 PM, Dave Chinner wrote:
> > On Mon, Sep 24, 2018 at 10:00:39PM -0500, Eric Sandeen wrote:
> >> Today, xfs_ifork_verify_data() will simply skip verification if the inode
> >> claims to be in non-local format.  However, nothing catches the case where
> >> the size for the format is too small to be non-local.  xfs_repair tests
> >> for this mismatch in process_check_inode_sizes(), so do the same in this
> >> verifier.
> >>
> >> Reported-by: Xu, Wen <wen.xu@gatech.edu>
> >> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200925
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >> Reviewed-by: Brian Foster <bfoster@redhat.com>
> >> ---
> >>
> >> V2: restructure code & tests per Dave's suggestion on the V1 patch.
> >> V3: rewrite dave's comments per brian's suggestions
> >>
> >> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> >> index f9acf1d436f6..d1a58e7a872f 100644
> >> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> >> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> >> @@ -704,12 +704,33 @@ xfs_ifork_verify_data(
> >>  	struct xfs_inode	*ip,
> >>  	struct xfs_ifork_ops	*ops)
> >>  {
> >> -	/* Non-local data fork, we're done. */
> >> -	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> >> +	struct xfs_mount	*mp = ip->i_mount;
> >> +	int			mode = VFS_I(ip)->i_mode;
> >> +
> >> +	/*
> >> +	 * Verify non-local format forks have a valid size. Symlinks must have
> >> +	 * outgrown the data fork size. The same goes for non-local dirs, but
> >> +	 * dirs grow at dirblock granularity. Perform a slightly stronger check
> >> +	 * and require the dir is at least one dirblock in size.
> >> +	 */
> >> +	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL) {
> >> +		switch (mode & S_IFMT) {
> >> +		case S_IFDIR:
> >> +			if (ip->i_d.di_size < mp->m_dir_geo->blksize)
> >> +				return __this_address;
> >> +			break;
> >> +		case S_IFLNK:
> >> +			if (ip->i_d.di_size <= XFS_IFORK_DSIZE(ip))
> >> +				return __this_address;
> > 
> > Just had this fire in inode writeback from generic/390. I'm going to
> > drop it for the moment, because I'm not sure what the correct fix is
> > yet.  Consider this:
> > 
> > 	create symlink XFS_LITINO bytes in length
> > 	  fits in inode, so put inline. size <= IFORK_DSIZE
> > 	[....]
> > 	add attr to symlink
> > 	  creates attr fork
> > 	    inline data fork too large, size > new IFORK_DSIZE
> > 	      xfs_symlink_local_to_remote()
> > 		data fork goes to extent format, size remains unchanged
> > 	[....]
> > 	remove last attrs from inode
> > 	  remove attr fork
> > 	    IFORK_DSIZE grows again, now size = IFORK_DSIZE again
> > 	    data fork remains in extent format
> > 	[....]
> > 	inode writeback
> > 	  size = IFORK_DSIZE, extent format
> > 	    xfs_ifork_verify_data verifier fails.
> > 
> > 
> > With this process, I think a symlink can be out of line even if it
> > is less than the size of the data fork. I think this can happen even
> > for symlinks much smaller than XFS_LITINO, because the attribute
> > fork can grow into free space in the literal area and push local
> > data larger than XFS_BMDR_SPACE_CALC(MINDBTPTRS) bytes to extent
> > format.
> > 
> > #define MINDBTPTRS 3
> > 
> > #define XFS_BMDR_SPACE_CALC(nrecs) \
> >         (int)(sizeof(xfs_bmdr_block_t) + \
> > 	       ((nrecs) * (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t)))) 
> > 
> > = 4 + 3 * (8 + 8)
> > = 52 bytes
> > = 56 bytes when rounded up to 8 byte offset
> > 
> > So, yeah, I think that this check needs to be different because I
> > think we could have symlinks as short at 56 bytes in extent format,
> > even when the inode has no attribute fork...
> 
> Hrmph.  And yet, xfs_repair:
> 
> static int
> process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
> {
>         xfs_fileoff_t           expected_offset;
>         xfs_bmbt_rec_t          *rp;
>         xfs_bmbt_irec_t         irec;
>         int                     numrecs;
>         int                     i;
>         int                     max_blocks;
> 
>         if (be64_to_cpu(dino->di_size) <= XFS_DFORK_DSIZE(dino, mp)) {
>                 if (dino->di_format == XFS_DINODE_FMT_LOCAL)
>                         return 0;
>                 do_warn(
> _("mismatch between format (%d) and size (%" PRId64 ") in symlink ino %" PRIu64 "\n"),
>                         dino->di_format,
>                         (int64_t)be64_to_cpu(dino->di_size), lino);
>                 return 1;
>         }
> 
> seems to clearly call "non-local symlink with size < XFS_DFORK_DSIZE" corruption.
> What gives?

Seems to me like another cases that the verifiers have uncovered
another situation that even repair doesn't handle correctly. (Which
is why I like to look at these things from first principles, rather
than just from the "what does reapir do" POV?). It's like to be rare
because who removes all the attributes on a file apart from when
unlinking the inode?

xfs_attr_fork_remove() has set the di_forkoff back to zero
when the last attribute is removed from the inode for a long time
(I stopped looking when I got to ~2008...), so this isn't a new
situation. i.e. trying to define what are valid values has
demonstrated that there are more cases we need to take into account
than anyone has realised.

Cheers,

Dave.
Eric Sandeen Sept. 30, 2018, 5:54 p.m. UTC | #4
On 9/30/18 1:05 AM, Dave Chinner wrote:
> On Sun, Sep 30, 2018 at 12:06:44AM -0500, Eric Sandeen wrote:
>> On 9/29/18 10:25 PM, Dave Chinner wrote:
>>> On Mon, Sep 24, 2018 at 10:00:39PM -0500, Eric Sandeen wrote:
...

>>> So, yeah, I think that this check needs to be different because I
>>> think we could have symlinks as short at 56 bytes in extent format,
>>> even when the inode has no attribute fork...
>>
>> Hrmph.  And yet, xfs_repair:
>>
>> static int
>> process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
>> {
>>         xfs_fileoff_t           expected_offset;
>>         xfs_bmbt_rec_t          *rp;
>>         xfs_bmbt_irec_t         irec;
>>         int                     numrecs;
>>         int                     i;
>>         int                     max_blocks;
>>
>>         if (be64_to_cpu(dino->di_size) <= XFS_DFORK_DSIZE(dino, mp)) {
>>                 if (dino->di_format == XFS_DINODE_FMT_LOCAL)
>>                         return 0;
>>                 do_warn(
>> _("mismatch between format (%d) and size (%" PRId64 ") in symlink ino %" PRIu64 "\n"),
>>                         dino->di_format,
>>                         (int64_t)be64_to_cpu(dino->di_size), lino);
>>                 return 1;
>>         }
>>
>> seems to clearly call "non-local symlink with size < XFS_DFORK_DSIZE" corruption.
>> What gives?
> 
> Seems to me like another cases that the verifiers have uncovered
> another situation that even repair doesn't handle correctly. (Which
> is why I like to look at these things from first principles, rather
> than just from the "what does reapir do" POV?). It's like to be rare
> because who removes all the attributes on a file apart from when
> unlinking the inode?

Sure, I get it that repair is not the canonical truth for format, I was
just surprised that you hit it fairly quickly with the kernel verifier,
but apparently didn't see it prior to this patch in a post-test repair run.
In fact I don't think I've ever seen this check fail in repair.  So it
just seemed odd.  It might be interesting to see if we can provoke it in
xfs_repair? 

> xfs_attr_fork_remove() has set the di_forkoff back to zero
> when the last attribute is removed from the inode for a long time
> (I stopped looking when I got to ~2008...), so this isn't a new
> situation. i.e. trying to define what are valid values has
> demonstrated that there are more cases we need to take into account
> than anyone has realised.

Yeah, I have a vague recollection of other bugs related to the fork
offset that led to the sense of nearly nondeterministic behavior for
when things move in and out of local.

-Eric

Patch
diff mbox series

diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index f9acf1d436f6..d1a58e7a872f 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -704,12 +704,33 @@  xfs_ifork_verify_data(
 	struct xfs_inode	*ip,
 	struct xfs_ifork_ops	*ops)
 {
-	/* Non-local data fork, we're done. */
-	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
+	struct xfs_mount	*mp = ip->i_mount;
+	int			mode = VFS_I(ip)->i_mode;
+
+	/*
+	 * Verify non-local format forks have a valid size. Symlinks must have
+	 * outgrown the data fork size. The same goes for non-local dirs, but
+	 * dirs grow at dirblock granularity. Perform a slightly stronger check
+	 * and require the dir is at least one dirblock in size.
+	 */
+	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL) {
+		switch (mode & S_IFMT) {
+		case S_IFDIR:
+			if (ip->i_d.di_size < mp->m_dir_geo->blksize)
+				return __this_address;
+			break;
+		case S_IFLNK:
+			if (ip->i_d.di_size <= XFS_IFORK_DSIZE(ip))
+				return __this_address;
+			break;
+		default:
+			break;
+		}
 		return NULL;
+	}
 
 	/* Check the inline data fork if there is one. */
-	switch (VFS_I(ip)->i_mode & S_IFMT) {
+	switch (mode & S_IFMT) {
 	case S_IFDIR:
 		return ops->verify_dir(ip);
 	case S_IFLNK: