diff mbox series

xfs: don't fail verifier on empty attr3 leaf block

Message ID 20200513145343.45855-1-bfoster@redhat.com (mailing list archive)
State Superseded
Headers show
Series xfs: don't fail verifier on empty attr3 leaf block | expand

Commit Message

Brian Foster May 13, 2020, 2:53 p.m. UTC
The attr fork can transition from shortform to leaf format while
empty if the first xattr doesn't fit in shortform. While this empty
leaf block state is intended to be transient, it is technically not
due to the transactional implementation of the xattr set operation.

We historically have a couple of bandaids to work around this
problem. The first is to hold the buffer after the format conversion
to prevent premature writeback of the empty leaf buffer and the
second is to bypass the xattr count check in the verifier during
recovery. The latter assumes that the xattr set is also in the log
and will be recovered into the buffer soon after the empty leaf
buffer is reconstructed. This is not guaranteed, however.

If the filesystem crashes after the format conversion but before the
xattr set that induced it, only the format conversion may exist in
the log. When recovered, this creates a latent corrupted state on
the inode as any subsequent attempts to read the buffer fail due to
verifier failure. This includes further attempts to set xattrs on
the inode or attempts to destroy the attr fork, which prevents the
inode from ever being removed from the unlinked list.

To avoid this condition, accept that an empty attr leaf block is a
valid state and remove the count check from the verifier. This means
that on rare occasions an attr fork might exist in an unexpected
state, but is otherwise consistent and functional. Note that we
retain the logic to avoid racing with metadata writeback to reduce
the window where this can occur.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

v1:
- Remove the verifier check instead of warn.
rfc: https://lore.kernel.org/linux-xfs/20200511185016.33684-1-bfoster@redhat.com/

 fs/xfs/libxfs/xfs_attr_leaf.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Darrick J. Wong May 14, 2020, 8:52 p.m. UTC | #1
On Wed, May 13, 2020 at 10:53:43AM -0400, Brian Foster wrote:
> The attr fork can transition from shortform to leaf format while
> empty if the first xattr doesn't fit in shortform. While this empty
> leaf block state is intended to be transient, it is technically not
> due to the transactional implementation of the xattr set operation.
> 
> We historically have a couple of bandaids to work around this
> problem. The first is to hold the buffer after the format conversion
> to prevent premature writeback of the empty leaf buffer and the
> second is to bypass the xattr count check in the verifier during
> recovery. The latter assumes that the xattr set is also in the log
> and will be recovered into the buffer soon after the empty leaf
> buffer is reconstructed. This is not guaranteed, however.
> 
> If the filesystem crashes after the format conversion but before the
> xattr set that induced it, only the format conversion may exist in
> the log. When recovered, this creates a latent corrupted state on
> the inode as any subsequent attempts to read the buffer fail due to
> verifier failure. This includes further attempts to set xattrs on
> the inode or attempts to destroy the attr fork, which prevents the
> inode from ever being removed from the unlinked list.
> 
> To avoid this condition, accept that an empty attr leaf block is a
> valid state and remove the count check from the verifier. This means
> that on rare occasions an attr fork might exist in an unexpected
> state, but is otherwise consistent and functional. Note that we
> retain the logic to avoid racing with metadata writeback to reduce
> the window where this can occur.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> v1:
> - Remove the verifier check instead of warn.
> rfc: https://lore.kernel.org/linux-xfs/20200511185016.33684-1-bfoster@redhat.com/
> 
>  fs/xfs/libxfs/xfs_attr_leaf.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index 863444e2dda7..6b94bb9de378 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -308,14 +308,6 @@ xfs_attr3_leaf_verify(
>  	if (fa)
>  		return fa;
>  
> -	/*
> -	 * In recovery there is a transient state where count == 0 is valid
> -	 * because we may have transitioned an empty shortform attr to a leaf
> -	 * if the attr didn't fit in shortform.

/me wonders if it would be useful for future spelunkers to retain some
sort of comment here that we once thought count==0 was bad but screwed
it up enough that we now allow it?

Moreso that future me/fuzzrobot won't come along having forgotten
everything and think "Oh, we need to validate hdr.count!" :P

--D

> -	 */
> -	if (!xfs_log_in_recovery(mp) && ichdr.count == 0)
> -		return __this_address;
> -
>  	/*
>  	 * firstused is the block offset of the first name info structure.
>  	 * Make sure it doesn't go off the block or crash into the header.
> -- 
> 2.21.1
>
Brian Foster May 15, 2020, noon UTC | #2
On Thu, May 14, 2020 at 01:52:10PM -0700, Darrick J. Wong wrote:
> On Wed, May 13, 2020 at 10:53:43AM -0400, Brian Foster wrote:
> > The attr fork can transition from shortform to leaf format while
> > empty if the first xattr doesn't fit in shortform. While this empty
> > leaf block state is intended to be transient, it is technically not
> > due to the transactional implementation of the xattr set operation.
> > 
> > We historically have a couple of bandaids to work around this
> > problem. The first is to hold the buffer after the format conversion
> > to prevent premature writeback of the empty leaf buffer and the
> > second is to bypass the xattr count check in the verifier during
> > recovery. The latter assumes that the xattr set is also in the log
> > and will be recovered into the buffer soon after the empty leaf
> > buffer is reconstructed. This is not guaranteed, however.
> > 
> > If the filesystem crashes after the format conversion but before the
> > xattr set that induced it, only the format conversion may exist in
> > the log. When recovered, this creates a latent corrupted state on
> > the inode as any subsequent attempts to read the buffer fail due to
> > verifier failure. This includes further attempts to set xattrs on
> > the inode or attempts to destroy the attr fork, which prevents the
> > inode from ever being removed from the unlinked list.
> > 
> > To avoid this condition, accept that an empty attr leaf block is a
> > valid state and remove the count check from the verifier. This means
> > that on rare occasions an attr fork might exist in an unexpected
> > state, but is otherwise consistent and functional. Note that we
> > retain the logic to avoid racing with metadata writeback to reduce
> > the window where this can occur.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > v1:
> > - Remove the verifier check instead of warn.
> > rfc: https://lore.kernel.org/linux-xfs/20200511185016.33684-1-bfoster@redhat.com/
> > 
> >  fs/xfs/libxfs/xfs_attr_leaf.c | 8 --------
> >  1 file changed, 8 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> > index 863444e2dda7..6b94bb9de378 100644
> > --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> > +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> > @@ -308,14 +308,6 @@ xfs_attr3_leaf_verify(
> >  	if (fa)
> >  		return fa;
> >  
> > -	/*
> > -	 * In recovery there is a transient state where count == 0 is valid
> > -	 * because we may have transitioned an empty shortform attr to a leaf
> > -	 * if the attr didn't fit in shortform.
> 
> /me wonders if it would be useful for future spelunkers to retain some
> sort of comment here that we once thought count==0 was bad but screwed
> it up enough that we now allow it?
> 
> Moreso that future me/fuzzrobot won't come along having forgotten
> everything and think "Oh, we need to validate hdr.count!" :P
> 

Fine by me. Something like the following perhaps?

"This verifier historically failed empty leaf buffers because we expect
the fork to be in another format. Empty attr fork format conversions are
possible during xattr set, however, and format conversion is not atomic
with the xattr set that triggers it. We cannot assume leaf blocks are
non-empty until that is addressed."

Brian

> --D
> 
> > -	 */
> > -	if (!xfs_log_in_recovery(mp) && ichdr.count == 0)
> > -		return __this_address;
> > -
> >  	/*
> >  	 * firstused is the block offset of the first name info structure.
> >  	 * Make sure it doesn't go off the block or crash into the header.
> > -- 
> > 2.21.1
> > 
>
Darrick J. Wong May 15, 2020, 3:56 p.m. UTC | #3
On Fri, May 15, 2020 at 08:00:28AM -0400, Brian Foster wrote:
> On Thu, May 14, 2020 at 01:52:10PM -0700, Darrick J. Wong wrote:
> > On Wed, May 13, 2020 at 10:53:43AM -0400, Brian Foster wrote:
> > > The attr fork can transition from shortform to leaf format while
> > > empty if the first xattr doesn't fit in shortform. While this empty
> > > leaf block state is intended to be transient, it is technically not
> > > due to the transactional implementation of the xattr set operation.
> > > 
> > > We historically have a couple of bandaids to work around this
> > > problem. The first is to hold the buffer after the format conversion
> > > to prevent premature writeback of the empty leaf buffer and the
> > > second is to bypass the xattr count check in the verifier during
> > > recovery. The latter assumes that the xattr set is also in the log
> > > and will be recovered into the buffer soon after the empty leaf
> > > buffer is reconstructed. This is not guaranteed, however.
> > > 
> > > If the filesystem crashes after the format conversion but before the
> > > xattr set that induced it, only the format conversion may exist in
> > > the log. When recovered, this creates a latent corrupted state on
> > > the inode as any subsequent attempts to read the buffer fail due to
> > > verifier failure. This includes further attempts to set xattrs on
> > > the inode or attempts to destroy the attr fork, which prevents the
> > > inode from ever being removed from the unlinked list.
> > > 
> > > To avoid this condition, accept that an empty attr leaf block is a
> > > valid state and remove the count check from the verifier. This means
> > > that on rare occasions an attr fork might exist in an unexpected
> > > state, but is otherwise consistent and functional. Note that we
> > > retain the logic to avoid racing with metadata writeback to reduce
> > > the window where this can occur.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > > 
> > > v1:
> > > - Remove the verifier check instead of warn.
> > > rfc: https://lore.kernel.org/linux-xfs/20200511185016.33684-1-bfoster@redhat.com/
> > > 
> > >  fs/xfs/libxfs/xfs_attr_leaf.c | 8 --------
> > >  1 file changed, 8 deletions(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> > > index 863444e2dda7..6b94bb9de378 100644
> > > --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> > > +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> > > @@ -308,14 +308,6 @@ xfs_attr3_leaf_verify(
> > >  	if (fa)
> > >  		return fa;
> > >  
> > > -	/*
> > > -	 * In recovery there is a transient state where count == 0 is valid
> > > -	 * because we may have transitioned an empty shortform attr to a leaf
> > > -	 * if the attr didn't fit in shortform.
> > 
> > /me wonders if it would be useful for future spelunkers to retain some
> > sort of comment here that we once thought count==0 was bad but screwed
> > it up enough that we now allow it?
> > 
> > Moreso that future me/fuzzrobot won't come along having forgotten
> > everything and think "Oh, we need to validate hdr.count!" :P
> > 
> 
> Fine by me. Something like the following perhaps?
> 
> "This verifier historically failed empty leaf buffers because we expect
> the fork to be in another format. Empty attr fork format conversions are
> possible during xattr set, however, and format conversion is not atomic
> with the xattr set that triggers it. We cannot assume leaf blocks are
> non-empty until that is addressed."

Sounds good to me!

--D

> Brian
> 
> > --D
> > 
> > > -	 */
> > > -	if (!xfs_log_in_recovery(mp) && ichdr.count == 0)
> > > -		return __this_address;
> > > -
> > >  	/*
> > >  	 * firstused is the block offset of the first name info structure.
> > >  	 * Make sure it doesn't go off the block or crash into the header.
> > > -- 
> > > 2.21.1
> > > 
> > 
>
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 863444e2dda7..6b94bb9de378 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -308,14 +308,6 @@  xfs_attr3_leaf_verify(
 	if (fa)
 		return fa;
 
-	/*
-	 * In recovery there is a transient state where count == 0 is valid
-	 * because we may have transitioned an empty shortform attr to a leaf
-	 * if the attr didn't fit in shortform.
-	 */
-	if (!xfs_log_in_recovery(mp) && ichdr.count == 0)
-		return __this_address;
-
 	/*
 	 * firstused is the block offset of the first name info structure.
 	 * Make sure it doesn't go off the block or crash into the header.