Message ID | 20170814081806.GA3924@debian.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > When changing a file's ACL mask, xfs_set_acl() will first set the group > bits of i_mode to the value of the mask, and only then set the actual > extended attribute representing the new ACL. > > If the second part fails (due to lack of space, for example) and the > file had no ACL attribute to begin with, the system will from now on > assume that the mask permission bits are actual group permission bits, > potentially granting access to the wrong users. > > To prevent this, perform the mode update and the ACL update as part of > the same transaction, and restore the original mode to the vfs inode in > case of failure. This requires a change in how extended attributes are > deleted: commit the transaction even if there was no attribute to > remove, to log the standard inode fields. > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > --- > This issue is covered by generic/449 in xfstests. Several other filesystems > are affected by this, some of them have already applied patches: > - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails > - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails > - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails > - fe26569 ext2: preserve i_mode if ext2_set_acl() fails > The test won't exactly pass after this patch is applied. It will fail > differently, this time claiming that the filesystem is inconsistent. This I disagree that corrupting the filesystem is an acceptable strategy for dealing with insufficient space to handle setfacl. > probably has something to do with setting too many extended attributes, as > it goes away if you change the attribute size in the test from 1k to 64k. Hadn't you better look into why it does that and handle it? --D > > fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- > fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- > 2 files changed, 25 insertions(+), 27 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > index de7b9bd..c3193e1 100644 > --- a/fs/xfs/libxfs/xfs_attr.c > +++ b/fs/xfs/libxfs/xfs_attr.c > @@ -395,6 +395,7 @@ xfs_attr_remove( > struct xfs_defer_ops dfops; > xfs_fsblock_t firstblock; > int error; > + int noattr = 0; > > XFS_STATS_INC(mp, xs_attr_remove); > > @@ -438,7 +439,12 @@ xfs_attr_remove( > xfs_trans_ijoin(args.trans, dp, 0); > > if (!xfs_inode_hasattr(dp)) { > - error = -ENOATTR; > + /* > + * Commit even if there is no attr to remove, because when > + * setting ACLs the caller may be changing the mode in the > + * same transaction. > + */ > + noattr = 1; > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > error = xfs_attr_shortform_remove(&args); > @@ -458,7 +464,7 @@ xfs_attr_remove( > if (mp->m_flags & XFS_MOUNT_WSYNC) > xfs_trans_set_sync(args.trans); > > - if ((flags & ATTR_KERNOTIME) == 0) > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > /* > @@ -468,6 +474,9 @@ xfs_attr_remove( > error = xfs_trans_commit(args.trans); > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > + if (!error && noattr) > + error = -ENOATTR; > + > return error; > > out: > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > index 7034e17..1fe4363 100644 > --- a/fs/xfs/xfs_acl.c > +++ b/fs/xfs/xfs_acl.c > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > } > > -static int > -xfs_set_mode(struct inode *inode, umode_t mode) > -{ > - int error = 0; > - > - if (mode != inode->i_mode) { > - struct iattr iattr; > - > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > - iattr.ia_mode = mode; > - iattr.ia_ctime = current_time(inode); > - > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > - } > - > - return error; > -} > - > int > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > { > int error = 0; > + int mode_changed = 0; > + umode_t old_mode = inode->i_mode; > + struct timespec old_ctime = inode->i_ctime; > > if (!acl) > goto set_acl; > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > > if (type == ACL_TYPE_ACCESS) { > - umode_t mode; > - > - error = posix_acl_update_mode(inode, &mode, &acl); > - if (error) > - return error; > - error = xfs_set_mode(inode, mode); > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > if (error) > return error; > + inode->i_ctime = current_time(inode); > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + mode_changed = 1; > } > > set_acl: > - return __xfs_set_acl(inode, acl, type); > + error = __xfs_set_acl(inode, acl, type); > + if (error && mode_changed) { > + inode->i_mode = old_mode; > + inode->i_ctime = old_ctime; > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + } > + return error; > } > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 14, 2017 at 03:28:27PM -0700, Darrick J. Wong wrote: > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > The test won't exactly pass after this patch is applied. It will fail > > differently, this time claiming that the filesystem is inconsistent. This > > I disagree that corrupting the filesystem is an acceptable strategy for > dealing with insufficient space to handle setfacl. > > > probably has something to do with setting too many extended attributes, as > > it goes away if you change the attribute size in the test from 1k to 64k. > > Hadn't you better look into why it does that and handle it? Hi Darrick, thank for your quick reply. I think I failed to explain myself properly. The inconsistency issue is NOT being caused by my patch. It was already there, and my patch just makes it visible. I'll try to be more clear. The test generic/449 essentially goes like this: 1) It sets as many extended attributes as possible so the filesystem will run out of space for the ACL. 2) It tries to set the ACL, which will fail because of (1). 3) It checks that the mode was not altered by the failed setfacl. 4) It checks that the filesystem was not corrupted. The filesystem becomes corrupted after step (1). Before my patch, the test was failing in step (3), so step (4) was never run. Once the ACL issue is fixed, step (3) will pass and the test will get to (4). So now the test announces the inconsistency, while before it was silent about it. If the test is run without applying my patch, but changing it slightly so it passes step (3), the end result will be the same: inconsistent filesystem. A simple way to try this is to comment out the whole if block that deals with the call to setfacl. So whatever this issue is, it seems to be a general problem with extended attributes in xfs. It has nothing to do with the bug my patch intends to fix, other than the fact that generic/449 accidentally checks for both problems. My suggestion to adjust the attribute size in the test was so that we could focus on the issue at hand, not so we could ignore the other one. Of course it needs to be fixed as well. I will try to figure out what is going on and hopefully send another patch. But both issues are independent, so please take a look at my patch on its own merits. It does fix the ACL bug even if the test does not pass. Thank you for your attention, Ernest -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > index de7b9bd..c3193e1 100644 > --- a/fs/xfs/libxfs/xfs_attr.c > +++ b/fs/xfs/libxfs/xfs_attr.c > @@ -395,6 +395,7 @@ xfs_attr_remove( > struct xfs_defer_ops dfops; > xfs_fsblock_t firstblock; > int error; > + int noattr = 0; bool. And make it "bool has_attr = true" to avoid the nasty double negative of "!noattr" later on. > > XFS_STATS_INC(mp, xs_attr_remove); > > @@ -438,7 +439,12 @@ xfs_attr_remove( > xfs_trans_ijoin(args.trans, dp, 0); > > if (!xfs_inode_hasattr(dp)) { > - error = -ENOATTR; > + /* > + * Commit even if there is no attr to remove, because when > + * setting ACLs the caller may be changing the mode in the > + * same transaction. > + */ > + noattr = 1; > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > error = xfs_attr_shortform_remove(&args); > @@ -458,7 +464,7 @@ xfs_attr_remove( > if (mp->m_flags & XFS_MOUNT_WSYNC) > xfs_trans_set_sync(args.trans); > > - if ((flags & ATTR_KERNOTIME) == 0) > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); Why? > > /* > @@ -468,6 +474,9 @@ xfs_attr_remove( > error = xfs_trans_commit(args.trans); > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > + if (!error && noattr) > + error = -ENOATTR; > + > return error; > > out: > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > index 7034e17..1fe4363 100644 > --- a/fs/xfs/xfs_acl.c > +++ b/fs/xfs/xfs_acl.c > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > } > > -static int > -xfs_set_mode(struct inode *inode, umode_t mode) > -{ > - int error = 0; > - > - if (mode != inode->i_mode) { > - struct iattr iattr; > - > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > - iattr.ia_mode = mode; > - iattr.ia_ctime = current_time(inode); > - > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > - } > - > - return error; > -} Ok, so that does an atomic, transactional change to the inode mode.... > int > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > { > int error = 0; > + int mode_changed = 0; > + umode_t old_mode = inode->i_mode; > + struct timespec old_ctime = inode->i_ctime; > > if (!acl) > goto set_acl; > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > > if (type == ACL_TYPE_ACCESS) { > - umode_t mode; > - > - error = posix_acl_update_mode(inode, &mode, &acl); > - if (error) > - return error; > - error = xfs_set_mode(inode, mode); > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > if (error) > return error; > + inode->i_ctime = current_time(inode); > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + mode_changed = 1; > } And this doesn't. So this breaks all our rules for inode metadata updates, both for runtime access serialisation and for crash consistency. > set_acl: > - return __xfs_set_acl(inode, acl, type); > + error = __xfs_set_acl(inode, acl, type); > + if (error && mode_changed) { > + inode->i_mode = old_mode; > + inode->i_ctime = old_ctime; > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + } Same here. IOWs, you can't just remove a transactional modification of inode state and replace it with in-memory VFS inode level changes. You need to do the changes in transaction context and with the correct locks held. IOWs, if you need to undo the mode changes xfs_set_mode() did, you need to call xfs_set_mode() again. And because modifications have been made, the new ctime should remain in place, too. Of course, this still isn't atomic from a runtime access POV, but that's substantially harder to provide than just undoing the mode change on ACL attribute save error. Cheers, Dave.
Hi Dave, thanks for the review! On Tue, Aug 15, 2017 at 12:00:18PM +1000, Dave Chinner wrote: > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > index de7b9bd..c3193e1 100644 > > --- a/fs/xfs/libxfs/xfs_attr.c > > +++ b/fs/xfs/libxfs/xfs_attr.c > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > struct xfs_defer_ops dfops; > > xfs_fsblock_t firstblock; > > int error; > > + int noattr = 0; > > bool. > > And make it "bool has_attr = true" to avoid the nasty double > negative of "!noattr" later on. I called it noattr because the error was -ENOATTR, but you are right, has_attr is better. I assume you would prefer mode_changed to be a bool as well? > > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > xfs_trans_ijoin(args.trans, dp, 0); > > > > if (!xfs_inode_hasattr(dp)) { > > - error = -ENOATTR; > > + /* > > + * Commit even if there is no attr to remove, because when > > + * setting ACLs the caller may be changing the mode in the > > + * same transaction. > > + */ > > + noattr = 1; > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > error = xfs_attr_shortform_remove(&args); > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > xfs_trans_set_sync(args.trans); > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > Why? Because we don't want the ctime to change when you try to delete an attribute that didn't exist in the first place. At least that's how it went before, so I tried to keep it the same. > > > > > /* > > @@ -468,6 +474,9 @@ xfs_attr_remove( > > error = xfs_trans_commit(args.trans); > > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > > > + if (!error && noattr) > > + error = -ENOATTR; > > + > > return error; > > > > out: > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > > index 7034e17..1fe4363 100644 > > --- a/fs/xfs/xfs_acl.c > > +++ b/fs/xfs/xfs_acl.c > > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > } > > > > -static int > > -xfs_set_mode(struct inode *inode, umode_t mode) > > -{ > > - int error = 0; > > - > > - if (mode != inode->i_mode) { > > - struct iattr iattr; > > - > > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > > - iattr.ia_mode = mode; > > - iattr.ia_ctime = current_time(inode); > > - > > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > > - } > > - > > - return error; > > -} > > Ok, so that does an atomic, transactional change to the inode > mode.... > > > int > > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > { > > int error = 0; > > + int mode_changed = 0; > > + umode_t old_mode = inode->i_mode; > > + struct timespec old_ctime = inode->i_ctime; > > > > if (!acl) > > goto set_acl; > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > > > if (type == ACL_TYPE_ACCESS) { > > - umode_t mode; > > - > > - error = posix_acl_update_mode(inode, &mode, &acl); > > - if (error) > > - return error; > > - error = xfs_set_mode(inode, mode); > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > if (error) > > return error; > > + inode->i_ctime = current_time(inode); > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + mode_changed = 1; > > } > > And this doesn't. > > So this breaks all our rules for inode metadata updates, both for > runtime access serialisation and for crash consistency. > I'm only making changes to the vfs inode, and xfs_set_acl() should always be called with i_rwsem held for writing. I think that should be enough to serialize access to the vfs inode fields. I'm sorry, I'm not sure I follow the crash consistency part. This is not writing anything to disk. My expectation was that the changes I made to the vfs inode would be attached to the transaction later, when calling xfs_trans_log_inode() in xfs_attr_set() or xfs_attr_remove(). Would that not work as intended? I was mostly imitating how btrfs sets an ACL, so I could be misunderstanding how xfs does journaling. > > set_acl: > > - return __xfs_set_acl(inode, acl, type); > > + error = __xfs_set_acl(inode, acl, type); > > + if (error && mode_changed) { > > + inode->i_mode = old_mode; > > + inode->i_ctime = old_ctime; > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + } > > Same here. i_rwsem would still be held here. My reasoning is that, since there was an error, no changes were actually committed; so there would be no need to commit the restoration either, just restore the old in-memory vfs inode fields. Once again, this is inspired by btrfs, so I suppose it may fail here in some way I can't see. (I should probably clarify I did run tests before submitting) > IOWs, you can't just remove a transactional modification of inode > state and replace it with in-memory VFS inode level changes. You > need to do the changes in transaction context and with the correct > locks held. > > IOWs, if you need to undo the mode changes xfs_set_mode() did, you > need to call xfs_set_mode() again. And because modifications have But that could fail as well, and how would we handle that? If this bug is to be fixed the whole thing needs to be a single transaction. That's why I removed the xfs_set_mode() function in the first place. > been made, the new ctime should remain in place, too. > > Of course, this still isn't atomic from a runtime access POV, but > that's substantially harder to provide than just undoing the mode > change on ACL attribute save error. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com Thanks, Ernest -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 15, 2017 at 03:18:58AM -0300, Ernesto A. Fernández wrote: > Hi Dave, thanks for the review! > > On Tue, Aug 15, 2017 at 12:00:18PM +1000, Dave Chinner wrote: > > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > > index de7b9bd..c3193e1 100644 > > > --- a/fs/xfs/libxfs/xfs_attr.c > > > +++ b/fs/xfs/libxfs/xfs_attr.c > > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > > struct xfs_defer_ops dfops; > > > xfs_fsblock_t firstblock; > > > int error; > > > + int noattr = 0; > > > > bool. > > > > And make it "bool has_attr = true" to avoid the nasty double > > negative of "!noattr" later on. > > I called it noattr because the error was -ENOATTR, but you are right, > has_attr is better. I assume you would prefer mode_changed to be a bool > as well? > > > > > > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > > xfs_trans_ijoin(args.trans, dp, 0); > > > > > > if (!xfs_inode_hasattr(dp)) { > > > - error = -ENOATTR; > > > + /* > > > + * Commit even if there is no attr to remove, because when > > > + * setting ACLs the caller may be changing the mode in the > > > + * same transaction. > > > + */ > > > + noattr = 1; > > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > > error = xfs_attr_shortform_remove(&args); > > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > > xfs_trans_set_sync(args.trans); > > > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > > > Why? > > Because we don't want the ctime to change when you try to delete an > attribute that didn't exist in the first place. At least that's how > it went before, so I tried to keep it the same. Which means the logic is convoluted and no longer obvious as to what it does. i.e. the noattr case is running a transaction that dirties an inode that no modifications were made to. Alarm bells right there. > > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > > return error; > > > > > > if (type == ACL_TYPE_ACCESS) { > > > - umode_t mode; > > > - > > > - error = posix_acl_update_mode(inode, &mode, &acl); > > > - if (error) > > > - return error; > > > - error = xfs_set_mode(inode, mode); > > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > > if (error) > > > return error; > > > + inode->i_ctime = current_time(inode); > > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > + mode_changed = 1; > > > } > > > > And this doesn't. > > > > So this breaks all our rules for inode metadata updates, both for > > runtime access serialisation and for crash consistency. > > I'm only making changes to the vfs inode, and xfs_set_acl() should always > be called with i_rwsem held for writing. I think that should be enough to > serialize access to the vfs inode fields. It's not. The i_rwsem does not protect XFS inode metadata, which is what needs changing here. i.e. we change both the VFS inode and the corresponding XFS inode metadata at the same time, under the same XFS_ILOCK_EXCL lock, only inside a transaction context. That's the rules, no exceptions. > I'm sorry, I'm not sure I follow the crash consistency part. This is not > writing anything to disk. xfs_set_mode() runs a transaction that is journalled to change the inode mode. You break in-memory consistency by modifying the VFS inode state directly without holding the correct locks (see xfs_setattr_mode()), and you break the transactional change model (i.e. on-disk journal consistency and on-disk metadata change ordering) by making metadata changes this outside transaction contexts and without holding the correct locks. > My expectation was that the changes I made to > the vfs inode would be attached to the transaction later, when calling > xfs_trans_log_inode() in xfs_attr_set() or xfs_attr_remove(). Would that > not work as intended? No, it wouldn't. You have to follow the filesystem specific update rules to change inode metadata, not make up your own.... > I was mostly imitating how btrfs sets an ACL, so I > could be misunderstanding how xfs does journaling. ... and btrfs has a completely different transaction model and implementation to XFS. IOWs, trying to do what works for btrfs in XFS code will break XFS. And vice versa. > > > set_acl: > > > - return __xfs_set_acl(inode, acl, type); > > > + error = __xfs_set_acl(inode, acl, type); > > > + if (error && mode_changed) { > > > + inode->i_mode = old_mode; > > > + inode->i_ctime = old_ctime; > > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > + } > > > > Same here. > > i_rwsem would still be held here. My reasoning is that, since there was > an error, no changes were actually committed; so there would be no need > to commit the restoration either, just restore the old in-memory vfs inode > fields. Once again, this is inspired by btrfs, so I suppose it may fail > here in some way I can't see. > > (I should probably clarify I did run tests before submitting) When you treat the code as a black box for testing, then end result might appear to be what you want. That doesn't mean the implementation is correct or that hidden transient states within the block box are incorrect or invalid. > > IOWs, you can't just remove a transactional modification of inode > > state and replace it with in-memory VFS inode level changes. You > > need to do the changes in transaction context and with the correct > > locks held. > > > > IOWs, if you need to undo the mode changes xfs_set_mode() did, you > > need to call xfs_set_mode() again. And because modifications have > > But that could fail as well, and how would we handle that? If the xattr change was not a fatal error (which will shut down the filesystem), then calling xfs_set_mode() should work just fine as it doesn't require any new space to be allocated (XFS != BTRFS - we can run overwriting transactions even at ENOSPC). And you need to run the mode undo as a separate transaction because.... > If this bug > is to be fixed the whole thing needs to be a single transaction. .... this is simply not possible. Why? Because __xfs_set_acl() can run up three separate transactions internally in adding the new xattr. That means your unlogged vfs indoe change has already been logged to the journal and so we *require* a transaction to overwrite the new mode that has already been committed to the journal. But I have to ask - why do we even need to modify the mode first? Why not change the ACL first, then modify the mode+timestamp? If setting the ACL fails, then we don't have anything to undo and all is good.... Cheers, Dave.
On Tue, Aug 15, 2017 at 06:44:30PM +1000, Dave Chinner wrote: > On Tue, Aug 15, 2017 at 03:18:58AM -0300, Ernesto A. Fernández wrote: > > Hi Dave, thanks for the review! > > > > On Tue, Aug 15, 2017 at 12:00:18PM +1000, Dave Chinner wrote: > > > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > > > index de7b9bd..c3193e1 100644 > > > > --- a/fs/xfs/libxfs/xfs_attr.c > > > > +++ b/fs/xfs/libxfs/xfs_attr.c > > > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > > > struct xfs_defer_ops dfops; > > > > xfs_fsblock_t firstblock; > > > > int error; > > > > + int noattr = 0; > > > > > > bool. > > > > > > And make it "bool has_attr = true" to avoid the nasty double > > > negative of "!noattr" later on. > > > > I called it noattr because the error was -ENOATTR, but you are right, > > has_attr is better. I assume you would prefer mode_changed to be a bool > > as well? > > > > > > > > > > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > > > xfs_trans_ijoin(args.trans, dp, 0); > > > > > > > > if (!xfs_inode_hasattr(dp)) { > > > > - error = -ENOATTR; > > > > + /* > > > > + * Commit even if there is no attr to remove, because when > > > > + * setting ACLs the caller may be changing the mode in the > > > > + * same transaction. > > > > + */ > > > > + noattr = 1; > > > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > > > error = xfs_attr_shortform_remove(&args); > > > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > > > xfs_trans_set_sync(args.trans); > > > > > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > > > > > Why? > > > > Because we don't want the ctime to change when you try to delete an > > attribute that didn't exist in the first place. At least that's how > > it went before, so I tried to keep it the same. > > Which means the logic is convoluted and no longer obvious as to what > it does. i.e. the noattr case is running a transaction that dirties > an inode that no modifications were made to. Alarm bells right > there. > > > > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > > > return error; > > > > > > > > if (type == ACL_TYPE_ACCESS) { > > > > - umode_t mode; > > > > - > > > > - error = posix_acl_update_mode(inode, &mode, &acl); > > > > - if (error) > > > > - return error; > > > > - error = xfs_set_mode(inode, mode); > > > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > > > if (error) > > > > return error; > > > > + inode->i_ctime = current_time(inode); > > > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > > + mode_changed = 1; > > > > } > > > > > > And this doesn't. > > > > > > So this breaks all our rules for inode metadata updates, both for > > > runtime access serialisation and for crash consistency. > > > > I'm only making changes to the vfs inode, and xfs_set_acl() should always > > be called with i_rwsem held for writing. I think that should be enough to > > serialize access to the vfs inode fields. > > It's not. The i_rwsem does not protect XFS inode metadata, which is > what needs changing here. i.e. we change both the VFS inode and > the corresponding XFS inode metadata at the same time, under the > same XFS_ILOCK_EXCL lock, only inside a transaction context. > > That's the rules, no exceptions. > > > I'm sorry, I'm not sure I follow the crash consistency part. This is not > > writing anything to disk. > > xfs_set_mode() runs a transaction that is journalled to change the > inode mode. You break in-memory consistency by modifying the VFS > inode state directly without holding the correct locks (see > xfs_setattr_mode()), and you break the transactional change model > (i.e. on-disk journal consistency and on-disk metadata change > ordering) by making metadata changes this outside transaction > contexts and without holding the correct locks. > > > My expectation was that the changes I made to > > the vfs inode would be attached to the transaction later, when calling > > xfs_trans_log_inode() in xfs_attr_set() or xfs_attr_remove(). Would that > > not work as intended? > > No, it wouldn't. You have to follow the filesystem specific update > rules to change inode metadata, not make up your own.... > > > I was mostly imitating how btrfs sets an ACL, so I > > could be misunderstanding how xfs does journaling. > > ... and btrfs has a completely different transaction model and > implementation to XFS. IOWs, trying to do what works for btrfs in > XFS code will break XFS. And vice versa. > > > > > set_acl: > > > > - return __xfs_set_acl(inode, acl, type); > > > > + error = __xfs_set_acl(inode, acl, type); > > > > + if (error && mode_changed) { > > > > + inode->i_mode = old_mode; > > > > + inode->i_ctime = old_ctime; > > > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > > + } > > > > > > Same here. > > > > i_rwsem would still be held here. My reasoning is that, since there was > > an error, no changes were actually committed; so there would be no need > > to commit the restoration either, just restore the old in-memory vfs inode > > fields. Once again, this is inspired by btrfs, so I suppose it may fail > > here in some way I can't see. > > > > (I should probably clarify I did run tests before submitting) > > When you treat the code as a black box for testing, then end result > might appear to be what you want. That doesn't mean the > implementation is correct or that hidden transient states within the > block box are incorrect or invalid. > > > > IOWs, you can't just remove a transactional modification of inode > > > state and replace it with in-memory VFS inode level changes. You > > > need to do the changes in transaction context and with the correct > > > locks held. > > > > > > IOWs, if you need to undo the mode changes xfs_set_mode() did, you > > > need to call xfs_set_mode() again. And because modifications have > > > > But that could fail as well, and how would we handle that? > > If the xattr change was not a fatal error (which will shut down the > filesystem), then calling xfs_set_mode() should work just fine as it > doesn't require any new space to be allocated (XFS != BTRFS - we can > run overwriting transactions even at ENOSPC). > > And you need to run the mode undo as a separate transaction > because.... > > > If this bug > > is to be fixed the whole thing needs to be a single transaction. > > .... this is simply not possible. Why? Because __xfs_set_acl() can > run up three separate transactions internally in adding the new > xattr. That means your unlogged vfs indoe change has already been > logged to the journal and so we *require* a transaction to overwrite > the new mode that has already been committed to the journal. It's clear that I don't understand xfs, and I'm causing more trouble than I'm helping. I should leave this issue to you guys. > But I have to ask - why do we even need to modify the mode first? > Why not change the ACL first, then modify the mode+timestamp? If > setting the ACL fails, then we don't have anything to undo and all > is good.... I intended for the mode to be committed as part of the same transaction that sets or removes the ACL. In my mind making the changes later, as part of a separate transaction, would have meant that a crash between the two left the filesystem in an inconsistent state, with a new ACL but without the corresponding mode bits. Of course that was based on a poor understanding of xfs journaling, so now I don't know. > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com Thanks, Ernest -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > When changing a file's ACL mask, xfs_set_acl() will first set the group > bits of i_mode to the value of the mask, and only then set the actual > extended attribute representing the new ACL. > > If the second part fails (due to lack of space, for example) and the > file had no ACL attribute to begin with, the system will from now on > assume that the mask permission bits are actual group permission bits, > potentially granting access to the wrong users. > > To prevent this, perform the mode update and the ACL update as part of > the same transaction, and restore the original mode to the vfs inode in > case of failure. This requires a change in how extended attributes are > deleted: commit the transaction even if there was no attribute to > remove, to log the standard inode fields. > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > --- > This issue is covered by generic/449 in xfstests. Several other filesystems Sorry, I'm coming into this late, and might be missing something... is there a particular xfs configuration that you've set up on which generic/449 fails? On 4.15-rc1, I see no failures of g/449. Thanks- Bill > are affected by this, some of them have already applied patches: > - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails > - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails > - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails > - fe26569 ext2: preserve i_mode if ext2_set_acl() fails > The test won't exactly pass after this patch is applied. It will fail > differently, this time claiming that the filesystem is inconsistent. This > probably has something to do with setting too many extended attributes, as > it goes away if you change the attribute size in the test from 1k to 64k. > > fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- > fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- > 2 files changed, 25 insertions(+), 27 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > index de7b9bd..c3193e1 100644 > --- a/fs/xfs/libxfs/xfs_attr.c > +++ b/fs/xfs/libxfs/xfs_attr.c > @@ -395,6 +395,7 @@ xfs_attr_remove( > struct xfs_defer_ops dfops; > xfs_fsblock_t firstblock; > int error; > + int noattr = 0; > > XFS_STATS_INC(mp, xs_attr_remove); > > @@ -438,7 +439,12 @@ xfs_attr_remove( > xfs_trans_ijoin(args.trans, dp, 0); > > if (!xfs_inode_hasattr(dp)) { > - error = -ENOATTR; > + /* > + * Commit even if there is no attr to remove, because when > + * setting ACLs the caller may be changing the mode in the > + * same transaction. > + */ > + noattr = 1; > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > error = xfs_attr_shortform_remove(&args); > @@ -458,7 +464,7 @@ xfs_attr_remove( > if (mp->m_flags & XFS_MOUNT_WSYNC) > xfs_trans_set_sync(args.trans); > > - if ((flags & ATTR_KERNOTIME) == 0) > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > /* > @@ -468,6 +474,9 @@ xfs_attr_remove( > error = xfs_trans_commit(args.trans); > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > + if (!error && noattr) > + error = -ENOATTR; > + > return error; > > out: > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > index 7034e17..1fe4363 100644 > --- a/fs/xfs/xfs_acl.c > +++ b/fs/xfs/xfs_acl.c > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > } > > -static int > -xfs_set_mode(struct inode *inode, umode_t mode) > -{ > - int error = 0; > - > - if (mode != inode->i_mode) { > - struct iattr iattr; > - > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > - iattr.ia_mode = mode; > - iattr.ia_ctime = current_time(inode); > - > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > - } > - > - return error; > -} > - > int > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > { > int error = 0; > + int mode_changed = 0; > + umode_t old_mode = inode->i_mode; > + struct timespec old_ctime = inode->i_ctime; > > if (!acl) > goto set_acl; > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > return error; > > if (type == ACL_TYPE_ACCESS) { > - umode_t mode; > - > - error = posix_acl_update_mode(inode, &mode, &acl); > - if (error) > - return error; > - error = xfs_set_mode(inode, mode); > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > if (error) > return error; > + inode->i_ctime = current_time(inode); > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + mode_changed = 1; > } > > set_acl: > - return __xfs_set_acl(inode, acl, type); > + error = __xfs_set_acl(inode, acl, type); > + if (error && mode_changed) { > + inode->i_mode = old_mode; > + inode->i_ctime = old_ctime; > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > + } > + return error; > } > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 07, 2017 at 11:31:51AM -0600, Bill O'Donnell wrote: > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > When changing a file's ACL mask, xfs_set_acl() will first set the group > > bits of i_mode to the value of the mask, and only then set the actual > > extended attribute representing the new ACL. > > > > If the second part fails (due to lack of space, for example) and the > > file had no ACL attribute to begin with, the system will from now on > > assume that the mask permission bits are actual group permission bits, > > potentially granting access to the wrong users. > > > > To prevent this, perform the mode update and the ACL update as part of > > the same transaction, and restore the original mode to the vfs inode in > > case of failure. This requires a change in how extended attributes are > > deleted: commit the transaction even if there was no attribute to > > remove, to log the standard inode fields. > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > > --- > > This issue is covered by generic/449 in xfstests. Several other filesystems > > Sorry, I'm coming into this late, and might be missing something... > is there a particular xfs configuration that you've set up on which > generic/449 fails? On 4.15-rc1, I see no failures of g/449. Oh, I see this was merged and fixes g/449: commit 67f2ffe31d1a683170c2ba0ecc643e42a5fdd397 Author: Dave Chinner <dchinner@redhat.com> Date: Mon Oct 9 11:37:23 2017 -0700 xfs: don't change inode mode if ACL update fails > > Thanks- > Bill > > > > > are affected by this, some of them have already applied patches: > > - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails > > - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails > > - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails > > - fe26569 ext2: preserve i_mode if ext2_set_acl() fails > > The test won't exactly pass after this patch is applied. It will fail > > differently, this time claiming that the filesystem is inconsistent. This > > probably has something to do with setting too many extended attributes, as > > it goes away if you change the attribute size in the test from 1k to 64k. > > > > fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- > > fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- > > 2 files changed, 25 insertions(+), 27 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > index de7b9bd..c3193e1 100644 > > --- a/fs/xfs/libxfs/xfs_attr.c > > +++ b/fs/xfs/libxfs/xfs_attr.c > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > struct xfs_defer_ops dfops; > > xfs_fsblock_t firstblock; > > int error; > > + int noattr = 0; > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > xfs_trans_ijoin(args.trans, dp, 0); > > > > if (!xfs_inode_hasattr(dp)) { > > - error = -ENOATTR; > > + /* > > + * Commit even if there is no attr to remove, because when > > + * setting ACLs the caller may be changing the mode in the > > + * same transaction. > > + */ > > + noattr = 1; > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > error = xfs_attr_shortform_remove(&args); > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > xfs_trans_set_sync(args.trans); > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > > > /* > > @@ -468,6 +474,9 @@ xfs_attr_remove( > > error = xfs_trans_commit(args.trans); > > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > > > + if (!error && noattr) > > + error = -ENOATTR; > > + > > return error; > > > > out: > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > > index 7034e17..1fe4363 100644 > > --- a/fs/xfs/xfs_acl.c > > +++ b/fs/xfs/xfs_acl.c > > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > } > > > > -static int > > -xfs_set_mode(struct inode *inode, umode_t mode) > > -{ > > - int error = 0; > > - > > - if (mode != inode->i_mode) { > > - struct iattr iattr; > > - > > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > > - iattr.ia_mode = mode; > > - iattr.ia_ctime = current_time(inode); > > - > > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > > - } > > - > > - return error; > > -} > > - > > int > > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > { > > int error = 0; > > + int mode_changed = 0; > > + umode_t old_mode = inode->i_mode; > > + struct timespec old_ctime = inode->i_ctime; > > > > if (!acl) > > goto set_acl; > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > > > if (type == ACL_TYPE_ACCESS) { > > - umode_t mode; > > - > > - error = posix_acl_update_mode(inode, &mode, &acl); > > - if (error) > > - return error; > > - error = xfs_set_mode(inode, mode); > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > if (error) > > return error; > > + inode->i_ctime = current_time(inode); > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + mode_changed = 1; > > } > > > > set_acl: > > - return __xfs_set_acl(inode, acl, type); > > + error = __xfs_set_acl(inode, acl, type); > > + if (error && mode_changed) { > > + inode->i_mode = old_mode; > > + inode->i_ctime = old_ctime; > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + } > > + return error; > > } > > -- > > 2.1.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 07, 2017 at 11:31:51AM -0600, Bill O'Donnell wrote: > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > When changing a file's ACL mask, xfs_set_acl() will first set the group > > bits of i_mode to the value of the mask, and only then set the actual > > extended attribute representing the new ACL. > > > > If the second part fails (due to lack of space, for example) and the > > file had no ACL attribute to begin with, the system will from now on > > assume that the mask permission bits are actual group permission bits, > > potentially granting access to the wrong users. > > > > To prevent this, perform the mode update and the ACL update as part of > > the same transaction, and restore the original mode to the vfs inode in > > case of failure. This requires a change in how extended attributes are > > deleted: commit the transaction even if there was no attribute to > > remove, to log the standard inode fields. > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > > --- > > This issue is covered by generic/449 in xfstests. Several other filesystems > > Sorry, I'm coming into this late, and might be missing something... > is there a particular xfs configuration that you've set up on which > generic/449 fails? On 4.15-rc1, I see no failures of g/449. Fixed in: 67f2ffe31d1 ("xfs: don't change inode mode if ACL update fails") (I think...) --D > Thanks- > Bill > > > > > are affected by this, some of them have already applied patches: > > - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails > > - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails > > - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails > > - fe26569 ext2: preserve i_mode if ext2_set_acl() fails > > The test won't exactly pass after this patch is applied. It will fail > > differently, this time claiming that the filesystem is inconsistent. This > > probably has something to do with setting too many extended attributes, as > > it goes away if you change the attribute size in the test from 1k to 64k. > > > > fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- > > fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- > > 2 files changed, 25 insertions(+), 27 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > index de7b9bd..c3193e1 100644 > > --- a/fs/xfs/libxfs/xfs_attr.c > > +++ b/fs/xfs/libxfs/xfs_attr.c > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > struct xfs_defer_ops dfops; > > xfs_fsblock_t firstblock; > > int error; > > + int noattr = 0; > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > xfs_trans_ijoin(args.trans, dp, 0); > > > > if (!xfs_inode_hasattr(dp)) { > > - error = -ENOATTR; > > + /* > > + * Commit even if there is no attr to remove, because when > > + * setting ACLs the caller may be changing the mode in the > > + * same transaction. > > + */ > > + noattr = 1; > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > error = xfs_attr_shortform_remove(&args); > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > xfs_trans_set_sync(args.trans); > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > > > /* > > @@ -468,6 +474,9 @@ xfs_attr_remove( > > error = xfs_trans_commit(args.trans); > > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > > > + if (!error && noattr) > > + error = -ENOATTR; > > + > > return error; > > > > out: > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > > index 7034e17..1fe4363 100644 > > --- a/fs/xfs/xfs_acl.c > > +++ b/fs/xfs/xfs_acl.c > > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > } > > > > -static int > > -xfs_set_mode(struct inode *inode, umode_t mode) > > -{ > > - int error = 0; > > - > > - if (mode != inode->i_mode) { > > - struct iattr iattr; > > - > > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > > - iattr.ia_mode = mode; > > - iattr.ia_ctime = current_time(inode); > > - > > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > > - } > > - > > - return error; > > -} > > - > > int > > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > { > > int error = 0; > > + int mode_changed = 0; > > + umode_t old_mode = inode->i_mode; > > + struct timespec old_ctime = inode->i_ctime; > > > > if (!acl) > > goto set_acl; > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > return error; > > > > if (type == ACL_TYPE_ACCESS) { > > - umode_t mode; > > - > > - error = posix_acl_update_mode(inode, &mode, &acl); > > - if (error) > > - return error; > > - error = xfs_set_mode(inode, mode); > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > if (error) > > return error; > > + inode->i_ctime = current_time(inode); > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + mode_changed = 1; > > } > > > > set_acl: > > - return __xfs_set_acl(inode, acl, type); > > + error = __xfs_set_acl(inode, acl, type); > > + if (error && mode_changed) { > > + inode->i_mode = old_mode; > > + inode->i_ctime = old_ctime; > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > + } > > + return error; > > } > > -- > > 2.1.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 07, 2017 at 09:43:09AM -0800, Darrick J. Wong wrote: > On Thu, Dec 07, 2017 at 11:31:51AM -0600, Bill O'Donnell wrote: > > On Mon, Aug 14, 2017 at 05:18:10AM -0300, Ernesto A. Fernández wrote: > > > When changing a file's ACL mask, xfs_set_acl() will first set the group > > > bits of i_mode to the value of the mask, and only then set the actual > > > extended attribute representing the new ACL. > > > > > > If the second part fails (due to lack of space, for example) and the > > > file had no ACL attribute to begin with, the system will from now on > > > assume that the mask permission bits are actual group permission bits, > > > potentially granting access to the wrong users. > > > > > > To prevent this, perform the mode update and the ACL update as part of > > > the same transaction, and restore the original mode to the vfs inode in > > > case of failure. This requires a change in how extended attributes are > > > deleted: commit the transaction even if there was no attribute to > > > remove, to log the standard inode fields. > > > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > > > --- > > > This issue is covered by generic/449 in xfstests. Several other filesystems > > > > Sorry, I'm coming into this late, and might be missing something... > > is there a particular xfs configuration that you've set up on which > > generic/449 fails? On 4.15-rc1, I see no failures of g/449. > > Fixed in: > 67f2ffe31d1 ("xfs: don't change inode mode if ACL update fails") > > (I think...) Correct. Thanks- Bill > > --D > > > Thanks- > > Bill > > > > > > > > > are affected by this, some of them have already applied patches: > > > - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails > > > - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails > > > - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails > > > - fe26569 ext2: preserve i_mode if ext2_set_acl() fails > > > The test won't exactly pass after this patch is applied. It will fail > > > differently, this time claiming that the filesystem is inconsistent. This > > > probably has something to do with setting too many extended attributes, as > > > it goes away if you change the attribute size in the test from 1k to 64k. > > > > > > fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- > > > fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- > > > 2 files changed, 25 insertions(+), 27 deletions(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c > > > index de7b9bd..c3193e1 100644 > > > --- a/fs/xfs/libxfs/xfs_attr.c > > > +++ b/fs/xfs/libxfs/xfs_attr.c > > > @@ -395,6 +395,7 @@ xfs_attr_remove( > > > struct xfs_defer_ops dfops; > > > xfs_fsblock_t firstblock; > > > int error; > > > + int noattr = 0; > > > > > > XFS_STATS_INC(mp, xs_attr_remove); > > > > > > @@ -438,7 +439,12 @@ xfs_attr_remove( > > > xfs_trans_ijoin(args.trans, dp, 0); > > > > > > if (!xfs_inode_hasattr(dp)) { > > > - error = -ENOATTR; > > > + /* > > > + * Commit even if there is no attr to remove, because when > > > + * setting ACLs the caller may be changing the mode in the > > > + * same transaction. > > > + */ > > > + noattr = 1; > > > } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { > > > ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); > > > error = xfs_attr_shortform_remove(&args); > > > @@ -458,7 +464,7 @@ xfs_attr_remove( > > > if (mp->m_flags & XFS_MOUNT_WSYNC) > > > xfs_trans_set_sync(args.trans); > > > > > > - if ((flags & ATTR_KERNOTIME) == 0) > > > + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) > > > xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); > > > > > > /* > > > @@ -468,6 +474,9 @@ xfs_attr_remove( > > > error = xfs_trans_commit(args.trans); > > > xfs_iunlock(dp, XFS_ILOCK_EXCL); > > > > > > + if (!error && noattr) > > > + error = -ENOATTR; > > > + > > > return error; > > > > > > out: > > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c > > > index 7034e17..1fe4363 100644 > > > --- a/fs/xfs/xfs_acl.c > > > +++ b/fs/xfs/xfs_acl.c > > > @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > > return error; > > > } > > > > > > -static int > > > -xfs_set_mode(struct inode *inode, umode_t mode) > > > -{ > > > - int error = 0; > > > - > > > - if (mode != inode->i_mode) { > > > - struct iattr iattr; > > > - > > > - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; > > > - iattr.ia_mode = mode; > > > - iattr.ia_ctime = current_time(inode); > > > - > > > - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); > > > - } > > > - > > > - return error; > > > -} > > > - > > > int > > > xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > > { > > > int error = 0; > > > + int mode_changed = 0; > > > + umode_t old_mode = inode->i_mode; > > > + struct timespec old_ctime = inode->i_ctime; > > > > > > if (!acl) > > > goto set_acl; > > > @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) > > > return error; > > > > > > if (type == ACL_TYPE_ACCESS) { > > > - umode_t mode; > > > - > > > - error = posix_acl_update_mode(inode, &mode, &acl); > > > - if (error) > > > - return error; > > > - error = xfs_set_mode(inode, mode); > > > + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); > > > if (error) > > > return error; > > > + inode->i_ctime = current_time(inode); > > > + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > + mode_changed = 1; > > > } > > > > > > set_acl: > > > - return __xfs_set_acl(inode, acl, type); > > > + error = __xfs_set_acl(inode, acl, type); > > > + if (error && mode_changed) { > > > + inode->i_mode = old_mode; > > > + inode->i_ctime = old_ctime; > > > + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); > > > + } > > > + return error; > > > } > > > -- > > > 2.1.4 > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index de7b9bd..c3193e1 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -395,6 +395,7 @@ xfs_attr_remove( struct xfs_defer_ops dfops; xfs_fsblock_t firstblock; int error; + int noattr = 0; XFS_STATS_INC(mp, xs_attr_remove); @@ -438,7 +439,12 @@ xfs_attr_remove( xfs_trans_ijoin(args.trans, dp, 0); if (!xfs_inode_hasattr(dp)) { - error = -ENOATTR; + /* + * Commit even if there is no attr to remove, because when + * setting ACLs the caller may be changing the mode in the + * same transaction. + */ + noattr = 1; } else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) { ASSERT(dp->i_afp->if_flags & XFS_IFINLINE); error = xfs_attr_shortform_remove(&args); @@ -458,7 +464,7 @@ xfs_attr_remove( if (mp->m_flags & XFS_MOUNT_WSYNC) xfs_trans_set_sync(args.trans); - if ((flags & ATTR_KERNOTIME) == 0) + if ((flags & ATTR_KERNOTIME) == 0 && !noattr) xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG); /* @@ -468,6 +474,9 @@ xfs_attr_remove( error = xfs_trans_commit(args.trans); xfs_iunlock(dp, XFS_ILOCK_EXCL); + if (!error && noattr) + error = -ENOATTR; + return error; out: diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c index 7034e17..1fe4363 100644 --- a/fs/xfs/xfs_acl.c +++ b/fs/xfs/xfs_acl.c @@ -226,28 +226,13 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) return error; } -static int -xfs_set_mode(struct inode *inode, umode_t mode) -{ - int error = 0; - - if (mode != inode->i_mode) { - struct iattr iattr; - - iattr.ia_valid = ATTR_MODE | ATTR_CTIME; - iattr.ia_mode = mode; - iattr.ia_ctime = current_time(inode); - - error = xfs_setattr_nonsize(XFS_I(inode), &iattr, XFS_ATTR_NOACL); - } - - return error; -} - int xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) { int error = 0; + int mode_changed = 0; + umode_t old_mode = inode->i_mode; + struct timespec old_ctime = inode->i_ctime; if (!acl) goto set_acl; @@ -257,16 +242,20 @@ xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) return error; if (type == ACL_TYPE_ACCESS) { - umode_t mode; - - error = posix_acl_update_mode(inode, &mode, &acl); - if (error) - return error; - error = xfs_set_mode(inode, mode); + error = posix_acl_update_mode(inode, &inode->i_mode, &acl); if (error) return error; + inode->i_ctime = current_time(inode); + XFS_STATS_INC(XFS_I(inode)->i_mount, xs_ig_attrchg); + mode_changed = 1; } set_acl: - return __xfs_set_acl(inode, acl, type); + error = __xfs_set_acl(inode, acl, type); + if (error && mode_changed) { + inode->i_mode = old_mode; + inode->i_ctime = old_ctime; + XFS_STATS_DEC(XFS_I(inode)->i_mount, xs_ig_attrchg); + } + return error; }
When changing a file's ACL mask, xfs_set_acl() will first set the group bits of i_mode to the value of the mask, and only then set the actual extended attribute representing the new ACL. If the second part fails (due to lack of space, for example) and the file had no ACL attribute to begin with, the system will from now on assume that the mask permission bits are actual group permission bits, potentially granting access to the wrong users. To prevent this, perform the mode update and the ACL update as part of the same transaction, and restore the original mode to the vfs inode in case of failure. This requires a change in how extended attributes are deleted: commit the transaction even if there was no attribute to remove, to log the standard inode fields. Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> --- This issue is covered by generic/449 in xfstests. Several other filesystems are affected by this, some of them have already applied patches: - 397e434 ext4: preserve i_mode if __ext4_set_acl() fails - f070e5a jfs: preserve i_mode if __jfs_set_acl() fails - fcea8ae reiserfs: preserve i_mode if __reiserfs_set_acl() fails - fe26569 ext2: preserve i_mode if ext2_set_acl() fails The test won't exactly pass after this patch is applied. It will fail differently, this time claiming that the filesystem is inconsistent. This probably has something to do with setting too many extended attributes, as it goes away if you change the attribute size in the test from 1k to 64k. fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++-- fs/xfs/xfs_acl.c | 39 ++++++++++++++------------------------- 2 files changed, 25 insertions(+), 27 deletions(-)