Message ID | 155009105350.32028.13101526675073908023.stgit@magnolia (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | [1/3] xfs: don't overflow xattr listent buffer | expand |
On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote: > + if (tmpfile) { > + /* > + * The VFS requires that any inode fed to d_tmpfile must have > + * nlink == 1 so that it can decrement the nlink in d_tmpfile. > + * However, we created the temp file with nlink == 0 because > + * we're not allowed to put an inode with nlink > 0 on the > + * unlinked list. Therefore we have to set nlink to 1 so that > + * d_tmpfile can immediately set it back to zero. > + */ > + set_nlink(inode, 1); > d_tmpfile(dentry, inode); > + } else At least btrtfs has to work around these d_tmpfile assumptions as well. Instead of piling hacks over hacks I'd rather move the call to inode_dec_link_count from d_tmpfile, which should lead to a saner interface.
On Thu, Feb 14, 2019 at 12:15:56AM -0800, Christoph Hellwig wrote: > On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote: > > + if (tmpfile) { > > + /* > > + * The VFS requires that any inode fed to d_tmpfile must have > > + * nlink == 1 so that it can decrement the nlink in d_tmpfile. > > + * However, we created the temp file with nlink == 0 because > > + * we're not allowed to put an inode with nlink > 0 on the > > + * unlinked list. Therefore we have to set nlink to 1 so that > > + * d_tmpfile can immediately set it back to zero. > > + */ > > + set_nlink(inode, 1); > > d_tmpfile(dentry, inode); > > + } else > > At least btrtfs has to work around these d_tmpfile assumptions as well. > Instead of piling hacks over hacks I'd rather move the call to > inode_dec_link_count from d_tmpfile, which should lead to a saner > interface. I'm working on a bigger change to fix the d_tmpfile behavior, but that's a complex multi-fs change that may or may not make it for 5.1. :( In the meantime this prevents leaking inodes during unlink recovery by ensuring that we never put linked inodes on the unlink list so I would still like to get this one reviewed. :) --D
On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > When XFS creates an O_TMPFILE file, the inode is created with nlink = 1, > put on the unlinked list, and then the VFS sets nlink = 0 in d_tmpfile. > If we crash before anything logs the inode (it's dirty incore but the > vfs doesn't tell us it's dirty so we never log that change), the iunlink > processing part of recovery will then explode with a pile of: > > XFS: Assertion failed: VFS_I(ip)->i_nlink == 0, file: > fs/xfs/xfs_log_recover.c, line: 5072 > > Worse yet, since nlink is nonzero, the inodes also don't get cleaned up > and they just leak until the next xfs_repair run. > > Therefore, change xfs_iunlink to require that inodes being put on the > unlinked list have nlink == 0, change the tmpfile callers to instantiate > nodes that way, and set the nlink to 1 just prior to calling d_tmpfile. > Fix the comment for xfs_iunlink while we're at it. Looks good for a quick fix, even if I think we should fix it different in the long term: Reviewed-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9aaa3143a277..9d683b455e01 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1332,7 +1332,7 @@ xfs_create_tmpfile( if (error) goto out_trans_cancel; - error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip); + error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip); if (error) goto out_trans_cancel; @@ -2231,11 +2231,8 @@ xfs_iunlink_update_inode( } /* - * This is called when the inode's link count goes to 0 or we are creating a - * tmpfile via O_TMPFILE. In the case of a tmpfile, @ignore_linkcount will be - * set to true as the link count is dropped to zero by the VFS after we've - * created the file successfully, so we have to add it to the unlinked list - * while the link count is non-zero. + * This is called when the inode's link count has gone to 0 or we are creating + * a tmpfile via O_TMPFILE. The inode @ip must have nlink == 0. * * We place the on-disk inode on a list in the AGI. It will be pulled from this * list when the inode is freed. @@ -2254,6 +2251,7 @@ xfs_iunlink( short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS; int error; + ASSERT(VFS_I(ip)->i_nlink == 0); ASSERT(VFS_I(ip)->i_mode != 0); trace_xfs_iunlink(ip); @@ -3184,11 +3182,9 @@ xfs_rename_alloc_whiteout( /* * Prepare the tmpfile inode as if it were created through the VFS. - * Otherwise, the link increment paths will complain about nlink 0->1. - * Drop the link count as done by d_tmpfile(), complete the inode setup - * and flag it as linkable. + * Complete the inode setup and flag it as linkable. nlink is already + * zero, so we can skip the drop_nlink. */ - drop_nlink(VFS_I(tmpfile)); xfs_setup_iops(tmpfile); xfs_finish_inode_setup(tmpfile); VFS_I(tmpfile)->i_state |= I_LINKABLE; diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index f48ffd7a8d3e..1efef69a7f1c 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -191,9 +191,18 @@ xfs_generic_create( xfs_setup_iops(ip); - if (tmpfile) + if (tmpfile) { + /* + * The VFS requires that any inode fed to d_tmpfile must have + * nlink == 1 so that it can decrement the nlink in d_tmpfile. + * However, we created the temp file with nlink == 0 because + * we're not allowed to put an inode with nlink > 0 on the + * unlinked list. Therefore we have to set nlink to 1 so that + * d_tmpfile can immediately set it back to zero. + */ + set_nlink(inode, 1); d_tmpfile(dentry, inode); - else + } else d_instantiate(dentry, inode); xfs_finish_inode_setup(ip);