Message ID | 1535300717-26686-4-git-send-email-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Overlayfs stacked f_op fixes | expand |
On Sun, Aug 26, 2018 at 07:25:14PM +0300, Amir Goldstein wrote: > Since overlayfs implements stacked file operations, the underlying > filesystems are not supposed to be exposed to the overlayfs file, > whose f_inode is an overlayfs inode. > > Assigning an overlayfs file to swap_file results in an attempt of xfs > code to dereference an xfs_inode struct from an ovl_inode pointer: > > CPU: 0 PID: 2462 Comm: swapon Not tainted > 4.18.0-xfstests-12721-g33e17876ea4e #3402 > RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f > Call Trace: > xfs_iomap_swapfile_activate+0x1f/0x43 > __se_sys_swapon+0xb1a/0xee9 > > Fix this by not assigning the real inode mapping to f_mapping, which > will cause swapon() to return an error (-EINVAL). Although it makes > sense not to allow setting swpafile on an overlayfs file, some users > may depend on it, so we may need to fix this up in the future. > > Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT > open to fail. Fix this by installing ovl_aops with noop_direct_IO in > overlay inode mapping. Ummm - shouldn't ovl be checking the real inode's .direct_IO method and returning status based on that? i.e. if the underlying fs doesn't support O_DIRECT, neither should ovl... > +const struct address_space_operations ovl_aops = { > + /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */ > + .direct_IO = noop_direct_IO, > +}; > + > /* > * It is possible to stack overlayfs instance on top of another > * overlayfs instance as lower layer. We need to annonate the > @@ -575,6 +580,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev, > case S_IFREG: > inode->i_op = &ovl_file_inode_operations; > inode->i_fop = &ovl_file_operations; > + inode->i_mapping->a_ops = &ovl_aops; > break; So you put an ovl interposer in the way here - it needs to pass through *everything* to the the real inode's aops, right? Cheers, Dave.
On Mon, Aug 27, 2018 at 6:43 AM Dave Chinner <david@fromorbit.com> wrote: > > On Sun, Aug 26, 2018 at 07:25:14PM +0300, Amir Goldstein wrote: > > Since overlayfs implements stacked file operations, the underlying > > filesystems are not supposed to be exposed to the overlayfs file, > > whose f_inode is an overlayfs inode. > > > > Assigning an overlayfs file to swap_file results in an attempt of xfs > > code to dereference an xfs_inode struct from an ovl_inode pointer: > > > > CPU: 0 PID: 2462 Comm: swapon Not tainted > > 4.18.0-xfstests-12721-g33e17876ea4e #3402 > > RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f > > Call Trace: > > xfs_iomap_swapfile_activate+0x1f/0x43 > > __se_sys_swapon+0xb1a/0xee9 > > > > Fix this by not assigning the real inode mapping to f_mapping, which > > will cause swapon() to return an error (-EINVAL). Although it makes > > sense not to allow setting swpafile on an overlayfs file, some users > > may depend on it, so we may need to fix this up in the future. > > > > Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT > > open to fail. Fix this by installing ovl_aops with noop_direct_IO in > > overlay inode mapping. > > Ummm - shouldn't ovl be checking the real inode's .direct_IO method > and returning status based on that? i.e. if the underlying fs > doesn't support O_DIRECT, neither should ovl... > ovl_open_realfile() will take care of that later when overlay actually tried to open the underlying file. > > +const struct address_space_operations ovl_aops = { > > + /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */ > > + .direct_IO = noop_direct_IO, > > +}; > > + > > /* > > * It is possible to stack overlayfs instance on top of another > > * overlayfs instance as lower layer. We need to annonate the > > @@ -575,6 +580,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev, > > case S_IFREG: > > inode->i_op = &ovl_file_inode_operations; > > inode->i_fop = &ovl_file_operations; > > + inode->i_mapping->a_ops = &ovl_aops; > > break; > > So you put an ovl interposer in the way here - it needs to pass > through *everything* to the the real inode's aops, right? > No. it's just a decoy a_ops, so if we miss another spot like swapon() user will get an error (i.e. for no a_ops->readpages) rather then calling into underlying filesystem aops with an overlay file and oopsing. So the trick of this patch is to change the game from whack-an-oops to whack-an-einval, which is better for everyone. Cheers, Amir.
On Mon, Aug 27, 2018 at 8:34 AM, Amir Goldstein <amir73il@gmail.com> wrote: > So the trick of this patch is to change the game from whack-an-oops > to whack-an-einval, which is better for everyone. In other words: this overlayfs update might cause regressions in dark and musty corners of overlayfs usage, but those will not be Oops inducing regressions. And whenever someone reports a regression, that will be solved by making that corner of the kernel less musty and dark (i.e. properly stackable), so in the end this is a win-win game. There's one corner case remaining, which is shared mmap consistency across copy-up. I think the solution to this will be: - private mappings of lower files will directly go to the lower file's mapping (just like now) - anything else (shared mmap of lower files and all mmap of upper files) will go to the overlay mapping (just like a normal fs) The flip side is cache duplication between shared mmap and private mmap of the same lower file. I hope this would be pretty rare. I don't see big complexities with the above, it should be solvable purely in overlayfs. Then there's getting rid of d_real() for good, for which Al has some ideas, and we are done. So yes, I think the end of the tunnel is pretty well in sight. Thanks, Miklos
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 24c9e0d70c3b..fd64daf6daa5 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -131,9 +131,6 @@ static int ovl_open(struct inode *inode, struct file *file) if (IS_ERR(realfile)) return PTR_ERR(realfile); - /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */ - file->f_mapping = realfile->f_mapping; - file->private_data = realfile; file->f_mode |= FMODE_STACKED; diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 5014749fd4b4..b6ac545b5a32 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -504,6 +504,11 @@ static const struct inode_operations ovl_special_inode_operations = { .update_time = ovl_update_time, }; +const struct address_space_operations ovl_aops = { + /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */ + .direct_IO = noop_direct_IO, +}; + /* * It is possible to stack overlayfs instance on top of another * overlayfs instance as lower layer. We need to annonate the @@ -575,6 +580,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev, case S_IFREG: inode->i_op = &ovl_file_inode_operations; inode->i_fop = &ovl_file_operations; + inode->i_mapping->a_ops = &ovl_aops; break; case S_IFDIR:
Since overlayfs implements stacked file operations, the underlying filesystems are not supposed to be exposed to the overlayfs file, whose f_inode is an overlayfs inode. Assigning an overlayfs file to swap_file results in an attempt of xfs code to dereference an xfs_inode struct from an ovl_inode pointer: CPU: 0 PID: 2462 Comm: swapon Not tainted 4.18.0-xfstests-12721-g33e17876ea4e #3402 RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f Call Trace: xfs_iomap_swapfile_activate+0x1f/0x43 __se_sys_swapon+0xb1a/0xee9 Fix this by not assigning the real inode mapping to f_mapping, which will cause swapon() to return an error (-EINVAL). Although it makes sense not to allow setting swpafile on an overlayfs file, some users may depend on it, so we may need to fix this up in the future. Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT open to fail. Fix this by installing ovl_aops with noop_direct_IO in overlay inode mapping. Keeping f_mapping pointing to overlay inode mapping will cause other a_ops related operations to fail (e.g. readahead()). Those will be fixed by follow up patches. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: f7c72396d0de ("ovl: add O_DIRECT support") Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/overlayfs/file.c | 3 --- fs/overlayfs/inode.c | 6 ++++++ 2 files changed, 6 insertions(+), 3 deletions(-)