Message ID | 20210923023751.1441091-1-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2] block: hold ->invalidate_lock in blkdev_fallocate | expand |
On Thu 23-09-21 10:37:51, Ming Lei wrote: > When running ->fallocate(), blkdev_fallocate() should hold > mapping->invalidate_lock to prevent page cache from being accessed, > otherwise stale data may be read in page cache. > > Without this patch, blktests block/009 fails sometimes. With this patch, > block/009 can pass always. > > Also as Jan pointed out, no pages can be created in the discarded area > while you are holding the invalidate_lock, so remove the 2nd > truncate_bdev_range(). > > Cc: Jan Kara <jack@suse.cz> > Signed-off-by: Ming Lei <ming.lei@redhat.com> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > V2: > - include <linux/fs.h> for avoiding implicit declaration of function > filemap_invalidate_lock > - remove 2nd truncate_bdev_range() as suggested by Jan > > block/fops.c | 21 ++++++++++----------- > 1 file changed, 10 insertions(+), 11 deletions(-) > > diff --git a/block/fops.c b/block/fops.c > index ffce6f6c68dd..1e970c247e0e 100644 > --- a/block/fops.c > +++ b/block/fops.c > @@ -14,6 +14,7 @@ > #include <linux/task_io_accounting_ops.h> > #include <linux/falloc.h> > #include <linux/suspend.h> > +#include <linux/fs.h> > #include "blk.h" > > static struct inode *bdev_file_inode(struct file *file) > @@ -553,7 +554,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) > static long blkdev_fallocate(struct file *file, int mode, loff_t start, > loff_t len) > { > - struct block_device *bdev = I_BDEV(bdev_file_inode(file)); > + struct inode *inode = bdev_file_inode(file); > + struct block_device *bdev = I_BDEV(inode); > loff_t end = start + len - 1; > loff_t isize; > int error; > @@ -580,10 +582,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, > if ((start | len) & (bdev_logical_block_size(bdev) - 1)) > return -EINVAL; > > + filemap_invalidate_lock(inode->i_mapping); > + > /* Invalidate the page cache, including dirty pages. */ > error = truncate_bdev_range(bdev, file->f_mode, start, end); > if (error) > - return error; > + goto fail; > > switch (mode) { > case FALLOC_FL_ZERO_RANGE: > @@ -600,17 +604,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, > GFP_KERNEL, 0); > break; > default: > - return -EOPNOTSUPP; > + error = -EOPNOTSUPP; > } > - if (error) > - return error; > > - /* > - * Invalidate the page cache again; if someone wandered in and dirtied > - * a page, we just discard it - userspace has no way of knowing whether > - * the write happened before or after discard completing... > - */ > - return truncate_bdev_range(bdev, file->f_mode, start, end); > + fail: > + filemap_invalidate_unlock(inode->i_mapping); > + return error; > } > > const struct file_operations def_blk_fops = { > -- > 2.31.1 >
On 9/22/21 8:37 PM, Ming Lei wrote: > When running ->fallocate(), blkdev_fallocate() should hold > mapping->invalidate_lock to prevent page cache from being accessed, > otherwise stale data may be read in page cache. > > Without this patch, blktests block/009 fails sometimes. With this patch, > block/009 can pass always. > > Also as Jan pointed out, no pages can be created in the discarded area > while you are holding the invalidate_lock, so remove the 2nd > truncate_bdev_range(). Applied, thanks.
On Sep 23, 2021 / 10:37, Ming Lei wrote: > When running ->fallocate(), blkdev_fallocate() should hold > mapping->invalidate_lock to prevent page cache from being accessed, > otherwise stale data may be read in page cache. > > Without this patch, blktests block/009 fails sometimes. With this patch, > block/009 can pass always. > > Also as Jan pointed out, no pages can be created in the discarded area > while you are holding the invalidate_lock, so remove the 2nd > truncate_bdev_range(). Hello Ming, Jan, thanks for the fix. Unfortunately, I still observe block/009 failure on the kernel version 5.15.0, which includes this fix patch. I found that BLKDISCARD ioctl has the same problem. I modified blk_ioctl_discard() in same manner, and the block/009 failure goes away. I also found that BLKZEROOUT has the same issue. I will post two patches for these ioctl. Your reviews will be appreciated.
diff --git a/block/fops.c b/block/fops.c index ffce6f6c68dd..1e970c247e0e 100644 --- a/block/fops.c +++ b/block/fops.c @@ -14,6 +14,7 @@ #include <linux/task_io_accounting_ops.h> #include <linux/falloc.h> #include <linux/suspend.h> +#include <linux/fs.h> #include "blk.h" static struct inode *bdev_file_inode(struct file *file) @@ -553,7 +554,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) static long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len) { - struct block_device *bdev = I_BDEV(bdev_file_inode(file)); + struct inode *inode = bdev_file_inode(file); + struct block_device *bdev = I_BDEV(inode); loff_t end = start + len - 1; loff_t isize; int error; @@ -580,10 +582,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, if ((start | len) & (bdev_logical_block_size(bdev) - 1)) return -EINVAL; + filemap_invalidate_lock(inode->i_mapping); + /* Invalidate the page cache, including dirty pages. */ error = truncate_bdev_range(bdev, file->f_mode, start, end); if (error) - return error; + goto fail; switch (mode) { case FALLOC_FL_ZERO_RANGE: @@ -600,17 +604,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, GFP_KERNEL, 0); break; default: - return -EOPNOTSUPP; + error = -EOPNOTSUPP; } - if (error) - return error; - /* - * Invalidate the page cache again; if someone wandered in and dirtied - * a page, we just discard it - userspace has no way of knowing whether - * the write happened before or after discard completing... - */ - return truncate_bdev_range(bdev, file->f_mode, start, end); + fail: + filemap_invalidate_unlock(inode->i_mapping); + return error; } const struct file_operations def_blk_fops = {
When running ->fallocate(), blkdev_fallocate() should hold mapping->invalidate_lock to prevent page cache from being accessed, otherwise stale data may be read in page cache. Without this patch, blktests block/009 fails sometimes. With this patch, block/009 can pass always. Also as Jan pointed out, no pages can be created in the discarded area while you are holding the invalidate_lock, so remove the 2nd truncate_bdev_range(). Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- V2: - include <linux/fs.h> for avoiding implicit declaration of function filemap_invalidate_lock - remove 2nd truncate_bdev_range() as suggested by Jan block/fops.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)