loop: drop caches if offset is changed

Message ID	20181214203223.7063-1-jaegeuk@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Jaegeuk Kim <jaegeuk@kernel.org> To: linux-kernel@vger.kernel.org Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org Subject: [PATCH] loop: drop caches if offset is changed Date: Fri, 14 Dec 2018 12:32:23 -0800 Message-Id: <20181214203223.7063-1-jaegeuk@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-block-owner@vger.kernel.org Precedence: bulk
Series	loop: drop caches if offset is changed \| expand loop: drop caches if offset is changed

Jaegeuk Kim Dec. 14, 2018, 8:32 p.m. UTC

If we don't drop caches used in old offset, we can get old data from new
offset, which gives unexpected data to user.

Martijn found a loopback bug in the below scenario.
1) LOOP_SET_FD loads first two pages on loop file
2) LOOP_SET_STATUS64 changes the offset on the loop file
3) mount is failed due to the cached pages having wrong superblock

This patch drops caches when we change lo->offset.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Reported-by: Martijn Coenen <maco@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 drivers/block/loop.c | 6 ++++++
 1 file changed, 6 insertions(+)

Jaegeuk Kim Dec. 17, 2018, 7:42 p.m. UTC | #1

If we don't drop caches used in old offset or block_size, we can get old data
from new offset/block_size, which gives unexpected data to user.

For example, Martijn found a loopback bug in the below scenario.
1) LOOP_SET_FD loads first two pages on loop file
2) LOOP_SET_STATUS64 changes the offset on the loop file
3) mount is failed due to the cached pages having wrong superblock

Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Reported-by: Martijn Coenen <maco@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---

v1 to v2:
 - cover block_size change

 drivers/block/loop.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index cb0cc8685076..382557c81674 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1154,6 +1154,12 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 
 	if (lo->lo_offset != info->lo_offset ||
 	    lo->lo_sizelimit != info->lo_sizelimit) {
+		struct block_device *bdev = lo->lo_device;
+
+		/* drop stale caches used in old offset */
+		sync_blockdev(bdev);
+		kill_bdev(bdev);
+
 		if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
 			err = -EFBIG;
 			goto exit;
@@ -1388,6 +1394,15 @@ static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
 	blk_queue_io_min(lo->lo_queue, arg);
 	loop_update_dio(lo);
 
+	/* Don't change the size if it is same as current */
+	if (lo->lo_queue->limits.logical_block_size != arg) {
+		struct block_device *bdev = lo->lo_device;
+
+		/* drop stale caches likewise set_blocksize */
+		sync_blockdev(bdev);
+		kill_bdev(bdev);
+	}
+
 	blk_mq_unfreeze_queue(lo->lo_queue);
 
 	return 0;

Jens Axboe Dec. 18, 2018, 5:48 p.m. UTC | #2

On 12/17/18 12:42 PM, Jaegeuk Kim wrote:
> If we don't drop caches used in old offset or block_size, we can get old data
> from new offset/block_size, which gives unexpected data to user.
> 
> For example, Martijn found a loopback bug in the below scenario.
> 1) LOOP_SET_FD loads first two pages on loop file
> 2) LOOP_SET_STATUS64 changes the offset on the loop file
> 3) mount is failed due to the cached pages having wrong superblock
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org
> Reported-by: Martijn Coenen <maco@google.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
> 
> v1 to v2:
>  - cover block_size change
> 
>  drivers/block/loop.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index cb0cc8685076..382557c81674 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1154,6 +1154,12 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
>  
>  	if (lo->lo_offset != info->lo_offset ||
>  	    lo->lo_sizelimit != info->lo_sizelimit) {
> +		struct block_device *bdev = lo->lo_device;
> +
> +		/* drop stale caches used in old offset */
> +		sync_blockdev(bdev);
> +		kill_bdev(bdev);
> +
>  		if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
>  			err = -EFBIG;
>  			goto exit;
> @@ -1388,6 +1394,15 @@ static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
>  	blk_queue_io_min(lo->lo_queue, arg);
>  	loop_update_dio(lo);
>  
> +	/* Don't change the size if it is same as current */
> +	if (lo->lo_queue->limits.logical_block_size != arg) {
> +		struct block_device *bdev = lo->lo_device;
> +
> +		/* drop stale caches likewise set_blocksize */
> +		sync_blockdev(bdev);
> +		kill_bdev(bdev);
> +	}
> +
>  	blk_mq_unfreeze_queue(lo->lo_queue);
>  
>  	return 0;

Looks fine to me, my only worry would be verifying that we're fine calling
sync/kill from those contexts. The queue is frozen at this point, what
happens if we do need to flush out dirty data?

Jaegeuk Kim Dec. 18, 2018, 10:41 p.m. UTC | #3

If we don't drop caches used in old offset or block_size, we can get old data
from new offset/block_size, which gives unexpected data to user.

For example, Martijn found a loopback bug in the below scenario.
1) LOOP_SET_FD loads first two pages on loop file
2) LOOP_SET_STATUS64 changes the offset on the loop file
3) mount is failed due to the cached pages having wrong superblock

Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Reported-by: Martijn Coenen <maco@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
v2 -> v3:
 - avoid to submit IOs on frozen mq

Jens, how about this?

Thanks,

 drivers/block/loop.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index cb0cc8685076..6b03121d62aa 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1126,6 +1126,12 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
 		return -EINVAL;
 
+	if (lo->lo_offset != info->lo_offset ||
+	    lo->lo_sizelimit != info->lo_sizelimit) {
+		sync_blockdev(lo->lo_device);
+		kill_bdev(lo->lo_device);
+	}
+
 	/* I/O need to be drained during transfer transition */
 	blk_mq_freeze_queue(lo->lo_queue);
 
@@ -1154,6 +1160,11 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 
 	if (lo->lo_offset != info->lo_offset ||
 	    lo->lo_sizelimit != info->lo_sizelimit) {
+		/* kill_bdev should have truncated all the pages */
+		if (lo->lo_device->bd_inode->i_mapping->nrpages) {
+			err = -EAGAIN;
+			goto exit;
+		}
 		if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
 			err = -EFBIG;
 			goto exit;
@@ -1375,22 +1386,36 @@ static int loop_set_dio(struct loop_device *lo, unsigned long arg)
 
 static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
 {
+	int err = 0;
+
 	if (lo->lo_state != Lo_bound)
 		return -ENXIO;
 
 	if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg))
 		return -EINVAL;
 
+	if (lo->lo_queue->limits.logical_block_size != arg) {
+		sync_blockdev(lo->lo_device);
+		kill_bdev(lo->lo_device);
+	}
+
 	blk_mq_freeze_queue(lo->lo_queue);
 
+	/* kill_bdev should have truncated all the pages */
+	if (lo->lo_queue->limits.logical_block_size != arg &&
+			lo->lo_device->bd_inode->i_mapping->nrpages) {
+		err = -EAGAIN;
+		goto out;
+	}
+
 	blk_queue_logical_block_size(lo->lo_queue, arg);
 	blk_queue_physical_block_size(lo->lo_queue, arg);
 	blk_queue_io_min(lo->lo_queue, arg);
 	loop_update_dio(lo);
-
+out:
 	blk_mq_unfreeze_queue(lo->lo_queue);
 
-	return 0;
+	return err;
 }
 
 static int lo_ioctl(struct block_device *bdev, fmode_t mode,

Jaegeuk Kim Jan. 9, 2019, 5:22 a.m. UTC | #4

Hi Jens,

May I ask whether this patch is acceptable?

Thanks,

On 12/18, Jaegeuk Kim wrote:
> If we don't drop caches used in old offset or block_size, we can get old data
> from new offset/block_size, which gives unexpected data to user.
> 
> For example, Martijn found a loopback bug in the below scenario.
> 1) LOOP_SET_FD loads first two pages on loop file
> 2) LOOP_SET_STATUS64 changes the offset on the loop file
> 3) mount is failed due to the cached pages having wrong superblock
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: linux-block@vger.kernel.org
> Reported-by: Martijn Coenen <maco@google.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
> v2 -> v3:
>  - avoid to submit IOs on frozen mq
> 
> Jens, how about this?
> 
> Thanks,
> 
>  drivers/block/loop.c | 29 +++++++++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index cb0cc8685076..6b03121d62aa 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1126,6 +1126,12 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
>  	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
>  		return -EINVAL;
>  
> +	if (lo->lo_offset != info->lo_offset ||
> +	    lo->lo_sizelimit != info->lo_sizelimit) {
> +		sync_blockdev(lo->lo_device);
> +		kill_bdev(lo->lo_device);
> +	}
> +
>  	/* I/O need to be drained during transfer transition */
>  	blk_mq_freeze_queue(lo->lo_queue);
>  
> @@ -1154,6 +1160,11 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
>  
>  	if (lo->lo_offset != info->lo_offset ||
>  	    lo->lo_sizelimit != info->lo_sizelimit) {
> +		/* kill_bdev should have truncated all the pages */
> +		if (lo->lo_device->bd_inode->i_mapping->nrpages) {
> +			err = -EAGAIN;
> +			goto exit;
> +		}
>  		if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
>  			err = -EFBIG;
>  			goto exit;
> @@ -1375,22 +1386,36 @@ static int loop_set_dio(struct loop_device *lo, unsigned long arg)
>  
>  static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
>  {
> +	int err = 0;
> +
>  	if (lo->lo_state != Lo_bound)
>  		return -ENXIO;
>  
>  	if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg))
>  		return -EINVAL;
>  
> +	if (lo->lo_queue->limits.logical_block_size != arg) {
> +		sync_blockdev(lo->lo_device);
> +		kill_bdev(lo->lo_device);
> +	}
> +
>  	blk_mq_freeze_queue(lo->lo_queue);
>  
> +	/* kill_bdev should have truncated all the pages */
> +	if (lo->lo_queue->limits.logical_block_size != arg &&
> +			lo->lo_device->bd_inode->i_mapping->nrpages) {
> +		err = -EAGAIN;
> +		goto out;
> +	}
> +
>  	blk_queue_logical_block_size(lo->lo_queue, arg);
>  	blk_queue_physical_block_size(lo->lo_queue, arg);
>  	blk_queue_io_min(lo->lo_queue, arg);
>  	loop_update_dio(lo);
> -
> +out:
>  	blk_mq_unfreeze_queue(lo->lo_queue);
>  
> -	return 0;
> +	return err;
>  }
>  
>  static int lo_ioctl(struct block_device *bdev, fmode_t mode,
> -- 
> 2.19.0.605.g01d371f741-goog

Bart Van Assche Jan. 9, 2019, 8:51 p.m. UTC | #5

On Tue, 2018-12-18 at 14:41 -0800, Jaegeuk Kim wrote:
> [ ... ]

Please post new versions of a patch as a new e-mail thread instead of
as a reply to a previous e-mail.

> [ ... ]
>
>  	if (lo->lo_offset != info->lo_offset ||
>  	    lo->lo_sizelimit != info->lo_sizelimit) {
> +		/* kill_bdev should have truncated all the pages */
> +		if (lo->lo_device->bd_inode->i_mapping->nrpages) {
> +			err = -EAGAIN;
> +			goto exit;
> +		}

Please add a pr_info() or pr_warn() statement here such that it becomes
easy for the user to figure out why EAGAIN has been returned.

>  	blk_mq_freeze_queue(lo->lo_queue);
>  
> +	/* kill_bdev should have truncated all the pages */
> +	if (lo->lo_queue->limits.logical_block_size != arg &&
> +			lo->lo_device->bd_inode->i_mapping->nrpages) {
> +		err = -EAGAIN;
> +		goto out;
> +	}

Same comment here. Additionally, please consider renaming the "out" label
into "unfreeze" or so. I think that will make the use of label names more
consistent with the rest of the block layer. Once these two comments are
addressed, feel free to add:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

loop: drop caches if offset is changed

Commit Message

Comments

Patch