Message ID | 20181030230624.61834-3-evgreen@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | loop: Better discard for block devices | expand |
On Tue, Oct 30, 2018 at 4:06 PM Evan Green <evgreen@chromium.org> wrote: > > If the backing device for a loop device is a block device, > then mirror the discard properties of the underlying block > device into the loop device. While in there, differentiate > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > different for block devices, but which the loop device had > just been lumping together. > > Signed-off-by: Evan Green <evgreen@chromium.org> Any thoughts on this patch? This fixes issues for us when using a loop device backed by a block device, where we see many logs like: [ 372.767286] print_req_error: I/O error, dev loop5, sector 88125696 -Evan
On Tue, Nov 27, 2018 at 2:55 AM Evan Green <evgreen@chromium.org> wrote: > > On Tue, Oct 30, 2018 at 4:06 PM Evan Green <evgreen@chromium.org> wrote: > > > > If the backing device for a loop device is a block device, This shouldn't be a very common use case wrt. loop. > > then mirror the discard properties of the underlying block > > device into the loop device. While in there, differentiate > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > different for block devices, but which the loop device had > > just been lumping together. > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > Any thoughts on this patch? This fixes issues for us when using a loop > device backed by a block device, where we see many logs like: > > [ 372.767286] print_req_error: I/O error, dev loop5, sector 88125696 Seems not see any explanation about this IO error and the fix in your patch. Could you describe it a bit more? thanks, Ming Lei
Hi Ming, On Mon, Nov 26, 2018 at 6:55 PM Ming Lei <tom.leiming@gmail.com> wrote: > > On Tue, Nov 27, 2018 at 2:55 AM Evan Green <evgreen@chromium.org> wrote: > > > > On Tue, Oct 30, 2018 at 4:06 PM Evan Green <evgreen@chromium.org> wrote: > > > > > > If the backing device for a loop device is a block device, > > This shouldn't be a very common use case wrt. loop. Yeah, I'm starting to gather that. Or maybe I'm just the first one to mention it on the kernel lists ;) We've used this in our Chrome OS installer, I believe for many years. Gwendal piped in with a few reasons we do it this way on the cover letter, but in general I think it allows us to have a unified set of functions to install to a file, disk, or prepare an image that may have a different block size than those on the running system. > > > > then mirror the discard properties of the underlying block > > > device into the loop device. While in there, differentiate > > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > > different for block devices, but which the loop device had > > > just been lumping together. > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > > > Any thoughts on this patch? This fixes issues for us when using a loop > > device backed by a block device, where we see many logs like: > > > > [ 372.767286] print_req_error: I/O error, dev loop5, sector 88125696 > > Seems not see any explanation about this IO error and the fix in your patch. > Could you describe it a bit more? Sure, I probably should have included more context with the series. The loop device always reports that it supports discard, by setting up the max_discard_sectors and max_write_zeroes_sectors in the blk queue. When the loop device gets a discard or write-zeroes request, it turns around and calls fallocate on the underlying device with the PUNCH_HOLE flag. This makes sense when you're backed by a file and hoping to just deallocate the space, but may fail when you're backed by a block device that doesn't support discard, or doesn't write zeroes to discarded sectors. Weirdly, lo_discard already had some code for preserving EOPNOTSUPP, but then later the error is smashed into EIO. Patch 1 pipes out EOPNOTSUPP properly, so it doesn't get squashed into EIO. Patch 2 reflects the discard characteristics of the underlying device into the loop device. That way, if you're backed by a file or a block device that does support discard, everything works great, and user mode can even see and use the correct discard and write zero granularities. If you're backed by a block device that does not support discard, this is exposed to user mode, which then usually avoids calling fallocate, and doesn't feel betrayed that their requests are unexpectedly failing. -Evan
On Tue, Oct 30, 2018 at 04:06:24PM -0700, Evan Green wrote: > If the backing device for a loop device is a block device, > then mirror the discard properties of the underlying block > device into the loop device. While in there, differentiate > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > different for block devices, but which the loop device had > just been lumping together. > > Signed-off-by: Evan Green <evgreen@chromium.org> > --- > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- > 1 file changed, 41 insertions(+), 20 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 28990fc94841a..176e65101c4ef 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > return ret; > } > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > +static int lo_discard(struct loop_device *lo, struct request *rq, > + int mode, loff_t pos) > { > - /* > - * We use punch hole to reclaim the free space used by the > - * image a.k.a. discard. However we do not support discard if > - * encryption is enabled, because it may give an attacker > - * useful information. > - */ > struct file *file = lo->lo_backing_file; > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > + struct request_queue *q = lo->lo_queue; > int ret; > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > + if (!blk_queue_discard(q)) { > ret = -EOPNOTSUPP; > goto out; > } > @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > case REQ_OP_FLUSH: > return lo_req_flush(lo, rq); > case REQ_OP_DISCARD: > + return lo_discard(lo, rq, > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > + > case REQ_OP_WRITE_ZEROES: > - return lo_discard(lo, rq, pos); > + return lo_discard(lo, rq, > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > + > case REQ_OP_WRITE: > if (lo->transfer) > return lo_write_transfer(lo, rq, pos); > @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) > struct file *file = lo->lo_backing_file; > struct inode *inode = file->f_mapping->host; > struct request_queue *q = lo->lo_queue; > + struct request_queue *backingq; > + > + /* > + * If the backing device is a block device, mirror its discard > + * capabilities. > + */ > + if (S_ISBLK(inode->i_mode)) { > + backingq = bdev_get_queue(inode->i_bdev); > + blk_queue_max_discard_sectors(q, > + backingq->limits.max_discard_sectors); > + > + blk_queue_max_write_zeroes_sectors(q, > + backingq->limits.max_write_zeroes_sectors); > + > + q->limits.discard_granularity = > + backingq->limits.discard_granularity; > + > + q->limits.discard_alignment = > + backingq->limits.discard_alignment; I think it isn't necessary to mirror backing queue's discard/write_zeros capabilities, given either fs of the underlying queue can deal with well. > > /* > * We use punch hole to reclaim the free space used by the > @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) > * encryption is enabled, because it may give an attacker > * useful information. > */ > - if ((!file->f_op->fallocate) || > - lo->lo_encrypt_key_size) { > + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > q->limits.discard_granularity = 0; > q->limits.discard_alignment = 0; > blk_queue_max_discard_sectors(q, 0); > blk_queue_max_write_zeroes_sectors(q, 0); > - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > - return; > - } > > - q->limits.discard_granularity = inode->i_sb->s_blocksize; > - q->limits.discard_alignment = 0; > + } else { > + q->limits.discard_granularity = inode->i_sb->s_blocksize; > + q->limits.discard_alignment = 0; > + > + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > + } > > - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) > + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > + else > + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > } Looks it should work just by mirroring backing queue's discard capability to loop queue in case that the loop is backed by block device, doesn't it? Meantime the unified discard limits & write_zeros limits can be kept. thanks, Ming
On Tue, Nov 27, 2018 at 03:34:04PM -0800, Evan Green wrote: > Hi Ming, > > On Mon, Nov 26, 2018 at 6:55 PM Ming Lei <tom.leiming@gmail.com> wrote: > > > > On Tue, Nov 27, 2018 at 2:55 AM Evan Green <evgreen@chromium.org> wrote: > > > > > > On Tue, Oct 30, 2018 at 4:06 PM Evan Green <evgreen@chromium.org> wrote: > > > > > > > > If the backing device for a loop device is a block device, > > > > This shouldn't be a very common use case wrt. loop. > > Yeah, I'm starting to gather that. Or maybe I'm just the first one to > mention it on the kernel lists ;) We've used this in our Chrome OS > installer, I believe for many years. Gwendal piped in with a few > reasons we do it this way on the cover letter, but in general I think > it allows us to have a unified set of functions to install to a file, > disk, or prepare an image that may have a different block size than > those on the running system. OK, got it, it makes sense. > > > > > > > then mirror the discard properties of the underlying block > > > > device into the loop device. While in there, differentiate > > > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > > > different for block devices, but which the loop device had > > > > just been lumping together. > > > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > > > > > Any thoughts on this patch? This fixes issues for us when using a loop > > > device backed by a block device, where we see many logs like: > > > > > > [ 372.767286] print_req_error: I/O error, dev loop5, sector 88125696 > > > > Seems not see any explanation about this IO error and the fix in your patch. > > Could you describe it a bit more? > > Sure, I probably should have included more context with the series. > > The loop device always reports that it supports discard, by setting up > the max_discard_sectors and max_write_zeroes_sectors in the blk queue. > When the loop device gets a discard or write-zeroes request, it turns > around and calls fallocate on the underlying device with the > PUNCH_HOLE flag. This makes sense when you're backed by a file and > hoping to just deallocate the space, but may fail when you're backed > by a block device that doesn't support discard, or doesn't write > zeroes to discarded sectors. Weirdly, lo_discard already had some code > for preserving EOPNOTSUPP, but then later the error is smashed into > EIO. Patch 1 pipes out EOPNOTSUPP properly, so it doesn't get squashed > into EIO. > > Patch 2 reflects the discard characteristics of the underlying device > into the loop device. That way, if you're backed by a file or a block > device that does support discard, everything works great, and user > mode can even see and use the correct discard and write zero > granularities. If you're backed by a block device that does not > support discard, this is exposed to user mode, which then usually > avoids calling fallocate, and doesn't feel betrayed that their > requests are unexpectedly failing. Thanks for your detailed explanation, and I think we need to fix it. Thanks, Ming
Hi Ming, On Tue, Nov 27, 2018 at 5:26 PM Ming Lei <ming.lei@redhat.com> wrote: > > On Tue, Oct 30, 2018 at 04:06:24PM -0700, Evan Green wrote: > > If the backing device for a loop device is a block device, > > then mirror the discard properties of the underlying block > > device into the loop device. While in there, differentiate > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > different for block devices, but which the loop device had > > just been lumping together. > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > --- > > > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- > > 1 file changed, 41 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index 28990fc94841a..176e65101c4ef 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > return ret; > > } > > > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > +static int lo_discard(struct loop_device *lo, struct request *rq, > > + int mode, loff_t pos) > > { > > - /* > > - * We use punch hole to reclaim the free space used by the > > - * image a.k.a. discard. However we do not support discard if > > - * encryption is enabled, because it may give an attacker > > - * useful information. > > - */ > > struct file *file = lo->lo_backing_file; > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > + struct request_queue *q = lo->lo_queue; > > int ret; > > > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > + if (!blk_queue_discard(q)) { > > ret = -EOPNOTSUPP; > > goto out; > > } > > @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > case REQ_OP_FLUSH: > > return lo_req_flush(lo, rq); > > case REQ_OP_DISCARD: > > + return lo_discard(lo, rq, > > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > > + > > case REQ_OP_WRITE_ZEROES: > > - return lo_discard(lo, rq, pos); > > + return lo_discard(lo, rq, > > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > > + > > case REQ_OP_WRITE: > > if (lo->transfer) > > return lo_write_transfer(lo, rq, pos); > > @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) > > struct file *file = lo->lo_backing_file; > > struct inode *inode = file->f_mapping->host; > > struct request_queue *q = lo->lo_queue; > > + struct request_queue *backingq; > > + > > + /* > > + * If the backing device is a block device, mirror its discard > > + * capabilities. > > + */ > > + if (S_ISBLK(inode->i_mode)) { > > + backingq = bdev_get_queue(inode->i_bdev); > > + blk_queue_max_discard_sectors(q, > > + backingq->limits.max_discard_sectors); > > + > > + blk_queue_max_write_zeroes_sectors(q, > > + backingq->limits.max_write_zeroes_sectors); > > + > > + q->limits.discard_granularity = > > + backingq->limits.discard_granularity; > > + > > + q->limits.discard_alignment = > > + backingq->limits.discard_alignment; > > I think it isn't necessary to mirror backing queue's discard/write_zeros > capabilities, given either fs of the underlying queue can deal with well. > > > > > /* > > * We use punch hole to reclaim the free space used by the > > @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) > > * encryption is enabled, because it may give an attacker > > * useful information. > > */ > > - if ((!file->f_op->fallocate) || > > - lo->lo_encrypt_key_size) { > > + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > q->limits.discard_granularity = 0; > > q->limits.discard_alignment = 0; > > blk_queue_max_discard_sectors(q, 0); > > blk_queue_max_write_zeroes_sectors(q, 0); > > - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > - return; > > - } > > > > - q->limits.discard_granularity = inode->i_sb->s_blocksize; > > - q->limits.discard_alignment = 0; > > + } else { > > + q->limits.discard_granularity = inode->i_sb->s_blocksize; > > + q->limits.discard_alignment = 0; > > + > > + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > + } > > > > - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) > > + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > + else > > + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > } > > Looks it should work just by mirroring backing queue's discard > capability to loop queue in case that the loop is backed by > block device, doesn't it? Meantime the unified discard limits & > write_zeros limits can be kept. I tested this out, and you're right that I could just flip the QUEUE_FLAG_DISCARD based on whether its a block device, and leave everything else alone, to completely disable discard support for loop devices backed by block devices. This seems to work for programs like mkfs.ext4, but still leaves problems for coreutils cp. But is that really the right call? With this change, we're not only able to use loop devices in this way, but we're able to use the discard and zero functionality of the underlying block device by simply passing it through. To me that seemed better than disabling all discard support for block devices, which would severely slow us down on some devices. -Evan
On Tue, Dec 04, 2018 at 02:19:46PM -0800, Evan Green wrote: > Hi Ming, > > On Tue, Nov 27, 2018 at 5:26 PM Ming Lei <ming.lei@redhat.com> wrote: > > > > On Tue, Oct 30, 2018 at 04:06:24PM -0700, Evan Green wrote: > > > If the backing device for a loop device is a block device, > > > then mirror the discard properties of the underlying block > > > device into the loop device. While in there, differentiate > > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > > different for block devices, but which the loop device had > > > just been lumping together. > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > > --- > > > > > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- > > > 1 file changed, 41 insertions(+), 20 deletions(-) > > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > > index 28990fc94841a..176e65101c4ef 100644 > > > --- a/drivers/block/loop.c > > > +++ b/drivers/block/loop.c > > > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > > return ret; > > > } > > > > > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > > +static int lo_discard(struct loop_device *lo, struct request *rq, > > > + int mode, loff_t pos) > > > { > > > - /* > > > - * We use punch hole to reclaim the free space used by the > > > - * image a.k.a. discard. However we do not support discard if > > > - * encryption is enabled, because it may give an attacker > > > - * useful information. > > > - */ > > > struct file *file = lo->lo_backing_file; > > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > > + struct request_queue *q = lo->lo_queue; > > > int ret; > > > > > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > + if (!blk_queue_discard(q)) { > > > ret = -EOPNOTSUPP; > > > goto out; > > > } > > > @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > > case REQ_OP_FLUSH: > > > return lo_req_flush(lo, rq); > > > case REQ_OP_DISCARD: > > > + return lo_discard(lo, rq, > > > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > > > + > > > case REQ_OP_WRITE_ZEROES: > > > - return lo_discard(lo, rq, pos); > > > + return lo_discard(lo, rq, > > > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > > > + > > > case REQ_OP_WRITE: > > > if (lo->transfer) > > > return lo_write_transfer(lo, rq, pos); > > > @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) > > > struct file *file = lo->lo_backing_file; > > > struct inode *inode = file->f_mapping->host; > > > struct request_queue *q = lo->lo_queue; > > > + struct request_queue *backingq; > > > + > > > + /* > > > + * If the backing device is a block device, mirror its discard > > > + * capabilities. > > > + */ > > > + if (S_ISBLK(inode->i_mode)) { > > > + backingq = bdev_get_queue(inode->i_bdev); > > > + blk_queue_max_discard_sectors(q, > > > + backingq->limits.max_discard_sectors); > > > + > > > + blk_queue_max_write_zeroes_sectors(q, > > > + backingq->limits.max_write_zeroes_sectors); > > > + > > > + q->limits.discard_granularity = > > > + backingq->limits.discard_granularity; > > > + > > > + q->limits.discard_alignment = > > > + backingq->limits.discard_alignment; > > > > I think it isn't necessary to mirror backing queue's discard/write_zeros > > capabilities, given either fs of the underlying queue can deal with well. > > > > > > > > /* > > > * We use punch hole to reclaim the free space used by the > > > @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) > > > * encryption is enabled, because it may give an attacker > > > * useful information. > > > */ > > > - if ((!file->f_op->fallocate) || > > > - lo->lo_encrypt_key_size) { > > > + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > q->limits.discard_granularity = 0; > > > q->limits.discard_alignment = 0; > > > blk_queue_max_discard_sectors(q, 0); > > > blk_queue_max_write_zeroes_sectors(q, 0); > > > - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > - return; > > > - } > > > > > > - q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > - q->limits.discard_alignment = 0; > > > + } else { > > > + q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > + q->limits.discard_alignment = 0; > > > + > > > + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > + } > > > > > > - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) > > > + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > + else > > > + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > } > > > > Looks it should work just by mirroring backing queue's discard > > capability to loop queue in case that the loop is backed by > > block device, doesn't it? Meantime the unified discard limits & > > write_zeros limits can be kept. > > I tested this out, and you're right that I could just flip the > QUEUE_FLAG_DISCARD based on whether its a block device, and leave What I meant actually is to do the following discard config: bool discard; if (S_ISBLK(inode->i_mode)) { struct request_queue *backingq = bdev_get_queue(inode->i_bdev); discard = blk_queue_discard(backingq); } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) discard = false; else discard = true; if (discard) { blk_queue_max_discard_sectors(q, UINT_MAX >> 9); blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); } else { blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); } > everything else alone, to completely disable discard support for loop > devices backed by block devices. This seems to work for programs like > mkfs.ext4, but still leaves problems for coreutils cp. > > But is that really the right call? With this change, we're not only > able to use loop devices in this way, but we're able to use the > discard and zero functionality of the underlying block device by > simply passing it through. To me that seemed better than disabling all > discard support for block devices, which would severely slow us down > on some devices. I guess the above approach can do the same job with yours, but simpler. thanks, Ming
Hi Ming, On Tue, Dec 4, 2018 at 5:11 PM Ming Lei <ming.lei@redhat.com> wrote: > > On Tue, Dec 04, 2018 at 02:19:46PM -0800, Evan Green wrote: > > Hi Ming, > > > > On Tue, Nov 27, 2018 at 5:26 PM Ming Lei <ming.lei@redhat.com> wrote: > > > > > > On Tue, Oct 30, 2018 at 04:06:24PM -0700, Evan Green wrote: > > > > If the backing device for a loop device is a block device, > > > > then mirror the discard properties of the underlying block > > > > device into the loop device. While in there, differentiate > > > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > > > different for block devices, but which the loop device had > > > > just been lumping together. > > > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > > > --- > > > > > > > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- > > > > 1 file changed, 41 insertions(+), 20 deletions(-) > > > > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > > > index 28990fc94841a..176e65101c4ef 100644 > > > > --- a/drivers/block/loop.c > > > > +++ b/drivers/block/loop.c > > > > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > > > return ret; > > > > } > > > > > > > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > > > +static int lo_discard(struct loop_device *lo, struct request *rq, > > > > + int mode, loff_t pos) > > > > { > > > > - /* > > > > - * We use punch hole to reclaim the free space used by the > > > > - * image a.k.a. discard. However we do not support discard if > > > > - * encryption is enabled, because it may give an attacker > > > > - * useful information. > > > > - */ > > > > struct file *file = lo->lo_backing_file; > > > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > > > + struct request_queue *q = lo->lo_queue; > > > > int ret; > > > > > > > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > > + if (!blk_queue_discard(q)) { > > > > ret = -EOPNOTSUPP; > > > > goto out; > > > > } > > > > @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > > > case REQ_OP_FLUSH: > > > > return lo_req_flush(lo, rq); > > > > case REQ_OP_DISCARD: > > > > + return lo_discard(lo, rq, > > > > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > > > > + > > > > case REQ_OP_WRITE_ZEROES: > > > > - return lo_discard(lo, rq, pos); > > > > + return lo_discard(lo, rq, > > > > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > > > > + > > > > case REQ_OP_WRITE: > > > > if (lo->transfer) > > > > return lo_write_transfer(lo, rq, pos); > > > > @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) > > > > struct file *file = lo->lo_backing_file; > > > > struct inode *inode = file->f_mapping->host; > > > > struct request_queue *q = lo->lo_queue; > > > > + struct request_queue *backingq; > > > > + > > > > + /* > > > > + * If the backing device is a block device, mirror its discard > > > > + * capabilities. > > > > + */ > > > > + if (S_ISBLK(inode->i_mode)) { > > > > + backingq = bdev_get_queue(inode->i_bdev); > > > > + blk_queue_max_discard_sectors(q, > > > > + backingq->limits.max_discard_sectors); > > > > + > > > > + blk_queue_max_write_zeroes_sectors(q, > > > > + backingq->limits.max_write_zeroes_sectors); > > > > + > > > > + q->limits.discard_granularity = > > > > + backingq->limits.discard_granularity; > > > > + > > > > + q->limits.discard_alignment = > > > > + backingq->limits.discard_alignment; > > > > > > I think it isn't necessary to mirror backing queue's discard/write_zeros > > > capabilities, given either fs of the underlying queue can deal with well. > > > > > > > > > > > /* > > > > * We use punch hole to reclaim the free space used by the > > > > @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) > > > > * encryption is enabled, because it may give an attacker > > > > * useful information. > > > > */ > > > > - if ((!file->f_op->fallocate) || > > > > - lo->lo_encrypt_key_size) { > > > > + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > > q->limits.discard_granularity = 0; > > > > q->limits.discard_alignment = 0; > > > > blk_queue_max_discard_sectors(q, 0); > > > > blk_queue_max_write_zeroes_sectors(q, 0); > > > > - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > > - return; > > > > - } > > > > > > > > - q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > > - q->limits.discard_alignment = 0; > > > > + } else { > > > > + q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > > + q->limits.discard_alignment = 0; > > > > + > > > > + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > > + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > > + } > > > > > > > > - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > > - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > > - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > > + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) > > > > + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > > + else > > > > + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > > } > > > > > > Looks it should work just by mirroring backing queue's discard > > > capability to loop queue in case that the loop is backed by > > > block device, doesn't it? Meantime the unified discard limits & > > > write_zeros limits can be kept. > > > > I tested this out, and you're right that I could just flip the > > QUEUE_FLAG_DISCARD based on whether its a block device, and leave > > What I meant actually is to do the following discard config: > > bool discard; > if (S_ISBLK(inode->i_mode)) { > struct request_queue *backingq = bdev_get_queue(inode->i_bdev); > discard = blk_queue_discard(backingq); > } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) > discard = false; > else > discard = true; > > if (discard) { > blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > } else { > blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > } Ah, I see. But I think it's useful to reflect max_discard_sectors, max_write_zeroes_sectors, discard_granularity, and discard_alignment from the block device to the loop device. With the exception of discard_alignment, these parameters are visible via sysfs, so usermode can actually use these to make more intelligent use of fallocate. Without this part of it, I still see issues with GNU cp. I'm not totally sure about discard_alignment, that seems to be useful in cases of merging blk requests. So I can stop mirroring that one if it's harmful or not helpful. But unless it's a nak, I'd really love to keep most of the mirroring. In which case the bool doesn't do a whole lot of simplifying. > > > everything else alone, to completely disable discard support for loop > > devices backed by block devices. This seems to work for programs like > > mkfs.ext4, but still leaves problems for coreutils cp. > > > > But is that really the right call? With this change, we're not only > > able to use loop devices in this way, but we're able to use the > > discard and zero functionality of the underlying block device by > > simply passing it through. To me that seemed better than disabling all > > discard support for block devices, which would severely slow us down > > on some devices. > > I guess the above approach can do the same job with yours, but simpler. > > thanks, > Ming
On Wed, Dec 05, 2018 at 11:35:57AM -0800, Evan Green wrote: > Hi Ming, > > On Tue, Dec 4, 2018 at 5:11 PM Ming Lei <ming.lei@redhat.com> wrote: > > > > On Tue, Dec 04, 2018 at 02:19:46PM -0800, Evan Green wrote: > > > Hi Ming, > > > > > > On Tue, Nov 27, 2018 at 5:26 PM Ming Lei <ming.lei@redhat.com> wrote: > > > > > > > > On Tue, Oct 30, 2018 at 04:06:24PM -0700, Evan Green wrote: > > > > > If the backing device for a loop device is a block device, > > > > > then mirror the discard properties of the underlying block > > > > > device into the loop device. While in there, differentiate > > > > > between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are > > > > > different for block devices, but which the loop device had > > > > > just been lumping together. > > > > > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org> > > > > > --- > > > > > > > > > > drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- > > > > > 1 file changed, 41 insertions(+), 20 deletions(-) > > > > > > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > > > > index 28990fc94841a..176e65101c4ef 100644 > > > > > --- a/drivers/block/loop.c > > > > > +++ b/drivers/block/loop.c > > > > > @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > > > > return ret; > > > > > } > > > > > > > > > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > > > > +static int lo_discard(struct loop_device *lo, struct request *rq, > > > > > + int mode, loff_t pos) > > > > > { > > > > > - /* > > > > > - * We use punch hole to reclaim the free space used by the > > > > > - * image a.k.a. discard. However we do not support discard if > > > > > - * encryption is enabled, because it may give an attacker > > > > > - * useful information. > > > > > - */ > > > > > struct file *file = lo->lo_backing_file; > > > > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > > > > + struct request_queue *q = lo->lo_queue; > > > > > int ret; > > > > > > > > > > - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > > > + if (!blk_queue_discard(q)) { > > > > > ret = -EOPNOTSUPP; > > > > > goto out; > > > > > } > > > > > @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > > > > case REQ_OP_FLUSH: > > > > > return lo_req_flush(lo, rq); > > > > > case REQ_OP_DISCARD: > > > > > + return lo_discard(lo, rq, > > > > > + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); > > > > > + > > > > > case REQ_OP_WRITE_ZEROES: > > > > > - return lo_discard(lo, rq, pos); > > > > > + return lo_discard(lo, rq, > > > > > + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); > > > > > + > > > > > case REQ_OP_WRITE: > > > > > if (lo->transfer) > > > > > return lo_write_transfer(lo, rq, pos); > > > > > @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) > > > > > struct file *file = lo->lo_backing_file; > > > > > struct inode *inode = file->f_mapping->host; > > > > > struct request_queue *q = lo->lo_queue; > > > > > + struct request_queue *backingq; > > > > > + > > > > > + /* > > > > > + * If the backing device is a block device, mirror its discard > > > > > + * capabilities. > > > > > + */ > > > > > + if (S_ISBLK(inode->i_mode)) { > > > > > + backingq = bdev_get_queue(inode->i_bdev); > > > > > + blk_queue_max_discard_sectors(q, > > > > > + backingq->limits.max_discard_sectors); > > > > > + > > > > > + blk_queue_max_write_zeroes_sectors(q, > > > > > + backingq->limits.max_write_zeroes_sectors); > > > > > + > > > > > + q->limits.discard_granularity = > > > > > + backingq->limits.discard_granularity; > > > > > + > > > > > + q->limits.discard_alignment = > > > > > + backingq->limits.discard_alignment; > > > > > > > > I think it isn't necessary to mirror backing queue's discard/write_zeros > > > > capabilities, given either fs of the underlying queue can deal with well. > > > > > > > > > > > > > > /* > > > > > * We use punch hole to reclaim the free space used by the > > > > > @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) > > > > > * encryption is enabled, because it may give an attacker > > > > > * useful information. > > > > > */ > > > > > - if ((!file->f_op->fallocate) || > > > > > - lo->lo_encrypt_key_size) { > > > > > + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > > > > q->limits.discard_granularity = 0; > > > > > q->limits.discard_alignment = 0; > > > > > blk_queue_max_discard_sectors(q, 0); > > > > > blk_queue_max_write_zeroes_sectors(q, 0); > > > > > - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > > > - return; > > > > > - } > > > > > > > > > > - q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > > > - q->limits.discard_alignment = 0; > > > > > + } else { > > > > > + q->limits.discard_granularity = inode->i_sb->s_blocksize; > > > > > + q->limits.discard_alignment = 0; > > > > > + > > > > > + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > > > + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > > > + } > > > > > > > > > > - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > > > > - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > > > > - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > > > + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) > > > > > + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > > > > + else > > > > > + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > > > > } > > > > > > > > Looks it should work just by mirroring backing queue's discard > > > > capability to loop queue in case that the loop is backed by > > > > block device, doesn't it? Meantime the unified discard limits & > > > > write_zeros limits can be kept. > > > > > > I tested this out, and you're right that I could just flip the > > > QUEUE_FLAG_DISCARD based on whether its a block device, and leave > > > > What I meant actually is to do the following discard config: > > > > bool discard; > > if (S_ISBLK(inode->i_mode)) { > > struct request_queue *backingq = bdev_get_queue(inode->i_bdev); > > discard = blk_queue_discard(backingq); > > } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) > > discard = false; > > else > > discard = true; > > > > if (discard) { > > blk_queue_max_discard_sectors(q, UINT_MAX >> 9); > > blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); > > blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); > > } else { > > blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); > > } > > Ah, I see. But I think it's useful to reflect max_discard_sectors, > max_write_zeroes_sectors, discard_granularity, and discard_alignment > from the block device to the loop device. With the exception of > discard_alignment, these parameters are visible via sysfs, so usermode > can actually use these to make more intelligent use of fallocate. Could you share us what the intelligent use of fallocate is? The block layer code of blk_bio_discard_split() can deal with all the magic limits. The unified discard limits is simpler from implement view of loop, or from userspace view. > Without this part of it, I still see issues with GNU cp. Could you investigate the cause of the GNU cp issue? Thanks, Ming
Evan, > Ah, I see. But I think it's useful to reflect max_discard_sectors, > max_write_zeroes_sectors, discard_granularity, and discard_alignment > from the block device to the loop device. With the exception of > discard_alignment, these parameters are visible via sysfs, discard_alignment is visible in sysfs, just not in the queue directory since alignment can be different on a per-partition basis. So there is one discard_alignment at the root of each /sys/block/foo directory and one for each partition subdirectory. This mirrors the alignment_offset variable which indicates a partitions alignment wrt. the underlying physical block size. That said, there are not many devices that actually report a non-zero discard alignment so it's not as useful as the device manufacturers (that were looking for an implementation shortcut) envisioned. > I'm not totally sure about discard_alignment, that seems to be useful > in cases of merging blk requests. So I can stop mirroring that one if > it's harmful or not helpful. But unless it's a nak, I'd really love to > keep most of the mirroring. In which case the bool doesn't do a whole > lot of simplifying. I think it's fine to export these. The block device topology was explicitly designed to be stackable like this.
On Wed, Dec 5, 2018 at 7:15 PM Martin K. Petersen <martin.petersen@oracle.com> wrote: > > > Evan, > > > Ah, I see. But I think it's useful to reflect max_discard_sectors, > > max_write_zeroes_sectors, discard_granularity, and discard_alignment > > from the block device to the loop device. With the exception of > > discard_alignment, these parameters are visible via sysfs, > > discard_alignment is visible in sysfs, just not in the queue directory > since alignment can be different on a per-partition basis. So there is > one discard_alignment at the root of each /sys/block/foo directory and > one for each partition subdirectory. This mirrors the alignment_offset > variable which indicates a partitions alignment wrt. the underlying > physical block size. > > That said, there are not many devices that actually report a non-zero > discard alignment so it's not as useful as the device manufacturers > (that were looking for an implementation shortcut) envisioned. Ah ok, thanks. > > > I'm not totally sure about discard_alignment, that seems to be useful > > in cases of merging blk requests. So I can stop mirroring that one if > > it's harmful or not helpful. But unless it's a nak, I'd really love to > > keep most of the mirroring. In which case the bool doesn't do a whole > > lot of simplifying. > > I think it's fine to export these. The block device topology was > explicitly designed to be stackable like this. Yeah, it seemed to fall in pretty naturally, which is why I was hoping it might not be so controversial. Thanks Martin. -Evan
On Mon, Dec 10, 2018 at 9:31 AM Evan Green <evgreen@chromium.org> wrote: > > On Wed, Dec 5, 2018 at 7:15 PM Martin K. Petersen > <martin.petersen@oracle.com> wrote: > > > > > > Evan, > > > > > Ah, I see. But I think it's useful to reflect max_discard_sectors, > > > max_write_zeroes_sectors, discard_granularity, and discard_alignment > > > from the block device to the loop device. With the exception of > > > discard_alignment, these parameters are visible via sysfs, > > > > discard_alignment is visible in sysfs, just not in the queue directory > > since alignment can be different on a per-partition basis. So there is > > one discard_alignment at the root of each /sys/block/foo directory and > > one for each partition subdirectory. This mirrors the alignment_offset > > variable which indicates a partitions alignment wrt. the underlying > > physical block size. > > > > That said, there are not many devices that actually report a non-zero > > discard alignment so it's not as useful as the device manufacturers > > (that were looking for an implementation shortcut) envisioned. > > Ah ok, thanks. > > > > > > I'm not totally sure about discard_alignment, that seems to be useful > > > in cases of merging blk requests. So I can stop mirroring that one if > > > it's harmful or not helpful. But unless it's a nak, I'd really love to > > > keep most of the mirroring. In which case the bool doesn't do a whole > > > lot of simplifying. > > > > I think it's fine to export these. The block device topology was > > explicitly designed to be stackable like this. > > Yeah, it seemed to fall in pretty naturally, which is why I was hoping > it might not be so controversial. Thanks Martin. Any other thoughts about this series? Ming, I did go back and experiment a little. The pseudocode you had proposed earlier would work, and solve the error prints I was seeing. But I still would like to keep the reflection of the block device properties, since that allows discard to be used on block devices that do support it. -Evan
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 28990fc94841a..176e65101c4ef 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -417,19 +417,14 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, return ret; } -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) +static int lo_discard(struct loop_device *lo, struct request *rq, + int mode, loff_t pos) { - /* - * We use punch hole to reclaim the free space used by the - * image a.k.a. discard. However we do not support discard if - * encryption is enabled, because it may give an attacker - * useful information. - */ struct file *file = lo->lo_backing_file; - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; + struct request_queue *q = lo->lo_queue; int ret; - if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { + if (!blk_queue_discard(q)) { ret = -EOPNOTSUPP; goto out; } @@ -603,8 +598,13 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) case REQ_OP_FLUSH: return lo_req_flush(lo, rq); case REQ_OP_DISCARD: + return lo_discard(lo, rq, + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, pos); + case REQ_OP_WRITE_ZEROES: - return lo_discard(lo, rq, pos); + return lo_discard(lo, rq, + FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, pos); + case REQ_OP_WRITE: if (lo->transfer) return lo_write_transfer(lo, rq, pos); @@ -859,6 +859,25 @@ static void loop_config_discard(struct loop_device *lo) struct file *file = lo->lo_backing_file; struct inode *inode = file->f_mapping->host; struct request_queue *q = lo->lo_queue; + struct request_queue *backingq; + + /* + * If the backing device is a block device, mirror its discard + * capabilities. + */ + if (S_ISBLK(inode->i_mode)) { + backingq = bdev_get_queue(inode->i_bdev); + blk_queue_max_discard_sectors(q, + backingq->limits.max_discard_sectors); + + blk_queue_max_write_zeroes_sectors(q, + backingq->limits.max_write_zeroes_sectors); + + q->limits.discard_granularity = + backingq->limits.discard_granularity; + + q->limits.discard_alignment = + backingq->limits.discard_alignment; /* * We use punch hole to reclaim the free space used by the @@ -866,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo) * encryption is enabled, because it may give an attacker * useful information. */ - if ((!file->f_op->fallocate) || - lo->lo_encrypt_key_size) { + } else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { q->limits.discard_granularity = 0; q->limits.discard_alignment = 0; blk_queue_max_discard_sectors(q, 0); blk_queue_max_write_zeroes_sectors(q, 0); - blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); - return; - } - q->limits.discard_granularity = inode->i_sb->s_blocksize; - q->limits.discard_alignment = 0; + } else { + q->limits.discard_granularity = inode->i_sb->s_blocksize; + q->limits.discard_alignment = 0; + + blk_queue_max_discard_sectors(q, UINT_MAX >> 9); + blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); + } - blk_queue_max_discard_sectors(q, UINT_MAX >> 9); - blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9); - blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); + if (q->limits.max_discard_sectors || q->limits.max_write_zeroes_sectors) + blk_queue_flag_set(QUEUE_FLAG_DISCARD, q); + else + blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q); } static void loop_unprepare_queue(struct loop_device *lo)
If the backing device for a loop device is a block device, then mirror the discard properties of the underlying block device into the loop device. While in there, differentiate between REQ_OP_DISCARD and REQ_OP_WRITE_ZEROES, which are different for block devices, but which the loop device had just been lumping together. Signed-off-by: Evan Green <evgreen@chromium.org> --- drivers/block/loop.c | 61 +++++++++++++++++++++++++++++++++++----------------- 1 file changed, 41 insertions(+), 20 deletions(-)