diff mbox series

[2/2] block: virtio_blk: fix handling single range discard request

Message ID 20200811092134.2256095-3-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series block: fix discard merge for single max discard segment | expand

Commit Message

Ming Lei Aug. 11, 2020, 9:21 a.m. UTC
1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
to support multi-range discard for virtio-blk. However, the virtio-blk
disk may report max discard segment as 1, at least that is exactly what
qemu is doing.

So far, block layer switches to normal request merge if max discard segment
limit is 1, and multiple bios can be merged to single segment. This way may
cause memory corruption in virtblk_setup_discard_write_zeroes().

Fix the issue by handling single max discard segment in straightforward
way.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Changpeng Liu <changpeng.liu@intel.com>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/block/virtio_blk.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

Comments

Stefano Garzarella Aug. 11, 2020, 12:30 p.m. UTC | #1
Hi Ming,

On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote:
> 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
> to support multi-range discard for virtio-blk. However, the virtio-blk
> disk may report max discard segment as 1, at least that is exactly what
> qemu is doing.
> 
> So far, block layer switches to normal request merge if max discard segment
> limit is 1, and multiple bios can be merged to single segment. This way may
> cause memory corruption in virtblk_setup_discard_write_zeroes().
> 
> Fix the issue by handling single max discard segment in straightforward
> way.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Changpeng Liu <changpeng.liu@intel.com>
> Cc: Daniel Verkamp <dverkamp@chromium.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  drivers/block/virtio_blk.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 63b213e00b37..05b01903122b 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c'
> @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
>  	if (!range)
>  		return -ENOMEM;

We are allocating the 'range' array to contain 'segments' elements.
When queue_max_discard_segments() returns 1, should we limit 'segments'
to 1?

>  
> -	__rq_for_each_bio(bio, req) {
> -		u64 sector = bio->bi_iter.bi_sector;
> -		u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
> -
> -		range[n].flags = cpu_to_le32(flags);
> -		range[n].num_sectors = cpu_to_le32(num_sectors);
> -		range[n].sector = cpu_to_le64(sector);
> -		n++;
> +	if (queue_max_discard_segments(req->q) == 1) {
> +		range[0].flags = cpu_to_le32(flags);
> +		range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req));
> +		range[0].sector = cpu_to_le64(blk_rq_pos(req));
> +		n = 1;
                ^
                this doesn't seem necessary since we don't use 'n' afterwards.

Thanks,
Stefano

> +	} else {
> +		__rq_for_each_bio(bio, req) {
> +			u64 sector = bio->bi_iter.bi_sector;
> +			u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
> +
> +			range[n].flags = cpu_to_le32(flags);
> +			range[n].num_sectors = cpu_to_le32(num_sectors);
> +			range[n].sector = cpu_to_le64(sector);
> +			n++;
> +		}
>  	}
>  
>  	req->special_vec.bv_page = virt_to_page(range);
> -- 
> 2.25.2
>
Ming Lei Aug. 11, 2020, 1:09 p.m. UTC | #2
On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote:
> Hi Ming,
> 
> On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote:
> > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
> > to support multi-range discard for virtio-blk. However, the virtio-blk
> > disk may report max discard segment as 1, at least that is exactly what
> > qemu is doing.
> > 
> > So far, block layer switches to normal request merge if max discard segment
> > limit is 1, and multiple bios can be merged to single segment. This way may
> > cause memory corruption in virtblk_setup_discard_write_zeroes().
> > 
> > Fix the issue by handling single max discard segment in straightforward
> > way.
> > 
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Changpeng Liu <changpeng.liu@intel.com>
> > Cc: Daniel Verkamp <dverkamp@chromium.org>
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  drivers/block/virtio_blk.c | 23 +++++++++++++++--------
> >  1 file changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index 63b213e00b37..05b01903122b 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c'
> > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
> >  	if (!range)
> >  		return -ENOMEM;
> 
> We are allocating the 'range' array to contain 'segments' elements.
> When queue_max_discard_segments() returns 1, should we limit 'segments'
> to 1?

That is block layer's responsibility to make sure that 'segments' is <=
1, and we can double check & warn here.

> 
> >  
> > -	__rq_for_each_bio(bio, req) {
> > -		u64 sector = bio->bi_iter.bi_sector;
> > -		u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
> > -
> > -		range[n].flags = cpu_to_le32(flags);
> > -		range[n].num_sectors = cpu_to_le32(num_sectors);
> > -		range[n].sector = cpu_to_le64(sector);
> > -		n++;
> > +	if (queue_max_discard_segments(req->q) == 1) {
> > +		range[0].flags = cpu_to_le32(flags);
> > +		range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req));
> > +		range[0].sector = cpu_to_le64(blk_rq_pos(req));
> > +		n = 1;
>                 ^
>                 this doesn't seem necessary since we don't use 'n' afterwards.

OK.

Thanks,
Ming
Stefano Garzarella Aug. 11, 2020, 1:39 p.m. UTC | #3
On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote:
> On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote:
> > Hi Ming,
> > 
> > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote:
> > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
> > > to support multi-range discard for virtio-blk. However, the virtio-blk
> > > disk may report max discard segment as 1, at least that is exactly what
> > > qemu is doing.
> > > 
> > > So far, block layer switches to normal request merge if max discard segment
> > > limit is 1, and multiple bios can be merged to single segment. This way may
> > > cause memory corruption in virtblk_setup_discard_write_zeroes().
> > > 
> > > Fix the issue by handling single max discard segment in straightforward
> > > way.
> > > 
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Changpeng Liu <changpeng.liu@intel.com>
> > > Cc: Daniel Verkamp <dverkamp@chromium.org>
> > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > >  drivers/block/virtio_blk.c | 23 +++++++++++++++--------
> > >  1 file changed, 15 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > > index 63b213e00b37..05b01903122b 100644
> > > --- a/drivers/block/virtio_blk.c
> > > +++ b/drivers/block/virtio_blk.c'
> > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
> > >  	if (!range)
> > >  		return -ENOMEM;
> > 
> > We are allocating the 'range' array to contain 'segments' elements.
> > When queue_max_discard_segments() returns 1, should we limit 'segments'
> > to 1?
> 
> That is block layer's responsibility to make sure that 'segments' is <=
> 1, and we can double check & warn here.

So, IIUC, the number of bio in a request may not be the same as
the return value of blk_rq_nr_discard_segments(). Is it right?

If it is true, is not clear to me why we are allocating the 'range'
array using 'segments' as the size, and then we are filling it cycling
on the bios.

Maybe I need to take a closer look at the block layer to understand it better.

Thanks,
Stefano

> 
> > 
> > >  
> > > -	__rq_for_each_bio(bio, req) {
> > > -		u64 sector = bio->bi_iter.bi_sector;
> > > -		u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
> > > -
> > > -		range[n].flags = cpu_to_le32(flags);
> > > -		range[n].num_sectors = cpu_to_le32(num_sectors);
> > > -		range[n].sector = cpu_to_le64(sector);
> > > -		n++;
> > > +	if (queue_max_discard_segments(req->q) == 1) {
> > > +		range[0].flags = cpu_to_le32(flags);
> > > +		range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req));
> > > +		range[0].sector = cpu_to_le64(blk_rq_pos(req));
> > > +		n = 1;
> >                 ^
> >                 this doesn't seem necessary since we don't use 'n' afterwards.
> 
> OK.
> 
> Thanks,
> Ming
>
Ming Lei Aug. 11, 2020, 1:43 p.m. UTC | #4
On Tue, Aug 11, 2020 at 03:39:25PM +0200, Stefano Garzarella wrote:
> On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote:
> > On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote:
> > > Hi Ming,
> > > 
> > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote:
> > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
> > > > to support multi-range discard for virtio-blk. However, the virtio-blk
> > > > disk may report max discard segment as 1, at least that is exactly what
> > > > qemu is doing.
> > > > 
> > > > So far, block layer switches to normal request merge if max discard segment
> > > > limit is 1, and multiple bios can be merged to single segment. This way may
> > > > cause memory corruption in virtblk_setup_discard_write_zeroes().
> > > > 
> > > > Fix the issue by handling single max discard segment in straightforward
> > > > way.
> > > > 
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> > > > Cc: Christoph Hellwig <hch@lst.de>
> > > > Cc: Changpeng Liu <changpeng.liu@intel.com>
> > > > Cc: Daniel Verkamp <dverkamp@chromium.org>
> > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > >  drivers/block/virtio_blk.c | 23 +++++++++++++++--------
> > > >  1 file changed, 15 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > > > index 63b213e00b37..05b01903122b 100644
> > > > --- a/drivers/block/virtio_blk.c
> > > > +++ b/drivers/block/virtio_blk.c'
> > > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
> > > >  	if (!range)
> > > >  		return -ENOMEM;
> > > 
> > > We are allocating the 'range' array to contain 'segments' elements.
> > > When queue_max_discard_segments() returns 1, should we limit 'segments'
> > > to 1?
> > 
> > That is block layer's responsibility to make sure that 'segments' is <=
> > 1, and we can double check & warn here.
> 
> So, IIUC, the number of bio in a request may not be the same as
> the return value of blk_rq_nr_discard_segments(). Is it right?

In case that queue_max_discard_segments() is 1, it is right. If
queue_max_discard_segments() is > 1, nr_range is supposed to be
same with number of bios in a request.


Thanks, 
Ming
Stefano Garzarella Aug. 11, 2020, 2:16 p.m. UTC | #5
On Tue, Aug 11, 2020 at 09:43:26PM +0800, Ming Lei wrote:
> On Tue, Aug 11, 2020 at 03:39:25PM +0200, Stefano Garzarella wrote:
> > On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote:
> > > On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote:
> > > > Hi Ming,
> > > > 
> > > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote:
> > > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts
> > > > > to support multi-range discard for virtio-blk. However, the virtio-blk
> > > > > disk may report max discard segment as 1, at least that is exactly what
> > > > > qemu is doing.
> > > > > 
> > > > > So far, block layer switches to normal request merge if max discard segment
> > > > > limit is 1, and multiple bios can be merged to single segment. This way may
> > > > > cause memory corruption in virtblk_setup_discard_write_zeroes().
> > > > > 
> > > > > Fix the issue by handling single max discard segment in straightforward
> > > > > way.
> > > > > 
> > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> > > > > Cc: Christoph Hellwig <hch@lst.de>
> > > > > Cc: Changpeng Liu <changpeng.liu@intel.com>
> > > > > Cc: Daniel Verkamp <dverkamp@chromium.org>
> > > > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > ---
> > > > >  drivers/block/virtio_blk.c | 23 +++++++++++++++--------
> > > > >  1 file changed, 15 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > > > > index 63b213e00b37..05b01903122b 100644
> > > > > --- a/drivers/block/virtio_blk.c
> > > > > +++ b/drivers/block/virtio_blk.c'
> > > > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
> > > > >  	if (!range)
> > > > >  		return -ENOMEM;
> > > > 
> > > > We are allocating the 'range' array to contain 'segments' elements.
> > > > When queue_max_discard_segments() returns 1, should we limit 'segments'
> > > > to 1?
> > > 
> > > That is block layer's responsibility to make sure that 'segments' is <=
> > > 1, and we can double check & warn here.
> > 
> > So, IIUC, the number of bio in a request may not be the same as
> > the return value of blk_rq_nr_discard_segments(). Is it right?
> 
> In case that queue_max_discard_segments() is 1, it is right. If
> queue_max_discard_segments() is > 1, nr_range is supposed to be
> same with number of bios in a request.

Got it. Thanks for clarify.

In the meantime I took a look at nvme_setup_discard() and there is
WARN_ON_ONCE(n != segments), maybe we should do the same.

Thanks,
Stefano
diff mbox series

Patch

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 63b213e00b37..05b01903122b 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -126,14 +126,21 @@  static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
 	if (!range)
 		return -ENOMEM;
 
-	__rq_for_each_bio(bio, req) {
-		u64 sector = bio->bi_iter.bi_sector;
-		u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
-
-		range[n].flags = cpu_to_le32(flags);
-		range[n].num_sectors = cpu_to_le32(num_sectors);
-		range[n].sector = cpu_to_le64(sector);
-		n++;
+	if (queue_max_discard_segments(req->q) == 1) {
+		range[0].flags = cpu_to_le32(flags);
+		range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req));
+		range[0].sector = cpu_to_le64(blk_rq_pos(req));
+		n = 1;
+	} else {
+		__rq_for_each_bio(bio, req) {
+			u64 sector = bio->bi_iter.bi_sector;
+			u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
+
+			range[n].flags = cpu_to_le32(flags);
+			range[n].num_sectors = cpu_to_le32(num_sectors);
+			range[n].sector = cpu_to_le64(sector);
+			n++;
+		}
 	}
 
 	req->special_vec.bv_page = virt_to_page(range);