Message ID | 20200811092134.2256095-3-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: fix discard merge for single max discard segment | expand |
Hi Ming, On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote: > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > to support multi-range discard for virtio-blk. However, the virtio-blk > disk may report max discard segment as 1, at least that is exactly what > qemu is doing. > > So far, block layer switches to normal request merge if max discard segment > limit is 1, and multiple bios can be merged to single segment. This way may > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > Fix the issue by handling single max discard segment in straightforward > way. > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > Cc: Christoph Hellwig <hch@lst.de> > Cc: Changpeng Liu <changpeng.liu@intel.com> > Cc: Daniel Verkamp <dverkamp@chromium.org> > Cc: Michael S. Tsirkin <mst@redhat.com> > Cc: Stefan Hajnoczi <stefanha@redhat.com> > --- > drivers/block/virtio_blk.c | 23 +++++++++++++++-------- > 1 file changed, 15 insertions(+), 8 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 63b213e00b37..05b01903122b 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c' > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > if (!range) > return -ENOMEM; We are allocating the 'range' array to contain 'segments' elements. When queue_max_discard_segments() returns 1, should we limit 'segments' to 1? > > - __rq_for_each_bio(bio, req) { > - u64 sector = bio->bi_iter.bi_sector; > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > - > - range[n].flags = cpu_to_le32(flags); > - range[n].num_sectors = cpu_to_le32(num_sectors); > - range[n].sector = cpu_to_le64(sector); > - n++; > + if (queue_max_discard_segments(req->q) == 1) { > + range[0].flags = cpu_to_le32(flags); > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > + n = 1; ^ this doesn't seem necessary since we don't use 'n' afterwards. Thanks, Stefano > + } else { > + __rq_for_each_bio(bio, req) { > + u64 sector = bio->bi_iter.bi_sector; > + u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + range[n].flags = cpu_to_le32(flags); > + range[n].num_sectors = cpu_to_le32(num_sectors); > + range[n].sector = cpu_to_le64(sector); > + n++; > + } > } > > req->special_vec.bv_page = virt_to_page(range); > -- > 2.25.2 >
On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote: > Hi Ming, > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote: > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > to support multi-range discard for virtio-blk. However, the virtio-blk > > disk may report max discard segment as 1, at least that is exactly what > > qemu is doing. > > > > So far, block layer switches to normal request merge if max discard segment > > limit is 1, and multiple bios can be merged to single segment. This way may > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > Fix the issue by handling single max discard segment in straightforward > > way. > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > Cc: Christoph Hellwig <hch@lst.de> > > Cc: Changpeng Liu <changpeng.liu@intel.com> > > Cc: Daniel Verkamp <dverkamp@chromium.org> > > Cc: Michael S. Tsirkin <mst@redhat.com> > > Cc: Stefan Hajnoczi <stefanha@redhat.com> > > --- > > drivers/block/virtio_blk.c | 23 +++++++++++++++-------- > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > index 63b213e00b37..05b01903122b 100644 > > --- a/drivers/block/virtio_blk.c > > +++ b/drivers/block/virtio_blk.c' > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > if (!range) > > return -ENOMEM; > > We are allocating the 'range' array to contain 'segments' elements. > When queue_max_discard_segments() returns 1, should we limit 'segments' > to 1? That is block layer's responsibility to make sure that 'segments' is <= 1, and we can double check & warn here. > > > > > - __rq_for_each_bio(bio, req) { > > - u64 sector = bio->bi_iter.bi_sector; > > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > - > > - range[n].flags = cpu_to_le32(flags); > > - range[n].num_sectors = cpu_to_le32(num_sectors); > > - range[n].sector = cpu_to_le64(sector); > > - n++; > > + if (queue_max_discard_segments(req->q) == 1) { > > + range[0].flags = cpu_to_le32(flags); > > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > > + n = 1; > ^ > this doesn't seem necessary since we don't use 'n' afterwards. OK. Thanks, Ming
On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote: > On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote: > > Hi Ming, > > > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote: > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > > to support multi-range discard for virtio-blk. However, the virtio-blk > > > disk may report max discard segment as 1, at least that is exactly what > > > qemu is doing. > > > > > > So far, block layer switches to normal request merge if max discard segment > > > limit is 1, and multiple bios can be merged to single segment. This way may > > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > > > Fix the issue by handling single max discard segment in straightforward > > > way. > > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > > Cc: Christoph Hellwig <hch@lst.de> > > > Cc: Changpeng Liu <changpeng.liu@intel.com> > > > Cc: Daniel Verkamp <dverkamp@chromium.org> > > > Cc: Michael S. Tsirkin <mst@redhat.com> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com> > > > --- > > > drivers/block/virtio_blk.c | 23 +++++++++++++++-------- > > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > > index 63b213e00b37..05b01903122b 100644 > > > --- a/drivers/block/virtio_blk.c > > > +++ b/drivers/block/virtio_blk.c' > > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > > if (!range) > > > return -ENOMEM; > > > > We are allocating the 'range' array to contain 'segments' elements. > > When queue_max_discard_segments() returns 1, should we limit 'segments' > > to 1? > > That is block layer's responsibility to make sure that 'segments' is <= > 1, and we can double check & warn here. So, IIUC, the number of bio in a request may not be the same as the return value of blk_rq_nr_discard_segments(). Is it right? If it is true, is not clear to me why we are allocating the 'range' array using 'segments' as the size, and then we are filling it cycling on the bios. Maybe I need to take a closer look at the block layer to understand it better. Thanks, Stefano > > > > > > > > > - __rq_for_each_bio(bio, req) { > > > - u64 sector = bio->bi_iter.bi_sector; > > > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > > - > > > - range[n].flags = cpu_to_le32(flags); > > > - range[n].num_sectors = cpu_to_le32(num_sectors); > > > - range[n].sector = cpu_to_le64(sector); > > > - n++; > > > + if (queue_max_discard_segments(req->q) == 1) { > > > + range[0].flags = cpu_to_le32(flags); > > > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > > > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > > > + n = 1; > > ^ > > this doesn't seem necessary since we don't use 'n' afterwards. > > OK. > > Thanks, > Ming >
On Tue, Aug 11, 2020 at 03:39:25PM +0200, Stefano Garzarella wrote: > On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote: > > On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote: > > > Hi Ming, > > > > > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote: > > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > > > to support multi-range discard for virtio-blk. However, the virtio-blk > > > > disk may report max discard segment as 1, at least that is exactly what > > > > qemu is doing. > > > > > > > > So far, block layer switches to normal request merge if max discard segment > > > > limit is 1, and multiple bios can be merged to single segment. This way may > > > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > > > > > Fix the issue by handling single max discard segment in straightforward > > > > way. > > > > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > > > Cc: Christoph Hellwig <hch@lst.de> > > > > Cc: Changpeng Liu <changpeng.liu@intel.com> > > > > Cc: Daniel Verkamp <dverkamp@chromium.org> > > > > Cc: Michael S. Tsirkin <mst@redhat.com> > > > > Cc: Stefan Hajnoczi <stefanha@redhat.com> > > > > --- > > > > drivers/block/virtio_blk.c | 23 +++++++++++++++-------- > > > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > > > index 63b213e00b37..05b01903122b 100644 > > > > --- a/drivers/block/virtio_blk.c > > > > +++ b/drivers/block/virtio_blk.c' > > > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > > > if (!range) > > > > return -ENOMEM; > > > > > > We are allocating the 'range' array to contain 'segments' elements. > > > When queue_max_discard_segments() returns 1, should we limit 'segments' > > > to 1? > > > > That is block layer's responsibility to make sure that 'segments' is <= > > 1, and we can double check & warn here. > > So, IIUC, the number of bio in a request may not be the same as > the return value of blk_rq_nr_discard_segments(). Is it right? In case that queue_max_discard_segments() is 1, it is right. If queue_max_discard_segments() is > 1, nr_range is supposed to be same with number of bios in a request. Thanks, Ming
On Tue, Aug 11, 2020 at 09:43:26PM +0800, Ming Lei wrote: > On Tue, Aug 11, 2020 at 03:39:25PM +0200, Stefano Garzarella wrote: > > On Tue, Aug 11, 2020 at 09:09:53PM +0800, Ming Lei wrote: > > > On Tue, Aug 11, 2020 at 02:30:44PM +0200, Stefano Garzarella wrote: > > > > Hi Ming, > > > > > > > > On Tue, Aug 11, 2020 at 05:21:34PM +0800, Ming Lei wrote: > > > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > > > > to support multi-range discard for virtio-blk. However, the virtio-blk > > > > > disk may report max discard segment as 1, at least that is exactly what > > > > > qemu is doing. > > > > > > > > > > So far, block layer switches to normal request merge if max discard segment > > > > > limit is 1, and multiple bios can be merged to single segment. This way may > > > > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > > > > > > > Fix the issue by handling single max discard segment in straightforward > > > > > way. > > > > > > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > > > > Cc: Christoph Hellwig <hch@lst.de> > > > > > Cc: Changpeng Liu <changpeng.liu@intel.com> > > > > > Cc: Daniel Verkamp <dverkamp@chromium.org> > > > > > Cc: Michael S. Tsirkin <mst@redhat.com> > > > > > Cc: Stefan Hajnoczi <stefanha@redhat.com> > > > > > --- > > > > > drivers/block/virtio_blk.c | 23 +++++++++++++++-------- > > > > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > > > > index 63b213e00b37..05b01903122b 100644 > > > > > --- a/drivers/block/virtio_blk.c > > > > > +++ b/drivers/block/virtio_blk.c' > > > > > @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > > > > if (!range) > > > > > return -ENOMEM; > > > > > > > > We are allocating the 'range' array to contain 'segments' elements. > > > > When queue_max_discard_segments() returns 1, should we limit 'segments' > > > > to 1? > > > > > > That is block layer's responsibility to make sure that 'segments' is <= > > > 1, and we can double check & warn here. > > > > So, IIUC, the number of bio in a request may not be the same as > > the return value of blk_rq_nr_discard_segments(). Is it right? > > In case that queue_max_discard_segments() is 1, it is right. If > queue_max_discard_segments() is > 1, nr_range is supposed to be > same with number of bios in a request. Got it. Thanks for clarify. In the meantime I took a look at nvme_setup_discard() and there is WARN_ON_ONCE(n != segments), maybe we should do the same. Thanks, Stefano
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 63b213e00b37..05b01903122b 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -126,14 +126,21 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) if (!range) return -ENOMEM; - __rq_for_each_bio(bio, req) { - u64 sector = bio->bi_iter.bi_sector; - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; - - range[n].flags = cpu_to_le32(flags); - range[n].num_sectors = cpu_to_le32(num_sectors); - range[n].sector = cpu_to_le64(sector); - n++; + if (queue_max_discard_segments(req->q) == 1) { + range[0].flags = cpu_to_le32(flags); + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); + range[0].sector = cpu_to_le64(blk_rq_pos(req)); + n = 1; + } else { + __rq_for_each_bio(bio, req) { + u64 sector = bio->bi_iter.bi_sector; + u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; + + range[n].flags = cpu_to_le32(flags); + range[n].num_sectors = cpu_to_le32(num_sectors); + range[n].sector = cpu_to_le64(sector); + n++; + } } req->special_vec.bv_page = virt_to_page(range);
1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts to support multi-range discard for virtio-blk. However, the virtio-blk disk may report max discard segment as 1, at least that is exactly what qemu is doing. So far, block layer switches to normal request merge if max discard segment limit is 1, and multiple bios can be merged to single segment. This way may cause memory corruption in virtblk_setup_discard_write_zeroes(). Fix the issue by handling single max discard segment in straightforward way. Signed-off-by: Ming Lei <ming.lei@redhat.com> Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Cc: Christoph Hellwig <hch@lst.de> Cc: Changpeng Liu <changpeng.liu@intel.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> --- drivers/block/virtio_blk.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)