Message ID | 1479413642-22463-5-git-send-email-eblake@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 17.11.2016 21:13, Eric Blake wrote: > Right now, the block layer rounds discard requests, so that > individual drivers are able to assert that discard requests > will never be unaligned. But there are some ISCSI devices > that track and coalesce multiple unaligned requests, turning it > into an actual discard if the requests eventually cover an > entire page, which implies that it is better to always pass > discard requests as low down the stack as possible. > > In isolation, this patch has no semantic effect, since the > block layer currently never passes an unaligned request through. > But the block layer already has code that silently ignores > drivers that return -ENOTSUP for a discard request that cannot > be honored (as well as drivers that return 0 even when nothing > was done). But the next patch will update the block layer to > fragment discard requests, so that clients are guaranteed that > they are either dealing with an unaligned head or tail, or an > aligned core, making it similar to the block layer semantics of > write zero fragmentation. > > CC: qemu-stable@nongnu.org > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > block/iscsi.c | 4 +++- > block/qcow2.c | 5 +++++ > block/sheepdog.c | 5 +++-- > 3 files changed, 11 insertions(+), 3 deletions(-) OK, so much about my remark that the comment describing pdiscard_alignment only says it's an "optimal alignment"... > > diff --git a/block/iscsi.c b/block/iscsi.c > index 71bd523..0960929 100644 > --- a/block/iscsi.c > +++ b/block/iscsi.c > @@ -1083,7 +1083,9 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int count) > struct IscsiTask iTask; > struct unmap_list list; > > - assert(is_byte_request_lun_aligned(offset, count, iscsilun)); > + if (!is_byte_request_lun_aligned(offset, count, iscsilun)) { > + return -ENOTSUP; > + } > > if (!iscsilun->lbp.lbpu) { > /* UNMAP is not supported by the target */ Next line is: > return 0; Hmm... -ENOTSUP would be the obvious return value here, too. That might interfere with your next patch, though. > diff --git a/block/qcow2.c b/block/qcow2.c > index e22f6dc..7cfcd84 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -2491,6 +2491,11 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, > int ret; > BDRVQcow2State *s = bs->opaque; > > + if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) { Ha! I like "offset | count". > + assert(count < s->cluster_size); Maybe add a comment for this assertion? E.g. "The block layer will only generate unaligned discard requests that are smaller than the alignment". > + return -ENOTSUP; > + } > + > qemu_co_mutex_lock(&s->lock); > ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS, > QCOW2_DISCARD_REQUEST, false); > diff --git a/block/sheepdog.c b/block/sheepdog.c > index 1fb9173..4c9af89 100644 > --- a/block/sheepdog.c > +++ b/block/sheepdog.c > @@ -2829,8 +2829,9 @@ static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset, > iov.iov_len = sizeof(zero); > discard_iov.iov = &iov; > discard_iov.niov = 1; > - assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0); > - assert((count & (BDRV_SECTOR_SIZE - 1)) == 0); > + if (!QEMU_IS_ALIGNED(offset | count, BDRV_SECTOR_SIZE)) { > + return -ENOTSUP; > + } Out of interest: Where does sheepdog tell the block layer that requests have to be aligned to this value? With this patch, it doesn't matter though, it only did before, so: Reviewed-by: Max Reitz <mreitz@redhat.com> > acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS, > count >> BDRV_SECTOR_BITS); > acb->aiocb_type = AIOCB_DISCARD_OBJ; >
On 11/17/2016 04:01 PM, Max Reitz wrote: > On 17.11.2016 21:13, Eric Blake wrote: >> Right now, the block layer rounds discard requests, so that >> individual drivers are able to assert that discard requests >> will never be unaligned. But there are some ISCSI devices >> that track and coalesce multiple unaligned requests, turning it >> into an actual discard if the requests eventually cover an >> entire page, which implies that it is better to always pass >> discard requests as low down the stack as possible. >> >> In isolation, this patch has no semantic effect, since the >> block layer currently never passes an unaligned request through. >> But the block layer already has code that silently ignores >> drivers that return -ENOTSUP for a discard request that cannot >> be honored (as well as drivers that return 0 even when nothing >> was done). But the next patch will update the block layer to >> fragment discard requests, so that clients are guaranteed that >> they are either dealing with an unaligned head or tail, or an >> aligned core, making it similar to the block layer semantics of >> write zero fragmentation. >> >> +++ b/block/iscsi.c >> @@ -1083,7 +1083,9 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int count) >> struct IscsiTask iTask; >> struct unmap_list list; >> >> - assert(is_byte_request_lun_aligned(offset, count, iscsilun)); >> + if (!is_byte_request_lun_aligned(offset, count, iscsilun)) { >> + return -ENOTSUP; >> + } >> >> if (!iscsilun->lbp.lbpu) { >> /* UNMAP is not supported by the target */ > > Next line is: > >> return 0; > > Hmm... -ENOTSUP would be the obvious return value here, too. That might > interfere with your next patch, though. Shouldn't interfere. I guess no one value is better than the other; I can respin to pick a consistent value (I'd lean towards -ENOTSUP) if you think it is worth it; but I'd rather get this into 2.8 without worrying about it. > >> diff --git a/block/qcow2.c b/block/qcow2.c >> index e22f6dc..7cfcd84 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2491,6 +2491,11 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, >> int ret; >> BDRVQcow2State *s = bs->opaque; >> >> + if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) { > > Ha! I like "offset | count". It only works because we know that qcow2 guarantees that s->cluster_size is a power of 2 (it does not work at the block layer, where the bs->bl.pdiscard_align need not be a power of 2). > >> + assert(count < s->cluster_size); > > Maybe add a comment for this assertion? E.g. "The block layer will only > generate unaligned discard requests that are smaller than the alignment". Sure, if the maintainer wants a respin. > >> + return -ENOTSUP; >> + } >> + >> qemu_co_mutex_lock(&s->lock); >> ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS, >> QCOW2_DISCARD_REQUEST, false); >> diff --git a/block/sheepdog.c b/block/sheepdog.c >> index 1fb9173..4c9af89 100644 >> --- a/block/sheepdog.c >> +++ b/block/sheepdog.c >> @@ -2829,8 +2829,9 @@ static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset, >> iov.iov_len = sizeof(zero); >> discard_iov.iov = &iov; >> discard_iov.niov = 1; >> - assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0); >> - assert((count & (BDRV_SECTOR_SIZE - 1)) == 0); >> + if (!QEMU_IS_ALIGNED(offset | count, BDRV_SECTOR_SIZE)) { Again, works because this is a power of 2. >> + return -ENOTSUP; >> + } > > Out of interest: Where does sheepdog tell the block layer that requests > have to be aligned to this value? It's Magic ! Don't tell anyone that I told you :) Actually, it's because sheepdog still uses .bdrv_co_readv (sector-based), and not .bdrv_co_preadv (byte-based), so the block layer automatically sets bs->bl.request_alignment to BDRV_SECTOR_SIZE for sheepdog and all other old-school drivers. The block layer then treats bs->bl.request_alignment as the very minimum that can be changed in the image at a time, so it only makes sense that sheepdog can't react to a discard request aligned below those limits. This code is weakening from an assertion to an early error return, and then 5/9 is what starts calling the code even with something aligned smaller than a sector. Someday the sheepdog driver may be relaxed to implement byte-based callbacks, and then we may want to delete the early error return of -ENOTSUP for requests smaller than 512; but that depends on whether sheepdog uses bytes or sectors over the wire. > > With this patch, it doesn't matter though, it only did before, so: > > Reviewed-by: Max Reitz <mreitz@redhat.com> > >> acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS, >> count >> BDRV_SECTOR_BITS); >> acb->aiocb_type = AIOCB_DISCARD_OBJ; >> > >
diff --git a/block/iscsi.c b/block/iscsi.c index 71bd523..0960929 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1083,7 +1083,9 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int count) struct IscsiTask iTask; struct unmap_list list; - assert(is_byte_request_lun_aligned(offset, count, iscsilun)); + if (!is_byte_request_lun_aligned(offset, count, iscsilun)) { + return -ENOTSUP; + } if (!iscsilun->lbp.lbpu) { /* UNMAP is not supported by the target */ diff --git a/block/qcow2.c b/block/qcow2.c index e22f6dc..7cfcd84 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2491,6 +2491,11 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, int ret; BDRVQcow2State *s = bs->opaque; + if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) { + assert(count < s->cluster_size); + return -ENOTSUP; + } + qemu_co_mutex_lock(&s->lock); ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS, QCOW2_DISCARD_REQUEST, false); diff --git a/block/sheepdog.c b/block/sheepdog.c index 1fb9173..4c9af89 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -2829,8 +2829,9 @@ static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset, iov.iov_len = sizeof(zero); discard_iov.iov = &iov; discard_iov.niov = 1; - assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0); - assert((count & (BDRV_SECTOR_SIZE - 1)) == 0); + if (!QEMU_IS_ALIGNED(offset | count, BDRV_SECTOR_SIZE)) { + return -ENOTSUP; + } acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS, count >> BDRV_SECTOR_BITS); acb->aiocb_type = AIOCB_DISCARD_OBJ;
Right now, the block layer rounds discard requests, so that individual drivers are able to assert that discard requests will never be unaligned. But there are some ISCSI devices that track and coalesce multiple unaligned requests, turning it into an actual discard if the requests eventually cover an entire page, which implies that it is better to always pass discard requests as low down the stack as possible. In isolation, this patch has no semantic effect, since the block layer currently never passes an unaligned request through. But the block layer already has code that silently ignores drivers that return -ENOTSUP for a discard request that cannot be honored (as well as drivers that return 0 even when nothing was done). But the next patch will update the block layer to fragment discard requests, so that clients are guaranteed that they are either dealing with an unaligned head or tail, or an aligned core, making it similar to the block layer semantics of write zero fragmentation. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> --- block/iscsi.c | 4 +++- block/qcow2.c | 5 +++++ block/sheepdog.c | 5 +++-- 3 files changed, 11 insertions(+), 3 deletions(-)