diff mbox

[5/5] block: Move request_alignment into BlockLimit

Message ID 5751F9E8.6090900@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Blake June 3, 2016, 9:43 p.m. UTC
On 06/03/2016 11:49 AM, Eric Blake wrote:
> On 06/03/2016 11:03 AM, Eric Blake wrote:
>> It makes more sense to have ALL block size limit constraints
>> in the same struct.  Improve the documentation while at it.
>>
>> Note that bdrv_refresh_limits() has to keep things alive across
>> a memset() of BlockLimits.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  include/block/block_int.h | 12 ++++++++----
>>  block.c                   |  4 ++--
>>  block/blkdebug.c          |  4 ++--
>>  block/bochs.c             |  2 +-
>>  block/cloop.c             |  2 +-
>>  block/dmg.c               |  2 +-
>>  block/io.c                | 12 +++++++-----
>>  block/iscsi.c             |  2 +-
>>  block/raw-posix.c         | 16 ++++++++--------
>>  block/raw-win32.c         |  6 +++---
>>  block/vvfat.c             |  2 +-
>>  11 files changed, 35 insertions(+), 29 deletions(-)
> 
> Something in this patch is causing qemu-iotests 77 to infloop; we may
> decide it is just easier to drop this patch rather than find all the
> places where the request_alignment must be preserved across what
> otherwise zeroes out limits.

Found it; squash this in (or use it as an argument why we don't want
request_alignment in bs->bl after all):

 static int raw_truncate(BlockDriverState *bs, int64_t offset)

Comments

Kevin Wolf June 7, 2016, 10:08 a.m. UTC | #1
Am 03.06.2016 um 23:43 hat Eric Blake geschrieben:
> On 06/03/2016 11:49 AM, Eric Blake wrote:
> > On 06/03/2016 11:03 AM, Eric Blake wrote:
> >> It makes more sense to have ALL block size limit constraints
> >> in the same struct.  Improve the documentation while at it.
> >>
> >> Note that bdrv_refresh_limits() has to keep things alive across
> >> a memset() of BlockLimits.
> >>
> >> Signed-off-by: Eric Blake <eblake@redhat.com>
> >> ---
> >>  include/block/block_int.h | 12 ++++++++----
> >>  block.c                   |  4 ++--
> >>  block/blkdebug.c          |  4 ++--
> >>  block/bochs.c             |  2 +-
> >>  block/cloop.c             |  2 +-
> >>  block/dmg.c               |  2 +-
> >>  block/io.c                | 12 +++++++-----
> >>  block/iscsi.c             |  2 +-
> >>  block/raw-posix.c         | 16 ++++++++--------
> >>  block/raw-win32.c         |  6 +++---
> >>  block/vvfat.c             |  2 +-
> >>  11 files changed, 35 insertions(+), 29 deletions(-)
> > 
> > Something in this patch is causing qemu-iotests 77 to infloop; we may
> > decide it is just easier to drop this patch rather than find all the
> > places where the request_alignment must be preserved across what
> > otherwise zeroes out limits.
> 
> Found it; squash this in (or use it as an argument why we don't want
> request_alignment in bs->bl after all):

This hunk doesn't make sense to me. For the correctness of the code it
shouldn't make a difference whether the alignment happens before passing
the request to file/raw-posix or already in the raw format layer.

The cause for the hang you're seeing is probably that the request is
already aligned before the blkdebug layer and therefore the blkdebug
events aren't generated any more. That's a problem with the test (I'm
considering the blkdebug events part of the test infrastructure),
however, and not with the code.

Kevin

> diff --git i/block/raw_bsd.c w/block/raw_bsd.c
> index b1d5237..c3c2246 100644
> --- i/block/raw_bsd.c
> +++ w/block/raw_bsd.c
> @@ -152,7 +152,11 @@ static int raw_get_info(BlockDriverState *bs,
> BlockDriverInfo *bdi)
> 
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
> +    /* Inherit all limits except for request_alignment */
> +    int request_alignment = bs->bl.request_alignment;
> +
>      bs->bl = bs->file->bs->bl;
> +    bs->bl.request_alignment = request_alignment;
>  }
> 
>  static int raw_truncate(BlockDriverState *bs, int64_t offset)
Paolo Bonzini June 7, 2016, 11:04 a.m. UTC | #2
On 07/06/2016 12:08, Kevin Wolf wrote:
>>> > > Something in this patch is causing qemu-iotests 77 to infloop; we may
>>> > > decide it is just easier to drop this patch rather than find all the
>>> > > places where the request_alignment must be preserved across what
>>> > > otherwise zeroes out limits.
>> > 
>> > Found it; squash this in (or use it as an argument why we don't want
>> > request_alignment in bs->bl after all):
> This hunk doesn't make sense to me. For the correctness of the code it
> shouldn't make a difference whether the alignment happens before passing
> the request to file/raw-posix or already in the raw format layer.
> 
> The cause for the hang you're seeing is probably that the request is
> already aligned before the blkdebug layer and therefore the blkdebug
> events aren't generated any more. That's a problem with the test (I'm
> considering the blkdebug events part of the test infrastructure),
> however, and not with the code.

Perhaps you could add an alignment option to blkdebug (or in general
options to force block limits on blkdebug), which would help testing in
general?

Thanks,

Paolo
Kevin Wolf June 7, 2016, 11:24 a.m. UTC | #3
Am 07.06.2016 um 13:04 hat Paolo Bonzini geschrieben:
> On 07/06/2016 12:08, Kevin Wolf wrote:
> >>> > > Something in this patch is causing qemu-iotests 77 to infloop; we may
> >>> > > decide it is just easier to drop this patch rather than find all the
> >>> > > places where the request_alignment must be preserved across what
> >>> > > otherwise zeroes out limits.
> >> > 
> >> > Found it; squash this in (or use it as an argument why we don't want
> >> > request_alignment in bs->bl after all):
> > This hunk doesn't make sense to me. For the correctness of the code it
> > shouldn't make a difference whether the alignment happens before passing
> > the request to file/raw-posix or already in the raw format layer.
> > 
> > The cause for the hang you're seeing is probably that the request is
> > already aligned before the blkdebug layer and therefore the blkdebug
> > events aren't generated any more. That's a problem with the test (I'm
> > considering the blkdebug events part of the test infrastructure),
> > however, and not with the code.
> 
> Perhaps you could add an alignment option to blkdebug (or in general
> options to force block limits on blkdebug), which would help testing in
> general?

It exists and is exactly what this test uses.

The problem is just that the raw format driver inherits the alignment,
so the RMW cycle that we want to test with blkdebug breakpoints happens
too early and the blkdebug layer (more precisely, the block/io.c
functions for the blkdebug BDS before they call into the driver) already
sees aligned requests and we don't get the events any more that the test
is looking for.

Kevin
Eric Blake June 14, 2016, 4:39 a.m. UTC | #4
On 06/07/2016 04:08 AM, Kevin Wolf wrote:

>> Found it; squash this in (or use it as an argument why we don't want
>> request_alignment in bs->bl after all):
> 
> This hunk doesn't make sense to me. For the correctness of the code it
> shouldn't make a difference whether the alignment happens before passing
> the request to file/raw-posix or already in the raw format layer.
> 
> The cause for the hang you're seeing is probably that the request is
> already aligned before the blkdebug layer and therefore the blkdebug
> events aren't generated any more. That's a problem with the test (I'm
> considering the blkdebug events part of the test infrastructure),
> however, and not with the code.
> 

Yes, it's definitely a hang caused by the test expecting an unalignment
event, but the inheritance chain now causes things to be aligned to
begin with and nothing unaligned happens after all.

> Kevin
> 
>> diff --git i/block/raw_bsd.c w/block/raw_bsd.c
>> index b1d5237..c3c2246 100644
>> --- i/block/raw_bsd.c
>> +++ w/block/raw_bsd.c
>> @@ -152,7 +152,11 @@ static int raw_get_info(BlockDriverState *bs,
>> BlockDriverInfo *bdi)
>>
>>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>> +    /* Inherit all limits except for request_alignment */
>> +    int request_alignment = bs->bl.request_alignment;
>> +
>>      bs->bl = bs->file->bs->bl;
>> +    bs->bl.request_alignment = request_alignment;

Any ideas on how to fix the test, then?  Have two blkdebug devices
nested atop one another, since those are the devices where we can
explicitly override alignment? (normally, you'd _expect_ the chain to
inherit the worst-case alignment of all BDS in the chain, so blkdebug is
the way around it).

That's the only thing left before I repost the series, so I may just
post the last patch as RFC and play with it a bit more...
Kevin Wolf June 14, 2016, 8:05 a.m. UTC | #5
Am 14.06.2016 um 06:39 hat Eric Blake geschrieben:
> On 06/07/2016 04:08 AM, Kevin Wolf wrote:
> 
> >> Found it; squash this in (or use it as an argument why we don't want
> >> request_alignment in bs->bl after all):
> > 
> > This hunk doesn't make sense to me. For the correctness of the code it
> > shouldn't make a difference whether the alignment happens before passing
> > the request to file/raw-posix or already in the raw format layer.
> > 
> > The cause for the hang you're seeing is probably that the request is
> > already aligned before the blkdebug layer and therefore the blkdebug
> > events aren't generated any more. That's a problem with the test (I'm
> > considering the blkdebug events part of the test infrastructure),
> > however, and not with the code.
> > 
> 
> Yes, it's definitely a hang caused by the test expecting an unalignment
> event, but the inheritance chain now causes things to be aligned to
> begin with and nothing unaligned happens after all.
> 
> > Kevin
> > 
> >> diff --git i/block/raw_bsd.c w/block/raw_bsd.c
> >> index b1d5237..c3c2246 100644
> >> --- i/block/raw_bsd.c
> >> +++ w/block/raw_bsd.c
> >> @@ -152,7 +152,11 @@ static int raw_get_info(BlockDriverState *bs,
> >> BlockDriverInfo *bdi)
> >>
> >>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> >>  {
> >> +    /* Inherit all limits except for request_alignment */
> >> +    int request_alignment = bs->bl.request_alignment;
> >> +
> >>      bs->bl = bs->file->bs->bl;
> >> +    bs->bl.request_alignment = request_alignment;
> 
> Any ideas on how to fix the test, then?  Have two blkdebug devices
> nested atop one another, since those are the devices where we can
> explicitly override alignment?

Interesting idea. Maybe that's a good option if it works.

> (normally, you'd _expect_ the chain to
> inherit the worst-case alignment of all BDS in the chain, so blkdebug is
> the way around it).

Actually, I think there are two cases.

The first one is that you get a request and want to know what to do with
it. Here you don't want to inherit the worst-case alignment. You'd
rather want to enforce alignment only when it is actually needed. For
example, if you have a cache=none backing file with 4k alignment, but a
cache=writeback overlay, and you don't actually access the backing file
with this request, it would be wasteful to align the request. You also
don't really know that a driver will issue a misaligned request (or any
request at all) to the lower layer just because it got called with one.

The second case is when you want to issue a request. For example, let's
take the qcow2 cache here, which has 64k cached and needs to update a
few bytes. Currently it always writes the full 64k (and with 1 MB
clusters a full megabyte), but what it really should do is consider the
worst-case alignment.

This is comparable to the difference between bdrv_opt_mem_align(), which
is used in qemu_blockalign() where we want to create a buffer that works
even in the worst case, and bdrv_min_mem_align(), which is used in
bdrv_qiov_is_aligned() in order to determine whether we must align an
existing request.

> That's the only thing left before I repost the series, so I may just
> post the last patch as RFC and play with it a bit more...

And in the light of the above, maybe the solution is as easy as changing
raw_refresh_limits() to set bs->bl.request_alignment = 1.

Kevin
Eric Blake June 14, 2016, 2:47 p.m. UTC | #6
On 06/14/2016 02:05 AM, Kevin Wolf wrote:

>>>>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>  {
>>>> +    /* Inherit all limits except for request_alignment */
>>>> +    int request_alignment = bs->bl.request_alignment;
>>>> +
>>>>      bs->bl = bs->file->bs->bl;
>>>> +    bs->bl.request_alignment = request_alignment;
>>
>> Any ideas on how to fix the test, then?  Have two blkdebug devices
>> nested atop one another, since those are the devices where we can
>> explicitly override alignment?
> 
> Interesting idea. Maybe that's a good option if it works.
> 
>> (normally, you'd _expect_ the chain to
>> inherit the worst-case alignment of all BDS in the chain, so blkdebug is
>> the way around it).
> 
> Actually, I think there are two cases.
> 
> The first one is that you get a request and want to know what to do with
> it. Here you don't want to inherit the worst-case alignment. You'd
> rather want to enforce alignment only when it is actually needed. For
> example, if you have a cache=none backing file with 4k alignment, but a
> cache=writeback overlay, and you don't actually access the backing file
> with this request, it would be wasteful to align the request. You also
> don't really know that a driver will issue a misaligned request (or any
> request at all) to the lower layer just because it got called with one.
> 
> The second case is when you want to issue a request. For example, let's
> take the qcow2 cache here, which has 64k cached and needs to update a
> few bytes. Currently it always writes the full 64k (and with 1 MB
> clusters a full megabyte), but what it really should do is consider the
> worst-case alignment.
> 
> This is comparable to the difference between bdrv_opt_mem_align(), which
> is used in qemu_blockalign() where we want to create a buffer that works
> even in the worst case, and bdrv_min_mem_align(), which is used in
> bdrv_qiov_is_aligned() in order to determine whether we must align an
> existing request.
> 
>> That's the only thing left before I repost the series, so I may just
>> post the last patch as RFC and play with it a bit more...
> 
> And in the light of the above, maybe the solution is as easy as changing
> raw_refresh_limits() to set bs->bl.request_alignment = 1.

Or maybe split the main bdrv_refresh_limits() into two pieces: one that
merges limits from one BDS into another (the limits that are worth
merging, and in the correct direction: opt merges to MAX, max merges to
MIN_NON_ZERO, request_alignment is not merged), the other that calls
merge(bs, bs->file->bs); then have raw_refresh_limits() also use the
common merge functionality rather than straight copy.  Testing that
approach now...
Kevin Wolf June 14, 2016, 3:30 p.m. UTC | #7
Am 14.06.2016 um 16:47 hat Eric Blake geschrieben:
> On 06/14/2016 02:05 AM, Kevin Wolf wrote:
> 
> >>>>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>  {
> >>>> +    /* Inherit all limits except for request_alignment */
> >>>> +    int request_alignment = bs->bl.request_alignment;
> >>>> +
> >>>>      bs->bl = bs->file->bs->bl;
> >>>> +    bs->bl.request_alignment = request_alignment;
> >>
> >> Any ideas on how to fix the test, then?  Have two blkdebug devices
> >> nested atop one another, since those are the devices where we can
> >> explicitly override alignment?
> > 
> > Interesting idea. Maybe that's a good option if it works.
> > 
> >> (normally, you'd _expect_ the chain to
> >> inherit the worst-case alignment of all BDS in the chain, so blkdebug is
> >> the way around it).
> > 
> > Actually, I think there are two cases.
> > 
> > The first one is that you get a request and want to know what to do with
> > it. Here you don't want to inherit the worst-case alignment. You'd
> > rather want to enforce alignment only when it is actually needed. For
> > example, if you have a cache=none backing file with 4k alignment, but a
> > cache=writeback overlay, and you don't actually access the backing file
> > with this request, it would be wasteful to align the request. You also
> > don't really know that a driver will issue a misaligned request (or any
> > request at all) to the lower layer just because it got called with one.
> > 
> > The second case is when you want to issue a request. For example, let's
> > take the qcow2 cache here, which has 64k cached and needs to update a
> > few bytes. Currently it always writes the full 64k (and with 1 MB
> > clusters a full megabyte), but what it really should do is consider the
> > worst-case alignment.
> > 
> > This is comparable to the difference between bdrv_opt_mem_align(), which
> > is used in qemu_blockalign() where we want to create a buffer that works
> > even in the worst case, and bdrv_min_mem_align(), which is used in
> > bdrv_qiov_is_aligned() in order to determine whether we must align an
> > existing request.
> > 
> >> That's the only thing left before I repost the series, so I may just
> >> post the last patch as RFC and play with it a bit more...
> > 
> > And in the light of the above, maybe the solution is as easy as changing
> > raw_refresh_limits() to set bs->bl.request_alignment = 1.
> 
> Or maybe split the main bdrv_refresh_limits() into two pieces: one that
> merges limits from one BDS into another (the limits that are worth
> merging, and in the correct direction: opt merges to MAX, max merges to
> MIN_NON_ZERO, request_alignment is not merged), the other that calls
> merge(bs, bs->file->bs); then have raw_refresh_limits() also use the
> common merge functionality rather than straight copy.  Testing that
> approach now...

So you don't agree with what I said above?

I think that block drivers should only set limits that they require for
themselves. The block layer bdrv_refresh_limits() function can then
merge things where needed; drivers shouldn't be involved there.

Kevin
diff mbox

Patch

diff --git i/block/raw_bsd.c w/block/raw_bsd.c
index b1d5237..c3c2246 100644
--- i/block/raw_bsd.c
+++ w/block/raw_bsd.c
@@ -152,7 +152,11 @@  static int raw_get_info(BlockDriverState *bs,
BlockDriverInfo *bdi)

 static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
 {
+    /* Inherit all limits except for request_alignment */
+    int request_alignment = bs->bl.request_alignment;
+
     bs->bl = bs->file->bs->bl;
+    bs->bl.request_alignment = request_alignment;
 }