diff mbox

[3/5] raw_bsd: Don't advertise flags not supported by protocol layer

Message ID 1466465969-25315-4-git-send-email-eblake@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Blake June 20, 2016, 11:39 p.m. UTC
The raw format layer supports all flags via passthrough - but
it only makes sense to pass through flags that the lower layer
actually supports.

Thanks to the previous patch, the raw format layer now attempts
to fragment writes at the max_transfer limit it inherits from
the NBD protocol layer, recently set to 32m.  An attempt to do
'w -f 0 40m' to an NBD server that lacks FUA thus changed from
flushing once (after NBD fragmented a single 40m write itself)
to instead flushing twice (the format layer sees BDRV_REQ_FUA
in supported_write_flags, so it sends the flag on to both
fragments, and then the block layer emulates FUA by flushing
for both the 32m and 8m fragments at the protocol layer).
This patch fixes the performance regression (now that the
format layer no longer advertises a flag not present at the
protocol layer, the flush to emulate FUA is deferred to the
last fragment).

Note that 'w -f -z 0 40m' does not currently exhibit the same
problem, because there, the fragmentation does not occur until
at the NBD layer (the raw layer has .bdrv_co_pwrite_zeroes, and
the NBD layer doesn't advertise max_pwrite_zeroes to constrain
things at the raw layer) - but that problem is latent and would
have the same problem with too many flushes without this patch
once the NBD layer implements support for using the new
NBD_CMD_WRITE_ZEROES and sets max_pwrite_zeroes to the same 32m
limit as recommended by the NBD protocol.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/raw_bsd.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Kevin Wolf July 8, 2016, 11:05 a.m. UTC | #1
Am 21.06.2016 um 01:39 hat Eric Blake geschrieben:
> The raw format layer supports all flags via passthrough - but
> it only makes sense to pass through flags that the lower layer
> actually supports.
> 
> Thanks to the previous patch, the raw format layer now attempts
> to fragment writes at the max_transfer limit it inherits from
> the NBD protocol layer, recently set to 32m.  An attempt to do
> 'w -f 0 40m' to an NBD server that lacks FUA thus changed from
> flushing once (after NBD fragmented a single 40m write itself)
> to instead flushing twice (the format layer sees BDRV_REQ_FUA
> in supported_write_flags, so it sends the flag on to both
> fragments, and then the block layer emulates FUA by flushing
> for both the 32m and 8m fragments at the protocol layer).
> This patch fixes the performance regression (now that the
> format layer no longer advertises a flag not present at the
> protocol layer, the flush to emulate FUA is deferred to the
> last fragment).
> 
> Note that 'w -f -z 0 40m' does not currently exhibit the same
> problem, because there, the fragmentation does not occur until
> at the NBD layer (the raw layer has .bdrv_co_pwrite_zeroes, and
> the NBD layer doesn't advertise max_pwrite_zeroes to constrain
> things at the raw layer) - but that problem is latent and would
> have the same problem with too many flushes without this patch
> once the NBD layer implements support for using the new
> NBD_CMD_WRITE_ZEROES and sets max_pwrite_zeroes to the same 32m
> limit as recommended by the NBD protocol.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Should this be moved before patch 2 so that we never get a regression in
the first place?

Kevin
Eric Blake July 8, 2016, 2:32 p.m. UTC | #2
On 07/08/2016 05:05 AM, Kevin Wolf wrote:
> Am 21.06.2016 um 01:39 hat Eric Blake geschrieben:
>> The raw format layer supports all flags via passthrough - but
>> it only makes sense to pass through flags that the lower layer
>> actually supports.
>>
>> Thanks to the previous patch, the raw format layer now attempts
>> to fragment writes at the max_transfer limit it inherits from
>> the NBD protocol layer, recently set to 32m.  An attempt to do
>> 'w -f 0 40m' to an NBD server that lacks FUA thus changed from
>> flushing once (after NBD fragmented a single 40m write itself)
>> to instead flushing twice (the format layer sees BDRV_REQ_FUA
>> in supported_write_flags, so it sends the flag on to both
>> fragments, and then the block layer emulates FUA by flushing
>> for both the 32m and 8m fragments at the protocol layer).
>> This patch fixes the performance regression (now that the
>> format layer no longer advertises a flag not present at the
>> protocol layer, the flush to emulate FUA is deferred to the
>> last fragment).
>>
>> Note that 'w -f -z 0 40m' does not currently exhibit the same
>> problem, because there, the fragmentation does not occur until
>> at the NBD layer (the raw layer has .bdrv_co_pwrite_zeroes, and
>> the NBD layer doesn't advertise max_pwrite_zeroes to constrain
>> things at the raw layer) - but that problem is latent and would
>> have the same problem with too many flushes without this patch
>> once the NBD layer implements support for using the new
>> NBD_CMD_WRITE_ZEROES and sets max_pwrite_zeroes to the same 32m
>> limit as recommended by the NBD protocol.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> Should this be moved before patch 2 so that we never get a regression in
> the first place?

Can do, although it will require some word-smithing to the commit message.
diff mbox

Patch

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 351ed2a..47a8352 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -197,8 +197,10 @@  static int raw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     bs->sg = bs->file->bs->sg;
-    bs->supported_write_flags = BDRV_REQ_FUA;
-    bs->supported_zero_flags = BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP;
+    bs->supported_write_flags = BDRV_REQ_FUA &
+        bs->file->bs->supported_write_flags;
+    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+        bs->file->bs->supported_zero_flags;

     if (bs->probed && !bdrv_is_read_only(bs)) {
         fprintf(stderr,