diff mbox series

[v4,7/8] file-posix: account discard operations

Message ID 1534844779-118784-8-git-send-email-anton.nefedov@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series discard blockstats | expand

Commit Message

Anton Nefedov Aug. 21, 2018, 9:46 a.m. UTC
This will help to identify how many of the user-issued discard operations
(accounted on a device level) have actually suceeded down on the host file
(even though the numbers will not be exactly the same if non-raw format
driver is used (e.g. qcow2 sending metadata discards)).

Note that these numbers will not include discards triggered by
write-zeroes + MAY_UNMAP calls.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
---
 block/file-posix.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

Comments

Kevin Wolf Oct. 4, 2018, 3:52 p.m. UTC | #1
Am 21.08.2018 um 11:46 hat Anton Nefedov geschrieben:
> This will help to identify how many of the user-issued discard operations
> (accounted on a device level) have actually suceeded down on the host file
> (even though the numbers will not be exactly the same if non-raw format
> driver is used (e.g. qcow2 sending metadata discards)).
> 
> Note that these numbers will not include discards triggered by
> write-zeroes + MAY_UNMAP calls.
> 
> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>

Why not implement accounting at the BDS level for all drivers? Then we
can also reuse the existing BlockStats fields instead of duplicating
them into driver-specific new ones.

Kevin
Anton Nefedov Oct. 8, 2018, 1:47 p.m. UTC | #2
On 4/10/2018 6:52 PM, Kevin Wolf wrote:
> Am 21.08.2018 um 11:46 hat Anton Nefedov geschrieben:
>> This will help to identify how many of the user-issued discard operations
>> (accounted on a device level) have actually suceeded down on the host file
>> (even though the numbers will not be exactly the same if non-raw format
>> driver is used (e.g. qcow2 sending metadata discards)).
>>
>> Note that these numbers will not include discards triggered by
>> write-zeroes + MAY_UNMAP calls.
>>
>> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
> 
> Why not implement accounting at the BDS level for all drivers? Then we
> can also reuse the existing BlockStats fields instead of duplicating
> them into driver-specific new ones.
> 
> Kevin
> 

I just wonder how useful is that?
Discards are interesting to see on the BDS level as they are optional.
Maybe one can be curious how much data is being read from backing
images. Anything else?

It feels like BDS level is complex enough as it is.
Accounting means we should keep that in mind in all the error paths,
with a risk to not account or to account twice (even more complicated
with bounce buffer fallback paths).

We could probably choose some anchor point to bind to, like

   - account at the very bottom only, i.e. right after drv->bdrv_x()
       (-) missing the cases where earlier sanity checks fired
       (-) easy to double-account fallback scenarios

   - account at tracked_request_begin/end
       (-) missing the cases where earlier sanity checks fired
       (-) accounting blk layer requests, and not the ones actually
           passed to the driver (which can be smaller in size due to
           BlockLimits - and can fail partially in discard case)

Maybe another option would be to keep BlockStats on BDS but still let
the drivers update them?

/Anton
diff mbox series

Patch

diff --git a/block/file-posix.c b/block/file-posix.c
index fe83cbf..c420f76 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -163,6 +163,11 @@  typedef struct BDRVRawState {
     bool has_fallocate;
     bool needs_alignment;
     bool check_cache_dropped;
+    struct {
+        int64_t discard_nb_ok;
+        int64_t discard_nb_failed;
+        int64_t discard_bytes_ok;
+    } stats;
 
     PRManager *pr_mgr;
 } BDRVRawState;
@@ -2575,12 +2580,25 @@  static void coroutine_fn raw_co_invalidate_cache(BlockDriverState *bs,
 #endif /* !__linux__ */
 }
 
+static void raw_account_discard(BDRVRawState *s, uint64_t nbytes, int ret)
+{
+    if (ret) {
+        s->stats.discard_nb_failed++;
+    } else {
+        s->stats.discard_nb_ok++;
+        s->stats.discard_bytes_ok += nbytes;
+    }
+}
+
 static coroutine_fn int
 raw_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
 {
     BDRVRawState *s = bs->opaque;
+    int ret;
 
-    return paio_submit_co(bs, s->fd, offset, NULL, bytes, QEMU_AIO_DISCARD);
+    ret = paio_submit_co(bs, s->fd, offset, NULL, bytes, QEMU_AIO_DISCARD);
+    raw_account_discard(s, bytes, ret);
+    return ret;
 }
 
 static int coroutine_fn raw_co_pwrite_zeroes(
@@ -3077,10 +3095,14 @@  hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
 
     ret = fd_open(bs);
     if (ret < 0) {
+        raw_account_discard(s, bytes, ret);
         return ret;
     }
-    return paio_submit_co(bs, s->fd, offset, NULL, bytes,
-                          QEMU_AIO_DISCARD | QEMU_AIO_BLKDEV);
+
+    ret = paio_submit_co(bs, s->fd, offset, NULL, bytes,
+                         QEMU_AIO_DISCARD | QEMU_AIO_BLKDEV);
+    raw_account_discard(s, bytes, ret);
+    return ret;
 }
 
 static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs,