diff mbox series

[1/6] bio: optimize initialization of a bio

Message ID 20210811193533.766613-2-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series Enable bio recycling for polled IO | expand

Commit Message

Jens Axboe Aug. 11, 2021, 7:35 p.m. UTC
The memset() used is measurably slower in targeted benchmarks. Get rid
of it and fill in the bio manually, in a separate helper.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bio.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Aug. 12, 2021, 6:51 a.m. UTC | #1
On Wed, Aug 11, 2021 at 01:35:28PM -0600, Jens Axboe wrote:
> The memset() used is measurably slower in targeted benchmarks. Get rid
> of it and fill in the bio manually, in a separate helper.

If you have some numbers if would be great to throw them in here.

> +static inline void __bio_init(struct bio *bio)

Why is this split from bio_init and what are the criteria where an
initialization goes?

> +	bio->bi_flags = bio->bi_ioprio = bio->bi_write_hint = 0;

Please keep each initialization on a separate line.
Jens Axboe Aug. 12, 2021, 3:29 p.m. UTC | #2
On 8/12/21 12:51 AM, Christoph Hellwig wrote:
> On Wed, Aug 11, 2021 at 01:35:28PM -0600, Jens Axboe wrote:
>> The memset() used is measurably slower in targeted benchmarks. Get rid
>> of it and fill in the bio manually, in a separate helper.
> 
> If you have some numbers if would be great to throw them in here.

It's about 1% of the overhead of the alloc after the cache, which
comes later in the series.

Percent│    return __underlying_memset(p, c, size);
       │      lea    0x8(%r8),%rdi
       │    bio_alloc_kiocb():
  2.18 │      cmove  %rax,%r9
       │    memset():
       │      mov    %r8,%rcx
       │      and    $0xfffffffffffffff8,%rdi
       │      movq   $0x0,(%r8)
       │      sub    %rdi,%rcx
       │      add    $0x60,%ecx
       │      shr    $0x3,%ecx
 55.02 │      rep    stos %rax,%es:(%rdi)

This is on AMD, might look different on Intel, the manual clear seems
like a nice win on both. As a minor detail, avoids things like
re-setting bio->bi_pool for cached entries, as it never changes.


>> +static inline void __bio_init(struct bio *bio)
> 
> Why is this split from bio_init and what are the criteria where an
> initialization goes?

Got rid of the helper.

>> +	bio->bi_flags = bio->bi_ioprio = bio->bi_write_hint = 0;
> 
> Please keep each initialization on a separate line.

Done
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index 1fab762e079b..0b1025899131 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -238,6 +238,35 @@  static void bio_free(struct bio *bio)
 	}
 }
 
+static inline void __bio_init(struct bio *bio)
+{
+	bio->bi_next = NULL;
+	bio->bi_bdev = NULL;
+	bio->bi_opf = 0;
+	bio->bi_flags = bio->bi_ioprio = bio->bi_write_hint = 0;
+	bio->bi_status = 0;
+	bio->bi_iter.bi_sector = 0;
+	bio->bi_iter.bi_size = 0;
+	bio->bi_iter.bi_idx = 0;
+	bio->bi_iter.bi_bvec_done = 0;
+	bio->bi_end_io = NULL;
+	bio->bi_private = NULL;
+#ifdef CONFIG_BLK_CGROUP
+	bio->bi_blkg = NULL;
+	bio->bi_issue.value = 0;
+#ifdef CONFIG_BLK_CGROUP_IOCOST
+	bio->bi_iocost_cost = 0;
+#endif
+#endif
+#ifdef CONFIG_BLK_INLINE_ENCRYPTION
+	bio->bi_crypt_context = NULL;
+#endif
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	bio->bi_integrity = NULL;
+#endif
+	bio->bi_vcnt = 0;
+}
+
 /*
  * Users of this function have their own bio allocation. Subsequently,
  * they must remember to pair any call to bio_init() with bio_uninit()
@@ -246,7 +275,7 @@  static void bio_free(struct bio *bio)
 void bio_init(struct bio *bio, struct bio_vec *table,
 	      unsigned short max_vecs)
 {
-	memset(bio, 0, sizeof(*bio));
+	__bio_init(bio);
 	atomic_set(&bio->__bi_remaining, 1);
 	atomic_set(&bio->__bi_cnt, 1);