Message ID | e751b3c536ace78f975b7d2553c22dbf6845a8d4.1711361340.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | reftable: improvements for the `binsearch()` mechanism | expand |
On 24/03/25 11:11AM, Patrick Steinhardt wrote: > When searching over restart points in a block we decode the key of each > of the records, which results in a memory allocation. This is quite > pointless though given that records it restart points will never use > prefix compression and thus store their keys verbatim in the block. > > Refactor the code so that we can avoid decoding the keys, which saves us > some allocations. Out of curiousity, do you have any benchmarks around this change and would that be something we would want to add to the commit message? -Justin > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > reftable/block.c | 29 +++++++++++++++++++---------- > 1 file changed, 19 insertions(+), 10 deletions(-) > > diff --git a/reftable/block.c b/reftable/block.c > index ca80a05e21..8bb4e43cec 100644 > --- a/reftable/block.c > +++ b/reftable/block.c > @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args) > .buf = args->reader->block.data + off, > .len = args->reader->block_len - off, > }; > - struct strbuf kth_restart_key = STRBUF_INIT; > - uint8_t unused_extra; > - int result, n; > + uint64_t prefix_len, suffix_len; > + uint8_t extra; > + int n; > > /* > - * TODO: The restart key is verbatim in the block, so we can in theory > - * avoid decoding the key and thus save some allocations. > + * Records at restart points are stored without prefix compression, so > + * there is no need to fully decode the record key here. This removes > + * the need for allocating memory. > */ > - n = reftable_decode_key(&kth_restart_key, &unused_extra, in); > - if (n < 0) { > + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra); > + if (n < 0 || prefix_len) { > args->error = 1; > return -1; > } > > - result = strbuf_cmp(&args->needle, &kth_restart_key); > - strbuf_release(&kth_restart_key); > - return result < 0; > + string_view_consume(&in, n); > + if (suffix_len > in.len) { > + args->error = 1; > + return -1; > + } > + > + n = memcmp(args->needle.buf, in.buf, > + args->needle.len < suffix_len ? args->needle.len : suffix_len); > + if (n) > + return n < 0; > + return args->needle.len < suffix_len; > } > > void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) > -- > 2.44.GIT >
On Tue, Apr 02, 2024 at 11:47:16AM -0500, Justin Tobler wrote: > On 24/03/25 11:11AM, Patrick Steinhardt wrote: > > When searching over restart points in a block we decode the key of each > > of the records, which results in a memory allocation. This is quite > > pointless though given that records it restart points will never use > > prefix compression and thus store their keys verbatim in the block. > > > > Refactor the code so that we can avoid decoding the keys, which saves us > > some allocations. > > Out of curiousity, do you have any benchmarks around this change and > would that be something we would want to add to the commit message? I don't have a benchmark. The problem is that the difference isn't really measureable when doing a single seek, only, because seeks are simply too fast. The only usecase where I know that there are a ton of of record seeks are writes, but here the performance improvement is getting drowned out by everything else. You can try to measure allocations and indeed see a difference. But again, this is getting drowned out by the noise for writes. With my block reader refactorings (ps/reftable-block-iteration-optim) you can see the difference when iterating through refs. Before: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 314 allocs, 189 frees, 106,035 bytes allocated After: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 303 allocs, 178 frees, 105,763 bytes allocated But yeah, it's nothing that'd make you go "Oh, wow!". As said, it will add up when doing many seeks, but I didn't manage to find a proper benchamrk yet that would be worthy to make it into the commit message. Patrick > -Justin > > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > reftable/block.c | 29 +++++++++++++++++++---------- > > 1 file changed, 19 insertions(+), 10 deletions(-) > > > > diff --git a/reftable/block.c b/reftable/block.c > > index ca80a05e21..8bb4e43cec 100644 > > --- a/reftable/block.c > > +++ b/reftable/block.c > > @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args) > > .buf = args->reader->block.data + off, > > .len = args->reader->block_len - off, > > }; > > - struct strbuf kth_restart_key = STRBUF_INIT; > > - uint8_t unused_extra; > > - int result, n; > > + uint64_t prefix_len, suffix_len; > > + uint8_t extra; > > + int n; > > > > /* > > - * TODO: The restart key is verbatim in the block, so we can in theory > > - * avoid decoding the key and thus save some allocations. > > + * Records at restart points are stored without prefix compression, so > > + * there is no need to fully decode the record key here. This removes > > + * the need for allocating memory. > > */ > > - n = reftable_decode_key(&kth_restart_key, &unused_extra, in); > > - if (n < 0) { > > + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra); > > + if (n < 0 || prefix_len) { > > args->error = 1; > > return -1; > > } > > > > - result = strbuf_cmp(&args->needle, &kth_restart_key); > > - strbuf_release(&kth_restart_key); > > - return result < 0; > > + string_view_consume(&in, n); > > + if (suffix_len > in.len) { > > + args->error = 1; > > + return -1; > > + } > > + > > + n = memcmp(args->needle.buf, in.buf, > > + args->needle.len < suffix_len ? args->needle.len : suffix_len); > > + if (n) > > + return n < 0; > > + return args->needle.len < suffix_len; > > } > > > > void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) > > -- > > 2.44.GIT > > > >
diff --git a/reftable/block.c b/reftable/block.c index ca80a05e21..8bb4e43cec 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args) .buf = args->reader->block.data + off, .len = args->reader->block_len - off, }; - struct strbuf kth_restart_key = STRBUF_INIT; - uint8_t unused_extra; - int result, n; + uint64_t prefix_len, suffix_len; + uint8_t extra; + int n; /* - * TODO: The restart key is verbatim in the block, so we can in theory - * avoid decoding the key and thus save some allocations. + * Records at restart points are stored without prefix compression, so + * there is no need to fully decode the record key here. This removes + * the need for allocating memory. */ - n = reftable_decode_key(&kth_restart_key, &unused_extra, in); - if (n < 0) { + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra); + if (n < 0 || prefix_len) { args->error = 1; return -1; } - result = strbuf_cmp(&args->needle, &kth_restart_key); - strbuf_release(&kth_restart_key); - return result < 0; + string_view_consume(&in, n); + if (suffix_len > in.len) { + args->error = 1; + return -1; + } + + n = memcmp(args->needle.buf, in.buf, + args->needle.len < suffix_len ? args->needle.len : suffix_len); + if (n) + return n < 0; + return args->needle.len < suffix_len; } void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
When searching over restart points in a block we decode the key of each of the records, which results in a memory allocation. This is quite pointless though given that records it restart points will never use prefix compression and thus store their keys verbatim in the block. Refactor the code so that we can avoid decoding the keys, which saves us some allocations. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- reftable/block.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)