mbox series

[v2,0/4] reftable: fix out-of-memory errors on NonStop

Message ID 20241222-b4-pks-reftable-oom-fix-without-readers-v2-0-19550090d15a@pks.im (mailing list archive)
Headers show
Series reftable: fix out-of-memory errors on NonStop | expand

Message

Patrick Steinhardt Dec. 22, 2024, 7:24 a.m. UTC
Hi,

this small patch series fixes out-of-memory errors on NonStop with the
reftable backend. These errors are caused by zero-sized allocations,
which return `NULL` pointers on NonStop.

Changes in v2:

  - Some small touchups to commit messages.
  - Explain why it is safe to stop auto-compacting with less than two
    tables.
  - Adapt `reftable_stack_reload_once()` so that we only do the minimum
    changes required to fix issue.
  - Link to v1: https://lore.kernel.org/r/20241221-b4-pks-reftable-oom-fix-without-readers-v1-0-12db83a3267c@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (4):
      reftable/stack: don't perform auto-compaction with less than two tables
      reftable/merged: fix zero-sized allocation when there are no readers
      reftable/stack: fix zero-sized allocation when there are no readers
      reftable/basics: return NULL on zero-sized allocations

 reftable/basics.c |  7 +++++++
 reftable/merged.c | 12 +++++++-----
 reftable/stack.c  | 27 +++++++++++++++++----------
 3 files changed, 31 insertions(+), 15 deletions(-)

Range-diff versus v1:

1:  9fcd452995 ! 1:  a7c5b7c52e reftable/stack: don't perform auto-compaction with less than two tables
    @@ Commit message
         from `reftable_stack_auto_compact()` in case we have less than two
         tables.
     
    -    This is mostly defense in depth: `stack_table_sizes_for_compaction()`
    -    may try to allocate a zero-byte object when there aren't any readers,
    -    and that may lead to a `NULL` pointer on some platforms like NonStop
    -    which causes us to bail out with an out-of-memory error.
    +    In the original, `stack_table_sizes_for_compaction()` yields an array
    +    that has the same length as the number of tables. This array is then
    +    passed on to `suggest_compaction_segment()`, which returns an empty
    +    segment in case we have less than two tables. The segment is then passed
    +    to `segment_size()`, which will return `0` because both start and end of
    +    the segment are `0`. And because we only call `stack_compact_range()` in
    +    case we have a positive segment size we don't perform auto-compaction at
    +    all. Consequently, this change does not result in a user-visible change
    +    in behaviour when called with a single table.
    +
    +    But when called with no tables this protects us against a potential
    +    out-of-memory error: `stack_table_sizes_for_compaction()` would try to
    +    allocate a zero-byte object when there aren't any tables, and that may
    +    lead to a `NULL` pointer on some platforms like NonStop which causes us
    +    to bail out with an out-of-memory error.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
2:  0703e520de ! 2:  1ca81345ce reftable/merged: fix zero-sized allocation when there are no readers
    @@ Metadata
      ## Commit message ##
         reftable/merged: fix zero-sized allocation when there are no readers
     
    -    It was reported [1c that Git started to fail with an out-of-memory error
    +    It was reported [1] that Git started to fail with an out-of-memory error
         when initializing repositories with the reftable backend on NonStop
         platforms. A bisect led to 802c0646ac (reftable/merged: handle
         allocation failures in `merged_table_init_iter()`, 2024-10-02), which
    @@ Commit message
         The root cause of this seems to be that NonStop returns a `NULL` pointer
         when doing a zero-sized allocation. This would've already happened
         before the above change, but we never noticed because we did not check
    -    the result. Now that we do we notice and thus return an out-of-memory
    -    error to the caller.
    +    the result. Now we do notice and thus return an out-of-memory error to
    +    the caller.
     
         Fix the issue by skipping the allocation altogether in case there are no
         readers.
3:  67baf67817 ! 3:  9be3c6e6c1 reftable/stack: fix zero-sized allocation when there are no readers
    @@ Commit message
         situation, and thus we return an error.
     
         Fix this by only allocating arrays when they have at least one entry.
    -    Refactor the code so that we don't try to access those arrays in case
    -    they are empty.
     
         Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
    @@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *s
      	size_t reused_len = 0, reused_alloc = 0, names_len;
      	size_t new_readers_len = 0;
      	struct reftable_merged_table *new_merged = NULL;
    - 	struct reftable_buf table_path = REFTABLE_BUF_INIT;
    +@@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *st,
      	int err = 0;
    --	size_t i;
    + 	size_t i;
      
     -	cur = stack_copy_readers(st, cur_len);
     -	if (!cur) {
    @@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *s
      	}
      
      	names_len = names_length(names);
    --
    + 
     -	new_readers = reftable_calloc(names_len, sizeof(*new_readers));
     -	if (!new_readers) {
     -		err = REFTABLE_OUT_OF_MEMORY_ERROR;
    @@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *s
     +		}
      	}
      
    --	while (*names) {
    -+	for (size_t i = 0; i < names_len; i++) {
    - 		struct reftable_reader *rd = NULL;
    --		const char *name = *names++;
    -+		const char *name = names[i];
    - 
    - 		/* this is linear; we assume compaction keeps the number of
    - 		   tables under control so this is not quadratic. */
    --		for (i = 0; reuse_open && i < cur_len; i++) {
    --			if (cur[i] && 0 == strcmp(cur[i]->name, name)) {
    --				rd = cur[i];
    --				cur[i] = NULL;
    -+		for (size_t j = 0; reuse_open && j < cur_len; j++) {
    -+			if (cur[j] && 0 == strcmp(cur[j]->name, name)) {
    -+				rd = cur[j];
    -+				cur[j] = NULL;
    - 
    - 				/*
    - 				 * When reloading the stack fails, we end up
    -@@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *st,
    - 	 * file of such an open reader wouldn't have been possible to be
    - 	 * unlinked by the compacting process.
    - 	 */
    --	for (i = 0; i < cur_len; i++) {
    -+	for (size_t i = 0; i < cur_len; i++) {
    - 		if (cur[i]) {
    - 			const char *name = reader_name(cur[i]);
    - 
    -@@ reftable/stack.c: static int reftable_stack_reload_once(struct reftable_stack *st,
    - 	 * happen on the successful case, because on the unsuccessful one we
    - 	 * decrement their refcount via `new_readers`.
    - 	 */
    --	for (i = 0; i < reused_len; i++)
    -+	for (size_t i = 0; i < reused_len; i++)
    - 		reftable_reader_decref(reused[i]);
    - 
    - done:
    --	for (i = 0; i < new_readers_len; i++)
    -+	for (size_t i = 0; i < new_readers_len; i++)
    - 		reftable_reader_decref(new_readers[i]);
    - 	reftable_free(new_readers);
    - 	reftable_free(reused);
    + 	while (*names) {
4:  16ae8201b1 = 4:  8028e2cf68 reftable/basics: return NULL on zero-sized allocations

---
base-commit: ff795a5c5ed2e2d07c688c217a615d89e3f5733b
change-id: 20241220-b4-pks-reftable-oom-fix-without-readers-c7d8fda0694d

Comments

Junio C Hamano Dec. 22, 2024, 4:31 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Changes in v2:
>
>   - Some small touchups to commit messages.
>   - Explain why it is safe to stop auto-compacting with less than two
>     tables.
>   - Adapt `reftable_stack_reload_once()` so that we only do the minimum
>     changes required to fix issue.
>   - Link to v1: https://lore.kernel.org/r/20241221-b4-pks-reftable-oom-fix-without-readers-v1-0-12db83a3267c@pks.im
>
> Thanks!

Thanks.  All four patches looked good.

Will merge to 'next' and then to 'master'.