diff mbox series

[v1,3/4] builtin/repack.c: change xwrite to write_in_full to allow large sizes.

Message ID 20240226220539.3494-4-randall.becker@nexbridge.ca (mailing list archive)
State Superseded
Headers show
Series Change xwrite() to write_in_full() in builtins. | expand

Commit Message

Randall S. Becker Feb. 26, 2024, 10:05 p.m. UTC
From: "Randall S. Becker" <rsbecker@nexbridge.com>

This change is required because some platforms do not support file writes of
arbitrary sizes (e.g, NonStop). xwrite ends up truncating the output to the
maximum single I/O size possible for the destination device. The result of
write_in_full() is also passed to the caller, which was previously ignored.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
---
 builtin/repack.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Feb. 26, 2024, 11:54 p.m. UTC | #1
"Randall S. Becker" <the.n.e.key@gmail.com> writes:

> From: "Randall S. Becker" <rsbecker@nexbridge.com>
>
> This change is required because some platforms do not support file writes of
> arbitrary sizes (e.g, NonStop). xwrite ends up truncating the output to the
> maximum single I/O size possible for the destination device. The result of
> write_in_full() is also passed to the caller, which was previously ignored.

This one smells more like a theoretical issue than realistic, in
that these writes are done only with .hexsz (either 40 or 64) bytes
oid string, or a single byte "\n", for either of which it is hard to
imagine that it is even remotely close to platform "maximum single
I/O size".

But we'd need to look for the error return anyway, so switching to
write_in_full() while we are doing so is also good.

> Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
> ---
>  builtin/repack.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index ede36328a3..932d24c60b 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -307,6 +307,7 @@ static int write_oid(const struct object_id *oid,
>  		     struct packed_git *pack UNUSED,
>  		     uint32_t pos UNUSED, void *data)
>  {
> +	int err;
>  	struct child_process *cmd = data;
>  
>  	if (cmd->in == -1) {
> @@ -314,8 +315,12 @@ static int write_oid(const struct object_id *oid,
>  			die(_("could not start pack-objects to repack promisor objects"));
>  	}
>  
> -	xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
> -	xwrite(cmd->in, "\n", 1);
> +	err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
> +	if (err <= 0)
> +		return err;
> +	err = write_in_full(cmd->in, "\n", 1);
> +	if (err <= 0)
> +		return err;
>  	return 0;
>  }
Jeff King Feb. 27, 2024, 8:20 a.m. UTC | #2
On Mon, Feb 26, 2024 at 05:05:37PM -0500, Randall S. Becker wrote:

> From: "Randall S. Becker" <rsbecker@nexbridge.com>
> 
> This change is required because some platforms do not support file writes of
> arbitrary sizes (e.g, NonStop). xwrite ends up truncating the output to the
> maximum single I/O size possible for the destination device. The result of
> write_in_full() is also passed to the caller, which was previously ignored.

These are going to be tiny compared to single-write() I/O limits, I'd
think, but in general we should be on guard for the OS returning short
reads (this is a pipe and so for most systems PIPE_BUF would guarantee
atomicity, I think, but IMHO it is simpler to just make things
obviously-correct by looping with write_in_full). So I'd be surprised if
this spot was the cause of a visible bug, but I think it's worth
changing regardless.

The error detection is a separate question, though. I think it is good
to check the result of the write here, as an error here means that the
child pack-objects misses some objects we wanted it to pack, which could
lead to a corrupt repository. But I don't think what you have here is
quite enough:

> @@ -314,8 +315,12 @@ static int write_oid(const struct object_id *oid,
>  			die(_("could not start pack-objects to repack promisor objects"));
>  	}
>  
> -	xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
> -	xwrite(cmd->in, "\n", 1);
> +	err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
> +	if (err <= 0)
> +		return err;
> +	err = write_in_full(cmd->in, "\n", 1);
> +	if (err <= 0)
> +		return err;
>  	return 0;

OK, so we detect the error and return it to the caller. Who is the
caller? The only use of this function is in repack_promisor_objects(),
which calls:

        for_each_packed_object(write_oid, &cmd,
                               FOR_EACH_OBJECT_PROMISOR_ONLY);

So when we return the error, now for_each_packed_object() will stop
traversing, and propagate that error up to the caller. But as we can see
above, the caller ignores it!

So I think you'd either want to die directly (perhaps using
write_or_die). Or you'd need to additionally check the return from
for_each_packed_object(). That would also catch cases where that
function failed to open a pack (I'm not sure how important that is to
this code).

But as it is, your patch just causes a write error to truncate the list
of oids send to the child process (though that is probably not
materially different from the current behavior, as the subsequent calls
would presumably fail, too).

-Peff
Jeff King Feb. 27, 2024, 8:22 a.m. UTC | #3
On Tue, Feb 27, 2024 at 03:20:27AM -0500, Jeff King wrote:

> OK, so we detect the error and return it to the caller. Who is the
> caller? The only use of this function is in repack_promisor_objects(),
> which calls:
> 
>         for_each_packed_object(write_oid, &cmd,
>                                FOR_EACH_OBJECT_PROMISOR_ONLY);
> 
> So when we return the error, now for_each_packed_object() will stop
> traversing, and propagate that error up to the caller. But as we can see
> above, the caller ignores it!

Oh, one other thing I meant to mention: as the test failure you saw was
related to repacking, this seemed like a likely culprit. But the code is
only triggered when repacking promisor objects in a partial clone, and
it didn't look like the test you posted covered that (it was just about
cruft packs). So I would not expect this code to be run at all in the
failing test you saw.

-Peff
diff mbox series

Patch

diff --git a/builtin/repack.c b/builtin/repack.c
index ede36328a3..932d24c60b 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -307,6 +307,7 @@  static int write_oid(const struct object_id *oid,
 		     struct packed_git *pack UNUSED,
 		     uint32_t pos UNUSED, void *data)
 {
+	int err;
 	struct child_process *cmd = data;
 
 	if (cmd->in == -1) {
@@ -314,8 +315,12 @@  static int write_oid(const struct object_id *oid,
 			die(_("could not start pack-objects to repack promisor objects"));
 	}
 
-	xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
-	xwrite(cmd->in, "\n", 1);
+	err = write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
+	if (err <= 0)
+		return err;
+	err = write_in_full(cmd->in, "\n", 1);
+	if (err <= 0)
+		return err;
 	return 0;
 }