diff mbox series

[1/3] packfile: factor out --pack_header argument parsing

Message ID 20250117125459.GA2893666@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 798e0f451661f81f4568dce4033cf1c9076f224f
Headers show
Series [1/3] packfile: factor out --pack_header argument parsing | expand

Commit Message

Jeff King Jan. 17, 2025, 12:54 p.m. UTC
Both index-pack and unpack-objects accept a --pack_header argument. This
is an undocumented internal argument used by receive-pack and fetch to
pass along information about the header of the pack, which they've
already read from the incoming stream.

In preparation for a bugfix, let's factor the duplicated code into a
common helper.

The callers are still responsible for identifying the option. While this
could likewise be factored out, it is more flexible this way (e.g., if
they ever started using parse-options and wanted to handle both the
stuck and unstuck forms).

Likewise, the callers are responsible for reporting errors, though they
both just call die(). I've tweaked unpack-objects to match index-pack in
marking the error for translation.

Signed-off-by: Jeff King <peff@peff.net>
---
The generating side of this option is duplicate, too, between
receive-pack and fetch-pack. But it's so much simpler that I didn't
think it was worth factoring out. We could always do it later.

 builtin/index-pack.c     | 14 +++-----------
 builtin/unpack-objects.c | 16 ++++------------
 packfile.c               | 17 +++++++++++++++++
 packfile.h               |  6 ++++++
 4 files changed, 30 insertions(+), 23 deletions(-)

Comments

Junio C Hamano Jan. 17, 2025, 10:45 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

>  			} else if (starts_with(arg, "--pack_header=")) {
> -				struct pack_header *hdr;
> -				char *c;
> -
> -				hdr = (struct pack_header *)input_buffer;
> -				hdr->hdr_signature = htonl(PACK_SIGNATURE);
> -				hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10));

Interesting.  So the file-scope static input_buffer[] sits in the
BSS and happens to be well aligned not to cause the problem, but ...

> @@ -645,18 +646,9 @@ int cmd_unpack_objects(int argc,
>  				continue;
>  			}
>  			if (starts_with(arg, "--pack_header=")) {
> -				struct pack_header *hdr;
> -				char *c;
> -
> -				hdr = (struct pack_header *)buffer;
> -				hdr->hdr_signature = htonl(PACK_SIGNATURE);
> -				hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10));

... the same file-scope static buffer[] that also sits in the BSS
was not well aligned by chance?

Otherwise these should be identical code.  Very interesting.

And of course the fix in the [2/3] is absolutely the right thing to
do.

Thanks.
Jeff King Jan. 18, 2025, 9:23 a.m. UTC | #2
On Fri, Jan 17, 2025 at 02:45:04PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >  			} else if (starts_with(arg, "--pack_header=")) {
> > -				struct pack_header *hdr;
> > -				char *c;
> > -
> > -				hdr = (struct pack_header *)input_buffer;
> > -				hdr->hdr_signature = htonl(PACK_SIGNATURE);
> > -				hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10));
> 
> Interesting.  So the file-scope static input_buffer[] sits in the
> BSS and happens to be well aligned not to cause the problem, but ...

I suspect it _is_ a problem, but either:

  - The OP's test case was small enough to trigger unpack-objects, not
    index-pack. Possibly:

      git index-pack --stdin --pack_header=2,2 <no-header.pack

    would fail for them.

  - We simply got lucky with alignment based on the other things in BSS,
    the whim of the compiler, etc. But it is an accident waiting to
    happen.

-Peff
Koakuma Jan. 18, 2025, 4:57 p.m. UTC | #3
Jeff King <peff@peff.net> wrote:
> Junio C Hamano wrote:
> > Jeff King peff@peff.net writes:
> > Interesting. So the file-scope static input_buffer[] sits in the
> > BSS and happens to be well aligned not to cause the problem, but ...
> 
> I suspect it is a problem, but either:
> 
> - The OP's test case was small enough to trigger unpack-objects, not
> index-pack. Possibly:
> 
> git index-pack --stdin --pack_header=2,2 <no-header.pack
> 
> would fail for them.
> 
> - We simply got lucky with alignment based on the other things in BSS,
> the whim of the compiler, etc. But it is an accident waiting to
> happen.
> 
> -Peff

That particular command doesn't seem to crash for me right now, but, as you
said, that is probably just a lucky coincidence that the compiler happens to
place the buffer on aligned memory.
diff mbox series

Patch

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 0b62b2589f..75b84f78f4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1955,18 +1955,10 @@  int cmd_index_pack(int argc,
 					nr_threads = 1;
 				}
 			} else if (starts_with(arg, "--pack_header=")) {
-				struct pack_header *hdr;
-				char *c;
-
-				hdr = (struct pack_header *)input_buffer;
-				hdr->hdr_signature = htonl(PACK_SIGNATURE);
-				hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10));
-				if (*c != ',')
-					die(_("bad %s"), arg);
-				hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10));
-				if (*c)
+				if (parse_pack_header_option(arg + 14,
+							     input_buffer,
+							     &input_len) < 0)
 					die(_("bad %s"), arg);
-				input_len = sizeof(*hdr);
 			} else if (!strcmp(arg, "-v")) {
 				verbose = 1;
 			} else if (!strcmp(arg, "--progress-title")) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 2197d6d933..cf2bc5c531 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -18,6 +18,7 @@ 
 #include "progress.h"
 #include "decorate.h"
 #include "fsck.h"
+#include "packfile.h"
 
 static int dry_run, quiet, recover, has_errors, strict;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
@@ -645,18 +646,9 @@  int cmd_unpack_objects(int argc,
 				continue;
 			}
 			if (starts_with(arg, "--pack_header=")) {
-				struct pack_header *hdr;
-				char *c;
-
-				hdr = (struct pack_header *)buffer;
-				hdr->hdr_signature = htonl(PACK_SIGNATURE);
-				hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10));
-				if (*c != ',')
-					die("bad %s", arg);
-				hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10));
-				if (*c)
-					die("bad %s", arg);
-				len = sizeof(*hdr);
+				if (parse_pack_header_option(arg + 14,
+							     buffer, &len) < 0)
+					die(_("bad %s"), arg);
 				continue;
 			}
 			if (skip_prefix(arg, "--max-input-size=", &arg)) {
diff --git a/packfile.c b/packfile.c
index cc7ab6403a..2bf9e57330 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2315,3 +2315,20 @@  int is_promisor_object(struct repository *r, const struct object_id *oid)
 	}
 	return oidset_contains(&promisor_objects, oid);
 }
+
+int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *len)
+{
+	struct pack_header *hdr;
+	char *c;
+
+	hdr = (struct pack_header *)out;
+	hdr->hdr_signature = htonl(PACK_SIGNATURE);
+	hdr->hdr_version = htonl(strtoul(in, &c, 10));
+	if (*c != ',')
+		return -1;
+	hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10));
+	if (*c)
+		return -1;
+	*len = sizeof(*hdr);
+	return 0;
+}
diff --git a/packfile.h b/packfile.h
index 58104fa009..00ada7a938 100644
--- a/packfile.h
+++ b/packfile.h
@@ -216,4 +216,10 @@  int is_promisor_object(struct repository *r, const struct object_id *oid);
 int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
 	     size_t idx_size, struct packed_git *p);
 
+/*
+ * Parse a --pack_header option as accepted by index-pack and unpack-objects,
+ * turning it into the matching bytes we'd find in a pack.
+ */
+int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *len);
+
 #endif