diff mbox series

[2/2] fetch-pack: refactor writing promisor file

Message ID 20210112082159.2277214-2-chriscool@tuxfamily.org (mailing list archive)
State Accepted
Commit 33add2ad7d6921489aa0cafd4a865504c3709512
Headers show
Series [1/2] fetch-pack: rename helper to create_promisor_file() | expand

Commit Message

Christian Couder Jan. 12, 2021, 8:21 a.m. UTC
Let's replace the 2 different pieces of code that write a
promisor file in 'builtin/repack.c' and 'fetch-pack.c'
with a new function called 'write_promisor_file()' in
'pack-write.c' and 'pack.h'.

This might also help us in the future, if we want to put
back the ref names and associated hashes that were in
the promisor files we are repacking in 'builtin/repack.c'
as suggested by a NEEDSWORK comment just above the code
we are refactoring.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/repack.c |  8 +++-----
 fetch-pack.c     |  8 +-------
 pack-write.c     | 12 ++++++++++++
 pack.h           |  4 ++++
 4 files changed, 20 insertions(+), 12 deletions(-)

Comments

Taylor Blau Jan. 12, 2021, 7:49 p.m. UTC | #1
On Tue, Jan 12, 2021 at 09:21:59AM +0100, Christian Couder wrote:
> Let's replace the 2 different pieces of code that write a
> promisor file in 'builtin/repack.c' and 'fetch-pack.c'
> with a new function called 'write_promisor_file()' in
> 'pack-write.c' and 'pack.h'.

This is just great. Even though I'm sure that I have read that code in
builtin/repack.c a hundred times before, I for some reason always have
to remind myself what it is doing.

So, I'm very happy to see that it is being cleaned up behind a function
whose name describes its behavior clearly. And, if that function can be
used to remove some duplication in a sensible fashion along the way,
then all the better.

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor
Junio C Hamano Jan. 12, 2021, 8:22 p.m. UTC | #2
Christian Couder <christian.couder@gmail.com> writes:

> Let's replace the 2 different pieces of code that write a
> promisor file in 'builtin/repack.c' and 'fetch-pack.c'
> with a new function called 'write_promisor_file()' in
> 'pack-write.c' and 'pack.h'.
>
> This might also help us in the future, if we want to put
> back the ref names and associated hashes that were in
> the promisor files we are repacking in 'builtin/repack.c'
> as suggested by a NEEDSWORK comment just above the code
> we are refactoring.

As soon as you say "might", my reading goes "Meh", but the real
issue/question I have about this is 

> +void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
> +{
> +	int i;
> +	FILE *output = xfopen(promisor_name, "w");
> +
> +	for (i = 0; i < nr_sought; i++)
> +		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> +			sought[i]->name);
> +	fclose(output);
> +}

If this function is so useful to be factored out, it must have
potential use cases where callers would want to write out the
promisor file [*1*].  Is it reasonable to assume that all of these
callers have an array of refs, or even _know_ about what "ref" is?
There still is just one "real" caller to this helper after this
patch, so it is too early to tell.

Seeing that the declaration (below) is made in <pack.h>, I think it
is fair to assume all the callers would know what "struct oid" is,
but I do not get the feeling that "an array of ='struct ref' pointers"
is something we can expect callers to have commonly.  And expecting
and/or requiring the potential callers to have its data in an unusual
shape would be a barrier for the helper's adoption.

Let's not do this change (yet) before we see a new potential caller
or two and know what kind of API they want this helper to have.
Without knowing them, my gut reaction is that it would be more
widely usable if it took an array of "struct object_id" pointers,
but if we make this function to take "struct object_id **sought"
plus "int nr_sought", it would mean that the only real caller that
currently exists needs to prepare a separate array out of the array
of "struct ref" poihtners it has.  That is way too premature
generalization.


> diff --git a/pack.h b/pack.h
> index 9fc0945ac9..9ae640f417 100644
> --- a/pack.h
> +++ b/pack.h
> @@ -87,6 +87,10 @@ off_t write_pack_header(struct hashfile *f, uint32_t);
>  void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
>  char *index_pack_lockfile(int fd);
>  
> +struct ref;
> +
> +void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
> +
>  /*
>   * The "hdr" output buffer should be at least this big, which will handle sizes
>   * up to 2^67.


[Footnote]

*1* The "we just make sure the file exists by calling this function
with no information about any objects" case I do not count as an
interesting caller---it could just have been done with a simple
"touch".
Junio C Hamano Jan. 12, 2021, 8:59 p.m. UTC | #3
Christian Couder <christian.couder@gmail.com> writes:

> +void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
> +{
> +	int i;
> +	FILE *output = xfopen(promisor_name, "w");
> +
> +	for (i = 0; i < nr_sought; i++)
> +		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> +			sought[i]->name);
> +	fclose(output);
> +}

Ah, nevermind.  This has its output format baked into the code and
it wants to show object name and which ref it is on, so as long as
we are happy with the format (do we read the contents and use it in
any way, by the way?), it is reasonable to expect any new caller to
have "struct ref" or an array of it.

Thanks.
Jeff King Jan. 13, 2021, 1:25 p.m. UTC | #4
On Tue, Jan 12, 2021 at 09:21:59AM +0100, Christian Couder wrote:

> Let's replace the 2 different pieces of code that write a
> promisor file in 'builtin/repack.c' and 'fetch-pack.c'
> with a new function called 'write_promisor_file()' in
> 'pack-write.c' and 'pack.h'.
> 
> This might also help us in the future, if we want to put
> back the ref names and associated hashes that were in
> the promisor files we are repacking in 'builtin/repack.c'
> as suggested by a NEEDSWORK comment just above the code
> we are refactoring.

I think the interface for the callers is much nicer. Not a new problem,
but one thing I did notice in the implementation:

> +void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
> +{
> +	int i;
> +	FILE *output = xfopen(promisor_name, "w");
> +
> +	for (i = 0; i < nr_sought; i++)
> +		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> +			sought[i]->name);
> +	fclose(output);
> +}

We check errors on open via xfopen(), which is good. But we would not
notice any problems writing via fprintf or fclose. Is it worth doing
something like:

  err = ferror(output);
  err |= fclose(output);
  return err ? -1 : 0;

?

(As an aside, this ferror/fclose dance is awkward enough and has caused
us enough questions in the past that I wonder if it is worth
encapsulating into a wrapper).

The existing callers behave the same way (checking open, but not the
writes), so definitely not a regression. But the helper function may
provide an opportunity to make things more robust without adding a lot
of duplicated code.

-Peff
Taylor Blau Jan. 13, 2021, 3:20 p.m. UTC | #5
On Wed, Jan 13, 2021 at 08:25:40AM -0500, Jeff King wrote:
> > +void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
> > +{
> > +	int i;
> > +	FILE *output = xfopen(promisor_name, "w");
> > +
> > +	for (i = 0; i < nr_sought; i++)
> > +		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> > +			sought[i]->name);
> > +	fclose(output);
> > +}
>
> We check errors on open via xfopen(), which is good. But we would not
> notice any problems writing via fprintf or fclose. Is it worth doing
> something like:
>
>   err = ferror(output);
>   err |= fclose(output);
>   return err ? -1 : 0;
>
> ?

I agree below that *not* doing this isn't a regression against the
current code, since it doesn't check either, but this could be done
relatively easily. It is appropriate for both callers of
write_promisor_file() to immediately die() if they get an error, so I
think that this is potentially worth doing.

> (As an aside, this ferror/fclose dance is awkward enough and has caused
> us enough questions in the past that I wonder if it is worth
> encapsulating into a wrapper).

From a quick grep through uses of ferror, there are a reasonable handful
of spots that I think could be improved if there was a ferror+fclose
helper, perhaps: xfclose().

> The existing callers behave the same way (checking open, but not the
> writes), so definitely not a regression. But the helper function may
> provide an opportunity to make things more robust without adding a lot
> of duplicated code.

Yep.

> -Peff

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/repack.c b/builtin/repack.c
index 279be11a16..2158b48f4c 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -14,6 +14,7 @@ 
 #include "object-store.h"
 #include "promisor-remote.h"
 #include "shallow.h"
+#include "pack.h"
 
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
@@ -263,7 +264,7 @@  static void repack_promisor_objects(const struct pack_objects_args *args,
 	while (strbuf_getline_lf(&line, out) != EOF) {
 		struct string_list_item *item;
 		char *promisor_name;
-		int fd;
+
 		if (line.len != the_hash_algo->hexsz)
 			die(_("repack: Expecting full hex object ID lines only from pack-objects."));
 		item = string_list_append(names, line.buf);
@@ -281,10 +282,7 @@  static void repack_promisor_objects(const struct pack_objects_args *args,
 		 */
 		promisor_name = mkpathdup("%s-%s.promisor", packtmp,
 					  line.buf);
-		fd = open(promisor_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
-		if (fd < 0)
-			die_errno(_("unable to create '%s'"), promisor_name);
-		close(fd);
+		write_promisor_file(promisor_name, NULL, 0);
 
 		item->util = (void *)(uintptr_t)populate_pack_exts(item->string);
 
diff --git a/fetch-pack.c b/fetch-pack.c
index c5fa4992a6..1eaedcb5dc 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -777,8 +777,6 @@  static void create_promisor_file(const char *keep_name,
 {
 	struct strbuf promisor_name = STRBUF_INIT;
 	int suffix_stripped;
-	FILE *output;
-	int i;
 
 	strbuf_addstr(&promisor_name, keep_name);
 	suffix_stripped = strbuf_strip_suffix(&promisor_name, ".keep");
@@ -787,11 +785,7 @@  static void create_promisor_file(const char *keep_name,
 		    keep_name);
 	strbuf_addstr(&promisor_name, ".promisor");
 
-	output = xfopen(promisor_name.buf, "w");
-	for (i = 0; i < nr_sought; i++)
-		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
-			sought[i]->name);
-	fclose(output);
+	write_promisor_file(promisor_name.buf, sought, nr_sought);
 
 	strbuf_release(&promisor_name);
 }
diff --git a/pack-write.c b/pack-write.c
index 3513665e1e..db3ff9980f 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -1,6 +1,7 @@ 
 #include "cache.h"
 #include "pack.h"
 #include "csum-file.h"
+#include "remote.h"
 
 void reset_pack_idx_option(struct pack_idx_option *opts)
 {
@@ -367,3 +368,14 @@  void finish_tmp_packfile(struct strbuf *name_buffer,
 
 	free((void *)idx_tmp_name);
 }
+
+void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
+{
+	int i;
+	FILE *output = xfopen(promisor_name, "w");
+
+	for (i = 0; i < nr_sought; i++)
+		fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
+			sought[i]->name);
+	fclose(output);
+}
diff --git a/pack.h b/pack.h
index 9fc0945ac9..9ae640f417 100644
--- a/pack.h
+++ b/pack.h
@@ -87,6 +87,10 @@  off_t write_pack_header(struct hashfile *f, uint32_t);
 void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 char *index_pack_lockfile(int fd);
 
+struct ref;
+
+void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
+
 /*
  * The "hdr" output buffer should be at least this big, which will handle sizes
  * up to 2^67.