Message ID | cover-0.3-00000000000-20210621T151357Z-avarab@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | bundle.c: remove "ref_list" in favor of string-list.c API | expand |
On Mon, Jun 21, 2021 at 05:16:11PM +0200, Ævar Arnfjörð Bjarmason wrote: > This v2 addresses an omission noted by Andrei Rybak[1]. I didn't > factor out the "name" into a variable in 2/3 for ease of reading > 3/3. That's now done. This all looks OK to me. I left a few small comments on the patches themselves. The UNLEAK() thing I suggested for patch 1 does make that patch much smaller and easier to read, but I suspect makes patch 3 harder (i.e., you are reusing the "cleanup" sections there to do the bundle header release. That could _also_ get UNLEAKed, but at some point it becomes more clear to actually clean up after ourselves, and I think patch 3 probably crosses that point). So I'm OK to ignore that. I would prefer to see the "die()" thing I mentioned there addressed, as well as the ternary thing from patch 3. But neither of them is incorrect as-is; it's just a style/preference thing. -Peff
Refactor the bundle API to use the string_list API instead of its own
version of a similar API. See [1] for v2.
Addresses comments by Jeff King about us being too overzelous in
trying not to leak memory (the 'die_no_repo' is gone), and other
flow/style comments of his.
I also added a bundle_header_init() function for use in transport.c,
and noticed a redundant call to string_list_clear() there.
1. https://lore.kernel.org/git/cover-0.3-00000000000-20210621T151357Z-avarab@gmail.com/
Ævar Arnfjörð Bjarmason (3):
bundle cmd: stop leaking memory from parse_options_cmd_bundle()
bundle.c: use a temporary variable for OIDs and names
bundle: remove "ref_list" in favor of string-list.c API
builtin/bundle.c | 74 ++++++++++++++++++++++++++++++------------------
bundle.c | 65 ++++++++++++++++++++++++++----------------
bundle.h | 21 +++++++-------
transport.c | 10 +++++--
4 files changed, 105 insertions(+), 65 deletions(-)
Range-diff against v2:
1: 6a8b20a7cf3 < -: ----------- upload-pack: run is_repository_shallow() before setup_revisions()
2: d88b2c04102 < -: ----------- revision.h: unify "disable_stdin" and "read_from_stdin"
3: d433d7b24a3 < -: ----------- pack-objects.c: do stdin parsing via revision.c's API
4: e59a06c3148 < -: ----------- pack-objects.c: make use of REV_INFO_STDIN_LINE_PROCESS
5: f4191088ac3 ! 1: 3d0d7a8e8b5 bundle cmd: stop leaking memory from parse_options_cmd_bundle()
@@ Commit message
about those fixes if valgrind runs cleanly at the end without any
leaks whatsoever.
+ An earlier version of this change went out of its way to not leak
+ memory on the die() codepaths here, but that was deemed too verbose to
+ worry about in a built-in that's dying anyway. The only reason we'd
+ need that is to appease a mode like SANITIZE=leak within the scope of
+ an entire test file.
+
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## builtin/bundle.c ##
@@ builtin/bundle.c: static int cmd_bundle_create(int argc, const char **argv, cons
struct strvec pack_opts;
int version = -1;
-
-+ int die_no_repo = 0;
+ int ret;
struct option options[] = {
OPT_SET_INT('q', "quiet", &progress,
@@ builtin/bundle.c: static int cmd_bundle_create(int argc, const char **argv, cons
argc = parse_options_cmd_bundle(argc, argv, prefix,
builtin_bundle_create_usage, options, &bundle_file);
@@ builtin/bundle.c: static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {
- if (progress && all_progress_implied)
- strvec_push(&pack_opts, "--all-progress-implied");
-- if (!startup_info->have_repository)
-+ if (!startup_info->have_repository) {
-+ die_no_repo = 1;
-+ goto cleanup;
-+ }
-+ ret = !!create_bundle(the_repository, bundle_file, argc, argv, &pack_opts, version);
-+cleanup:
-+ free(bundle_file);
-+ if (die_no_repo)
+ if (!startup_info->have_repository)
die(_("Need a repository to create a bundle."));
- return !!create_bundle(the_repository, bundle_file, argc, argv, &pack_opts, version);
++ ret = !!create_bundle(the_repository, bundle_file, argc, argv, &pack_opts, version);
++ free(bundle_file);
+ return ret;
}
@@ builtin/bundle.c: static int cmd_bundle_create(int argc, const char **argv, cons
struct bundle_header header;
int bundle_fd = -1;
-
-+ int die_no_repo = 0;
+ int ret;
struct option options[] = {
OPT_END()
@@ builtin/bundle.c: static int cmd_bundle_create(int argc, const char **argv, cons
memset(&header, 0, sizeof(header));
- if ((bundle_fd = read_bundle_header(bundle_file, &header)) < 0)
- return 1;
-- if (!startup_info->have_repository)
-- die(_("Need a repository to unbundle."));
-- return !!unbundle(the_repository, &header, bundle_fd, 0) ||
+ if ((bundle_fd = read_bundle_header(bundle_file, &header)) < 0) {
+ ret = 1;
+ goto cleanup;
+ }
-+ if (!startup_info->have_repository) {
-+ die_no_repo = 1;
-+ goto cleanup;
-+ }
+ if (!startup_info->have_repository)
+ die(_("Need a repository to unbundle."));
+- return !!unbundle(the_repository, &header, bundle_fd, 0) ||
+ ret = !!unbundle(the_repository, &header, bundle_fd, 0) ||
list_bundle_refs(&header, argc, argv);
+cleanup:
-+ if (die_no_repo)
-+ die(_("Need a repository to unbundle."));
+ free(bundle_file);
+ return ret;
}
6: f297fd0432a ! 2: e47646d3a98 bundle.c: use a temporary variable for OIDs and names
@@ bundle.c: int verify_bundle(struct repository *r,
for (i = 0; i < p->nr; i++) {
struct ref_list_entry *e = p->list + i;
- struct object *o = parse_object(r, &e->oid);
++ const char *name = e->name;
+ struct object_id *oid = &e->oid;
+ struct object *o = parse_object(r, oid);
if (o) {
o->flags |= PREREQ_MARK;
- add_pending_object(&revs, o, e->name);
-@@ bundle.c: int verify_bundle(struct repository *r,
+- add_pending_object(&revs, o, e->name);
++ add_pending_object(&revs, o, name);
+ continue;
}
if (++ret == 1)
error("%s", message);
- error("%s %s", oid_to_hex(&e->oid), e->name);
-+ error("%s %s", oid_to_hex(oid), e->name);
++ error("%s %s", oid_to_hex(oid), name);
}
if (revs.pending.nr != p->nr)
return ret;
@@ bundle.c: int verify_bundle(struct repository *r,
for (i = 0; i < p->nr; i++) {
struct ref_list_entry *e = p->list + i;
- struct object *o = parse_object(r, &e->oid);
++ const char *name = e->name;
+ struct object_id *oid = &e->oid;
+ struct object *o = parse_object(r, oid);
assert(o); /* otherwise we'd have returned early */
@@ bundle.c: int verify_bundle(struct repository *r,
if (++ret == 1)
error("%s", message);
- error("%s %s", oid_to_hex(&e->oid), e->name);
-+ error("%s %s", oid_to_hex(oid), e->name);
++ error("%s %s", oid_to_hex(oid), name);
}
/* Clean up objects used, as they will be reused. */
@@ bundle.c: int verify_bundle(struct repository *r,
## transport.c ##
@@ transport.c: static struct ref *get_refs_from_bundle(struct transport *transport,
+
for (i = 0; i < data->header.references.nr; i++) {
struct ref_list_entry *e = data->header.references.list + i;
- struct ref *ref = alloc_ref(e->name);
+- struct ref *ref = alloc_ref(e->name);
- oidcpy(&ref->old_oid, &e->oid);
++ const char *name = e->name;
++ struct ref *ref = alloc_ref(name);
+ struct object_id *oid = &e->oid;
+ oidcpy(&ref->old_oid, oid);
ref->next = result;
7: 887313d3b02 ! 3: f1066ee1b9a bundle: remove "ref_list" in favor of string-list.c API
@@ builtin/bundle.c: static int cmd_bundle_list_heads(int argc, const char **argv,
- struct bundle_header header;
+ struct bundle_header header = BUNDLE_HEADER_INIT;
int bundle_fd = -1;
- int die_no_repo = 0;
int ret;
+ struct option options[] = {
@@ builtin/bundle.c: static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
builtin_bundle_unbundle_usage, options, &bundle_file);
/* bundle internals use argv[1] as further parameters */
@@ builtin/bundle.c: static int cmd_bundle_unbundle(int argc, const char **argv, co
ret = 1;
goto cleanup;
@@ builtin/bundle.c: static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
- }
+ die(_("Need a repository to unbundle."));
ret = !!unbundle(the_repository, &header, bundle_fd, 0) ||
list_bundle_refs(&header, argc, argv);
+ bundle_header_release(&header);
cleanup:
- if (die_no_repo)
- die(_("Need a repository to unbundle."));
+ free(bundle_file);
+ return ret;
## bundle.c ##
@@ bundle.c: static struct {
@@ bundle.c: static struct {
-static void add_to_ref_list(const struct object_id *oid, const char *name,
- struct ref_list *list)
--{
++void bundle_header_init(struct bundle_header *header)
+ {
- ALLOC_GROW(list->list, list->nr + 1, list->alloc);
- oidcpy(&list->list[list->nr].oid, oid);
- list->list[list->nr].name = xstrdup(name);
- list->nr++;
--}
--
- static int parse_capability(struct bundle_header *header, const char *capability)
- {
- const char *arg;
-@@ bundle.c: static int parse_bundle_header(int fd, struct bundle_header *header,
- /* The bundle header ends with an empty line */
- while (!strbuf_getwholeline_fd(&buf, fd, '\n') &&
- buf.len && buf.buf[0] != '\n') {
-- struct object_id oid;
-+ struct object_id *oid;
- int is_prereq = 0;
- const char *p;
++ memset(header, 0, sizeof(*header));
++ string_list_init(&header->prerequisites, 1);
++ string_list_init(&header->references, 1);
++}
++
++void bundle_header_release(struct bundle_header *header)
++{
++ string_list_clear(&header->prerequisites, 1);
++ string_list_clear(&header->references, 1);
+ }
+ static int parse_capability(struct bundle_header *header, const char *capability)
@@ bundle.c: static int parse_bundle_header(int fd, struct bundle_header *header,
- * Prerequisites have object name that is optionally
- * followed by SP and subject line.
- */
-- if (parse_oid_hex_algop(buf.buf, &oid, &p, header->hash_algo) ||
-+ oid = xmalloc(sizeof(struct object_id));
-+ if (parse_oid_hex_algop(buf.buf, oid, &p, header->hash_algo) ||
- (*p && !isspace(*p)) ||
- (!is_prereq && !*p)) {
- if (report_path)
- error(_("unrecognized header: %s%s (%d)"),
- (is_prereq ? "-" : ""), buf.buf, (int)buf.len);
status = -1;
-+ free(oid);
break;
} else {
-- if (is_prereq)
++ struct object_id *dup = oiddup(&oid);
+ if (is_prereq)
- add_to_ref_list(&oid, "", &header->prerequisites);
-- else
++ string_list_append(&header->prerequisites, "")->util = dup;
+ else
- add_to_ref_list(&oid, p + 1, &header->references);
-+ const char *string = is_prereq ? "" : p + 1;
-+ struct string_list *list = is_prereq
-+ ? &header->prerequisites
-+ : &header->references;
-+ string_list_append(list, string)->util = oid;
++ string_list_append(&header->references, p + 1)->util = dup;
}
}
@@ bundle.c: int verify_bundle(struct repository *r,
repo_init_revisions(r, &revs, NULL);
for (i = 0; i < p->nr; i++) {
- struct ref_list_entry *e = p->list + i;
+- const char *name = e->name;
- struct object_id *oid = &e->oid;
+ struct string_list_item *e = p->items + i;
++ const char *name = e->string;
+ struct object_id *oid = e->util;
struct object *o = parse_object(r, oid);
if (o) {
o->flags |= PREREQ_MARK;
-- add_pending_object(&revs, o, e->name);
-+ add_pending_object(&revs, o, e->string);
- continue;
- }
- if (++ret == 1)
- error("%s", message);
-- error("%s %s", oid_to_hex(oid), e->name);
-+ error("%s %s", oid_to_hex(oid), e->string);
- }
- if (revs.pending.nr != p->nr)
- return ret;
@@ bundle.c: int verify_bundle(struct repository *r,
i--;
for (i = 0; i < p->nr; i++) {
- struct ref_list_entry *e = p->list + i;
+- const char *name = e->name;
- struct object_id *oid = &e->oid;
+ struct string_list_item *e = p->items + i;
++ const char *name = e->string;
+ const struct object_id *oid = e->util;
struct object *o = parse_object(r, oid);
assert(o); /* otherwise we'd have returned early */
if (o->flags & SHOWN)
- continue;
- if (++ret == 1)
- error("%s", message);
-- error("%s %s", oid_to_hex(oid), e->name);
-+ error("%s %s", oid_to_hex(oid), e->string);
- }
+@@ bundle.c: int verify_bundle(struct repository *r,
/* Clean up objects used, as they will be reused. */
for (i = 0; i < p->nr; i++) {
@@ bundle.c: int verify_bundle(struct repository *r,
r = &header->references;
printf_ln(Q_("The bundle contains this ref:",
-@@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header,
- return error(_("index-pack died"));
- return 0;
- }
-+
-+void bundle_header_release(struct bundle_header *header)
-+{
-+ string_list_clear(&header->prerequisites, 1);
-+ string_list_clear(&header->references, 1);
-+}
## bundle.h ##
@@
@@ bundle.h
+ .prerequisites = STRING_LIST_INIT_DUP, \
+ .references = STRING_LIST_INIT_DUP, \
+}
++void bundle_header_init(struct bundle_header *header);
++void bundle_header_release(struct bundle_header *header);
+
int is_bundle(const char *path, int quiet);
int read_bundle_header(const char *path, struct bundle_header *header);
int create_bundle(struct repository *r, const char *path,
-@@ bundle.h: int unbundle(struct repository *r, struct bundle_header *header,
- int bundle_fd, int flags);
- int list_bundle_refs(struct bundle_header *header,
- int argc, const char **argv);
-+void bundle_header_release(struct bundle_header *header);
-
- #endif
## transport.c ##
@@ transport.c: static struct ref *get_refs_from_bundle(struct transport *transport,
@@ transport.c: static struct ref *get_refs_from_bundle(struct transport *transport
for (i = 0; i < data->header.references.nr; i++) {
- struct ref_list_entry *e = data->header.references.list + i;
-- struct ref *ref = alloc_ref(e->name);
-- struct object_id *oid = &e->oid;
+- const char *name = e->name;
+ struct string_list_item *e = data->header.references.items + i;
-+ struct ref *ref = alloc_ref(e->string);
-+ const struct object_id *oid = e->util;
++ const char *name = e->string;
+ struct ref *ref = alloc_ref(name);
+- struct object_id *oid = &e->oid;
++ struct object_id *oid = e->util;
oidcpy(&ref->old_oid, oid);
ref->next = result;
result = ref;
- }
-+ string_list_clear(&data->header.references, 1);
- return result;
- }
-
@@ transport.c: static int close_bundle(struct transport *transport)
struct bundle_transport_data *data = transport->data;
if (data->fd > 0)
@@ transport.c: struct transport *transport_get(struct remote *remote, const char *
die(_("git-over-rsync is no longer supported"));
} else if (url_is_local_not_ssh(url) && is_file(url) && is_bundle(url, 1)) {
struct bundle_transport_data *data = xcalloc(1, sizeof(*data));
-+ string_list_init(&data->header.prerequisites, 1);
-+ string_list_init(&data->header.references, 1);
++ bundle_header_init(&data->header);
transport_check_allowed("file");
ret->data = data;
ret->vtable = &bundle_vtable;
On Wed, Jun 30, 2021 at 04:06:13PM +0200, Ævar Arnfjörð Bjarmason wrote: > Refactor the bundle API to use the string_list API instead of its own > version of a similar API. See [1] for v2. > > Addresses comments by Jeff King about us being too overzelous in > trying not to leak memory (the 'die_no_repo' is gone), and other > flow/style comments of his. > > I also added a bundle_header_init() function for use in transport.c, > and noticed a redundant call to string_list_clear() there. Thanks, all three look good to me. As an aside, having to have a separate bundle_header_init() and BUNDLE_HEADER_INIT is annoying (because they both must be kept up to date with each other), but quite common in our code base. I wonder if writing: void bundle_header_init(struct bundle_header *header) { struct bundle_header blank = BUNDLE_HEADER_INIT; memcpy(header, &blank, sizeof(*header)); } would let a smart enough compiler just init "header" in place without the extra copy (the performance of a single bundle_header almost certainly doesn't matter, but it might for other types). Just musing. ;) -Peff
On Wed, Jun 30, 2021 at 01:34:07PM -0400, Jeff King wrote: > As an aside, having to have a separate bundle_header_init() and > BUNDLE_HEADER_INIT is annoying (because they both must be kept up to > date with each other), but quite common in our code base. I wonder if > writing: > > void bundle_header_init(struct bundle_header *header) > { > struct bundle_header blank = BUNDLE_HEADER_INIT; > memcpy(header, &blank, sizeof(*header)); > } > > would let a smart enough compiler just init "header" in place without > the extra copy (the performance of a single bundle_header almost > certainly doesn't matter, but it might for other types). > > Just musing. ;) For my own curiosity, the answer is yes: https://godbolt.org/z/s54dc6ss9 With "gcc -O2" the memcpy goes away and we init "header" directly. If we want to start using this technique widely, I don't think it should be part of your series, though. This probably applies to quite a few data structures, so it would make more sense to have a series which converts several. -Peff
On Wed, Jun 30 2021, Jeff King wrote: > On Wed, Jun 30, 2021 at 01:34:07PM -0400, Jeff King wrote: > >> As an aside, having to have a separate bundle_header_init() and >> BUNDLE_HEADER_INIT is annoying (because they both must be kept up to >> date with each other), but quite common in our code base. I wonder if >> writing: >> >> void bundle_header_init(struct bundle_header *header) >> { >> struct bundle_header blank = BUNDLE_HEADER_INIT; >> memcpy(header, &blank, sizeof(*header)); >> } >> >> would let a smart enough compiler just init "header" in place without >> the extra copy (the performance of a single bundle_header almost >> certainly doesn't matter, but it might for other types). >> >> Just musing. ;) > > For my own curiosity, the answer is yes: https://godbolt.org/z/s54dc6ss9 > > With "gcc -O2" the memcpy goes away and we init "header" directly. > > If we want to start using this technique widely, I don't think it should > be part of your series, though. This probably applies to quite a few > data structures, so it would make more sense to have a series which > converts several. That's cool, yeah that would make quite a lot of code better. Thanks!
On Wed, Jun 30 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Jun 30 2021, Jeff King wrote: > >> On Wed, Jun 30, 2021 at 01:34:07PM -0400, Jeff King wrote: >> >>> As an aside, having to have a separate bundle_header_init() and >>> BUNDLE_HEADER_INIT is annoying (because they both must be kept up to >>> date with each other), but quite common in our code base. I wonder if >>> writing: >>> >>> void bundle_header_init(struct bundle_header *header) >>> { >>> struct bundle_header blank = BUNDLE_HEADER_INIT; >>> memcpy(header, &blank, sizeof(*header)); >>> } >>> >>> would let a smart enough compiler just init "header" in place without >>> the extra copy (the performance of a single bundle_header almost >>> certainly doesn't matter, but it might for other types). >>> >>> Just musing. ;) >> >> For my own curiosity, the answer is yes: https://godbolt.org/z/s54dc6ss9 >> >> With "gcc -O2" the memcpy goes away and we init "header" directly. >> >> If we want to start using this technique widely, I don't think it should >> be part of your series, though. This probably applies to quite a few >> data structures, so it would make more sense to have a series which >> converts several. > > That's cool, yeah that would make quite a lot of code better. Thanks! Just for future reference: I submitted a small series to make use of this suggested idiom: https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/