Message ID | Y7l4LsEQcDT9HZ21@coredump.intra.peff.net (mailing list archive) |
---|---|
Headers | show |
Series | cleaning up read_object() family of functions | expand |
On 1/7/2023 8:48 AM, Jeff King wrote: > I often get confused about the difference between: > > - read_object() > - read_object_file(); > - read_object_file_extended(); > - repo_read_object_file(); > > Since Jonathan's recent cleanups from 9e59b38c88 (object-file: emit > corruption errors when detected, 2022-12-14), these are mostly thin > wrappers around each other and around oid_object_info_extended(). > > This series shuffles things around a little more so that we are down to > just read_object_file() and repo_read_object_file(). And the > relationship there is pretty easy (and long-term we'd eventually merge > them once everyone has a repository object). I read the patches carefully and the translations look correct and definitely help with this confusing mess of method names. > It is a net reduction in lines, even though some of the callers end up a > little longer (because they have to stuff pointers into an object_info > struct). If that's too distasteful, the middle ground is to have a > helper like: > > void *foo(struct repository *r, const struct object_id *oid, > enum object_type *type, unsigned long *size, > unsigned flags) > { > struct object_info oi = OBJECT_INFO_INIT; > void *content; > > oi.typep = type; > oi.sizep = size; > oi.contentp = ret; > > if (oid_object_info_extended(r, oid, &oi, flags) < 0) > return NULL; > return content; > } > > which is basically the same as read_object(), but makes it clear that > you can pass OBJECT_INFO flags. The trouble is that I could not come up > with a name for it that was not confusing. ;) So just having most places > call oid_object_info_extended() directly seemed better. It would be nice > if that function had a shorter name, too, but I left that for another > day. I did think that requiring callers to create their own object_info structs (which takes at least four lines) would be too much, but the number of new callers is so low that I think this is a fine place to stop. Thanks, -Stolee
On Mon, Jan 09, 2023 at 10:09:32AM -0500, Derrick Stolee wrote: > I did think that requiring callers to create their own object_info > structs (which takes at least four lines) would be too much, but > the number of new callers is so low that I think this is a fine place > to stop. Yeah, that was my feeling. I do wonder if there's a way to make it easier for callers of oid_object_info_extended(), but I couldn't come up with anything that's nice enough to merit the complexity. For example, here's an attempt to let the caller use designated initializers to set up the query struct: diff --git a/object-file.c b/object-file.c index 80b08fc389..60ca75d755 100644 --- a/object-file.c +++ b/object-file.c @@ -1700,13 +1700,12 @@ void *repo_read_object_file(struct repository *r, enum object_type *type, unsigned long *size) { - struct object_info oi = OBJECT_INFO_INIT; unsigned flags = OBJECT_INFO_DIE_IF_CORRUPT | OBJECT_INFO_LOOKUP_REPLACE; void *data; + struct object_info oi = OBJECT_INFO(.typep = type, + .sizep = size, + .contentp = &data); - oi.typep = type; - oi.sizep = size; - oi.contentp = &data; if (oid_object_info_extended(r, oid, &oi, flags)) return NULL; diff --git a/object-store.h b/object-store.h index 1a713d89d7..e894cee61b 100644 --- a/object-store.h +++ b/object-store.h @@ -418,7 +418,8 @@ struct object_info { * Initializer for a "struct object_info" that wants no items. You may * also memset() the memory to all-zeroes. */ -#define OBJECT_INFO_INIT { 0 } +#define OBJECT_INFO(...) { 0, __VA_ARGS__ } +#define OBJECT_INFO_INIT OBJECT_INFO() /* Invoke lookup_replace_object() on the given hash */ #define OBJECT_INFO_LOOKUP_REPLACE 1 But: - it actually triggers a gcc warning, since OBJECT_INFO(.typep = foo) sets typep twice (once for the default "0", and once by name). In this case the "0" is superfluous, since that's the default, and we could just do: #define OBJECT_INFO(...) { __VA_ARGS__ } #define OBJECT_INFO_INIT OBJECT_INFO(0) but I was hoping to find a general technique for object initializers. - it's not really that much shorter than the existing code. The real benefit of "data = read_object(oid, type, size)" is the implicit number and names of the parameters. And the way to get that is to provide an extra function. So I think we are better off with the code that is longer but totally obvious, unless we really want to add a function wrapper for common queries as syntactic sugar. -Peff
On 1/11/2023 1:26 PM, Jeff King wrote: > On Mon, Jan 09, 2023 at 10:09:32AM -0500, Derrick Stolee wrote: > >> I did think that requiring callers to create their own object_info >> structs (which takes at least four lines) would be too much, but >> the number of new callers is so low that I think this is a fine place >> to stop. > > Yeah, that was my feeling. I do wonder if there's a way to make it > easier for callers of oid_object_info_extended(), but I couldn't come up > with anything that's nice enough to merit the complexity. > > For example, here's an attempt to let the caller use designated > initializers to set up the query struct: > + struct object_info oi = OBJECT_INFO(.typep = type, > + .sizep = size, > + .contentp = &data); Your macro expansion creates this format: struct object_info oi = { .type = type, .sizep = size, .contentp = &data, }; And even this expansion looks a bit better than the inline updates: > - oi.typep = type; > - oi.sizep = size; > - oi.contentp = &data; So maybe that's a preferred pattern that we could establish by replacing the existing callers. It's also such a minor point that I wouldn't say it's a high priority to do. Thanks, -Stolee
On Wed, Jan 11, 2023 at 03:17:58PM -0500, Derrick Stolee wrote: > > For example, here's an attempt to let the caller use designated > > initializers to set up the query struct: > > > + struct object_info oi = OBJECT_INFO(.typep = type, > > + .sizep = size, > > + .contentp = &data); > > Your macro expansion creates this format: > > struct object_info oi = { > .type = type, > .sizep = size, > .contentp = &data, > }; > > And even this expansion looks a bit better than the inline > updates: There's a subtle assumption in the expanded initializer, though, which is that everything not specified is OK to be zero-initialized. That works for object_info, but not for arbitrary structs (which is why we have these INIT macros in the first place). -Peff
On Wed, Jan 11 2023, Jeff King wrote: > On Mon, Jan 09, 2023 at 10:09:32AM -0500, Derrick Stolee wrote: > >> I did think that requiring callers to create their own object_info >> structs (which takes at least four lines) would be too much, but >> the number of new callers is so low that I think this is a fine place >> to stop. > > Yeah, that was my feeling. I do wonder if there's a way to make it > easier for callers of oid_object_info_extended(), but I couldn't come up > with anything that's nice enough to merit the complexity. > > For example, here's an attempt to let the caller use designated > initializers to set up the query struct: > > diff --git a/object-file.c b/object-file.c > index 80b08fc389..60ca75d755 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -1700,13 +1700,12 @@ void *repo_read_object_file(struct repository *r, > enum object_type *type, > unsigned long *size) > { > - struct object_info oi = OBJECT_INFO_INIT; > unsigned flags = OBJECT_INFO_DIE_IF_CORRUPT | OBJECT_INFO_LOOKUP_REPLACE; > void *data; > + struct object_info oi = OBJECT_INFO(.typep = type, > + .sizep = size, > + .contentp = &data); > > - oi.typep = type; > - oi.sizep = size; > - oi.contentp = &data; > if (oid_object_info_extended(r, oid, &oi, flags)) > return NULL; > > diff --git a/object-store.h b/object-store.h > index 1a713d89d7..e894cee61b 100644 > --- a/object-store.h > +++ b/object-store.h > @@ -418,7 +418,8 @@ struct object_info { > * Initializer for a "struct object_info" that wants no items. You may > * also memset() the memory to all-zeroes. > */ > -#define OBJECT_INFO_INIT { 0 } > +#define OBJECT_INFO(...) { 0, __VA_ARGS__ } > +#define OBJECT_INFO_INIT OBJECT_INFO() > > /* Invoke lookup_replace_object() on the given hash */ > #define OBJECT_INFO_LOOKUP_REPLACE 1 > > But: > > - it actually triggers a gcc warning, since OBJECT_INFO(.typep = foo) > sets typep twice (once for the default "0", and once by name). In > this case the "0" is superfluous, since that's the default, and we > could just do: > > #define OBJECT_INFO(...) { __VA_ARGS__ } > #define OBJECT_INFO_INIT OBJECT_INFO(0) > > but I was hoping to find a general technique for object > initializers. > > - it's not really that much shorter than the existing code. The real > benefit of "data = read_object(oid, type, size)" is the implicit > number and names of the parameters. And the way to get that is to > provide an extra function. > > So I think we are better off with the code that is longer but totally > obvious, unless we really want to add a function wrapper for common > queries as syntactic sugar. > > -Peff I agree that it's probably not worth it here, but I think you're just tying yourself in knots in trying to define these macros in terms of each other. This sort of thing will work if you just do: diff --git a/object-store.h b/object-store.h index e894cee61ba..bfcd2482dc5 100644 --- a/object-store.h +++ b/object-store.h @@ -418,8 +418,8 @@ struct object_info { * Initializer for a "struct object_info" that wants no items. You may * also memset() the memory to all-zeroes. */ -#define OBJECT_INFO(...) { 0, __VA_ARGS__ } -#define OBJECT_INFO_INIT OBJECT_INFO() +#define OBJECT_INFO_INIT { 0 } +#define OBJECT_INFO(...) { __VA_ARGS__ } /* Invoke lookup_replace_object() on the given hash */ #define OBJECT_INFO_LOOKUP_REPLACE 1 Which is just a twist on René's suggestion from [1], i.e.: #define CHILD_PROCESS_INIT_EX(...) { .args = STRVEC_INIT, __VA_ARGS__ } In that case we always need to rely on the "args" being init'd, and the GCC warning you note is a feature, its initialization is "private", and you should never override it. But likewise you don't need the "0" there, if the user provides an empty list that's their own fault, they should use OBJECT_INFO_INIT instead. If they do provide arguments it's an implementation detail how any "default" arguments get init'd, if they're not clobbering any "private" arguments we're OK. So using an explicit "0" is the same as providing nothing in the "*_ARGS()" case, in both cases we're just offloading that zero-init to the language. The only way I think you can dig yourself into a proper hole here is if you're trying to support 0 or N args, as P99 shows that's possible, but quite complex (and not worth it, IMO). 1. https://lore.kernel.org/git/749f6adc-928a-0978-e3a1-2ede9f07def0@web.de/
On Thu, Jan 12, 2023 at 10:21:46AM +0100, Ævar Arnfjörð Bjarmason wrote: > I agree that it's probably not worth it here, but I think you're just > tying yourself in knots in trying to define these macros in terms of > each other. This sort of thing will work if you just do: > > diff --git a/object-store.h b/object-store.h > index e894cee61ba..bfcd2482dc5 100644 > --- a/object-store.h > +++ b/object-store.h > @@ -418,8 +418,8 @@ struct object_info { > * Initializer for a "struct object_info" that wants no items. You may > * also memset() the memory to all-zeroes. > */ > -#define OBJECT_INFO(...) { 0, __VA_ARGS__ } > -#define OBJECT_INFO_INIT OBJECT_INFO() > +#define OBJECT_INFO_INIT { 0 } > +#define OBJECT_INFO(...) { __VA_ARGS__ } Right, that works because the initializer is just "0", which the compiler can do for us implicitly. I agree it works here to omit, but as a general solution, it doesn't. > Which is just a twist on René's suggestion from [1], i.e.: > > #define CHILD_PROCESS_INIT_EX(...) { .args = STRVEC_INIT, __VA_ARGS__ } > > In that case we always need to rely on the "args" being init'd, and the > GCC warning you note is a feature, its initialization is "private", and > you should never override it. Right, and it works here because you'd never want to init .args to anything else (which I think is what you mean by "private"). But in the general case the defaults can't set something that the caller might want to override, because the compiler's warning doesn't know the difference between "override" and "oops, you specified this twice". It's mostly a non-issue because we tend to prefer 0-initialization when possible, but I think as a general technique this is probably opening a can of worms for little benefit. -Peff
On Thu, Jan 12 2023, Jeff King wrote: > On Thu, Jan 12, 2023 at 10:21:46AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> I agree that it's probably not worth it here, but I think you're just >> tying yourself in knots in trying to define these macros in terms of >> each other. This sort of thing will work if you just do: >> >> diff --git a/object-store.h b/object-store.h >> index e894cee61ba..bfcd2482dc5 100644 >> --- a/object-store.h >> +++ b/object-store.h >> @@ -418,8 +418,8 @@ struct object_info { >> * Initializer for a "struct object_info" that wants no items. You may >> * also memset() the memory to all-zeroes. >> */ >> -#define OBJECT_INFO(...) { 0, __VA_ARGS__ } >> -#define OBJECT_INFO_INIT OBJECT_INFO() >> +#define OBJECT_INFO_INIT { 0 } >> +#define OBJECT_INFO(...) { __VA_ARGS__ } > > Right, that works because the initializer is just "0", which the > compiler can do for us implicitly. I agree it works here to omit, but as > a general solution, it doesn't. > >> Which is just a twist on René's suggestion from [1], i.e.: >> >> #define CHILD_PROCESS_INIT_EX(...) { .args = STRVEC_INIT, __VA_ARGS__ } >> >> In that case we always need to rely on the "args" being init'd, and the >> GCC warning you note is a feature, its initialization is "private", and >> you should never override it. > > Right, and it works here because you'd never want to init .args to > anything else (which I think is what you mean by "private"). But in the > general case the defaults can't set something that the caller might want > to override, because the compiler's warning doesn't know the difference > between "override" and "oops, you specified this twice". > > It's mostly a non-issue because we tend to prefer 0-initialization when > possible, but I think as a general technique this is probably opening a > can of worms for little benefit. You're right in the general case, although I think that if we did encounter such a use-case a perfectly good solution would be to just suppress the GCC-specific warning with the relevant GCC-specific macro magic, this being perfectly valid C, just something it (rightly, as it's almost always a mistake) complains about. But I can't think of a case where this would matter for us in practice. We have members like "struct strbuf"'s "buf", which always needs to be init'd, but never "maybe by the user", so the pattern above would work there. Then we have things like "strdup_strings" which we might imagine that the user would override (with a hypothetical "struct string_list" that took more arguments, but in those cases we could just add another init macro, as "STRING_LIST_INIT_{DUP,NODUP}" does. For any such member we could always just invert its boolean state, if it came to that, couldn't we? Anyway, I agree that it's not worth pursuing this in this case. But I think it's a neat pattern that we might find use for sooner than later for something else. I don't think it's worth the churn to change it at this point (except maybe with a sufficiently clever coccinelle rule), but I think it's already "worth it" in the case of the run-command API, if we were adding that code today under current constraints (i.e. being able to use C99 macro features).
On Thu, Jan 12, 2023 at 05:22:04PM +0100, Ævar Arnfjörð Bjarmason wrote: > We have members like "struct strbuf"'s "buf", which always needs to be > init'd, but never "maybe by the user", so the pattern above would work > there. We've discussed in the past having a strbuf that points to an existing buffer, over which it takes ownership. Or a const string that we'd leave behind (but not free) if we needed to grow. In those cases you'd want to pass in a buffer to the allocator. Of course in the case of a strbuf those initializers would probably just be totally separate from the regular slopbuf one, just because there's not much else in a strbuf to initialize. You don't gain much from trying to avoid repetition. > Anyway, I agree that it's not worth pursuing this in this case. > > But I think it's a neat pattern that we might find use for sooner than > later for something else. I remain unconvinced. ;) Mostly just that the lines saved versus the amount of magic and thought doesn't seem reasonable. But it's something we can keep in mind as new opportunities show up. -Peff