Message ID | patch-7.8-0a4fb01ae38-20210628T191634Z-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | serve: add "startup_config" callback | expand |
On Mon, Jun 28, 2021 at 09:19:24PM +0200, Ævar Arnfjörð Bjarmason wrote: > So let's support this common pattern as a "startup_config" callback, > making use of our recently added "call_{advertise,command}()" > functions. This allows us to simplify e.g. the "ensure_config_read()" > function added in 59e1205d167 (ls-refs: report unborn targets of > symrefs, 2021-02-05). > > We could read all the config for all the protocol capabilities, but > let's do it one callback at a time in anticipation that some won't be > called at all, and that some might be more expensive than others in > the future. Sadly I don't think this addresses my "v2 receive-pack" concern. The ls_refs_startup_config() function will get called after we've received a request for "ls-refs", which is good. But: > +static void read_startup_config(struct protocol_capability *command) > +{ > + if (!command->startup_config) > + return; > + if (command->have_startup_config++) > + return; > + git_config(command->startup_config, NULL); > +} ...we don't pass any context to the config callback here. I thought passing "command" might work, but looking at the ls_refs() function, it is the one who actually reads the pkt-lines that will tell us "hey, I'm doing an ls-refs for a push". So none of the serve() infrastructure can help us there; we need to read pkt-lines and _then_ read config. I dunno. Maybe the solution is for ls_refs() to just do a separate config call to pick up the operation-specific bits, like: diff --git a/ls-refs.c b/ls-refs.c index 88f6c3f60d..6ee70126aa 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -130,12 +130,13 @@ static void send_possibly_unborn_head(struct ls_refs_data *data) static int ls_refs_config(const char *var, const char *value, void *data) { + struct ls_refs_data *d = data; /* * We only serve fetches over v2 for now, so respect only "uploadpack" * config. This may need to eventually be expanded to "receive", but we * don't yet know how that information will be passed to ls-refs. */ - return parse_hide_refs_config(var, value, "uploadpack"); + return parse_hide_refs_config(var, value, d->for_push ? "receive" : "uploadpack"); } int ls_refs(struct repository *r, struct strvec *keys, @@ -147,7 +148,6 @@ int ls_refs(struct repository *r, struct strvec *keys, strvec_init(&data.prefixes); ensure_config_read(); - git_config(ls_refs_config, NULL); while (packet_reader_read(request) == PACKET_READ_NORMAL) { const char *arg = request->line; @@ -161,8 +161,12 @@ int ls_refs(struct repository *r, struct strvec *keys, strvec_push(&data.prefixes, out); else if (!strcmp("unborn", arg)) data.unborn = allow_unborn; + else if (!strcmp("for-push", arg)) /* imagine this exists */ + data.for_push = 1; } + git_config(ls_refs_config, &data); + if (request->status != PACKET_READ_FLUSH) die(_("expected flush after ls-refs arguments")); And then it is a separate thing entirely from your "serve will read config" code. It's arguably more confusing, because now config is read in two places, but that is already true because of this ensure_config_read() thing. The patch above obviously makes no sense until we're working on v2 receive-pack. But I think it illustrates that your patch here is not getting in the way (though technically I think that would also be true of your v1, I do like this version better). -Peff
On Thu, Jul 01, 2021 at 12:43:43PM -0400, Jeff King wrote: > I dunno. Maybe the solution is for ls_refs() to just do a separate > config call to pick up the operation-specific bits, like: By the way, I think both currently and after the patch I showed, ls_refs() has the same "bug" that we fixed for upload_pack_v2() a while ago: in a v2 world, a client could request "ls-refs" over and over, and each time we'd load the hiderefs config, appending duplicate config to the list each time. In practice this doesn't happen because unlike "fetch", which clients must do many rounds of, clients usually issue only a single ls-refs. So it may not be worth worrying too much about. I guess a malicious client could convince us to very slowly allocate an arbitrary amount of memory. -Peff
On Thu, Jul 01 2021, Jeff King wrote: > On Mon, Jun 28, 2021 at 09:19:24PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> So let's support this common pattern as a "startup_config" callback, >> making use of our recently added "call_{advertise,command}()" >> functions. This allows us to simplify e.g. the "ensure_config_read()" >> function added in 59e1205d167 (ls-refs: report unborn targets of >> symrefs, 2021-02-05). >> >> We could read all the config for all the protocol capabilities, but >> let's do it one callback at a time in anticipation that some won't be >> called at all, and that some might be more expensive than others in >> the future. > > Sadly I don't think this addresses my "v2 receive-pack" concern. The > ls_refs_startup_config() function will get called after we've received a > request for "ls-refs", which is good. But: I think it does in that you rightly objected to us moving all config to such a callback, because for some of it we don't have the information needed to look it up yet, we do that in the request handler. But for a lot of our config it's fine to do it early, hence "startup" config. Yes I've moved the ls-refs handling into the "startup" because /right now/ it's only handling fetches, it'll need to be moved out if and when we start handling pushes. But isn't it going to be obvious that we'll need to do that then? Since we'll have the example of upload-pack.c doing that exact thing? I.e. do you not want to have the "startup config" concept at all, or would just prefer to have the ls-refs part of it pre-emotively moved out of it in anticipation of handling pushes some day, even if we can do that on "startup" now? (More below) >> +static void read_startup_config(struct protocol_capability *command) >> +{ >> + if (!command->startup_config) >> + return; >> + if (command->have_startup_config++) >> + return; >> + git_config(command->startup_config, NULL); >> +} > > ...we don't pass any context to the config callback here. I thought > passing "command" might work, but looking at the ls_refs() function, it > is the one who actually reads the pkt-lines that will tell us "hey, I'm > doing an ls-refs for a push". > > So none of the serve() infrastructure can help us there; we need to read > pkt-lines and _then_ read config. > > I dunno. Maybe the solution is for ls_refs() to just do a separate > config call to pick up the operation-specific bits, like: > > diff --git a/ls-refs.c b/ls-refs.c > index 88f6c3f60d..6ee70126aa 100644 > --- a/ls-refs.c > +++ b/ls-refs.c > @@ -130,12 +130,13 @@ static void send_possibly_unborn_head(struct ls_refs_data *data) > > static int ls_refs_config(const char *var, const char *value, void *data) > { > + struct ls_refs_data *d = data; > /* > * We only serve fetches over v2 for now, so respect only "uploadpack" > * config. This may need to eventually be expanded to "receive", but we > * don't yet know how that information will be passed to ls-refs. > */ > - return parse_hide_refs_config(var, value, "uploadpack"); > + return parse_hide_refs_config(var, value, d->for_push ? "receive" : "uploadpack"); > } > > int ls_refs(struct repository *r, struct strvec *keys, > @@ -147,7 +148,6 @@ int ls_refs(struct repository *r, struct strvec *keys, > strvec_init(&data.prefixes); > > ensure_config_read(); > - git_config(ls_refs_config, NULL); > > while (packet_reader_read(request) == PACKET_READ_NORMAL) { > const char *arg = request->line; > @@ -161,8 +161,12 @@ int ls_refs(struct repository *r, struct strvec *keys, > strvec_push(&data.prefixes, out); > else if (!strcmp("unborn", arg)) > data.unborn = allow_unborn; > + else if (!strcmp("for-push", arg)) /* imagine this exists */ > + data.for_push = 1; > } > > + git_config(ls_refs_config, &data); > + > if (request->status != PACKET_READ_FLUSH) > die(_("expected flush after ls-refs arguments")); > > > And then it is a separate thing entirely from your "serve will read > config" code. It's arguably more confusing, because now config is read > in two places, but that is already true because of this > ensure_config_read() thing. This suggests to me you'd like to preemptively move it out of "startup", correct? Anyway, I can do that if that addresses your concern. I thought the v1 objection was mainly "the config flow won't work that way in all cases", which you're right that I incorrectly assumed. I just thought preemptively doing it for "ls-refs" wouldn't be needed, since we'd notice in testing such a feature that "do it once" would break in obvious ways for multi-requests, especially with the comment for the "startup_config" callback explicitly calling out that case. > The patch above obviously makes no sense until we're working on v2 > receive-pack. But I think it illustrates that your patch here is not > getting in the way (though technically I think that would also be true > of your v1, I do like this version better).
On Fri, Jul 02, 2021 at 02:55:38PM +0200, Ævar Arnfjörð Bjarmason wrote: > > Sadly I don't think this addresses my "v2 receive-pack" concern. The > > ls_refs_startup_config() function will get called after we've received a > > request for "ls-refs", which is good. But: > > I think it does in that you rightly objected to us moving all config to > such a callback, because for some of it we don't have the information > needed to look it up yet, we do that in the request handler. > > But for a lot of our config it's fine to do it early, hence "startup" > config. So I think your patch is OK, but just to lay out my thinking, it was this. The v2 serve code basically does: 1. Somebody calls serve(). 2. It reads a capability/operation name, then dispatches to the appropriate function. 3. That function reads operation-specific data, like options. Your first patch read config at step 1. This one reads at step 2. But from past discussions, the hide-refs config could only be read (or interpreted) after step 3, because that's when we'd see an option saying "this is for pushing". So anything except step 3 is "too late". But that doesn't stop us from having some early config and some operation-specific late config. > Yes I've moved the ls-refs handling into the "startup" because /right > now/ it's only handling fetches, it'll need to be moved out if and when > we start handling pushes. > > But isn't it going to be obvious that we'll need to do that then? Since > we'll have the example of upload-pack.c doing that exact thing? > > I.e. do you not want to have the "startup config" concept at all, or > would just prefer to have the ls-refs part of it pre-emotively moved out > of it in anticipation of handling pushes some day, even if we can do > that on "startup" now? Sorry to be so confusing. About halfway through writing my previous email I had realized "well, I guess we can have two phases of config-reading, and that's not so bad". So I'm OK with this direction. I don't think anything else needs to happen on top of your patch here. The "late" phase for ls-refs config reading already happens totally separately (it's in the git_config(ls_refs_config) call, which is already separate from the ensure_config_read() thing. And your patch leaves that line untouched. > > And then it is a separate thing entirely from your "serve will read > > config" code. It's arguably more confusing, because now config is read > > in two places, but that is already true because of this > > ensure_config_read() thing. > > This suggests to me you'd like to preemptively move it out of "startup", > correct? It's already out of the startup, so I think we are OK. My big question here was: is the concept of "startup config" and "other config we read after seeing the options" too confusing? But probably not. > I just thought preemptively doing it for "ls-refs" wouldn't be needed, > since we'd notice in testing such a feature that "do it once" would > break in obvious ways for multi-requests, especially with the comment > for the "startup_config" callback explicitly calling out that case. Yeah, we'd definitely notice. I was just wondering if we'd end up needing to back out these changes to accommodate it. But the two-phase thing solves that. -Peff
On Mon, Jun 28, 2021 at 9:19 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > So let's support this common pattern as a "startup_config" callback, > making use of our recently added "call_{advertise,command}()" I think this is an improvement over ensure_config_read(), but I don't understand the whole picture. If you want to control configuration and be sure it is set early, isn't it more natural to put this in struct serve_options? With your patch, there are still two routes to injecting configuration: 'struct serve_options' and the git config. IMO the former is more principled, and the control flow (serve_options is an argument to serve()) ensures that the options are read only once. > + /* > + * A git_config() callback that'll be called only once for the > + * lifetime of the process, possibly over many different Putting it in a struct supplied by the caller provides more flexibility for testing: you could test multiple behaviors from a single process, by calling serve multiple times supplying different arguments.
On Mon, Jun 28, 2021 at 9:19 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > /* Main serve loop for protocol version 2 */ > void serve(struct serve_options *options) side note: this is a very short name for a globally visible symbol. Would be nice if this were something like protocol_v2_serve_loop() . Then one could drop the comment.
diff --git a/ls-refs.c b/ls-refs.c index 88f6c3f60d8..02fbcfd9bde 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -7,37 +7,26 @@ #include "pkt-line.h" #include "config.h" -static int config_read; -static int advertise_unborn; -static int allow_unborn; +/* "unborn" is on by default if there's no lsrefs.unborn config */ +static int advertise_unborn = 1; +static int allow_unborn = 1; -static void ensure_config_read(void) +int ls_refs_startup_config(const char *var, const char *value, void *data) { - const char *str = NULL; - - if (config_read) - return; - - if (repo_config_get_string_tmp(the_repository, "lsrefs.unborn", &str)) { - /* - * If there is no such config, advertise and allow it by - * default. - */ - advertise_unborn = 1; - allow_unborn = 1; - } else { - if (!strcmp(str, "advertise")) { - advertise_unborn = 1; + if (!strcmp(var, "lsrefs.unborn")) { + if (!strcmp(value, "advertise")) { + /* Allowed and advertised by default */ + } else if (!strcmp(value, "allow")) { + advertise_unborn = 0; allow_unborn = 1; - } else if (!strcmp(str, "allow")) { - allow_unborn = 1; - } else if (!strcmp(str, "ignore")) { - /* do nothing */ + } else if (!strcmp(value, "ignore")) { + advertise_unborn = 0; + allow_unborn = 0; } else { - die(_("invalid value '%s' for lsrefs.unborn"), str); + die(_("invalid value '%s' for lsrefs.unborn"), value); } } - config_read = 1; + return 0; } /* @@ -145,8 +134,6 @@ int ls_refs(struct repository *r, struct strvec *keys, memset(&data, 0, sizeof(data)); strvec_init(&data.prefixes); - - ensure_config_read(); git_config(ls_refs_config, NULL); while (packet_reader_read(request) == PACKET_READ_NORMAL) { @@ -179,7 +166,6 @@ int ls_refs(struct repository *r, struct strvec *keys, int ls_refs_advertise(struct repository *r, struct strbuf *value) { if (value) { - ensure_config_read(); if (advertise_unborn) strbuf_addstr(value, "unborn"); } diff --git a/ls-refs.h b/ls-refs.h index a99e4be0bdd..5fcab0d1dca 100644 --- a/ls-refs.h +++ b/ls-refs.h @@ -6,6 +6,7 @@ struct strvec; struct packet_reader; int ls_refs(struct repository *r, struct strvec *keys, struct packet_reader *request); +int ls_refs_startup_config(const char *var, const char *value, void *data); int ls_refs_advertise(struct repository *r, struct strbuf *value); #endif /* LS_REFS_H */ diff --git a/serve.c b/serve.c index 6dbd05248b9..cce96dd5a81 100644 --- a/serve.c +++ b/serve.c @@ -33,6 +33,13 @@ static int object_format_advertise(struct repository *r, return 1; } +static int session_id_startup_config(const char *var, const char *value, void *data) +{ + if (!strcmp(var, "transfer.advertisesid")) + advertise_sid = git_config_bool(var, value); + return 0; +} + static int session_id_advertise(struct repository *r, struct strbuf *value) { if (!advertise_sid) @@ -54,6 +61,24 @@ struct protocol_capability { */ const char *name; + /* + * A git_config() callback that'll be called only once for the + * lifetime of the process, possibly over many different + * requests. Used for reading config that's expected to be + * static. + * + * The "command" or "advertise" callbacks themselves are + * expected to read config that needs to be more current than + * that, or which is dependent on request data. + */ + int (*startup_config)(const char *var, const char *value, void *data); + + /* + * A boolean to check if we've called our "startup_config" + * callback. + */ + int have_startup_config; + /* * Function queried to see if a capability should be advertised. * Optionally a value can be specified by adding it to 'value'. @@ -81,6 +106,7 @@ static struct protocol_capability capabilities[] = { }, { .name = "ls-refs", + .startup_config = ls_refs_startup_config, .advertise = ls_refs_advertise, .command = ls_refs, }, @@ -99,6 +125,7 @@ static struct protocol_capability capabilities[] = { }, { .name = "session-id", + .startup_config = session_id_startup_config, .advertise = session_id_advertise, }, { @@ -108,6 +135,15 @@ static struct protocol_capability capabilities[] = { }, }; +static void read_startup_config(struct protocol_capability *command) +{ + if (!command->startup_config) + return; + if (command->have_startup_config++) + return; + git_config(command->startup_config, NULL); +} + static int call_advertise(struct protocol_capability *command, struct repository *r, struct strbuf *value) { @@ -115,6 +151,8 @@ static int call_advertise(struct protocol_capability *command, struct strbuf sb = STRBUF_INIT; const char *msg; + read_startup_config(command); + strbuf_addf(&sb, "advertise/%s", command->name); trace2_region_enter("serve", sb.buf, r); ret = command->advertise(r, value); @@ -132,6 +170,8 @@ static int call_command(struct protocol_capability *command, int ret; struct strbuf sb = STRBUF_INIT; + read_startup_config(command); + strbuf_addf(&sb, "command/%s", command->name); trace2_region_enter("serve", sb.buf, r); ret = command->command(r, keys, request); @@ -338,8 +378,6 @@ static int process_request(void) /* Main serve loop for protocol version 2 */ void serve(struct serve_options *options) { - git_config_get_bool("transfer.advertisesid", &advertise_sid); - if (options->advertise_capabilities || !options->stateless_rpc) { /* serve by default supports v2 */ packet_write_fmt(1, "version 2\n");
Since the introduction of serve.c we've added git_config() callbacks and other config reading for capabilities in the following commits: - e20b4192a37 (upload-pack: support hidden refs with protocol v2, 2018-12-18) - 08450ef7918 (upload-pack: clear filter_options for each v2 fetch command, 2020-05-08) - 6b5b6e422ee (serve: advertise session ID in v2 capabilities, 2020-11-11) - 59e1205d167 (ls-refs: report unborn targets of symrefs, 2021-02-05) Of these 08450ef7918 fixed code that needed to read config on a per-request basis, whereas most of the config reading just wants to check if we've enabled one semi-static config variable or other. We'd like to re-read that value eventually, but from request-to-request it's OK if we retain the old one, and it isn't impacted by other request data. So let's support this common pattern as a "startup_config" callback, making use of our recently added "call_{advertise,command}()" functions. This allows us to simplify e.g. the "ensure_config_read()" function added in 59e1205d167 (ls-refs: report unborn targets of symrefs, 2021-02-05). We could read all the config for all the protocol capabilities, but let's do it one callback at a time in anticipation that some won't be called at all, and that some might be more expensive than others in the future. I'm not migrating over the code in the upload_pack_v2 function in upload-pack.c yet, that case is more complex since it deals with both v1 and v2. It will be dealt with in a code a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- ls-refs.c | 42 ++++++++++++++---------------------------- ls-refs.h | 1 + serve.c | 42 ++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 55 insertions(+), 30 deletions(-)