Message ID | 4cb4b69c-bd14-dfbd-6d06-59a7cd7e8c94@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | coccinelle: add and apply branch_get() rules | expand |
Rubén Justo <rjusto@gmail.com> writes: > There are three supported ways to obtain a "struct branch *" for the > currently checked out branch, in the current worktree, using the API > branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). > > The first one is the recommended [1][2] and optimal usage. Let's add > two coccinelle rules to convert the latter two into the first one. > > 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) > > 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) Citing commits in the past is not an optimal way to justify a recommendation, though. It does not show that these recommendations made earlier are still current. Phrasing it this way Among them, the comment in remote.h for "struct branch" recommends branch_get(NULL) for HEAD. might be a way to say that it comes from the current source, and when future readers of "git log" stumbles on this commit, they will also understand why the change was made. > diff --git a/contrib/coccinelle/branch_get.cocci b/contrib/coccinelle/branch_get.cocci > new file mode 100644 > index 0000000000..3ec5b59723 > --- /dev/null > +++ b/contrib/coccinelle/branch_get.cocci > @@ -0,0 +1,10 @@ > +@@ > +@@ > +- branch_get("HEAD") > ++ branch_get(NULL) > +@@ > +@@ > +- branch_get("") > ++ branch_get(NULL) > + I am not sure about these rules. Noybody is passing "" to ask for HEAD in the current code. Neither $ git log -S'branch_get("")' shows anything. The first one does modify existing calls, but there are many calls to branch_get() that pass a computed value in a strbuf or a variable. Do we know they are not passing "HEAD" or ""? Stepping back a bit. What is the ultimate goal for this change? Are we going to insist that the currently checked out branch MUST be asked for by passing NULL and not "" or "HEAD" to branch_get()? If that is the goal, then it almost makes me wonder if we should just do the attached patch, plus your changes to the callers, without adding any new Coccinelle rules, and finding and fixing the fallouts by inspecting all the call graph that leads to branch_get(). If that is not the goal, and we will keep acepting NULL, "", and "HEAD" as equivalents, the value of updating callers with literal "HEAD" to pass NULL is rather dubious. To be fair, I do not think we would not see if "branch_get(NULL) must be the only way to ask for the current branch" is a good goal, until we at least try a little. Maybe during such a conversion that starts by erroring when the function is called with "HEAD" or "" (attached below), we might find that we need to change an existing caller (or many of them) to do something silly like this: return_type a_caller(const char *branch_name, ...) { - struct branch *branch = branch_get(branch_name); + struct branch *branch; + + if (branch_name && + (!strcmp(branch_name, "HEAD") || !*branch_name)) + branch = branch_get(NULL); + else + branch = branch_get(branch_name); ... use branch_name ... ... use branch ... in which case we may start doubtint the goal to always use NULL for "HEAD". So, I dunno. The patch does not make anything worse (other than adding two extra Coccinelle rules), but to me, the ultimate goal ("where does it want to take us") is unclear. remote.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git i/remote.c w/remote.c index 3a831cb530..3788dd3fa0 100644 --- i/remote.c +++ w/remote.c @@ -1829,8 +1829,10 @@ struct branch *branch_get(const char *name) struct branch *ret; read_config(the_repository); - if (!name || !*name || !strcmp(name, "HEAD")) + if (!name) ret = the_repository->remote_state->current_branch; + else if (!*name || !strcmp(name, "HEAD")) + BUG("use NULL for HEAD to call branch_get()"); else ret = make_branch(the_repository->remote_state, name, strlen(name));
Junio C Hamano <gitster@pobox.com> writes: > To be fair, I do not think we would not see if "branch_get(NULL) > must be the only way to ask for the current branch" is a good goal, > until we at least try a little. Maybe during such a conversion that Argh. My double negatives are horrible. What I meant is that we may not know until we try, at least a little bit. Sorry for being a bad writer.
On 07-abr-2023 08:55:53, Junio C Hamano wrote: > Rubén Justo <rjusto@gmail.com> writes: > > > There are three supported ways to obtain a "struct branch *" for the > > currently checked out branch, in the current worktree, using the API > > branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). > > > > The first one is the recommended [1][2] and optimal usage. Let's add > > two coccinelle rules to convert the latter two into the first one. > > > > 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) > > > > 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) > > Citing commits in the past is not an optimal way to justify a > recommendation, though. Well, my intention is to state that the recommendation is not recent. Perhaps it is confusing to not state clearly that it is also current. > > diff --git a/contrib/coccinelle/branch_get.cocci b/contrib/coccinelle/branch_get.cocci > > new file mode 100644 > > index 0000000000..3ec5b59723 > > --- /dev/null > > +++ b/contrib/coccinelle/branch_get.cocci > > @@ -0,0 +1,10 @@ > > +@@ > > +@@ > > +- branch_get("HEAD") > > ++ branch_get(NULL) > > +@@ > > +@@ > > +- branch_get("") > > ++ branch_get(NULL) > > + > > I am not sure about these rules. Noybody is passing "" to ask for > HEAD in the current code. Neither > > $ git log -S'branch_get("")' I'm not sure if there is any path that might use "", but the consideration is there since introduced in cf818348f1 (Report information on branches from remote.h, 2007-09-10). > > shows anything. The first one does modify existing calls, but there > are many calls to branch_get() that pass a computed value in a > strbuf or a variable. Do we know they are not passing "HEAD" or ""? > > Stepping back a bit. What is the ultimate goal for this change? Of course, as you pointed out, there are usages where a computed value is used, perhaps coming from the user, which might end up specifying "HEAD". Those usages of branch_get() are not considered here. Not even indirect ones. Having said that, the goal in this change is to aid following, now and in the future, when using a literal with branch_get(), the recommendation we already have. Which, IMHO, is also the optimal usage. As a collateral, we save some cycles; either at runtime, avoiding the if (!strcmp("HEAD", "HEAD")) to the user; or better, at compile time, saving the compiler from optimizing out that strcmp. I have to admit I have this change in mind, not in the current form, but in the same direction, since my patches for builtin/branch.c, a few months ago. When, reviewing the use of branch_get() I was a bit confused. Thanks.
Rubén Justo <rjusto@gmail.com> writes: > On 07-abr-2023 08:55:53, Junio C Hamano wrote: >> Rubén Justo <rjusto@gmail.com> writes: >> >> > There are three supported ways to obtain a "struct branch *" for the >> > currently checked out branch, in the current worktree, using the API >> > branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). >> > >> > The first one is the recommended [1][2] and optimal usage. Let's add >> > two coccinelle rules to convert the latter two into the first one. >> > >> > 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) >> > >> > 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) >> >> Citing commits in the past is not an optimal way to justify a >> recommendation, though. > > Well, my intention is to state that the recommendation is not recent. > Perhaps it is confusing to not state clearly that it is also current. No matter how long ago the recommendation was originally written, we should by default consider that anything that appears in the current set of sources is still current. If it is stale and there is a better recommendation, you of course are welcome to update it, and if you were writing such a patch, it may make sense to explain the situation like: In the comment for "struct branch" in remote.h, we recommend to use branch_get(NULL) to find out the branch currently checked out. This recommendation dates back to f019d08e (API documentation for remote.h, 2008-02-19) in a separate documentation but later moved by d27eb356 (remote: move doc to remote.h and refspec.h, 2019-11-17) to the current location. However, the recommendation is out of date because ... or something. But that is not what this patch is about, is it? I do not think you really gain anything by showing that they date back long time, without making your position clear between "yes, it is a very long-standing tradition and majority of the existing code conforms to it" and "this ancient recommendation is iffy, and I think it should be updated". What you need to justify this change is to say that the recommendation _is_ current, anyway, so I do not know why you are arguing against my suggestion to improve your proposed log message. >> Stepping back a bit. What is the ultimate goal for this change? > > Of course, as you pointed out, there are usages where a computed value > is used, perhaps coming from the user, which might end up specifying > "HEAD". Those usages of branch_get() are not considered here. Not even > indirect ones. That is what I found problematic, because I do not think this particular change will get us closer to the endgame of not feedling "" or "HEAD", if ... > I have to admit I have this change in mind, not in the current form, but > in the same direction, since my patches for builtin/branch.c, a few > months ago. When, reviewing the use of branch_get() I was a bit > confused. ... it is the ultimate goal. These Coccinelle rules would not help us fish out existing callers that receive string "HEAD" from their callers and pass them unmodified to call branch_get() and convert them to pass NULL, for example. And for doing something like that and encode that into another set of Coccinelle rules, it would take auditing more and more indirect callers that reach branch_get(), but if we were doing that, we can do the "passing HEAD or empty is a BUG" patch to protect the function from future callers mistakingly passing "HEAD" or "" without Coccinelle rules that are rather costly to the CI. The above assumes that it is a good thing to declare that NULL is the only permitted way to ask for the branch currently checked out. I am a bit skeptical about that, though.
On 08-abr-2023 15:45:54, Junio C Hamano wrote: > I do not know why you are > arguing against my suggestion to improve your proposed log message. Sorry, that's not my intention. The recommendation still stands and the message was not clear about it. > >> Stepping back a bit. What is the ultimate goal for this change? > > > > Of course, as you pointed out, there are usages where a computed value > > is used, perhaps coming from the user, which might end up specifying > > "HEAD". Those usages of branch_get() are not considered here. Not even > > indirect ones. > > That is what I found problematic, because I do not think this > particular change will get us closer to the endgame of not feedling > "" or "HEAD", if ... The objective in this patch is to avoid having in the codebase branch_get("HEAD") in favor of branch_get(NULL). Because that's what we recommend and, anyway, a smart compiler is going to optimize out that strcmp with two literals. Therefore, we follow the recommendations and save some compiler effort in the way. But, branch_get() cannot stop supporting a computed value that ends being "HEAD", as a way to refer to the current branch. However, maybe you are suggesting so...
On Thu, Apr 06 2023, Rubén Justo wrote: > There are three supported ways to obtain a "struct branch *" for the > currently checked out branch, in the current worktree, using the API > branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). > > The first one is the recommended [1][2] and optimal usage. Let's add > two coccinelle rules to convert the latter two into the first one. > > 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) > > 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) I wondered why it is that we don't just make passing "HEAD" an error, and what I thought was the case is why: It's because we use this API both for "internal" callers like what you modify below, but also for passing e.g. a "HEAD" as an argv element directly to the API, and don't want every command-line interface to hardcode the "HEAD" == NULL. So that makes sense. But do we need to support "" at all? changing branch_get() so that we do: if (name && !*name) BUG("pass NULL, not \"\""); Passes all our tests, but perhaps we have insufficient coverage. > Signed-off-by: Rubén Justo <rjusto@gmail.com> > --- > builtin/fetch.c | 2 +- > builtin/pull.c | 8 ++++---- > contrib/coccinelle/branch_get.cocci | 10 ++++++++++ We've typically named these rules after the API itself, in this case this is in remote.c, maybe we can just add a remote.cocci? > 3 files changed, 15 insertions(+), 5 deletions(-) > create mode 100644 contrib/coccinelle/branch_get.cocci > > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 7221e57f35..45d81c8e02 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -1738,7 +1738,7 @@ static int do_fetch(struct transport *transport, > commit_fetch_head(&fetch_head); > > if (set_upstream) { > - struct branch *branch = branch_get("HEAD"); > + struct branch *branch = branch_get(NULL); > struct ref *rm; > struct ref *source_ref = NULL; I wonder if we shouldn't just change all of thes to a new inline helper with a more obvious name, perhaps current_branch()? > diff --git a/contrib/coccinelle/branch_get.cocci b/contrib/coccinelle/branch_get.cocci > new file mode 100644 > index 0000000000..3ec5b59723 > --- /dev/null > +++ b/contrib/coccinelle/branch_get.cocci > @@ -0,0 +1,10 @@ > +@@ > +@@ > +- branch_get("HEAD") > ++ branch_get(NULL) > + > +@@ > +@@ > +- branch_get("") > ++ branch_get(NULL) > + You don't need this duplication, see contrib/coccinelle/the_repository.cocci. I think this should do the trick, although it's untested: @@ @@ branch_get( ( - "HEAD" + NULL | - "" + NULL ) ) A rule structured like that makes it clear that we're not changing the name, but just the argument.
On 16-abr-2023 15:56:56, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Apr 06 2023, Rubén Justo wrote: > > > There are three supported ways to obtain a "struct branch *" for the > > currently checked out branch, in the current worktree, using the API > > branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). > > > > The first one is the recommended [1][2] and optimal usage. Let's add > > two coccinelle rules to convert the latter two into the first one. > > > > 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) > > > > 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) > > I wondered why it is that we don't just make passing "HEAD" an error, > and what I thought was the case is why: It's because we use this API > both for "internal" callers like what you modify below, but also for > passing e.g. a "HEAD" as an argv element directly to the API, and don't > want every command-line interface to hardcode the "HEAD" == NULL. So > that makes sense. > > But do we need to support "" at all? changing branch_get() so that we do: > > if (name && !*name) > BUG("pass NULL, not \"\""); > > Passes all our tests, but perhaps we have insufficient coverage. I haven't found a use case where it is used, but the consideration is there since it was introduced in cf818348f1 (Report information on branches from remote.h, 2007-09-10). We can introduce that BUG() and see what happens. But I'm not sure the change is worth the potential noise. > > > Signed-off-by: Rubén Justo <rjusto@gmail.com> > > --- > > builtin/fetch.c | 2 +- > > builtin/pull.c | 8 ++++---- > > contrib/coccinelle/branch_get.cocci | 10 ++++++++++ > > We've typically named these rules after the API itself, in this case > this is in remote.c, maybe we can just add a remote.cocci? Is the new rule worth the cost in the CI? That's what I'm thinking about now. Maybe adding the rule to the message for future reference is enough. > > > 3 files changed, 15 insertions(+), 5 deletions(-) > > create mode 100644 contrib/coccinelle/branch_get.cocci > > > > diff --git a/builtin/fetch.c b/builtin/fetch.c > > index 7221e57f35..45d81c8e02 100644 > > --- a/builtin/fetch.c > > +++ b/builtin/fetch.c > > @@ -1738,7 +1738,7 @@ static int do_fetch(struct transport *transport, > > commit_fetch_head(&fetch_head); > > > > if (set_upstream) { > > - struct branch *branch = branch_get("HEAD"); > > + struct branch *branch = branch_get(NULL); > > struct ref *rm; > > struct ref *source_ref = NULL; > > I wonder if we shouldn't just change all of thes to a new inline helper > with a more obvious name, perhaps current_branch()? I had the same idea and explored it a bit. I ended up thinking that I was introducing a _new_ way of using the API. So, I dunno. > > diff --git a/contrib/coccinelle/branch_get.cocci b/contrib/coccinelle/branch_get.cocci > > new file mode 100644 > > index 0000000000..3ec5b59723 > > --- /dev/null > > +++ b/contrib/coccinelle/branch_get.cocci > > @@ -0,0 +1,10 @@ > > +@@ > > +@@ > > +- branch_get("HEAD") > > ++ branch_get(NULL) > > + > > +@@ > > +@@ > > +- branch_get("") > > ++ branch_get(NULL) > > + > > You don't need this duplication, see > contrib/coccinelle/the_repository.cocci. > > I think this should do the trick, although it's untested: > > @@ > @@ > branch_get( > ( > - "HEAD" > + NULL > | > - "" > + NULL > ) > ) > > A rule structured like that makes it clear that we're not changing the > name, but just the argument. > Thanks.
diff --git a/builtin/fetch.c b/builtin/fetch.c index 7221e57f35..45d81c8e02 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1738,7 +1738,7 @@ static int do_fetch(struct transport *transport, commit_fetch_head(&fetch_head); if (set_upstream) { - struct branch *branch = branch_get("HEAD"); + struct branch *branch = branch_get(NULL); struct ref *rm; struct ref *source_ref = NULL; diff --git a/builtin/pull.c b/builtin/pull.c index 56f679d94a..fbb1cbea0a 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -332,7 +332,7 @@ static const char *config_get_ff(void) */ static enum rebase_type config_get_rebase(int *rebase_unspecified) { - struct branch *curr_branch = branch_get("HEAD"); + struct branch *curr_branch = branch_get(NULL); const char *value; if (curr_branch) { @@ -437,7 +437,7 @@ static int get_only_remote(struct remote *remote, void *cb_data) */ static void NORETURN die_no_merge_candidates(const char *repo, const char **refspecs) { - struct branch *curr_branch = branch_get("HEAD"); + struct branch *curr_branch = branch_get(NULL); const char *remote = curr_branch ? curr_branch->remote_name : NULL; if (*refspecs) { @@ -710,7 +710,7 @@ static const char *get_upstream_branch(const char *remote) if (!rm) return NULL; - curr_branch = branch_get("HEAD"); + curr_branch = branch_get(NULL); if (!curr_branch) return NULL; @@ -774,7 +774,7 @@ static int get_rebase_fork_point(struct object_id *fork_point, const char *repo, struct child_process cp = CHILD_PROCESS_INIT; struct strbuf sb = STRBUF_INIT; - curr_branch = branch_get("HEAD"); + curr_branch = branch_get(NULL); if (!curr_branch) return -1; diff --git a/contrib/coccinelle/branch_get.cocci b/contrib/coccinelle/branch_get.cocci new file mode 100644 index 0000000000..3ec5b59723 --- /dev/null +++ b/contrib/coccinelle/branch_get.cocci @@ -0,0 +1,10 @@ +@@ +@@ +- branch_get("HEAD") ++ branch_get(NULL) + +@@ +@@ +- branch_get("") ++ branch_get(NULL) +
There are three supported ways to obtain a "struct branch *" for the currently checked out branch, in the current worktree, using the API branch_get(): branch_get(NULL), branch_get("") and branch_get("HEAD"). The first one is the recommended [1][2] and optimal usage. Let's add two coccinelle rules to convert the latter two into the first one. 1. f019d08ea6 (API documentation for remote.h, 2008-02-19) 2. d27eb356bf (remote: move doc to remote.h and refspec.h, 2019-11-17) Signed-off-by: Rubén Justo <rjusto@gmail.com> --- builtin/fetch.c | 2 +- builtin/pull.c | 8 ++++---- contrib/coccinelle/branch_get.cocci | 10 ++++++++++ 3 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 contrib/coccinelle/branch_get.cocci