[2/5] refs: make `is_pseudoref_syntax()` stricter

Message ID	20240119142705.139374-3-karthik.188@gmail.com (mailing list archive)
State	Superseded
Headers	show Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E71C453818 for <git@vger.kernel.org>; Fri, 19 Jan 2024 14:27:11 +0000 (UTC) From: Karthik Nayak <karthik.188@gmail.com> To: git@vger.kernel.org Cc: Karthik Nayak <karthik.188@gmail.com> Subject: [PATCH 2/5] refs: make `is_pseudoref_syntax()` stricter Date: Fri, 19 Jan 2024 15:27:02 +0100 Message-ID: <20240119142705.139374-3-karthik.188@gmail.com> In-Reply-To: <20240119142705.139374-1-karthik.188@gmail.com> References: <20240119142705.139374-1-karthik.188@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	for-each-ref: print all refs on empty string pattern \| expand [0/5] for-each-ref: print all refs on empty string pattern [1/5] refs: expose `is_pseudoref_syntax()` [2/5] refs: make `is_pseudoref_syntax()` stricter [3/5] refs: extract out `loose_fill_ref_dir_regular_file()` [4/5] refs: introduce `refs_for_each_all_refs()` [5/5] for-each-ref: avoid filtering on empty pattern

Karthik Nayak Jan. 19, 2024, 2:27 p.m. UTC

The `is_pseudoref_syntax()` function is used to determine if a
particular refname follows the pseudoref syntax. The pseudoref syntax is
loosely defined at this instance as all refs which are in caps and use
underscores. Most of the pseudorefs also have the "HEAD" suffix.

Using this information we make the `is_pseudoref_syntax()` function
stricter, by adding the check for "HEAD" suffix and for refs which don't
end with the HEAD suffix, matching them with a predetermined list.

This requires fixing up t1407 to use the "HEAD" suffix for creation of
pseudorefs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 refs.c                        | 21 ++++++++++++++++++---
 t/t1407-worktree-ref-store.sh | 12 ++++++------
 2 files changed, 24 insertions(+), 9 deletions(-)

Junio C Hamano Jan. 19, 2024, 8:44 p.m. UTC | #1

Karthik Nayak <karthik.188@gmail.com> writes:

> Using this information we make the `is_pseudoref_syntax()` function
> stricter, by adding the check for "HEAD" suffix and for refs which don't
> end with the HEAD suffix, matching them with a predetermined list.

OK, so this partly answers my question on the previous step.  Before
making it more strict, the function worked only on the "syntax", so
a random string that can be a pseudo ref passed the check.

But stepping back a bit, if we call this function is_pseudoref_syntax(),
wouldn't it be what we want to have anyway?  You seem to want a
separate function called is_pseudoref() that rejects bogus uppercase
string "FOO_BAR" while accepting the known-good pseudoref you add
tests for, plus the ${FOO}_HEAD for any value of ${FOO} we currently
have and we may want to add in the future.

>  int is_pseudoref_syntax(const char *refname)
>  {
> +	/* TODO: move these pseudorefs to have _HEAD suffix */
> +	static const char *const irregular_pseudorefs[] = {
> +		"BISECT_EXPECTED_REV",
> +		"NOTES_MERGE_PARTIAL",
> +		"NOTES_MERGE_REF",
> +		"AUTO_MERGE"
> +	};
> +	size_t i;
>  	const char *c;
>  
>  	for (c = refname; *c; c++) {
> @@ -837,10 +845,17 @@ int is_pseudoref_syntax(const char *refname)
>  	}
>  
>  	/*
> -	 * HEAD is not a pseudoref, but it certainly uses the
> -	 * pseudoref syntax.
> +	 * Most pseudorefs end with _HEAD. HEAD itself is not a
> +	 * pseudoref, but it certainly uses the pseudoref syntax.
>  	 */
> -	return 1;
> +	if (ends_with(refname, "HEAD"))
> +		return 1;

I would imagine that at the final stage in which something like this
will be named is_pseudoref(), asking is_pseudoref("HEAD") would
return "No" (even though "is_pseudoref_syntax()", if the helper
function remains, may say "Yes" to "HEAD").  And this ends_with()
will use "_HEAD", instead of "HEAD".  But I am reading ahead of
myself, so let's keep going.

> +	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
> +		if (!strcmp(refname, irregular_pseudorefs[i]))
> +			return 1;
> +
> +	return 0;
>  }

Phillip Wood Jan. 22, 2024, 8:13 p.m. UTC | #2

Hi Karthik

On 19/01/2024 14:27, Karthik Nayak wrote:
> The `is_pseudoref_syntax()` function is used to determine if a
> particular refname follows the pseudoref syntax. The pseudoref syntax is
> loosely defined at this instance as all refs which are in caps and use
> underscores. Most of the pseudorefs also have the "HEAD" suffix.
> 
> Using this information we make the `is_pseudoref_syntax()` function
> stricter, by adding the check for "HEAD" suffix and for refs which don't
> end with the HEAD suffix, matching them with a predetermined list.
> 
> This requires fixing up t1407 to use the "HEAD" suffix for creation of
> pseudorefs.

I'm concerned that this change is a regression. is_pseudoref_syntax() is 
used by is_current_worktree_ref() and so scripts that create pseudorefs 
that do not conform to your new rules will break as git will no-longer 
consider the pseudorefs they create to be worktree specific. The list of 
hard coded exceptions also looks quite short, I can see MERGE_AUTOSTASH 
and BISECT_START are missing and there are probably others I've not 
thought of.

The commit message would be a good place to discuss why you're making 
this change, the implications of the change and any alternative 
approaches that you considered. As I understand it you're tying to get 
round the problem that the files backend stores pseudorefs mixed up with 
other non-ref files in $GIT_DIR. Another approach would be to read all 
the files whose name matches the pseudoref syntax and see if its 
contents looks like a valid ref skipping names like COMMIT_EDITMSG that 
we know are not pseudorefs.

Best Wishes

Phillip

> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
>   refs.c                        | 21 ++++++++++++++++++---
>   t/t1407-worktree-ref-store.sh | 12 ++++++------
>   2 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 5999605230..b84e173762 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -829,6 +829,14 @@ int is_per_worktree_ref(const char *refname)
>   
>   int is_pseudoref_syntax(const char *refname)
>   {
> +	/* TODO: move these pseudorefs to have _HEAD suffix */
> +	static const char *const irregular_pseudorefs[] = {
> +		"BISECT_EXPECTED_REV",
> +		"NOTES_MERGE_PARTIAL",
> +		"NOTES_MERGE_REF",
> +		"AUTO_MERGE"
> +	};
> +	size_t i;
>   	const char *c;
>   
>   	for (c = refname; *c; c++) {
> @@ -837,10 +845,17 @@ int is_pseudoref_syntax(const char *refname)
>   	}
>   
>   	/*
> -	 * HEAD is not a pseudoref, but it certainly uses the
> -	 * pseudoref syntax.
> +	 * Most pseudorefs end with _HEAD. HEAD itself is not a
> +	 * pseudoref, but it certainly uses the pseudoref syntax.
>   	 */
> -	return 1;
> +	if (ends_with(refname, "HEAD"))
> +		return 1;
> +
> +	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
> +		if (!strcmp(refname, irregular_pseudorefs[i]))
> +			return 1;
> +
> +	return 0;
>   }
>   
>   static int is_current_worktree_ref(const char *ref) {
> diff --git a/t/t1407-worktree-ref-store.sh b/t/t1407-worktree-ref-store.sh
> index 05b1881c59..53592c95f3 100755
> --- a/t/t1407-worktree-ref-store.sh
> +++ b/t/t1407-worktree-ref-store.sh
> @@ -61,18 +61,18 @@ test_expect_success 'create_symref(FOO, refs/heads/main)' '
>   # PSEUDO-WT and refs/bisect/random do not create reflogs by default, so it is
>   # not testing a realistic scenario.
>   test_expect_success REFFILES 'for_each_reflog()' '
> -	echo $ZERO_OID > .git/logs/PSEUDO-MAIN &&
> +	echo $ZERO_OID >.git/logs/PSEUDO_MAIN_HEAD &&
>   	mkdir -p     .git/logs/refs/bisect &&
> -	echo $ZERO_OID > .git/logs/refs/bisect/random &&
> +	echo $ZERO_OID >.git/logs/refs/bisect/random &&
>   
> -	echo $ZERO_OID > .git/worktrees/wt/logs/PSEUDO-WT &&
> +	echo $ZERO_OID >.git/worktrees/wt/logs/PSEUDO_WT_HEAD &&
>   	mkdir -p     .git/worktrees/wt/logs/refs/bisect &&
> -	echo $ZERO_OID > .git/worktrees/wt/logs/refs/bisect/wt-random &&
> +	echo $ZERO_OID >.git/worktrees/wt/logs/refs/bisect/wt-random &&
>   
>   	$RWT for-each-reflog | cut -d" " -f 2- | sort >actual &&
>   	cat >expected <<-\EOF &&
>   	HEAD 0x1
> -	PSEUDO-WT 0x0
> +	PSEUDO_WT_HEAD 0x0
>   	refs/bisect/wt-random 0x0
>   	refs/heads/main 0x0
>   	refs/heads/wt-main 0x0
> @@ -82,7 +82,7 @@ test_expect_success REFFILES 'for_each_reflog()' '
>   	$RMAIN for-each-reflog | cut -d" " -f 2- | sort >actual &&
>   	cat >expected <<-\EOF &&
>   	HEAD 0x1
> -	PSEUDO-MAIN 0x0
> +	PSEUDO_MAIN_HEAD 0x0
>   	refs/bisect/random 0x0
>   	refs/heads/main 0x0
>   	refs/heads/wt-main 0x0

Junio C Hamano Jan. 22, 2024, 8:22 p.m. UTC | #3

Phillip Wood <phillip.wood123@gmail.com> writes:

> I'm concerned that this change is a regression. is_pseudoref_syntax()
> is used by is_current_worktree_ref() and so scripts that create
> pseudorefs that do not conform to your new rules will break as git
> will no-longer consider the pseudorefs they create to be worktree
> specific.

Ideally, when the exception list in the function becomes more
complete, those "pseudorefs" created by those scripts shouldn't
probably be created either as common or worktree specific thing
if they are not "pseudoref".

> The list of hard coded exceptions also looks quite short, I
> can see MERGE_AUTOSTASH and BISECT_START are missing and there are
> probably others I've not thought of.

I agree that it is something we need to fix.

> The commit message would be a good place to discuss why you're making
> this change, the implications of the change and any alternative
> approaches that you considered. As I understand it you're tying to get
> round the problem that the files backend stores pseudorefs mixed up
> with other non-ref files in $GIT_DIR.

Yup.  The rationale may want to be explained better.

> Another approach would be to
> read all the files whose name matches the pseudoref syntax and see if
> its contents looks like a valid ref skipping names like COMMIT_EDITMSG
> that we know are not pseudorefs.

In the longer term, I'd prefer to see a simpler rule, like "all-caps
or underscore string, ending with _HEAD and nothing else are the
pseudorefs but we have these small number of exceptions that are
grandfathered".

Thanks.

Phillip Wood Jan. 23, 2024, 11:03 a.m. UTC | #4

Hi Junio

On 22/01/2024 20:22, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> I'm concerned that this change is a regression. is_pseudoref_syntax()
>> is used by is_current_worktree_ref() and so scripts that create
>> pseudorefs that do not conform to your new rules will break as git
>> will no-longer consider the pseudorefs they create to be worktree
>> specific.
> 
> Ideally, when the exception list in the function becomes more
> complete, those "pseudorefs" created by those scripts shouldn't
> probably be created either as common or worktree specific thing
> if they are not "pseudoref".

I not sure I quite understand what you mean here. Are you saying that 
scripts should stop using "git update-ref" and "git rev-parse" for 
anything that does not match the new pseudoref syntax?

>> Another approach would be to
>> read all the files whose name matches the pseudoref syntax and see if
>> its contents looks like a valid ref skipping names like COMMIT_EDITMSG
>> that we know are not pseudorefs.
> 
> In the longer term, I'd prefer to see a simpler rule, like "all-caps
> or underscore string, ending with _HEAD and nothing else are the
> pseudorefs but we have these small number of exceptions that are
> grandfathered".

Hopefully such a rule would stop us adding pseudorefs that are really 
private state like MERGE_AUTOSTASH. I think that is good in the long 
term but isn't it is happening now with this patch without any warning 
to users? This patch changes the behavior of parse_worktree_ref() which 
the files backend uses to figure out the path it should use when reading 
and writing a ref.

Best Wishes

Phillip

Patrick Steinhardt Jan. 23, 2024, 11:16 a.m. UTC | #5

On Mon, Jan 22, 2024 at 12:22:49PM -0800, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> > The list of hard coded exceptions also looks quite short, I
> > can see MERGE_AUTOSTASH and BISECT_START are missing and there are
> > probably others I've not thought of.
> 
> I agree that it is something we need to fix.

I've taken a deeper look at BISECT_START because I previously missed it
in my conversion to make special refs become normal pseudo refs. But as
it turns out, BISECT_START is not a ref at all.

Depending on how you execute git-bisect(1), it will either contain the
object ID of the detached HEAD or the branch you're starting the bisect
from. This information is used to switch back to that state when you
abort the bisect. So far this sounds like a proper ref indeed. But in
case you're starting from a branch it will not be a symref that points
to this branch, but it will just contain the branch name. This is not a
valid format that could be read as a loose ref, and thus this file is
not a proper ref at all (except that sometimes it behaves like one when
starting from a detached HEAD).

My first hunch was to convert it so that it indeed always is a proper
ref. But thinking about it a bit more I'm less convinced that this is
sensible as it is deeply tied to the behaviour of git-bisect(1) and only
represents its internal state. I thus came to the conclusion that it is
more similar to the sequencer state that we have in ".git/rebase-merge"
and ".git/rebase-apply" than anything else.

So if we wanted to rectify this, I think the most sensible way to
address this would be to introduce a new ".git/bisect-state" directory
that contains all of git-bisect(1)'s state:

    - BISECT_TERMS -> bisect-state/terms
    - BISECT_LOG -> bisect-state/log
    - BISECT_START -> bisect-state/start
    - BISECT_RUN -> bisect-state/run
    - BISECT_FIRST_PARENT -> bisect-state/first-parent
    - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok

I think this would make for a much cleaner solution overall as things
are neatly contained. Cleaning up after a bisect would thus only require
a delete of ".git/bisect-state/" and we're done.

Of course, this would be a backwards-incompatible change. We could
transition to that newer schema by having newer Git versions recognize
both ways to store the state, but only ever write the new schema. But
I'm not sure whether it would ultimately be worth it.

Patrick

Karthik Nayak Jan. 23, 2024, 12:49 p.m. UTC | #6

Hello Phillip,

Phillip Wood <phillip.wood123@gmail.com> writes:
>
> Hopefully such a rule would stop us adding pseudorefs that are really
> private state like MERGE_AUTOSTASH. I think that is good in the long
> term but isn't it is happening now with this patch without any warning
> to users? This patch changes the behavior of parse_worktree_ref() which
> the files backend uses to figure out the path it should use when reading
> and writing a ref.
>

I do agree with the problem you're outlining here. Changing
`is_pseudoref_syntax()` does indeed break things since its also used by
`parse_worktree_ref()`.

I first thought I could get around this by adding the required missing
refs, but even that wouldn't work. Because BISECT_START has dual nature,
it act as a ref and also as file storing a branch name as Patrick
mentions in detail in his email [1]. Meaning if `is_pseudoref_syntax()`
identifies it as a pseudoref, it could be wrong and printing it as such
might not work. But we can't not match it because that is the current
expectation.

So there is no way to make `is_pseudoref_syntax()` stricter without
breaking backward compatibility. While we do want to reach that goal, we
have to go about in the other way around, that i.e.
1. Fix all pseudorefs to have the '_HEAD' suffix.
2. Move bisect files to '$GIT_DIR/bisect-state' (see [1] for more
details).
After this, we can safely make `is_pseudoref_syntax()` stricter.

Given this, I think for the next version, I'll do the following changes:
1. keep `is_pseudoref_syntax()` as is.
2. introduce `is_pseudoref()` which calls `is_pseudoref_syntax()` and
also checks the content of the file.
3. replace use of `is_pseudoref_syntax()` with `is_pseudoref()` in this
patch series.

[1]: https://public-inbox.org/git/20240119142705.139374-1-karthik.188@gmail.com/T/#m6e3790e30613fd68349708faaf5f4d9c704ba677

Phillip Wood Jan. 23, 2024, 4:30 p.m. UTC | #7

Hi Patrick

On 23/01/2024 11:16, Patrick Steinhardt wrote:
> On Mon, Jan 22, 2024 at 12:22:49PM -0800, Junio C Hamano wrote:
>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>> The list of hard coded exceptions also looks quite short, I
>>> can see MERGE_AUTOSTASH and BISECT_START are missing and there are
>>> probably others I've not thought of.
>>
>> I agree that it is something we need to fix.
> 
> I've taken a deeper look at BISECT_START because I previously missed it
> in my conversion to make special refs become normal pseudo refs. But as
> it turns out, BISECT_START is not a ref at all.
> > Depending on how you execute git-bisect(1), it will either contain the
> object ID of the detached HEAD or the branch you're starting the bisect
> from. This information is used to switch back to that state when you
> abort the bisect. So far this sounds like a proper ref indeed. But in
> case you're starting from a branch it will not be a symref that points
> to this branch, but it will just contain the branch name. This is not a
> valid format that could be read as a loose ref, and thus this file is
> not a proper ref at all (except that sometimes it behaves like one when
> starting from a detached HEAD).

Thank you, I'd missed that

> My first hunch was to convert it so that it indeed always is a proper
> ref. But thinking about it a bit more I'm less convinced that this is
> sensible as it is deeply tied to the behaviour of git-bisect(1) and only
> represents its internal state. I thus came to the conclusion that it is
> more similar to the sequencer state that we have in ".git/rebase-merge"
> and ".git/rebase-apply" than anything else.
> 
> So if we wanted to rectify this, I think the most sensible way to
> address this would be to introduce a new ".git/bisect-state" directory
> that contains all of git-bisect(1)'s state:
> 
>      - BISECT_TERMS -> bisect-state/terms
>      - BISECT_LOG -> bisect-state/log
>      - BISECT_START -> bisect-state/start
>      - BISECT_RUN -> bisect-state/run
>      - BISECT_FIRST_PARENT -> bisect-state/first-parent
>      - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
> 
> I think this would make for a much cleaner solution overall as things
> are neatly contained. Cleaning up after a bisect would thus only require
> a delete of ".git/bisect-state/" and we're done.
> 
> Of course, this would be a backwards-incompatible change. We could
> transition to that newer schema by having newer Git versions recognize
> both ways to store the state, but only ever write the new schema. But
> I'm not sure whether it would ultimately be worth it.

I think that is a really good suggestion, it would bring bisect into 
line with am, rebase, cherry-pick etc. which keep their state in a 
subdirectory rather than cluttering up .git.

Best Wishes

Phillip

> Patrick

Phillip Wood Jan. 23, 2024, 4:40 p.m. UTC | #8

Hi Karthik

On 23/01/2024 12:49, Karthik Nayak wrote:
> Hello Phillip,
> 
> Phillip Wood <phillip.wood123@gmail.com> writes:
> [...]
> So there is no way to make `is_pseudoref_syntax()` stricter without
> breaking backward compatibility. While we do want to reach that goal, we
> have to go about in the other way around, that i.e.
> 1. Fix all pseudorefs to have the '_HEAD' suffix.

I'm wary of renaming user facing pseudorefs like AUTO_MERGE. In the 
future we should try and avoid adding any without a "_HEAD" suffix

> 2. Move bisect files to '$GIT_DIR/bisect-state' (see [1] for more
> details).
> After this, we can safely make `is_pseudoref_syntax()` stricter.
> 
> Given this, I think for the next version, I'll do the following changes:
> 1. keep `is_pseudoref_syntax()` as is.
> 2. introduce `is_pseudoref()` which calls `is_pseudoref_syntax()` and
> also checks the content of the file.

We could perhaps make is_pseudoref() stricter from the start as we're 
adding a new function, it could have a list of exceptions to the "a 
pseudoref must end with '_HEAD'" rule like this patch. Being strict 
about the pseudorefs we list with "git for-each-ref" might encourage 
users to update any scripts they have that create "pseudorefs" that do 
not match the new rules without us making any changes that actually 
break those scripts.

> 3. replace use of `is_pseudoref_syntax()` with `is_pseudoref()` in this
> patch series.

That sounds like a good way forward to me, lets see if Junio agrees.

Best Wishes

Phillip

> [1]: https://public-inbox.org/git/20240119142705.139374-1-karthik.188@gmail.com/T/#m6e3790e30613fd68349708faaf5f4d9c704ba677

Junio C Hamano Jan. 23, 2024, 5:38 p.m. UTC | #9

Phillip Wood <phillip.wood123@gmail.com> writes:

> I not sure I quite understand what you mean here. Are you saying that
> scripts should stop using "git update-ref" and "git rev-parse" for
> anything that does not match the new pseudoref syntax?

Yes, I am saying that, and also that we should have been stricter on
what we accept and consider as pseudorefs---not just any file that
sits directly under $GIT_DIR/., but ideally we should have limited
to "^[A-Z]*_HEAD$" or something.  The idea I am floating is to see if
such a tightening can be done now without hearing too many screams.

Junio C Hamano Jan. 23, 2024, 5:44 p.m. UTC | #10

Patrick Steinhardt <ps@pks.im> writes:

> My first hunch was to convert it so that it indeed always is a proper
> ref. But thinking about it a bit more I'm less convinced that this is
> sensible as it is deeply tied to the behaviour of git-bisect(1) and only
> represents its internal state. I thus came to the conclusion that it is
> more similar to the sequencer state that we have in ".git/rebase-merge"
> and ".git/rebase-apply" than anything else.

Fair enough.

> So if we wanted to rectify this, I think the most sensible way to
> address this would be to introduce a new ".git/bisect-state" directory
> that contains all of git-bisect(1)'s state:
>
>     - BISECT_TERMS -> bisect-state/terms
>     - BISECT_LOG -> bisect-state/log
>     - BISECT_START -> bisect-state/start
>     - BISECT_RUN -> bisect-state/run
>     - BISECT_FIRST_PARENT -> bisect-state/first-parent
>     - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
>
> I think this would make for a much cleaner solution overall as things
> are neatly contained. Cleaning up after a bisect would thus only require
> a delete of ".git/bisect-state/" and we're done.

And bisect-state/ needs to be marked as per-worktree hierarchy, I suppose.

> Of course, this would be a backwards-incompatible change.

As long as we ignore folks who switches versions of Git in the
middle of their "git bisect" session, we should be OK.

If we really cared the backward compatibility, the new version of
Git that knows and uses this new layout could notice these old-style
filenames and move them over to the new place under new names.  From
there, everything should work (including things like "git bisect log").

Thanks.

Junio C Hamano Jan. 23, 2024, 5:46 p.m. UTC | #11

Karthik Nayak <karthik.188@gmail.com> writes:

> Given this, I think for the next version, I'll do the following changes:
> 1. keep `is_pseudoref_syntax()` as is.
> 2. introduce `is_pseudoref()` which calls `is_pseudoref_syntax()` and
> also checks the content of the file.
> 3. replace use of `is_pseudoref_syntax()` with `is_pseudoref()` in this
> patch series.

The content check in 2. was something that was mentioned in an
earlier discussion, lack of which I completely missed during the
review of this current round.  Sounds very good to add that.

Thanks.

Patrick Steinhardt Jan. 24, 2024, 8:51 a.m. UTC | #12

On Tue, Jan 23, 2024 at 09:44:21AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > My first hunch was to convert it so that it indeed always is a proper
> > ref. But thinking about it a bit more I'm less convinced that this is
> > sensible as it is deeply tied to the behaviour of git-bisect(1) and only
> > represents its internal state. I thus came to the conclusion that it is
> > more similar to the sequencer state that we have in ".git/rebase-merge"
> > and ".git/rebase-apply" than anything else.
> 
> Fair enough.
> 
> > So if we wanted to rectify this, I think the most sensible way to
> > address this would be to introduce a new ".git/bisect-state" directory
> > that contains all of git-bisect(1)'s state:
> >
> >     - BISECT_TERMS -> bisect-state/terms
> >     - BISECT_LOG -> bisect-state/log
> >     - BISECT_START -> bisect-state/start
> >     - BISECT_RUN -> bisect-state/run
> >     - BISECT_FIRST_PARENT -> bisect-state/first-parent
> >     - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
> >
> > I think this would make for a much cleaner solution overall as things
> > are neatly contained. Cleaning up after a bisect would thus only require
> > a delete of ".git/bisect-state/" and we're done.
> 
> And bisect-state/ needs to be marked as per-worktree hierarchy, I suppose.

Yes, "bisect-state/" would need to be stored in GIT_DIR, not COMMON_DIR.

> > Of course, this would be a backwards-incompatible change.
> 
> As long as we ignore folks who switches versions of Git in the
> middle of their "git bisect" session, we should be OK.
> 
> If we really cared the backward compatibility, the new version of
> Git that knows and uses this new layout could notice these old-style
> filenames and move them over to the new place under new names.  From
> there, everything should work (including things like "git bisect log").

We also have consider that there may be alternate implementations of Git
that would only know to handle the old layout. Those tools would be
broken in case we did such a migration, but they would be broken anyway
if the bisect was started via Git and not via the tool.

Anyway, I'll add this to our growing backlog of issues that we might
want to investigate once the reftable backend has been upstreamed. Which
of course shouldn't preclude anybody else from picking up this topic in
case they are interested.

Patrick

[2/5] refs: make `is_pseudoref_syntax()` stricter

Commit Message

Comments

Patch