Message ID | b5e7ddb1e30acb7e3871a189beb2c828b18f9e73.1714398019.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Clarify pseudo-ref terminology | expand |
Hi Patrick On 29/04/2024 14:41, Patrick Steinhardt wrote: > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt > index d71b199955..4275918fa0 100644 > --- a/Documentation/glossary-content.txt > +++ b/Documentation/glossary-content.txt > @@ -497,20 +497,28 @@ exclude;; > unusual refs. > > [[def_pseudoref]]pseudoref:: > - Pseudorefs are a class of files under `$GIT_DIR` which behave > - like refs for the purposes of rev-parse, but which are treated > - specially by git. Pseudorefs both have names that are all-caps, > - and always start with a line consisting of a > - <<def_SHA1,SHA-1>> followed by whitespace. So, HEAD is not a > - pseudoref, because it is sometimes a symbolic ref. They might > - optionally contain some additional data. `MERGE_HEAD` and > - `CHERRY_PICK_HEAD` are examples. Unlike > - <<def_per_worktree_ref,per-worktree refs>>, these files cannot > - be symbolic refs, and never have reflogs. They also cannot be > - updated through the normal ref update machinery. Instead, > - they are updated by directly writing to the files. However, > - they can be read as if they were refs, so `git rev-parse > - MERGE_HEAD` will work. > + Pseudorefs are references that live in the root of the reference > + hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an > + all-uppercase name and must end with a "_HEAD" suffix, for example > + "`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as > + any other reference and can be both read and written via regular Git > + tooling. This changes the definition to allow pseudorefs to by symbolic refs. When is_pseudoref() was introduced Junio and I had a brief discussion about this restriction and he was not in favor of allowing pseudorefs to be symbolic refs [1]. Are there any practical implications of the changes in this patch for users running commands like "git log FETCH_HEAD" (I can't think of any off the top of my head but it would be good to have some reassurance on that point in the commit message) Best Wishes Phillip [1] https://lore.kernel.org/git/xmqq34u2q3zs.fsf@gitster.g/ > +<<def_special_ref>,Special refs>> are not pseudorefs. > ++ > +Due to historic reasons, Git has several irregular pseudo refs that do not > +follow above rules. The following list of irregular pseudo refs is exhaustive > +and shall not be extended in the future: > + > + - "`AUTO_MERGE`" > + > + - "`BISECT_EXPECTED_REV`" > + > + - "`NOTES_MERGE_PARTIAL`" > + > + - "`NOTES_MERGE_REF`" > + > + - "`MERGE_AUTOSTASH`" > > [[def_pull]]pull:: > Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and > diff --git a/refs.c b/refs.c > index c64f66bff9..567c6fc6ff 100644 > --- a/refs.c > +++ b/refs.c > @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname) > > if (!is_pseudoref_syntax(refname)) > return 0; > + if (is_special_ref(refname)) > + return 0; > > if (ends_with(refname, "_HEAD")) { > refs_resolve_ref_unsafe(refs, refname, > diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh > index 948f1bb5f4..8c92fbde79 100755 > --- a/t/t6302-for-each-ref-filter.sh > +++ b/t/t6302-for-each-ref-filter.sh > @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' ' > test_cmp expect actual > ' > > +test_expect_success '--include-root-refs pattern does not print special refs' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + ( > + cd repo && > + test_commit initial && > + git rev-parse HEAD >.git/MERGE_HEAD && > + git for-each-ref --format="%(refname)" --include-root-refs >actual && > + cat >expect <<-EOF && > + HEAD > + $(git symbolic-ref HEAD) > + refs/tags/initial > + EOF > + test_cmp expect actual > + ) > +' > + > test_expect_success '--include-root-refs with other patterns' ' > cat >expect <<-\EOF && > HEAD
Patrick Steinhardt <ps@pks.im> writes: > +Due to historic reasons, Git has several irregular pseudo refs that do not > +follow above rules. The following list of irregular pseudo refs is exhaustive > +and shall not be extended in the future: I like this part of the patch the most ;-).
On 24/04/29 03:41PM, Patrick Steinhardt wrote: > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt > index d71b199955..4275918fa0 100644 > --- a/Documentation/glossary-content.txt > +++ b/Documentation/glossary-content.txt > @@ -497,20 +497,28 @@ exclude;; > unusual refs. > > [[def_pseudoref]]pseudoref:: > - Pseudorefs are a class of files under `$GIT_DIR` which behave > - like refs for the purposes of rev-parse, but which are treated > - specially by git. Pseudorefs both have names that are all-caps, > - and always start with a line consisting of a > - <<def_SHA1,SHA-1>> followed by whitespace. So, HEAD is not a > - pseudoref, because it is sometimes a symbolic ref. They might We remove the example here about HEAD not being a pseudoref. This example seems helpful to indicate that a pseudoref cannot be a symbolic ref. Is this no longer the case and the change intended? > - optionally contain some additional data. `MERGE_HEAD` and > - `CHERRY_PICK_HEAD` are examples. Unlike > - <<def_per_worktree_ref,per-worktree refs>>, these files cannot > - be symbolic refs, and never have reflogs. They also cannot be > - updated through the normal ref update machinery. Instead, > - they are updated by directly writing to the files. However, > - they can be read as if they were refs, so `git rev-parse > - MERGE_HEAD` will work. > + Pseudorefs are references that live in the root of the reference > + hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an > + all-uppercase name and must end with a "_HEAD" suffix, for example > + "`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as > + any other reference and can be both read and written via regular Git > + tooling. Pseudorefs behaving the same and using the same tooling seems to contridict the previous documentation. I assume the previous information was out-of-date, but it might be nice to explain this in the message. > ++ > +<<def_special_ref>,Special refs>> are not pseudorefs. > ++ > +Due to historic reasons, Git has several irregular pseudo refs that do not > +follow above rules. The following list of irregular pseudo refs is exhaustive We seem to be inconsistent between using "pseudoref" and "pseudo ref". Not sure it we want to be consistent here. -Justin > +and shall not be extended in the future: > + > + - "`AUTO_MERGE`" > + > + - "`BISECT_EXPECTED_REV`" > + > + - "`NOTES_MERGE_PARTIAL`" > + > + - "`NOTES_MERGE_REF`" > + > + - "`MERGE_AUTOSTASH`" > > [[def_pull]]pull:: > Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and > diff --git a/refs.c b/refs.c > index c64f66bff9..567c6fc6ff 100644 > --- a/refs.c > +++ b/refs.c > @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname) > > if (!is_pseudoref_syntax(refname)) > return 0; > + if (is_special_ref(refname)) > + return 0; > > if (ends_with(refname, "_HEAD")) { > refs_resolve_ref_unsafe(refs, refname, > diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh > index 948f1bb5f4..8c92fbde79 100755 > --- a/t/t6302-for-each-ref-filter.sh > +++ b/t/t6302-for-each-ref-filter.sh > @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' ' > test_cmp expect actual > ' > > +test_expect_success '--include-root-refs pattern does not print special refs' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + ( > + cd repo && > + test_commit initial && > + git rev-parse HEAD >.git/MERGE_HEAD && > + git for-each-ref --format="%(refname)" --include-root-refs >actual && > + cat >expect <<-EOF && > + HEAD > + $(git symbolic-ref HEAD) > + refs/tags/initial > + EOF > + test_cmp expect actual > + ) > +' > + > test_expect_success '--include-root-refs with other patterns' ' > cat >expect <<-\EOF && > HEAD > -- > 2.45.0-rc1 >
On Mon, Apr 29, 2024 at 05:52:41PM -0500, Justin Tobler wrote: > On 24/04/29 03:41PM, Patrick Steinhardt wrote: > > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt > > index d71b199955..4275918fa0 100644 > > --- a/Documentation/glossary-content.txt > > +++ b/Documentation/glossary-content.txt > > @@ -497,20 +497,28 @@ exclude;; > > unusual refs. > > > > [[def_pseudoref]]pseudoref:: > > - Pseudorefs are a class of files under `$GIT_DIR` which behave > > - like refs for the purposes of rev-parse, but which are treated > > - specially by git. Pseudorefs both have names that are all-caps, > > - and always start with a line consisting of a > > - <<def_SHA1,SHA-1>> followed by whitespace. So, HEAD is not a > > - pseudoref, because it is sometimes a symbolic ref. They might > > We remove the example here about HEAD not being a pseudoref. This > example seems helpful to indicate that a pseudoref cannot be a symbolic > ref. Is this no longer the case and the change intended? I just don't see why we would want to have this restriction. Honestly, the more I think about this whole topic the more I want to go into the direction I've hinted at in the cover letter: drop "special refs" and define pseudo refs as either FETCH_HEAD or MERGE_HEAD. Everything else is just a normal ref, even though some of those may live in the root directory if they conform to a set of strict rules: - All upppercase characters plus underscores. - Must end with "_HEAD", except a list of known irregular root refs. I feel like the world would be better like this. > > - optionally contain some additional data. `MERGE_HEAD` and > > - `CHERRY_PICK_HEAD` are examples. Unlike > > - <<def_per_worktree_ref,per-worktree refs>>, these files cannot > > - be symbolic refs, and never have reflogs. They also cannot be > > - updated through the normal ref update machinery. Instead, > > - they are updated by directly writing to the files. However, > > - they can be read as if they were refs, so `git rev-parse > > - MERGE_HEAD` will work. > > + Pseudorefs are references that live in the root of the reference > > + hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an > > + all-uppercase name and must end with a "_HEAD" suffix, for example > > + "`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as > > + any other reference and can be both read and written via regular Git > > + tooling. > > Pseudorefs behaving the same and using the same tooling seems to > contridict the previous documentation. I assume the previous information > was out-of-date, but it might be nice to explain this in the message. Yes, and I actually want to change this. We never enforced restrictions for pseudorefs anyway, they can be symrefs just fine. And neither would I see any reason why that should be the case in the first place. > > ++ > > +<<def_special_ref>,Special refs>> are not pseudorefs. > > ++ > > +Due to historic reasons, Git has several irregular pseudo refs that do not > > +follow above rules. The following list of irregular pseudo refs is exhaustive > > We seem to be inconsistent between using "pseudoref" and "pseudo ref". > Not sure it we want to be consistent here. Makes sense. Patrick > -Justin > > > +and shall not be extended in the future: > > + > > + - "`AUTO_MERGE`" > > + > > + - "`BISECT_EXPECTED_REV`" > > + > > + - "`NOTES_MERGE_PARTIAL`" > > + > > + - "`NOTES_MERGE_REF`" > > + > > + - "`MERGE_AUTOSTASH`" > > > > [[def_pull]]pull:: > > Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and > > diff --git a/refs.c b/refs.c > > index c64f66bff9..567c6fc6ff 100644 > > --- a/refs.c > > +++ b/refs.c > > @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname) > > > > if (!is_pseudoref_syntax(refname)) > > return 0; > > + if (is_special_ref(refname)) > > + return 0; > > > > if (ends_with(refname, "_HEAD")) { > > refs_resolve_ref_unsafe(refs, refname, > > diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh > > index 948f1bb5f4..8c92fbde79 100755 > > --- a/t/t6302-for-each-ref-filter.sh > > +++ b/t/t6302-for-each-ref-filter.sh > > @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' ' > > test_cmp expect actual > > ' > > > > +test_expect_success '--include-root-refs pattern does not print special refs' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + ( > > + cd repo && > > + test_commit initial && > > + git rev-parse HEAD >.git/MERGE_HEAD && > > + git for-each-ref --format="%(refname)" --include-root-refs >actual && > > + cat >expect <<-EOF && > > + HEAD > > + $(git symbolic-ref HEAD) > > + refs/tags/initial > > + EOF > > + test_cmp expect actual > > + ) > > +' > > + > > test_expect_success '--include-root-refs with other patterns' ' > > cat >expect <<-\EOF && > > HEAD > > -- > > 2.45.0-rc1 > > > >
On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote: > Hi Patrick > > On 29/04/2024 14:41, Patrick Steinhardt wrote: > > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt > > index d71b199955..4275918fa0 100644 > > --- a/Documentation/glossary-content.txt > > +++ b/Documentation/glossary-content.txt > > @@ -497,20 +497,28 @@ exclude;; > > unusual refs. > > [[def_pseudoref]]pseudoref:: > > - Pseudorefs are a class of files under `$GIT_DIR` which behave > > - like refs for the purposes of rev-parse, but which are treated > > - specially by git. Pseudorefs both have names that are all-caps, > > - and always start with a line consisting of a > > - <<def_SHA1,SHA-1>> followed by whitespace. So, HEAD is not a > > - pseudoref, because it is sometimes a symbolic ref. They might > > - optionally contain some additional data. `MERGE_HEAD` and > > - `CHERRY_PICK_HEAD` are examples. Unlike > > - <<def_per_worktree_ref,per-worktree refs>>, these files cannot > > - be symbolic refs, and never have reflogs. They also cannot be > > - updated through the normal ref update machinery. Instead, > > - they are updated by directly writing to the files. However, > > - they can be read as if they were refs, so `git rev-parse > > - MERGE_HEAD` will work. > > + Pseudorefs are references that live in the root of the reference > > + hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an > > + all-uppercase name and must end with a "_HEAD" suffix, for example > > + "`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as > > + any other reference and can be both read and written via regular Git > > + tooling. > > This changes the definition to allow pseudorefs to by symbolic refs. When > is_pseudoref() was introduced Junio and I had a brief discussion about this > restriction and he was not in favor of allowing pseudorefs to be symbolic > refs [1]. So the reason why pseudorefs exist is that some refs behave like a ref sometimes, but not always. And in my book that really only applies to MERGE_HEAD and FETCH_HEAD, because those contain additional metadata that makes them not-a-ref. And for those I very much see that they should not ever be a symref. But everyhing else living in the root of the ref hierarchy is not special in any way, at least not in my opinion. We have never enforced that those cannot be symrefs, and it makes our terminology needlessly confusing. I think I'm going to reroll this patch series and go down the nuclear path that I've hinted at in the cover letter: - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD. - Refs starting with "refs/" are just plain normal refs. - Refs living in the root of the ref hierarchy need to conform to a set of strict rules, as Peff is starting to enforce in a separate patch series. These are just normal refs, as well, even though we may call them "root ref" in our tooling as they live in the root of the ref hierarchy. I just don't think that the current state makes sense to anybody. It's majorly confusing -- I've spent the last 8 months working in our refs code almost exclusively and still forget what's what. How are our users expected to understand this? > Are there any practical implications of the changes in this patch for users > running commands like "git log FETCH_HEAD" (I can't think of any off the top > of my head but it would be good to have some reassurance on that point in > the commit message) Not really, no. We have never been doing a good job at enforcing the difference between pseudo refs or normal refs anyway. Pseudo refs can be symrefs just fine, and our tooling won't complain. The only exception where I want us to become stricter is in how we enforce the syntax rules for root refs (which is handled by Peff in a separate patch series), and that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. They should still resolve when you ask git-rev-parse(1), but when you iterate through refs they should not be surfaced as they _aren't_ refs. Patrick
Hi Patrick On 30/04/2024 08:30, Patrick Steinhardt wrote: > On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote: > >> This changes the definition to allow pseudorefs to by symbolic refs. When >> is_pseudoref() was introduced Junio and I had a brief discussion about this >> restriction and he was not in favor of allowing pseudorefs to be symbolic >> refs [1]. > > So the reason why pseudorefs exist is that some refs behave like a ref > sometimes, but not always. And in my book that really only applies to > MERGE_HEAD and FETCH_HEAD, because those contain additional metadata > that makes them not-a-ref. And for those I very much see that they > should not ever be a symref. > > But everyhing else living in the root of the ref hierarchy is not > special in any way, at least not in my opinion. We have never enforced > that those cannot be symrefs, and it makes our terminology needlessly > confusing. I agree HEAD not being a pseudoref and having special refs as well as pseudorefs refs is confusing. I do have some sympathy for the argument that pseudorefs should not be symbolic refs though as AUTO_MERGE, CHERRY_PICK_HEAD, ORIG_HEAD etc. are all pointers to a commit and it would be a bug for them to be a symbolic ref. It is unfortunate that in the move away from assessing those refs as files we lost the check that they are not symbolic refs. > I think I'm going to reroll this patch series and go down the nuclear > path that I've hinted at in the cover letter: > > - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD. > > - Refs starting with "refs/" are just plain normal refs. > > - Refs living in the root of the ref hierarchy need to conform to a > set of strict rules, as Peff is starting to enforce in a separate > patch series. These are just normal refs, as well, even though we > may call them "root ref" in our tooling as they live in the root of > the ref hierarchy. That would certainly be simpler. > I just don't think that the current state makes sense to anybody. It's > majorly confusing -- I've spent the last 8 months working in our refs > code almost exclusively and still forget what's what. How are our users > expected to understand this? The current state is confusing but arguably there is a logic to the various distinctions - whether those distinctions are useful in practice is open to debate though. I wonder how much users really care about these distinctions and whether it affects their use of git. I was unaware of the distinction between HEAD and pseudorefs until I reviewed Karthik's for-each-ref series a couple of months ago and I don't think that lack of knowledge had caused me any trouble when using git. >> Are there any practical implications of the changes in this patch for users >> running commands like "git log FETCH_HEAD" (I can't think of any off the top >> of my head but it would be good to have some reassurance on that point in >> the commit message) > > Not really, no. We have never been doing a good job at enforcing the > difference between pseudo refs or normal refs anyway. Pseudo refs can be > symrefs just fine, and our tooling won't complain. The only exception > where I want us to become stricter is in how we enforce the syntax rules > for root refs (which is handled by Peff in a separate patch series), and > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. > They should still resolve when you ask git-rev-parse(1), but when you > iterate through refs they should not be surfaced as they _aren't_ refs. That's good Thanks Phillip
On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote: > So the reason why pseudorefs exist is that some refs behave like a ref > sometimes, but not always. And in my book that really only applies to > MERGE_HEAD and FETCH_HEAD, because those contain additional metadata > that makes them not-a-ref. And for those I very much see that they > should not ever be a symref. > > But everyhing else living in the root of the ref hierarchy is not > special in any way, at least not in my opinion. We have never enforced > that those cannot be symrefs, and it makes our terminology needlessly > confusing. > > I think I'm going to reroll this patch series and go down the nuclear > path that I've hinted at in the cover letter: > > - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD. > > - Refs starting with "refs/" are just plain normal refs. > > - Refs living in the root of the ref hierarchy need to conform to a > set of strict rules, as Peff is starting to enforce in a separate > patch series. These are just normal refs, as well, even though we > may call them "root ref" in our tooling as they live in the root of > the ref hierarchy. > > I just don't think that the current state makes sense to anybody. It's > majorly confusing -- I've spent the last 8 months working in our refs > code almost exclusively and still forget what's what. How are our users > expected to understand this? Yes, I very much agree with your final paragraph. I have been working on Git for 18 years, and am learning new things about pseudo and special refs in this thread. ;) (Admittedly, I think that distinction is new in the past few months). I think the "everything is a ref, even at the root" is the simplest thing for users. And the only rules they need to know are the syntactic ones: names start with "refs/" or are all-caps and underscore. But I do not see the value in them caring that HEAD can be a symref or that MERGE_HEAD cannot (nor the value in the code making such a distinction). My series does not enforce the "_HEAD" suffix (plus special cases) as a syntactic rule, but we could do that easily on top. That would help protect case-insensitive filesystems from the same shenanigans that my series aims for (e.g., "CONFIG" on such a system will still look at the "config" file). It is unfortunate to me that we even need to call out FETCH_HEAD and MERGE_HEAD. I know they are special within Git, and probably ref backends need to be aware (because they have to be able to carry extra data). But from a user's perspective they resolve in the normal way (unless you are trying to look at them in their special non-ref way). I guess the user must care that they will always be in the filesystem in order to access them in that special way, though. > > Are there any practical implications of the changes in this patch for users > > running commands like "git log FETCH_HEAD" (I can't think of any off the top > > of my head but it would be good to have some reassurance on that point in > > the commit message) > > Not really, no. We have never been doing a good job at enforcing the > difference between pseudo refs or normal refs anyway. Pseudo refs can be > symrefs just fine, and our tooling won't complain. The only exception > where I want us to become stricter is in how we enforce the syntax rules > for root refs (which is handled by Peff in a separate patch series), and > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. > They should still resolve when you ask git-rev-parse(1), but when you > iterate through refs they should not be surfaced as they _aren't_ refs. I actually would not even mind if they are surfaced when iterating with --include-root-refs. But then I am a little skeptical of the purpose of that feature in the first place. After all, the reason code shoves stuff into .git/FOO_HEAD is precisely because we don't want other stuff iterating over them, using them for reachability, and so on. That is why "--all" does not include them, for example. But I did not follow the development of the feature, so maybe I am missing some cool use case. -Peff
Jeff King <peff@peff.net> writes: > On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote: > >> So the reason why pseudorefs exist is that some refs behave like a ref >> sometimes, but not always. And in my book that really only applies to >> MERGE_HEAD and FETCH_HEAD, because those contain additional metadata >> that makes them not-a-ref. And for those I very much see that they >> should not ever be a symref. >> >> But everyhing else living in the root of the ref hierarchy is not >> special in any way, at least not in my opinion. We have never enforced >> that those cannot be symrefs, and it makes our terminology needlessly >> confusing. >> >> I think I'm going to reroll this patch series and go down the nuclear >> path that I've hinted at in the cover letter: >> >> - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD. >> >> - Refs starting with "refs/" are just plain normal refs. >> >> - Refs living in the root of the ref hierarchy need to conform to a >> set of strict rules, as Peff is starting to enforce in a separate >> patch series. These are just normal refs, as well, even though we >> may call them "root ref" in our tooling as they live in the root of >> the ref hierarchy. >> >> I just don't think that the current state makes sense to anybody. It's >> majorly confusing -- I've spent the last 8 months working in our refs >> code almost exclusively and still forget what's what. How are our users >> expected to understand this? > > Yes, I very much agree with your final paragraph. I have been working on > Git for 18 years, and am learning new things about pseudo and special > refs in this thread. ;) (Admittedly, I think that distinction is new in > the past few months). > > I think the "everything is a ref, even at the root" is the simplest > thing for users. And the only rules they need to know are the syntactic > ones: names start with "refs/" or are all-caps and underscore. But I do > not see the value in them caring that HEAD can be a symref or that > MERGE_HEAD cannot (nor the value in the code making such a distinction). > > My series does not enforce the "_HEAD" suffix (plus special cases) as a > syntactic rule, but we could do that easily on top. That would help > protect case-insensitive filesystems from the same shenanigans that my > series aims for (e.g., "CONFIG" on such a system will still look at the > "config" file). > > It is unfortunate to me that we even need to call out FETCH_HEAD and > MERGE_HEAD. I know they are special within Git, and probably ref > backends need to be aware (because they have to be able to carry extra > data). But from a user's perspective they resolve in the normal way > (unless you are trying to look at them in their special non-ref way). > I guess the user must care that they will always be in the filesystem in > order to access them in that special way, though. > >> > Are there any practical implications of the changes in this patch for users >> > running commands like "git log FETCH_HEAD" (I can't think of any off the top >> > of my head but it would be good to have some reassurance on that point in >> > the commit message) >> >> Not really, no. We have never been doing a good job at enforcing the >> difference between pseudo refs or normal refs anyway. Pseudo refs can be >> symrefs just fine, and our tooling won't complain. The only exception >> where I want us to become stricter is in how we enforce the syntax rules >> for root refs (which is handled by Peff in a separate patch series), and >> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. >> They should still resolve when you ask git-rev-parse(1), but when you >> iterate through refs they should not be surfaced as they _aren't_ refs. > > I actually would not even mind if they are surfaced when iterating with > --include-root-refs. But then I am a little skeptical of the purpose of > that feature in the first place. After all, the reason code shoves stuff > into .git/FOO_HEAD is precisely because we don't want other stuff > iterating over them, using them for reachability, and so on. That is why > "--all" does not include them, for example. > > But I did not follow the development of the feature, so maybe I am > missing some cool use case. > The use case was to allow us to look at these refs when working with the reftable backend. Currently there is no way to do that, with the files backend, well you could just read the files. So mostly a debugging usecase.
On Tue, Apr 30, 2024 at 10:59:36AM +0100, Phillip Wood wrote: > Hi Patrick > > On 30/04/2024 08:30, Patrick Steinhardt wrote: > > On Mon, Apr 29, 2024 at 04:12:37PM +0100, Phillip Wood wrote: > > > > > This changes the definition to allow pseudorefs to by symbolic refs. When > > > is_pseudoref() was introduced Junio and I had a brief discussion about this > > > restriction and he was not in favor of allowing pseudorefs to be symbolic > > > refs [1]. > > > > So the reason why pseudorefs exist is that some refs behave like a ref > > sometimes, but not always. And in my book that really only applies to > > MERGE_HEAD and FETCH_HEAD, because those contain additional metadata > > that makes them not-a-ref. And for those I very much see that they > > should not ever be a symref. > > > > But everyhing else living in the root of the ref hierarchy is not > > special in any way, at least not in my opinion. We have never enforced > > that those cannot be symrefs, and it makes our terminology needlessly > > confusing. > > I agree HEAD not being a pseudoref and having special refs as well as > pseudorefs refs is confusing. I do have some sympathy for the argument that > pseudorefs should not be symbolic refs though as AUTO_MERGE, > CHERRY_PICK_HEAD, ORIG_HEAD etc. are all pointers to a commit and it would > be a bug for them to be a symbolic ref. It is unfortunate that in the move > away from assessing those refs as files we lost the check that they are not > symbolic refs. While I agree that conceptually these should always be "regular" refs, I feel like that is higher-level logic that belongs into the respective subsystems that write those. I just don't see why the ref backend should care about the particular usecases that those higher-level subsystems have, and I can very much see that there might eventually be another subsystem that actually wants a specific ref to be a symref. No we could of course start to hard code all kinds of refs into the ref layer. But I think that this is the wrong way to go, and treating the ref store as just that, a generic store where you can store refs, without attaching specific meaning to any of the refs, is the proper way to go. > > I think I'm going to reroll this patch series and go down the nuclear > > path that I've hinted at in the cover letter: > > > > - Pseudo refs can only be either FETCH_HEAD or MERGE_HEAD. > > > > - Refs starting with "refs/" are just plain normal refs. > > > > - Refs living in the root of the ref hierarchy need to conform to a > > set of strict rules, as Peff is starting to enforce in a separate > > patch series. These are just normal refs, as well, even though we > > may call them "root ref" in our tooling as they live in the root of > > the ref hierarchy. > > That would certainly be simpler. > > > I just don't think that the current state makes sense to anybody. It's > > majorly confusing -- I've spent the last 8 months working in our refs > > code almost exclusively and still forget what's what. How are our users > > expected to understand this? > > The current state is confusing but arguably there is a logic to the various > distinctions - whether those distinctions are useful in practice is open to > debate though. I wonder how much users really care about these distinctions > and whether it affects their use of git. I was unaware of the distinction > between HEAD and pseudorefs until I reviewed Karthik's for-each-ref series a > couple of months ago and I don't think that lack of knowledge had caused me > any trouble when using git. There is some logic, that's true enough. I just don't think that anybody understands the logic. Patrick > > > Are there any practical implications of the changes in this patch for users > > > running commands like "git log FETCH_HEAD" (I can't think of any off the top > > > of my head but it would be good to have some reassurance on that point in > > > the commit message) > > > > Not really, no. We have never been doing a good job at enforcing the > > difference between pseudo refs or normal refs anyway. Pseudo refs can be > > symrefs just fine, and our tooling won't complain. The only exception > > where I want us to become stricter is in how we enforce the syntax rules > > for root refs (which is handled by Peff in a separate patch series), and > > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. > > They should still resolve when you ask git-rev-parse(1), but when you > > iterate through refs they should not be surfaced as they _aren't_ refs. > > That's good > > Thanks > > Phillip >
On Tue, Apr 30, 2024 at 06:23:10AM -0400, Jeff King wrote: > On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote: [snip] > > > Are there any practical implications of the changes in this patch for users > > > running commands like "git log FETCH_HEAD" (I can't think of any off the top > > > of my head but it would be good to have some reassurance on that point in > > > the commit message) > > > > Not really, no. We have never been doing a good job at enforcing the > > difference between pseudo refs or normal refs anyway. Pseudo refs can be > > symrefs just fine, and our tooling won't complain. The only exception > > where I want us to become stricter is in how we enforce the syntax rules > > for root refs (which is handled by Peff in a separate patch series), and > > that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. > > They should still resolve when you ask git-rev-parse(1), but when you > > iterate through refs they should not be surfaced as they _aren't_ refs. > > I actually would not even mind if they are surfaced when iterating with > --include-root-refs. But then I am a little skeptical of the purpose of > that feature in the first place. After all, the reason code shoves stuff > into .git/FOO_HEAD is precisely because we don't want other stuff > iterating over them, using them for reachability, and so on. That is why > "--all" does not include them, for example. > > But I did not follow the development of the feature, so maybe I am > missing some cool use case. The thing is that once we start to surface pseudorefs (in the sense of these _really_ aren't refs) in ref-related tooling, users will want to treat them as a ref, as well. And that's just bound to happen with plumbing like `git for-each-ref`, where a user may rightfully expect that all output here can be treated like a normal ref. In fact though, I want to double down on restrictions regarding the pseudorefs FETCH_HEAD and MERGE_HEAD. While it's fair enough that those can be read like a ref, writing to them is a totally different thing. It does not make any sense to try and write such refs, and our abstractions aren't even prepared to write them correctly. They go through the ref backend, and thus the "reftable" backend would write them into the reftable stack instead of into the filesystem. Now you could argue that this should be fixed, but I don't think it is reasonable to expect the reftable backend to start writing loose refs for those pseudorefs. So I'd really like to stick with the current explanation that we have in the "special ref" glossary: pseudorefs must be written via the filesystem and can't ever go through the ref backends. Patrick
On Tue, Apr 30, 2024 at 05:07:19AM -0700, Karthik Nayak wrote: > Jeff King <peff@peff.net> writes: > > > On Tue, Apr 30, 2024 at 09:30:05AM +0200, Patrick Steinhardt wrote: [snip] > >> > Are there any practical implications of the changes in this patch for users > >> > running commands like "git log FETCH_HEAD" (I can't think of any off the top > >> > of my head but it would be good to have some reassurance on that point in > >> > the commit message) > >> > >> Not really, no. We have never been doing a good job at enforcing the > >> difference between pseudo refs or normal refs anyway. Pseudo refs can be > >> symrefs just fine, and our tooling won't complain. The only exception > >> where I want us to become stricter is in how we enforce the syntax rules > >> for root refs (which is handled by Peff in a separate patch series), and > >> that we start to not treat FETCH_HEAD and MERGE_HEAD as proper refs. > >> They should still resolve when you ask git-rev-parse(1), but when you > >> iterate through refs they should not be surfaced as they _aren't_ refs. > > > > I actually would not even mind if they are surfaced when iterating with > > --include-root-refs. But then I am a little skeptical of the purpose of > > that feature in the first place. After all, the reason code shoves stuff > > into .git/FOO_HEAD is precisely because we don't want other stuff > > iterating over them, using them for reachability, and so on. That is why > > "--all" does not include them, for example. > > > > But I did not follow the development of the feature, so maybe I am > > missing some cool use case. > > > > The use case was to allow us to look at these refs when working with > the reftable backend. Currently there is no way to do that, with the > files backend, well you could just read the files. So mostly a debugging > usecase. That's true for normal root refs, only, though. The pseudorefs (current special refs) can still be surfaced even if the for-each-ref machinery doesn't surface them because by definition, they always live in the filesystem. Patrick
Hello, On Monday, 29 April 2024 15:41:28 CEST Patrick Steinhardt wrote: > --- > Documentation/glossary-content.txt | 36 ++++++++++++++++++------------ > refs.c | 2 ++ > t/t6302-for-each-ref-filter.sh | 17 ++++++++++++++ > 3 files changed, 41 insertions(+), 14 deletions(-) > > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary- content.txt > index d71b199955..4275918fa0 100644 > --- a/Documentation/glossary-content.txt > +++ b/Documentation/glossary-content.txt > @@ -497,20 +497,28 @@ exclude;; ... > + > + - "`AUTO_MERGE`" > + > + - "`BISECT_EXPECTED_REV`" > + > + - "`NOTES_MERGE_PARTIAL`" > + > + - "`NOTES_MERGE_REF`" > + > + - "`MERGE_AUTOSTASH`" Quoting the names seems overkill here, as they are already formatted with the back-quotes. The rendering as bold or monospace is enough. Regards, Jean-Noël
On Thu, May 09, 2024 at 07:29:18PM +0200, Jean-Noël AVILA wrote: > Hello, > > On Monday, 29 April 2024 15:41:28 CEST Patrick Steinhardt wrote: > > --- > > Documentation/glossary-content.txt | 36 ++++++++++++++++++------------ > > refs.c | 2 ++ > > t/t6302-for-each-ref-filter.sh | 17 ++++++++++++++ > > 3 files changed, 41 insertions(+), 14 deletions(-) > > > > diff --git a/Documentation/glossary-content.txt b/Documentation/glossary- > content.txt > > index d71b199955..4275918fa0 100644 > > --- a/Documentation/glossary-content.txt > > +++ b/Documentation/glossary-content.txt > > @@ -497,20 +497,28 @@ exclude;; > ... > > + > > + - "`AUTO_MERGE`" > > + > > + - "`BISECT_EXPECTED_REV`" > > + > > + - "`NOTES_MERGE_PARTIAL`" > > + > > + - "`NOTES_MERGE_REF`" > > + > > + - "`MERGE_AUTOSTASH`" > > Quoting the names seems overkill here, as they are already formatted with the > back-quotes. The rendering as bold or monospace is enough. > > Regards, > > Jean-Noël These don't exist in later versions of this patch series anymore. But there are other cases where I did the same in the current version, so let me drop that. Thanks! Patrick
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index d71b199955..4275918fa0 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -497,20 +497,28 @@ exclude;; unusual refs. [[def_pseudoref]]pseudoref:: - Pseudorefs are a class of files under `$GIT_DIR` which behave - like refs for the purposes of rev-parse, but which are treated - specially by git. Pseudorefs both have names that are all-caps, - and always start with a line consisting of a - <<def_SHA1,SHA-1>> followed by whitespace. So, HEAD is not a - pseudoref, because it is sometimes a symbolic ref. They might - optionally contain some additional data. `MERGE_HEAD` and - `CHERRY_PICK_HEAD` are examples. Unlike - <<def_per_worktree_ref,per-worktree refs>>, these files cannot - be symbolic refs, and never have reflogs. They also cannot be - updated through the normal ref update machinery. Instead, - they are updated by directly writing to the files. However, - they can be read as if they were refs, so `git rev-parse - MERGE_HEAD` will work. + Pseudorefs are references that live in the root of the reference + hierarchy, outside of the usual "refs/" hierarchy. Pseudorefs have an + all-uppercase name and must end with a "_HEAD" suffix, for example + "`BISECT_HEAD`". Other than that, pseudorefs behave the exact same as + any other reference and can be both read and written via regular Git + tooling. ++ +<<def_special_ref>,Special refs>> are not pseudorefs. ++ +Due to historic reasons, Git has several irregular pseudo refs that do not +follow above rules. The following list of irregular pseudo refs is exhaustive +and shall not be extended in the future: + + - "`AUTO_MERGE`" + + - "`BISECT_EXPECTED_REV`" + + - "`NOTES_MERGE_PARTIAL`" + + - "`NOTES_MERGE_REF`" + + - "`MERGE_AUTOSTASH`" [[def_pull]]pull:: Pulling a <<def_branch,branch>> means to <<def_fetch,fetch>> it and diff --git a/refs.c b/refs.c index c64f66bff9..567c6fc6ff 100644 --- a/refs.c +++ b/refs.c @@ -905,6 +905,8 @@ int is_pseudoref(struct ref_store *refs, const char *refname) if (!is_pseudoref_syntax(refname)) return 0; + if (is_special_ref(refname)) + return 0; if (ends_with(refname, "_HEAD")) { refs_resolve_ref_unsafe(refs, refname, diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index 948f1bb5f4..8c92fbde79 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -52,6 +52,23 @@ test_expect_success '--include-root-refs pattern prints pseudorefs' ' test_cmp expect actual ' +test_expect_success '--include-root-refs pattern does not print special refs' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + git rev-parse HEAD >.git/MERGE_HEAD && + git for-each-ref --format="%(refname)" --include-root-refs >actual && + cat >expect <<-EOF && + HEAD + $(git symbolic-ref HEAD) + refs/tags/initial + EOF + test_cmp expect actual + ) +' + test_expect_success '--include-root-refs with other patterns' ' cat >expect <<-\EOF && HEAD
We have two refs which almost behave like a ref in many contexts, but aren't really: - MERGE_HEAD contains the list of parents during a merge. - FETCH_HEAD contains the list of fetched references after git-fetch(1) with some annotations. These references have been declared "special refs" in 8df4c5d205 (Documentation: add "special refs" to the glossary, 2024-01-19). Due to their "_HEAD" suffix, those special refs also almost look like a pseudo ref, even though they aren't. But because `is_pseudoref()` labels anything as a pseudo ref that ends with the `_HEAD` suffix, it will also happily label both of the above special refs as pseudo refs. This mis-labeling creates some weirdness and inconsistent behaviour across ref backends. As special refs are never stored via a ref backend, they theoretically speaking cannot know about special refs. But with the recent introduction of the `--include-root-refs` flag this isn't quite true anymore: the "files" backend will yield all refs that look like a pseudo ref or "HEAD" stored in the root directory. And given that both of the above look like pseudo refs, the "files" backend will list those, too. The "reftable" backend naturally cannot know about those, and teaching it to parse and yield these special refs very much feels like the wrong way to go. So, arguably, the better direction to go is to mark the "files" behaviour as a bug and stop yielding special refs there. Conceptually, this feels like the right thing to do, too. Special refs really aren't refs, they are a different file format that for some part may behave like a ref. If we were designing these special refs from scratch, we would have likely never named it anything like a "ref" at all. So let's double down on the path that the mentioned commit has started, which is to cleanly distinguish special refs and pseudo refs. Ideally, the proper way would be to return to the original meaning that pseudo refs really had: a ref that behaves like a ref for most of the part, but isn't really a ref. We would essentially replace the current "pseudoref" term with the "special ref" term. The consequence is that all refs except for FETCH_HEAD and MERGE_HEAD would be normal refs, regardless of whether they live in the root hierarchy or not. The way that pseudorefs are enforced now would then change to be a naming policy for refs, only. It's unclear though how sensible it would be to do such a large change to terminology now, which is why this commit does the next best thing. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Documentation/glossary-content.txt | 36 ++++++++++++++++++------------ refs.c | 2 ++ t/t6302-for-each-ref-filter.sh | 17 ++++++++++++++ 3 files changed, 41 insertions(+), 14 deletions(-)