diff mbox series

[RFC] builtin/shortlog: explicitly set hash algo when there is no repo

Message ID 20241011183445.229228-1-wolf@oriole.systems (mailing list archive)
State Superseded
Headers show
Series [RFC] builtin/shortlog: explicitly set hash algo when there is no repo | expand

Commit Message

Wolfgang Müller Oct. 11, 2024, 6:34 p.m. UTC
Whilst git-shortlog(1) does not explicitly need any repository
information when run without reference to one, it still parses some of
its arguments with parse_revision_opt() which assumes that the hash
algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) we stopped setting up a default
hash algorithm and instead require commands to set it up explicitly.

This was done for most other commands like in ab274909d4 (builtin/diff:
explicitly set hash algo when there is no repo, 2024-05-07) but was
missed for builtin/shortlog, making git-shortlog(1) segfault outside of
a repository when given arguments like --author that trigger a call to
parse_revision_opt().

Fix this for now by explicitly setting the hash algorithm to SHA1.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
---
I think there is a much cleaner patch here if one would look at
disentangling the parsing machinery split between cmd_shortlog() and
parse_revision_opt() such that the latter is only called if there is an
actual repository, but I'm not at all familiar enough with the codebase
to do that. Like the commit says some other commands were fixed like
this as well, so I thought to send this patch your way.

 builtin/shortlog.c | 8 ++++++++
 1 file changed, 8 insertions(+)


base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2

Comments

Wolfgang Müller Oct. 15, 2024, 9:33 a.m. UTC | #1
On 2024-10-11 20:34, Wolfgang Müller wrote:
> Whilst git-shortlog(1) does not explicitly need any repository
> information when run without reference to one, it still parses some of
> its arguments with parse_revision_opt() which assumes that the hash
> algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
> as the default object hash, 2024-05-07) we stopped setting up a default
> hash algorithm and instead require commands to set it up explicitly.
> 
> This was done for most other commands like in ab274909d4 (builtin/diff:
> explicitly set hash algo when there is no repo, 2024-05-07) but was
> missed for builtin/shortlog, making git-shortlog(1) segfault outside of
> a repository when given arguments like --author that trigger a call to
> parse_revision_opt().
> 
> Fix this for now by explicitly setting the hash algorithm to SHA1.
> 
> Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
> ---
> I think there is a much cleaner patch here if one would look at
> disentangling the parsing machinery split between cmd_shortlog() and
> parse_revision_opt() such that the latter is only called if there is an
> actual repository, but I'm not at all familiar enough with the codebase
> to do that. Like the commit says some other commands were fixed like
> this as well, so I thought to send this patch your way.

Was this topic maybe missed? It's not a critical fix but would solve a
segfault that is pretty easy for a user to trigger.

Thanks!
Kristoffer Haugsbakk Oct. 15, 2024, 9:47 a.m. UTC | #2
On Tue, Oct 15, 2024, at 11:33, Wolfgang Müller wrote:
> > […]
>
> Was this topic maybe missed? It's not a critical fix but would solve a
> segfault that is pretty easy for a user to trigger.

I can confirm that this gives me a segfault:

```
# Directory which is not tracked
cd ~
git shortlog --author=Kristoffer
```

Maybe a few things could make the change more ready to be included:

1. A regression test for that segfault
2. A quickfix (which you indicate below the three-dash/hyphens) can be
   documented in the comment above the code with a `NEEDSWORK`
Taylor Blau Oct. 15, 2024, 7:54 p.m. UTC | #3
On Tue, Oct 15, 2024 at 11:33:35AM +0200, Wolfgang Müller wrote:
> On 2024-10-11 20:34, Wolfgang Müller wrote:
> > Whilst git-shortlog(1) does not explicitly need any repository
> > information when run without reference to one, it still parses some of
> > its arguments with parse_revision_opt() which assumes that the hash
> > algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
> > as the default object hash, 2024-05-07) we stopped setting up a default
> > hash algorithm and instead require commands to set it up explicitly.
> >
> > This was done for most other commands like in ab274909d4 (builtin/diff:
> > explicitly set hash algo when there is no repo, 2024-05-07) but was
> > missed for builtin/shortlog, making git-shortlog(1) segfault outside of
> > a repository when given arguments like --author that trigger a call to
> > parse_revision_opt().
> >
> > Fix this for now by explicitly setting the hash algorithm to SHA1.
> >
> > Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
> > ---
> > I think there is a much cleaner patch here if one would look at
> > disentangling the parsing machinery split between cmd_shortlog() and
> > parse_revision_opt() such that the latter is only called if there is an
> > actual repository, but I'm not at all familiar enough with the codebase
> > to do that. Like the commit says some other commands were fixed like
> > this as well, so I thought to send this patch your way.
>
> Was this topic maybe missed? It's not a critical fix but would solve a
> segfault that is pretty easy for a user to trigger.

Yeah. It looks like this was sent out on Friday, which was Junio's last
working day before vacation. I think we probably both assumed the other
picked it up ;-).

Thanks,
Taylor
Taylor Blau Oct. 15, 2024, 11:28 p.m. UTC | #4
On Tue, Oct 15, 2024 at 03:54:51PM -0400, Taylor Blau wrote:
> On Tue, Oct 15, 2024 at 11:33:35AM +0200, Wolfgang Müller wrote:
> > On 2024-10-11 20:34, Wolfgang Müller wrote:
> > > Whilst git-shortlog(1) does not explicitly need any repository
> > > information when run without reference to one, it still parses some of
> > > its arguments with parse_revision_opt() which assumes that the hash
> > > algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
> > > as the default object hash, 2024-05-07) we stopped setting up a default
> > > hash algorithm and instead require commands to set it up explicitly.
> > >
> > > This was done for most other commands like in ab274909d4 (builtin/diff:
> > > explicitly set hash algo when there is no repo, 2024-05-07) but was
> > > missed for builtin/shortlog, making git-shortlog(1) segfault outside of
> > > a repository when given arguments like --author that trigger a call to
> > > parse_revision_opt().
> > >
> > > Fix this for now by explicitly setting the hash algorithm to SHA1.
> > >
> > > Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
> > > ---
> > > I think there is a much cleaner patch here if one would look at
> > > disentangling the parsing machinery split between cmd_shortlog() and
> > > parse_revision_opt() such that the latter is only called if there is an
> > > actual repository, but I'm not at all familiar enough with the codebase
> > > to do that. Like the commit says some other commands were fixed like
> > > this as well, so I thought to send this patch your way.
> >
> > Was this topic maybe missed? It's not a critical fix but would solve a
> > segfault that is pretty easy for a user to trigger.
>
> Yeah. It looks like this was sent out on Friday, which was Junio's last
> working day before vacation. I think we probably both assumed the other
> picked it up ;-).

I picked this up today and applied it to 'seen', but ejected it before
pushing the result out, as this appears to break CI, presumably due to
the mixed declaration and code in the patch.

Thanks,
Taylor
Wolfgang Müller Oct. 16, 2024, 8:15 a.m. UTC | #5
On 2024-10-15 19:28, Taylor Blau wrote:
> I picked this up today and applied it to 'seen', but ejected it before
> pushing the result out, as this appears to break CI, presumably due to
> the mixed declaration and code in the patch.

Oh, I had missed setting DEVELOPER=1 in my builds so I didn't see this
failure myself. Will fix in the upcoming version.
Taylor Blau Oct. 16, 2024, 6:28 p.m. UTC | #6
On Wed, Oct 16, 2024 at 10:15:31AM +0200, Wolfgang Müller wrote:
> On 2024-10-15 19:28, Taylor Blau wrote:
> > I picked this up today and applied it to 'seen', but ejected it before
> > pushing the result out, as this appears to break CI, presumably due to
> > the mixed declaration and code in the patch.
>
> Oh, I had missed setting DEVELOPER=1 in my builds so I didn't see this
> failure myself. Will fix in the upcoming version.

Thanks for fixing.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 3ed5c46078..6422f088b2 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -387,6 +387,14 @@  int cmd_shortlog(int argc,
 	struct rev_info rev;
 	int nongit = !startup_info->have_repository;
 
+	/*
+	 * Later on we'll call parse_revision_opt which relies on the hash
+	 * algorithm being set but since we are operating outside of a Git
+	 * repository we cannot determine one. For now default to SHA1.
+	 */
+	if (nongit && !the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	const struct option options[] = {
 		OPT_BIT('c', "committer", &log.groups,
 			N_("group by committer rather than author"),