diff mbox series

[PATCH/RFD] fix connection via git protocol

Message ID 5d4e0ce10f537b4bb795a70dd51db12ecaf0206d.1681556597.git.git@grubix.eu (mailing list archive)
State New, archived
Headers show
Series [PATCH/RFD] fix connection via git protocol | expand

Commit Message

Michael J Gruber April 15, 2023, 11:06 a.m. UTC
5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
broke connections via git protocol because it removed the inclusion of
the default port macro. While some may consider this transport to be
deprecated, it still serves some purpose.

connect.c (no more chache.h) and daemon.c (which still includes cache.h)
are the only users of the macro. Hot fix the issue by copying the
definition to connect.c.

A real fix will identify a proper common header file (I couldn't) or
create a new one.

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 connect.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Elijah Newren April 16, 2023, 5:47 a.m. UTC | #1
On Sat, Apr 15, 2023 at 4:06 AM Michael J Gruber <git@grubix.eu> wrote:
>
> 5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
> broke connections via git protocol because it removed the inclusion of
> the default port macro. While some may consider this transport to be
> deprecated, it still serves some purpose.

In particular the problem is that

	const char *port = STR(DEFAULT_GIT_PORT);

translates now to

	const char *port = "DEFAULT_GIT_PORT";

instead of

	const char *port = "9418";

Since both compile and nothing in the testsuite tests this, I just
missed this problem when making the other changes.

Sorry about this.

> connect.c (no more chache.h) and daemon.c (which still includes cache.h)
> are the only users of the macro. Hot fix the issue by copying the
> definition to connect.c.
>
> A real fix will identify a proper common header file (I couldn't) or
> create a new one.

I've got a patch that does precisely this that I just submitted as
part of my follow-on to the en/header-split-cache-h series.  I've included
that patch below in case Junio wants to advance it faster than the rest of
that series.

-- >8 --
Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 cache.h    | 21 ---------------------
 daemon.c   |  1 +
 protocol.h | 21 +++++++++++++++++++++
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/cache.h b/cache.h
index 2f21704da9e..71e2fe74c4f 100644
--- a/cache.h
+++ b/cache.h
@@ -39,27 +39,6 @@
 #define S_DIFFTREE_IFXMIN_NEQ	0x80000000
 
 
-/*
- * Intensive research over the course of many years has shown that
- * port 9418 is totally unused by anything else. Or
- *
- *	Your search - "port 9418" - did not match any documents.
- *
- * as www.google.com puts it.
- *
- * This port has been properly assigned for git use by IANA:
- * git (Assigned-9418) [I06-050728-0001].
- *
- *	git  9418/tcp   git pack transfer service
- *	git  9418/udp   git pack transfer service
- *
- * with Linus Torvalds <torvalds@osdl.org> as the point of
- * contact. September 2005.
- *
- * See http://www.iana.org/assignments/port-numbers
- */
-#define DEFAULT_GIT_PORT 9418
-
 /*
  * Basic data structures for the directory cache
  */
diff --git a/daemon.c b/daemon.c
index db8a31a6ea2..75c3c064574 100644
--- a/daemon.c
+++ b/daemon.c
@@ -4,6 +4,7 @@
 #include "config.h"
 #include "environment.h"
 #include "pkt-line.h"
+#include "protocol.h"
 #include "run-command.h"
 #include "setup.h"
 #include "strbuf.h"
diff --git a/protocol.h b/protocol.h
index cef1a4a01c7..de66bf80f84 100644
--- a/protocol.h
+++ b/protocol.h
@@ -1,6 +1,27 @@
 #ifndef PROTOCOL_H
 #define PROTOCOL_H
 
+/*
+ * Intensive research over the course of many years has shown that
+ * port 9418 is totally unused by anything else. Or
+ *
+ *	Your search - "port 9418" - did not match any documents.
+ *
+ * as www.google.com puts it.
+ *
+ * This port has been properly assigned for git use by IANA:
+ * git (Assigned-9418) [I06-050728-0001].
+ *
+ *	git  9418/tcp   git pack transfer service
+ *	git  9418/udp   git pack transfer service
+ *
+ * with Linus Torvalds <torvalds@osdl.org> as the point of
+ * contact. September 2005.
+ *
+ * See http://www.iana.org/assignments/port-numbers
+ */
+#define DEFAULT_GIT_PORT 9418
+
 enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
Elijah Newren April 16, 2023, 5:51 a.m. UTC | #2
On Sat, Apr 15, 2023 at 10:47 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Sat, Apr 15, 2023 at 4:06 AM Michael J Gruber <git@grubix.eu> wrote:
> >
<snip>
> > A real fix will identify a proper common header file (I couldn't) or
> > create a new one.
>
> I've got a patch that does precisely this that I just submitted as
> part of my follow-on to the en/header-split-cache-h series.  I've included
> that patch below in case Junio wants to advance it faster than the rest of
> that series.

Link to other follow-up series I took this patch from:
https://lore.kernel.org/git/003548de707f57cb9908b6dfbdf42954f668ee43.1681614206.git.gitgitgadget@gmail.com/
Michael J Gruber April 16, 2023, 11:21 a.m. UTC | #3
Elijah Newren venit, vidit, dixit 2023-04-16 07:51:03:
> On Sat, Apr 15, 2023 at 10:47 PM Elijah Newren <newren@gmail.com> wrote:
> >
> > On Sat, Apr 15, 2023 at 4:06 AM Michael J Gruber <git@grubix.eu> wrote:
> > >
> <snip>
> > > A real fix will identify a proper common header file (I couldn't) or
> > > create a new one.
> >
> > I've got a patch that does precisely this that I just submitted as
> > part of my follow-on to the en/header-split-cache-h series.  I've included
> > that patch below in case Junio wants to advance it faster than the rest of
> > that series.
> 
> Link to other follow-up series I took this patch from:
> https://lore.kernel.org/git/003548de707f57cb9908b6dfbdf42954f668ee43.1681614206.git.gitgitgadget@gmail.com/

Thanks, protocol.h looks like a good place. I wasn't sure whether
daemon.c wants to include it, but I'm fine with it.

The change which broke git protocol hit me basically at the same time at
which I upgraded my system (F37->F38) which made the first analysis
interesting ... Anyway, it's "released" on next only, and the fix is
simple, thus not urgent. git.git/next works nicely on Fedora 38 by the
way, no surprises with tool chain upgrades so far.

Apparantly, we don't test this protocol, and almost all our next users
don't use it either ;)

Cheers
Michael
Jeff King April 17, 2023, 7:38 a.m. UTC | #4
On Sat, Apr 15, 2023 at 10:47:35PM -0700, Elijah Newren wrote:

> On Sat, Apr 15, 2023 at 4:06 AM Michael J Gruber <git@grubix.eu> wrote:
> >
> > 5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
> > broke connections via git protocol because it removed the inclusion of
> > the default port macro. While some may consider this transport to be
> > deprecated, it still serves some purpose.
> 
> In particular the problem is that
> 
> 	const char *port = STR(DEFAULT_GIT_PORT);
> 
> translates now to
> 
> 	const char *port = "DEFAULT_GIT_PORT";
> 
> instead of
> 
> 	const char *port = "9418";
> 
> Since both compile and nothing in the testsuite tests this, I just
> missed this problem when making the other changes.

Your fix looks obviously correct, but the much more interesting thing to
me is how surprising it is that neither the compiler nor tests caught
it.  The tests don't catch it because we never use the default port for
our daemon tests, since we don't want two scripts running in parallel to
conflict. And your example above shows what the compiler sees, but root
issue is this funky string-ification macro:

  #define STR_(s) # s
  #define STR(s) STR_(s)

The preprocessor doesn't know that we'll be confused if "s" isn't
resolved, and by the time the compiler sees it, it's a string already.

Obviously we could add a test that catches this at run-time, but we
should be able to do better (catch it earlier, and with less code).

My first thought was: why can't we just treat the port as an "int" in
the first place? The answer is mostly that getaddrinfo() expects it as a
string. It could even be a non-numeric service like "http" in theory
(and looked up in /etc/services; Debian's even has "git" in it!), but
our get_host_and_port() refuses to allow that. But even if we didn't
want to ever support non-numeric service names, it makes the code more
awkward (we have to format the port into an extra buffer).

This would work:

diff --git a/connect.c b/connect.c
index fd3179e545..1eba71e34c 100644
--- a/connect.c
+++ b/connect.c
@@ -753,7 +753,7 @@ static char *host_end(char **hoststart, int removebrackets)
 }
 
 #define STR_(s)	# s
-#define STR(s)	STR_(s)
+#define STR(s) (STR_(s) + BUILD_ASSERT_OR_ZERO(s))
 
 static void get_host_and_port(char **host, const char **port)
 {

The error message is a bit verbose, but it starts with:

  connect.c: In function ‘git_tcp_connect_sock’:
  connect.c:801:32: error: ‘DEFAULT_GIT_PORT’ undeclared (first use in this function)
  801 |         const char *port = STR(DEFAULT_GIT_PORT);
      |                                ^~~~~~~~~~~~~~~~

which seems OK in practice.

Another alternative is to just declare this STR() thing too clever, and
put:

  #define DEFAULT_GIT_PORT_STR "9418"

next to the int declaration. It's not like its going to change. But the
BUILD_ASSERT doesn't seem too bad to me.

-Peff
Junio C Hamano April 17, 2023, 4:29 p.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

>> 5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
>> broke connections via git protocol because it removed the inclusion of
>> the default port macro. While some may consider this transport to be
>> deprecated, it still serves some purpose.
>
> In particular the problem is that
>
> 	const char *port = STR(DEFAULT_GIT_PORT);
>
> translates now to
>
> 	const char *port = "DEFAULT_GIT_PORT";
>
> instead of
>
> 	const char *port = "9418";

Wow, that is a bad one.  If people do want "DEFAULT_GIT_PORT", they
would never write STR(DEFAULT_GIT_PORT).  I wonder if we can have a
bit more clever STR() macro that catches this kind of mistake at the
compile time.

If this is worth fixing, the fix would probably be worth protecting
with a test or two, but the networking test with fixed port is not
something we can easily do without a sealed environment, so...

Thanks Michael for catching this.

> I've got a patch that does precisely this that I just submitted as
> part of my follow-on to the en/header-split-cache-h series.  I've included
> that patch below in case Junio wants to advance it faster than the rest of
> that series.

Yeah, burying it in a 24-patch series is a bit unfortunate.


> -- >8 --
> Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  cache.h    | 21 ---------------------
>  daemon.c   |  1 +
>  protocol.h | 21 +++++++++++++++++++++
>  3 files changed, 22 insertions(+), 21 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 2f21704da9e..71e2fe74c4f 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -39,27 +39,6 @@
>  #define S_DIFFTREE_IFXMIN_NEQ	0x80000000
>  
>  
> -/*
> - * Intensive research over the course of many years has shown that
> - * port 9418 is totally unused by anything else. Or
> - *
> - *	Your search - "port 9418" - did not match any documents.
> - *
> - * as www.google.com puts it.
> - *
> - * This port has been properly assigned for git use by IANA:
> - * git (Assigned-9418) [I06-050728-0001].
> - *
> - *	git  9418/tcp   git pack transfer service
> - *	git  9418/udp   git pack transfer service
> - *
> - * with Linus Torvalds <torvalds@osdl.org> as the point of
> - * contact. September 2005.
> - *
> - * See http://www.iana.org/assignments/port-numbers
> - */
> -#define DEFAULT_GIT_PORT 9418
> -
>  /*
>   * Basic data structures for the directory cache
>   */
> diff --git a/daemon.c b/daemon.c
> index db8a31a6ea2..75c3c064574 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -4,6 +4,7 @@
>  #include "config.h"
>  #include "environment.h"
>  #include "pkt-line.h"
> +#include "protocol.h"
>  #include "run-command.h"
>  #include "setup.h"
>  #include "strbuf.h"
> diff --git a/protocol.h b/protocol.h
> index cef1a4a01c7..de66bf80f84 100644
> --- a/protocol.h
> +++ b/protocol.h
> @@ -1,6 +1,27 @@
>  #ifndef PROTOCOL_H
>  #define PROTOCOL_H
>  
> +/*
> + * Intensive research over the course of many years has shown that
> + * port 9418 is totally unused by anything else. Or
> + *
> + *	Your search - "port 9418" - did not match any documents.
> + *
> + * as www.google.com puts it.
> + *
> + * This port has been properly assigned for git use by IANA:
> + * git (Assigned-9418) [I06-050728-0001].
> + *
> + *	git  9418/tcp   git pack transfer service
> + *	git  9418/udp   git pack transfer service
> + *
> + * with Linus Torvalds <torvalds@osdl.org> as the point of
> + * contact. September 2005.
> + *
> + * See http://www.iana.org/assignments/port-numbers
> + */
> +#define DEFAULT_GIT_PORT 9418
> +
>  enum protocol_version {
>  	protocol_unknown_version = -1,
>  	protocol_v0 = 0,
Junio C Hamano April 17, 2023, 4:33 p.m. UTC | #6
Jeff King <peff@peff.net> writes:

> The preprocessor doesn't know that we'll be confused if "s" isn't
> resolved, and by the time the compiler sees it, it's a string already.
>
> Obviously we could add a test that catches this at run-time, but we
> should be able to do better (catch it earlier, and with less code).

Ah, great minds think alike.  I am glad you have already thought it
through.

> My first thought was: why can't we just treat the port as an "int" in
> the first place? The answer is mostly that getaddrinfo() expects it as a
> string. It could even be a non-numeric service like "http" in theory
> (and looked up in /etc/services; Debian's even has "git" in it!), but
> our get_host_and_port() refuses to allow that. But even if we didn't
> want to ever support non-numeric service names, it makes the code more
> awkward (we have to format the port into an extra buffer).
>
> This would work:
>
> diff --git a/connect.c b/connect.c
> index fd3179e545..1eba71e34c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -753,7 +753,7 @@ static char *host_end(char **hoststart, int removebrackets)
>  }
>  
>  #define STR_(s)	# s
> -#define STR(s)	STR_(s)
> +#define STR(s) (STR_(s) + BUILD_ASSERT_OR_ZERO(s))

OOoooh.  Clever.  A pointer plus N indexes into an array, but if the
offset is N then the pointer is left intact so the caller does not
see the difference.

> ... But the
> BUILD_ASSERT doesn't seem too bad to me.

Indeed.
Junio C Hamano April 17, 2023, 6:31 p.m. UTC | #7
Junio C Hamano <gitster@pobox.com> writes:

>>  #define STR_(s)	# s
>> -#define STR(s)	STR_(s)
>> +#define STR(s) (STR_(s) + BUILD_ASSERT_OR_ZERO(s))
>
> OOoooh.  Clever.  A pointer plus N indexes into an array, but if the
> offset is N then the pointer is left intact so the caller does not

Sorry, but s/N/0/ obviously.

> see the difference.
>
>> ... But the
>> BUILD_ASSERT doesn't seem too bad to me.
>
> Indeed.
Elijah Newren April 18, 2023, 1:55 a.m. UTC | #8
On Mon, Apr 17, 2023 at 9:29 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> 5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
> >> broke connections via git protocol because it removed the inclusion of
> >> the default port macro. While some may consider this transport to be
> >> deprecated, it still serves some purpose.
> >
> > In particular the problem is that
> >
> >       const char *port = STR(DEFAULT_GIT_PORT);
> >
> > translates now to
> >
> >       const char *port = "DEFAULT_GIT_PORT";
> >
> > instead of
> >
> >       const char *port = "9418";
>
> Wow, that is a bad one.  If people do want "DEFAULT_GIT_PORT", they
> would never write STR(DEFAULT_GIT_PORT).  I wonder if we can have a
> bit more clever STR() macro that catches this kind of mistake at the
> compile time.
>
> If this is worth fixing, the fix would probably be worth protecting
> with a test or two, but the networking test with fixed port is not
> something we can easily do without a sealed environment, so...
>
> Thanks Michael for catching this.

Yup.

> > I've got a patch that does precisely this that I just submitted as
> > part of my follow-on to the en/header-split-cache-h series.  I've included
> > that patch below in case Junio wants to advance it faster than the rest of
> > that series.
>
> Yeah, burying it in a 24-patch series is a bit unfortunate.

I didn't know it was a fix for anything when I wrote it; it was in the
24-patch series just as a further refactoring.  Then I found out after
this report and doing a little digging I found it might be considered
a good fix for the issue so I included it here too.

If you apply this patch directly on this series, I'm happy to drop it
from the other series or do whatever makes things easiest.
Elijah Newren April 18, 2023, 1:56 a.m. UTC | #9
On Mon, Apr 17, 2023 at 12:38 AM Jeff King <peff@peff.net> wrote:
>
> On Sat, Apr 15, 2023 at 10:47:35PM -0700, Elijah Newren wrote:
>
> > On Sat, Apr 15, 2023 at 4:06 AM Michael J Gruber <git@grubix.eu> wrote:
> > >
> > > 5579f44d2f ("treewide: remove unnecessary cache.h inclusion", 2023-04-11)
> > > broke connections via git protocol because it removed the inclusion of
> > > the default port macro. While some may consider this transport to be
> > > deprecated, it still serves some purpose.
> >
> > In particular the problem is that
> >
> >       const char *port = STR(DEFAULT_GIT_PORT);
> >
> > translates now to
> >
> >       const char *port = "DEFAULT_GIT_PORT";
> >
> > instead of
> >
> >       const char *port = "9418";
> >
> > Since both compile and nothing in the testsuite tests this, I just
> > missed this problem when making the other changes.
>
> Your fix looks obviously correct, but the much more interesting thing to
> me is how surprising it is that neither the compiler nor tests caught
> it.  The tests don't catch it because we never use the default port for
> our daemon tests, since we don't want two scripts running in parallel to
> conflict. And your example above shows what the compiler sees, but root
> issue is this funky string-ification macro:
>
>   #define STR_(s) # s
>   #define STR(s) STR_(s)
>
> The preprocessor doesn't know that we'll be confused if "s" isn't
> resolved, and by the time the compiler sees it, it's a string already.
>
> Obviously we could add a test that catches this at run-time, but we
> should be able to do better (catch it earlier, and with less code).
>
> My first thought was: why can't we just treat the port as an "int" in
> the first place? The answer is mostly that getaddrinfo() expects it as a
> string. It could even be a non-numeric service like "http" in theory
> (and looked up in /etc/services; Debian's even has "git" in it!), but
> our get_host_and_port() refuses to allow that. But even if we didn't
> want to ever support non-numeric service names, it makes the code more
> awkward (we have to format the port into an extra buffer).
>
> This would work:
>
> diff --git a/connect.c b/connect.c
> index fd3179e545..1eba71e34c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -753,7 +753,7 @@ static char *host_end(char **hoststart, int removebrackets)
>  }
>
>  #define STR_(s)        # s
> -#define STR(s) STR_(s)
> +#define STR(s) (STR_(s) + BUILD_ASSERT_OR_ZERO(s))
>
>  static void get_host_and_port(char **host, const char **port)
>  {
>
> The error message is a bit verbose, but it starts with:
>
>   connect.c: In function ‘git_tcp_connect_sock’:
>   connect.c:801:32: error: ‘DEFAULT_GIT_PORT’ undeclared (first use in this function)
>   801 |         const char *port = STR(DEFAULT_GIT_PORT);
>       |                                ^~~~~~~~~~~~~~~~
>
> which seems OK in practice.

Seems pretty good to me.

> Another alternative is to just declare this STR() thing too clever, and
> put:
>
>   #define DEFAULT_GIT_PORT_STR "9418"
>
> next to the int declaration. It's not like its going to change. But the
> BUILD_ASSERT doesn't seem too bad to me.

Yeah, I like the BUILD_ASSERT.
Jeff King April 18, 2023, 3:39 a.m. UTC | #10
On Mon, Apr 17, 2023 at 09:33:57AM -0700, Junio C Hamano wrote:

> > diff --git a/connect.c b/connect.c
> > index fd3179e545..1eba71e34c 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -753,7 +753,7 @@ static char *host_end(char **hoststart, int removebrackets)
> >  }
> >  
> >  #define STR_(s)	# s
> > -#define STR(s)	STR_(s)
> > +#define STR(s) (STR_(s) + BUILD_ASSERT_OR_ZERO(s))
> 
> OOoooh.  Clever.  A pointer plus N indexes into an array, but if the
> offset is N then the pointer is left intact so the caller does not
> see the difference.
> 
> > ... But the
> > BUILD_ASSERT doesn't seem too bad to me.
> 
> Indeed.

So I started to write this up as a patch, but there's another subtle
thing going on.

The BUILD_ASSERT is actually checking two things: that the result
compiles (which is what we care about here), and that the expression it
evaluates is nonzero (which we don't).

So this would fail for example with:

  #define ZERO 0
  const char *x = STR(ZERO);

That is OK for our purposes here (a zero port does not make any sense).
But it feels a bit weird for a macro as generically named as STR(). At
least it's local to the one file. But maybe it should be PORT_TO_STR()
or something.

All of that makes me wonder if we wouldn't be just as happy with it as a
string in the first place. In three out of four locations that use it,
they want the string anyway (to feed to getaddrinfo). And in the final
one (git-daemon), we need to convert from the user's "--port" anyway, so
there's always some string-to-int parsing. And depending on the #ifdefs,
in most cases we turn it back into a string anyway to feed to...you
guessed it, getaddrinfo!

The exception is when NO_IPV6 is defined, in which we do want the
numeric value. But we could delay parsing until that point (and
otherwise let getaddrinfo handle, which seems more correct anyway).

Something like this (though I'd probably split it into a few patches to
reason about the motivation and implications of each):

diff --git a/cache.h b/cache.h
index 2f21704da9..2ece09a2b8 100644
--- a/cache.h
+++ b/cache.h
@@ -58,7 +58,7 @@
  *
  * See http://www.iana.org/assignments/port-numbers
  */
-#define DEFAULT_GIT_PORT 9418
+#define DEFAULT_GIT_PORT "9418"
 
 /*
  * Basic data structures for the directory cache
diff --git a/connect.c b/connect.c
index fd3179e545..189367604c 100644
--- a/connect.c
+++ b/connect.c
@@ -1,4 +1,5 @@
 #include "git-compat-util.h"
+#include "cache.h" /* or protocol.h after Elijah's patch */
 #include "config.h"
 #include "environment.h"
 #include "gettext.h"
@@ -752,9 +753,6 @@ static char *host_end(char **hoststart, int removebrackets)
 	return end;
 }
 
-#define STR_(s)	# s
-#define STR(s)	STR_(s)
-
 static void get_host_and_port(char **host, const char **port)
 {
 	char *colon, *end;
@@ -798,7 +796,7 @@ static int git_tcp_connect_sock(char *host, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
+	const char *port = DEFAULT_GIT_PORT;
 	struct addrinfo hints, *ai0, *ai;
 	int gai;
 	int cnt = 0;
@@ -868,7 +866,7 @@ static int git_tcp_connect_sock(char *host, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
+	const char *port = DEFAULT_GIT_PORT;
 	char *ep;
 	struct hostent *he;
 	struct sockaddr_in sa;
@@ -1020,7 +1018,7 @@ static int git_use_proxy(const char *host)
 
 static struct child_process *git_proxy_connect(int fd[2], char *host)
 {
-	const char *port = STR(DEFAULT_GIT_PORT);
+	const char *port = DEFAULT_GIT_PORT;
 	struct child_process *proxy;
 
 	get_host_and_port(&host, &port);
diff --git a/daemon.c b/daemon.c
index db8a31a6ea..ce692bad35 100644
--- a/daemon.c
+++ b/daemon.c
@@ -984,22 +984,20 @@ static const char *ip2str(int family, struct sockaddr *sin, socklen_t len)
 
 #ifndef NO_IPV6
 
-static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+static int setup_named_sock(char *listen_addr, const char *listen_port, struct socketlist *socklist)
 {
 	int socknum = 0;
-	char pbuf[NI_MAXSERV];
 	struct addrinfo hints, *ai0, *ai;
 	int gai;
 	long flags;
 
-	xsnprintf(pbuf, sizeof(pbuf), "%d", listen_port);
 	memset(&hints, 0, sizeof(hints));
 	hints.ai_family = AF_UNSPEC;
 	hints.ai_socktype = SOCK_STREAM;
 	hints.ai_protocol = IPPROTO_TCP;
 	hints.ai_flags = AI_PASSIVE;
 
-	gai = getaddrinfo(listen_addr, pbuf, &hints, &ai0);
+	gai = getaddrinfo(listen_addr, listen_port, &hints, &ai0);
 	if (gai) {
 		logerror("getaddrinfo() for %s failed: %s", listen_addr, gai_strerror(gai));
 		return 0;
@@ -1065,15 +1063,27 @@ static int setup_named_sock(char *listen_addr, int listen_port, struct socketlis
 
 #else /* NO_IPV6 */
 
-static int setup_named_sock(char *listen_addr, int listen_port, struct socketlist *socklist)
+static int parse_port(const char *s)
+{
+	unsigned long ret;
+	char *end;
+
+	ret = strtoul(s, &end, 0);
+	if (!ret || ret > 65535 || *end)
+		die(_("invalid listen port: %s"), s);
+
+	return ret;
+}
+
+static int setup_named_sock(char *listen_addr, const char *listen_port, struct socketlist *socklist)
 {
 	struct sockaddr_in sin;
 	int sockfd;
 	long flags;
 
 	memset(&sin, 0, sizeof sin);
 	sin.sin_family = AF_INET;
-	sin.sin_port = htons(listen_port);
+	sin.sin_port = parse_port(listen_port);
 
 	if (listen_addr) {
 		/* Well, host better be an IP address here. */
@@ -1122,7 +1132,7 @@ static int setup_named_sock(char *listen_addr, int listen_port, struct socketlis
 
 #endif
 
-static void socksetup(struct string_list *listen_addr, int listen_port, struct socketlist *socklist)
+static void socksetup(struct string_list *listen_addr, const char *listen_port, struct socketlist *socklist)
 {
 	if (!listen_addr->nr)
 		setup_named_sock(NULL, listen_port, socklist);
@@ -1133,7 +1143,7 @@ static void socksetup(struct string_list *listen_addr, int listen_port, struct s
 						   listen_port, socklist);
 
 			if (socknum == 0)
-				logerror("unable to allocate any listen sockets for host %s on port %u",
+				logerror("unable to allocate any listen sockets for host %s on port %s",
 					 listen_addr->items[i].string, listen_port);
 		}
 	}
@@ -1246,14 +1256,14 @@ static struct credentials *prepare_credentials(const char *user_name,
 }
 #endif
 
-static int serve(struct string_list *listen_addr, int listen_port,
+static int serve(struct string_list *listen_addr, const char *listen_port,
     struct credentials *cred)
 {
 	struct socketlist socklist = { NULL, 0, 0 };
 
 	socksetup(listen_addr, listen_port, &socklist);
 	if (socklist.nr == 0)
-		die("unable to allocate any listen sockets on port %u",
+		die("unable to allocate any listen sockets on port %s",
 		    listen_port);
 
 	drop_privileges(cred);
@@ -1265,7 +1275,7 @@ static int serve(struct string_list *listen_addr, int listen_port,
 
 int cmd_main(int argc, const char **argv)
 {
-	int listen_port = 0;
+	const char *listen_port = NULL;
 	struct string_list listen_addr = STRING_LIST_INIT_NODUP;
 	int serve_mode = 0, inetd_mode = 0;
 	const char *pid_file = NULL, *user_name = NULL, *group_name = NULL;
@@ -1282,13 +1292,8 @@ int cmd_main(int argc, const char **argv)
 			continue;
 		}
 		if (skip_prefix(arg, "--port=", &v)) {
-			char *end;
-			unsigned long n;
-			n = strtoul(v, &end, 0);
-			if (*v && !*end) {
-				listen_port = n;
-				continue;
-			}
+			listen_port = v;
+			continue;
 		}
 		if (!strcmp(arg, "--serve")) {
 			serve_mode = 1;
@@ -1439,7 +1444,7 @@ int cmd_main(int argc, const char **argv)
 
 	if (inetd_mode && (listen_port || (listen_addr.nr > 0)))
 		die("--listen= and --port= are incompatible with --inetd");
-	else if (listen_port == 0)
+	else if (!listen_port)
 		listen_port = DEFAULT_GIT_PORT;
 
 	if (group_name && !user_name)
Jeff King April 18, 2023, 3:40 a.m. UTC | #11
On Mon, Apr 17, 2023 at 06:56:42PM -0700, Elijah Newren wrote:

> > The error message is a bit verbose, but it starts with:
> >
> >   connect.c: In function ‘git_tcp_connect_sock’:
> >   connect.c:801:32: error: ‘DEFAULT_GIT_PORT’ undeclared (first use in this function)
> >   801 |         const char *port = STR(DEFAULT_GIT_PORT);
> >       |                                ^~~~~~~~~~~~~~~~
> >
> > which seems OK in practice.
> 
> Seems pretty good to me.

The rest of it is less nice:
  connect.c: In function ‘git_tcp_connect_sock’:
  connect.c:801:32: error: ‘DEFAULT_GIT_PORT’ undeclared (first use in this function)
    801 |         const char *port = STR(DEFAULT_GIT_PORT);
        |                                ^~~~~~~~~~~~~~~~
  git-compat-util.h:93:31: note: in definition of macro ‘BUILD_ASSERT_OR_ZERO’
     93 |         (sizeof(char [1 - 2*!(cond)]) - 1)
        |                               ^~~~
  connect.c:801:28: note: in expansion of macro ‘STR’
    801 |         const char *port = STR(DEFAULT_GIT_PORT);
        |                            ^~~
  connect.c:801:32: note: each undeclared identifier is reported only once for each function it appears in
    801 |         const char *port = STR(DEFAULT_GIT_PORT);
        |                                ^~~~~~~~~~~~~~~~
  git-compat-util.h:93:31: note: in definition of macro ‘BUILD_ASSERT_OR_ZERO’
     93 |         (sizeof(char [1 - 2*!(cond)]) - 1)
        |                               ^~~~
  connect.c:801:28: note: in expansion of macro ‘STR’
    801 |         const char *port = STR(DEFAULT_GIT_PORT);
        |                            ^~~

but that's kind of the nature of BUILD_ASSERT (it's even worse if the
code compiles but the assertion is false, since the first thing is that
funky sizeof() line). I guess it's the best we can do. The point is that
these aren't supposed to happen very often.

-Peff
Junio C Hamano April 18, 2023, 8:47 p.m. UTC | #12
Jeff King <peff@peff.net> writes:

> All of that makes me wonder if we wouldn't be just as happy with it as a
> string in the first place. In three out of four locations that use it,
> they want the string anyway (to feed to getaddrinfo). And in the final
> one (git-daemon), we need to convert from the user's "--port" anyway, so
> there's always some string-to-int parsing. And depending on the #ifdefs,
> in most cases we turn it back into a string anyway to feed to...you
> guessed it, getaddrinfo!
>
> The exception is when NO_IPV6 is defined, in which we do want the
> numeric value. But we could delay parsing until that point (and
> otherwise let getaddrinfo handle, which seems more correct anyway).
>
> Something like this (though I'd probably split it into a few patches to
> reason about the motivation and implications of each):

The updated code keeps the "port" as a string and turns it into a
short integer only when it is needed, which is nice.  In the future,
parse_port() may even want to be extended to call something like
getservbyname(), to allow something silly like

	#define DEFAULT_GIT_PORT "git"
Junio C Hamano April 18, 2023, 9 p.m. UTC | #13
Elijah Newren <newren@gmail.com> writes:

> I didn't know it was a fix for anything when I wrote it; it was in the
> 24-patch series just as a further refactoring.  Then I found out after
> this report and doing a little digging I found it might be considered
> a good fix for the issue so I included it here too.

Yup, let's queue it at the tip of (and as a part of) the base series
with a bit of explanation.  How does this look?

----- >8 --------- >8 --------- >8 --------- >8 -----
From: Elijah Newren <newren@gmail.com>
Date: Sun, 16 Apr 2023 03:03:05 +0000
Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h

Michael J Gruber noticed that connection via the git:// protocol no
longer worked after a recent header clean-up.  This was caused by
funny interaction of few gotchas.  First, a necessary definition

	#define DEFAULT_GIT_PORT 9418

was made invisible to a place where

	const char *port = STR(DEFAULT_GIT_PORT);

was expecting to turn the integer into "9418" with a clever STR()
macro, and ended up stringifying it to

	const char *port = "DEFAULT_GIT_PORT";

without giving any chance to compilers to notice such a mistake.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache.h    | 21 ---------------------
 daemon.c   |  1 +
 protocol.h | 21 +++++++++++++++++++++
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/cache.h b/cache.h
index 2f21704da9..71e2fe74c4 100644
--- a/cache.h
+++ b/cache.h
@@ -39,27 +39,6 @@
 #define S_DIFFTREE_IFXMIN_NEQ	0x80000000
 
 
-/*
- * Intensive research over the course of many years has shown that
- * port 9418 is totally unused by anything else. Or
- *
- *	Your search - "port 9418" - did not match any documents.
- *
- * as www.google.com puts it.
- *
- * This port has been properly assigned for git use by IANA:
- * git (Assigned-9418) [I06-050728-0001].
- *
- *	git  9418/tcp   git pack transfer service
- *	git  9418/udp   git pack transfer service
- *
- * with Linus Torvalds <torvalds@osdl.org> as the point of
- * contact. September 2005.
- *
- * See http://www.iana.org/assignments/port-numbers
- */
-#define DEFAULT_GIT_PORT 9418
-
 /*
  * Basic data structures for the directory cache
  */
diff --git a/daemon.c b/daemon.c
index db8a31a6ea..75c3c06457 100644
--- a/daemon.c
+++ b/daemon.c
@@ -4,6 +4,7 @@
 #include "config.h"
 #include "environment.h"
 #include "pkt-line.h"
+#include "protocol.h"
 #include "run-command.h"
 #include "setup.h"
 #include "strbuf.h"
diff --git a/protocol.h b/protocol.h
index cef1a4a01c..de66bf80f8 100644
--- a/protocol.h
+++ b/protocol.h
@@ -1,6 +1,27 @@
 #ifndef PROTOCOL_H
 #define PROTOCOL_H
 
+/*
+ * Intensive research over the course of many years has shown that
+ * port 9418 is totally unused by anything else. Or
+ *
+ *	Your search - "port 9418" - did not match any documents.
+ *
+ * as www.google.com puts it.
+ *
+ * This port has been properly assigned for git use by IANA:
+ * git (Assigned-9418) [I06-050728-0001].
+ *
+ *	git  9418/tcp   git pack transfer service
+ *	git  9418/udp   git pack transfer service
+ *
+ * with Linus Torvalds <torvalds@osdl.org> as the point of
+ * contact. September 2005.
+ *
+ * See http://www.iana.org/assignments/port-numbers
+ */
+#define DEFAULT_GIT_PORT 9418
+
 enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
Eric Sunshine April 18, 2023, 9:24 p.m. UTC | #14
On Tue, Apr 18, 2023 at 5:06 PM Junio C Hamano <gitster@pobox.com> wrote:
> From: Elijah Newren <newren@gmail.com>
> Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
>
> Michael J Gruber noticed that connection via the git:// protocol no
> longer worked after a recent header clean-up.

A link to Michael's email might be useful for future readers of this
commit message.

    Michale J Gruber noticed[1] that connection...

    [1]: https://lore.kernel.org/git/5d4e0ce10f537b4bb795a70dd51db12ecaf0206d.1681556597.git.git@grubix.eu/

> This was caused by
> funny interaction of few gotchas.  First, a necessary definition
>
>         #define DEFAULT_GIT_PORT 9418
>
> was made invisible to a place where
>
>         const char *port = STR(DEFAULT_GIT_PORT);
>
> was expecting to turn the integer into "9418" with a clever STR()
> macro, and ended up stringifying it to
>
>         const char *port = "DEFAULT_GIT_PORT";
>
> without giving any chance to compilers to notice such a mistake.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Perhaps an additional tailer would be appropriate?

    Reported-by: Michael J Gruber <git@grubix.eu>
Elijah Newren April 19, 2023, 3:15 a.m. UTC | #15
On Tue, Apr 18, 2023 at 2:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I didn't know it was a fix for anything when I wrote it; it was in the
> > 24-patch series just as a further refactoring.  Then I found out after
> > this report and doing a little digging I found it might be considered
> > a good fix for the issue so I included it here too.
>
> Yup, let's queue it at the tip of (and as a part of) the base series
> with a bit of explanation.  How does this look?
>
> ----- >8 --------- >8 --------- >8 --------- >8 -----
> From: Elijah Newren <newren@gmail.com>
> Date: Sun, 16 Apr 2023 03:03:05 +0000
> Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
>
> Michael J Gruber noticed that connection via the git:// protocol no
> longer worked after a recent header clean-up.  This was caused by
> funny interaction of few gotchas.  First, a necessary definition
>
>         #define DEFAULT_GIT_PORT 9418
>
> was made invisible to a place where
>
>         const char *port = STR(DEFAULT_GIT_PORT);
>
> was expecting to turn the integer into "9418" with a clever STR()
> macro, and ended up stringifying it to
>
>         const char *port = "DEFAULT_GIT_PORT";
>
> without giving any chance to compilers to notice such a mistake.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Looks great!

> ---
>  cache.h    | 21 ---------------------
>  daemon.c   |  1 +
>  protocol.h | 21 +++++++++++++++++++++
>  3 files changed, 22 insertions(+), 21 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 2f21704da9..71e2fe74c4 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -39,27 +39,6 @@
>  #define S_DIFFTREE_IFXMIN_NEQ  0x80000000
>
>
> -/*
> - * Intensive research over the course of many years has shown that
> - * port 9418 is totally unused by anything else. Or
> - *
> - *     Your search - "port 9418" - did not match any documents.
> - *
> - * as www.google.com puts it.
> - *
> - * This port has been properly assigned for git use by IANA:
> - * git (Assigned-9418) [I06-050728-0001].
> - *
> - *     git  9418/tcp   git pack transfer service
> - *     git  9418/udp   git pack transfer service
> - *
> - * with Linus Torvalds <torvalds@osdl.org> as the point of
> - * contact. September 2005.
> - *
> - * See http://www.iana.org/assignments/port-numbers
> - */
> -#define DEFAULT_GIT_PORT 9418
> -
>  /*
>   * Basic data structures for the directory cache
>   */
> diff --git a/daemon.c b/daemon.c
> index db8a31a6ea..75c3c06457 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -4,6 +4,7 @@
>  #include "config.h"
>  #include "environment.h"
>  #include "pkt-line.h"
> +#include "protocol.h"
>  #include "run-command.h"
>  #include "setup.h"
>  #include "strbuf.h"
> diff --git a/protocol.h b/protocol.h
> index cef1a4a01c..de66bf80f8 100644
> --- a/protocol.h
> +++ b/protocol.h
> @@ -1,6 +1,27 @@
>  #ifndef PROTOCOL_H
>  #define PROTOCOL_H
>
> +/*
> + * Intensive research over the course of many years has shown that
> + * port 9418 is totally unused by anything else. Or
> + *
> + *     Your search - "port 9418" - did not match any documents.
> + *
> + * as www.google.com puts it.
> + *
> + * This port has been properly assigned for git use by IANA:
> + * git (Assigned-9418) [I06-050728-0001].
> + *
> + *     git  9418/tcp   git pack transfer service
> + *     git  9418/udp   git pack transfer service
> + *
> + * with Linus Torvalds <torvalds@osdl.org> as the point of
> + * contact. September 2005.
> + *
> + * See http://www.iana.org/assignments/port-numbers
> + */
> +#define DEFAULT_GIT_PORT 9418
> +
>  enum protocol_version {
>         protocol_unknown_version = -1,
>         protocol_v0 = 0,
> --
> 2.40.0-352-g667fcf4e15
>
>
>
Elijah Newren April 19, 2023, 3:16 a.m. UTC | #16
On Tue, Apr 18, 2023 at 2:25 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Tue, Apr 18, 2023 at 5:06 PM Junio C Hamano <gitster@pobox.com> wrote:
> > From: Elijah Newren <newren@gmail.com>
> > Subject: [PATCH] protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
> >
> > Michael J Gruber noticed that connection via the git:// protocol no
> > longer worked after a recent header clean-up.
>
> A link to Michael's email might be useful for future readers of this
> commit message.
>
>     Michale J Gruber noticed[1] that connection...
>
>     [1]: https://lore.kernel.org/git/5d4e0ce10f537b4bb795a70dd51db12ecaf0206d.1681556597.git.git@grubix.eu/
>
> > This was caused by
> > funny interaction of few gotchas.  First, a necessary definition
> >
> >         #define DEFAULT_GIT_PORT 9418
> >
> > was made invisible to a place where
> >
> >         const char *port = STR(DEFAULT_GIT_PORT);
> >
> > was expecting to turn the integer into "9418" with a clever STR()
> > macro, and ended up stringifying it to
> >
> >         const char *port = "DEFAULT_GIT_PORT";
> >
> > without giving any chance to compilers to notice such a mistake.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > Signed-off-by: Junio C Hamano <gitster@pobox.com>
>
> Perhaps an additional tailer would be appropriate?
>
>     Reported-by: Michael J Gruber <git@grubix.eu>

These both look like good additions too.
Junio C Hamano April 24, 2023, 8:57 p.m. UTC | #17
Eric Sunshine <sunshine@sunshineco.com> writes:

> A link to Michael's email might be useful for future readers of this
> commit message.
>
>     Michale J Gruber noticed[1] that connection...

I don't think so.  At least, I tried to write the proposed log
message in such a way that it is not needed.

I encourage people not to add URLs just so that they can be lazy and
omit summarizing the discussion for readers so that they do not have
to refer to external material in order to understand "git log"
output.  And I was trying to show by example.

Was there anything you couldn't read out of what I wrote that is
necessary to understand the issue the commit tried to address
without reading the original discussion?
diff mbox series

Patch

diff --git a/connect.c b/connect.c
index 5d8036197d..64f89e33cf 100644
--- a/connect.c
+++ b/connect.c
@@ -20,6 +20,8 @@ 
 #include "alias.h"
 #include "bundle-uri.h"
 
+#define DEFAULT_GIT_PORT 9418
+
 static char *server_capabilities_v1;
 static struct strvec server_capabilities_v2 = STRVEC_INIT;
 static const char *next_server_feature_value(const char *feature, int *len, int *offset);