diff mbox series

[v5,6/6] agent: advertise OS name via agent capability

Message ID 20250214123734.1403120-7-usmanakinyemi202@gmail.com (mailing list archive)
State Superseded
Headers show
Series extend agent capability to include OS name | expand

Commit Message

Usman Akinyemi Feb. 14, 2025, 12:36 p.m. UTC
As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality. The operating system name is retrieved using the 'sysname'
field of the `uname(2)` system call or its equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/gitprotocol-v2.txt | 13 ++++++++-----
 t/t5701-git-serve.sh             | 16 +++++++++++++++-
 t/test-lib-functions.sh          |  8 ++++++++
 version.c                        | 29 ++++++++++++++++++++++++++++-
 version.h                        |  3 +++
 5 files changed, 62 insertions(+), 7 deletions(-)

Comments

Junio C Hamano Feb. 14, 2025, 10:07 p.m. UTC | #1
Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> As some issues that can happen with a Git client can be operating system
> specific, it can be useful for a server to know which OS a client is
> using. In the same way it can be useful for a client to know which OS
> a server is using.
>
> Our current agent capability is in the form of "package/version" (e.g.,
> "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").

Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP?  The
capability list in protocol v1 is on a single line that is whitespace 
separated (cf. connect.c:parse_feature_value()) without any escape
mechanism.

	Side note.  Does it pose a security hole, when we can set
	agent to any value?  I do not think so, as it controls what
	this end sends to the other.  If you are attacker in control
	of your own agent string to be sent to the other end, and
	use a string with a whitespace in it after "agent=" to claim
	that you support a capability you actually don't, that is
	not a new way to attack the other side available to you---you
	can write your own Git client to talk to the other side to
	send such a bogus capablity list anyway.

> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> index 1652fef3ae..f4831a8787 100644
> --- a/Documentation/gitprotocol-v2.txt
> +++ b/Documentation/gitprotocol-v2.txt
> @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
>  the `agent` capability with a value `Y` (in the form `agent=Y`) in its
>  request to the server (but it MUST NOT do so if the server did not
>  advertise the agent capability). The `X` and `Y` strings may contain any
> -printable ASCII characters except space (i.e., the byte range 32 < x <
> -127), and are typically of the form "package/version" (e.g.,
> -"git/1.8.3.1"). The agent strings are purely informative for statistics
> -and debugging purposes, and MUST NOT be used to programmatically assume
> -the presence or absence of particular features.
> +printable ASCII characters (i.e., the byte range 31 < x < 127), and are

Patches 1 & 2 redacted non-printables and SP separately, because SP
is considered printable.  With this change you are allowing SP to be
passed without getting redacted?  I do not think it is a good idea
(see above).

While I'd prefer to keep the range the same as before, i.e. "any
printable ASCII characters except space", "33 <= x <= 126" may be
more readily recognisable that we are doing something unusual, as
"32 <= x <= 126" is fairly easily recognisable as "ASCII printable".

> +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")

So, I'd suggest using something other than " " between "version" and
"os".  Dot (as if the byte there were redacted) or slash or dash or
whatever, anything that is not whitespace.

> +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
> +be configured using the GIT_USER_AGENT environment variable and it takes
> +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
> +system call or its equivalent. The agent strings are purely informative for
> +statistics and debugging purposes, and MUST NOT be used to programmatically
> +assume the presence or absence of particular features.

Other than these nits, I find the above very well done.

As to the additional implementation of git_user_agent_sanitized(),
except for that same "do we really want SP there?" question, I see
nothing questionable there, either.

Overall very nicely done and presented.

Thanks.
Usman Akinyemi Feb. 15, 2025, 3:29 p.m. UTC | #2
On Sat, Feb 15, 2025 at 3:37 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
Hi Junio,
>
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
> >
> > Our current agent capability is in the form of "package/version" (e.g.,
> > "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
>
> Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP?  The
> capability list in protocol v1 is on a single line that is whitespace
> separated (cf. connect.c:parse_feature_value()) without any escape
> mechanism.
Yeah, I almost missed this function. Thanks for pointing it out.
>
>         Side note.  Does it pose a security hole, when we can set
>         agent to any value?  I do not think so, as it controls what
>         this end sends to the other.  If you are attacker in control
>         of your own agent string to be sent to the other end, and
>         use a string with a whitespace in it after "agent=" to claim
>         that you support a capability you actually don't, that is
>         not a new way to attack the other side available to you---you
>         can write your own Git client to talk to the other side to
>         send such a bogus capablity list anyway.
Thanks for this explanation.
>
> > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> > index 1652fef3ae..f4831a8787 100644
> > --- a/Documentation/gitprotocol-v2.txt
> > +++ b/Documentation/gitprotocol-v2.txt
> > @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
> >  the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> >  request to the server (but it MUST NOT do so if the server did not
> >  advertise the agent capability). The `X` and `Y` strings may contain any
> > -printable ASCII characters except space (i.e., the byte range 32 < x <
> > -127), and are typically of the form "package/version" (e.g.,
> > -"git/1.8.3.1"). The agent strings are purely informative for statistics
> > -and debugging purposes, and MUST NOT be used to programmatically assume
> > -the presence or absence of particular features.
> > +printable ASCII characters (i.e., the byte range 31 < x < 127), and are
>
> Patches 1 & 2 redacted non-printables and SP separately, because SP
> is considered printable.  With this change you are allowing SP to be
> passed without getting redacted?  I do not think it is a good idea
> (see above).
>
> While I'd prefer to keep the range the same as before, i.e. "any
> printable ASCII characters except space", "33 <= x <= 126" may be
> more readily recognisable that we are doing something unusual, as
> "32 <= x <= 126" is fairly easily recognisable as "ASCII printable".
>
> > +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
>
> So, I'd suggest using something other than " " between "version" and
> "os".  Dot (as if the byte there were redacted) or slash or dash or
> whatever, anything that is not whitespace.
Yeah, Noted. Thanks.
>
> > +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
> > +be configured using the GIT_USER_AGENT environment variable and it takes
> > +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
> > +system call or its equivalent. The agent strings are purely informative for
> > +statistics and debugging purposes, and MUST NOT be used to programmatically
> > +assume the presence or absence of particular features.
>
> Other than these nits, I find the above very well done.
>
> As to the additional implementation of git_user_agent_sanitized(),
> except for that same "do we really want SP there?" question, I see
> nothing questionable there, either.
>
> Overall very nicely done and presented.
Thank you.
>
> Thanks.
diff mbox series

Patch

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..f4831a8787 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -184,11 +184,14 @@  form `agent=X`) to notify the client that the server is running version
 the `agent` capability with a value `Y` (in the form `agent=Y`) in its
 request to the server (but it MUST NOT do so if the server did not
 advertise the agent capability). The `X` and `Y` strings may contain any
-printable ASCII characters except space (i.e., the byte range 32 < x <
-127), and are typically of the form "package/version" (e.g.,
-"git/1.8.3.1"). The agent strings are purely informative for statistics
-and debugging purposes, and MUST NOT be used to programmatically assume
-the presence or absence of particular features.
+printable ASCII characters (i.e., the byte range 31 < x < 127), and are
+typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
+where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
+be configured using the GIT_USER_AGENT environment variable and it takes
+priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
+system call or its equivalent. The agent strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to programmatically
+assume the presence or absence of particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 4c24a188b9..4f0b053c4a 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,13 +8,19 @@  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success 'setup to generate files with expected content' '
-	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+	printf "agent=git/%s" "$(git version | cut -d" " -f3)" >agent_capability &&
 
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
 
+	if test_have_prereq WINDOWS
+	then
+		printf "agent=FAKE\n" >agent_capability
+	else
+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
+	fi &&
 	cat >expect.base <<-EOF &&
 	version 2
 	$(cat agent_capability)
@@ -31,6 +37,10 @@  test_expect_success 'setup to generate files with expected content' '
 test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
@@ -361,6 +371,10 @@  test_expect_success 'test capability advertisement with uploadpack.advertiseBund
 	    expect.extra \
 	    expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..3465904323 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@  test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# correspond to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index d95221a72a..027ebc82b4 100644
--- a/version.c
+++ b/version.c
@@ -1,8 +1,9 @@ 
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
-#include "sane-ctype.h"
 #include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
@@ -34,6 +35,27 @@  const char *git_user_agent(void)
 	return agent;
 }
 
+/*
+  Retrieve, sanitize and cache operating system info for subsequent
+  calls. Return a pointer to the sanitized operating system info
+  string.
+*/
+static const char *os_info(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		/* Sanitize the os information immediately */
+		redact_non_printables(&buf);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
 const char *git_user_agent_sanitized(void)
 {
 	static const char *agent = NULL;
@@ -43,6 +65,11 @@  const char *git_user_agent_sanitized(void)
 
 		strbuf_addstr(&buf, git_user_agent());
 		redact_non_printables(&buf);
+
+		if (!getenv("GIT_USER_AGENT")) {
+			strbuf_addch(&buf, ' ');
+			strbuf_addstr(&buf, os_info());
+		}
 		agent = strbuf_detach(&buf, NULL);
 	}
 
diff --git a/version.h b/version.h
index 5eb586c0bd..bbde6d371a 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@ 
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,5 @@  const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+
 #endif /* VERSION_H */