Message ID | 20250214123734.1403120-7-usmanakinyemi202@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | extend agent capability to include OS name | expand |
Usman Akinyemi <usmanakinyemi202@gmail.com> writes: > As some issues that can happen with a Git client can be operating system > specific, it can be useful for a server to know which OS a client is > using. In the same way it can be useful for a client to know which OS > a server is using. > > Our current agent capability is in the form of "package/version" (e.g., > "git/1.8.3.1"). Let's extend it to include the operating system name (os) > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux"). Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP? The capability list in protocol v1 is on a single line that is whitespace separated (cf. connect.c:parse_feature_value()) without any escape mechanism. Side note. Does it pose a security hole, when we can set agent to any value? I do not think so, as it controls what this end sends to the other. If you are attacker in control of your own agent string to be sent to the other end, and use a string with a whitespace in it after "agent=" to claim that you support a capability you actually don't, that is not a new way to attack the other side available to you---you can write your own Git client to talk to the other side to send such a bogus capablity list anyway. > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt > index 1652fef3ae..f4831a8787 100644 > --- a/Documentation/gitprotocol-v2.txt > +++ b/Documentation/gitprotocol-v2.txt > @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version > the `agent` capability with a value `Y` (in the form `agent=Y`) in its > request to the server (but it MUST NOT do so if the server did not > advertise the agent capability). The `X` and `Y` strings may contain any > -printable ASCII characters except space (i.e., the byte range 32 < x < > -127), and are typically of the form "package/version" (e.g., > -"git/1.8.3.1"). The agent strings are purely informative for statistics > -and debugging purposes, and MUST NOT be used to programmatically assume > -the presence or absence of particular features. > +printable ASCII characters (i.e., the byte range 31 < x < 127), and are Patches 1 & 2 redacted non-printables and SP separately, because SP is considered printable. With this change you are allowing SP to be passed without getting redacted? I do not think it is a good idea (see above). While I'd prefer to keep the range the same as before, i.e. "any printable ASCII characters except space", "33 <= x <= 126" may be more readily recognisable that we are doing something unusual, as "32 <= x <= 126" is fairly easily recognisable as "ASCII printable". > +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux") So, I'd suggest using something other than " " between "version" and "os". Dot (as if the byte there were redacted) or slash or dash or whatever, anything that is not whitespace. > +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can > +be configured using the GIT_USER_AGENT environment variable and it takes > +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)` > +system call or its equivalent. The agent strings are purely informative for > +statistics and debugging purposes, and MUST NOT be used to programmatically > +assume the presence or absence of particular features. Other than these nits, I find the above very well done. As to the additional implementation of git_user_agent_sanitized(), except for that same "do we really want SP there?" question, I see nothing questionable there, either. Overall very nicely done and presented. Thanks.
On Sat, Feb 15, 2025 at 3:37 AM Junio C Hamano <gitster@pobox.com> wrote: > > Usman Akinyemi <usmanakinyemi202@gmail.com> writes: Hi Junio, > > > As some issues that can happen with a Git client can be operating system > > specific, it can be useful for a server to know which OS a client is > > using. In the same way it can be useful for a client to know which OS > > a server is using. > > > > Our current agent capability is in the form of "package/version" (e.g., > > "git/1.8.3.1"). Let's extend it to include the operating system name (os) > > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux"). > > Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP? The > capability list in protocol v1 is on a single line that is whitespace > separated (cf. connect.c:parse_feature_value()) without any escape > mechanism. Yeah, I almost missed this function. Thanks for pointing it out. > > Side note. Does it pose a security hole, when we can set > agent to any value? I do not think so, as it controls what > this end sends to the other. If you are attacker in control > of your own agent string to be sent to the other end, and > use a string with a whitespace in it after "agent=" to claim > that you support a capability you actually don't, that is > not a new way to attack the other side available to you---you > can write your own Git client to talk to the other side to > send such a bogus capablity list anyway. Thanks for this explanation. > > > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt > > index 1652fef3ae..f4831a8787 100644 > > --- a/Documentation/gitprotocol-v2.txt > > +++ b/Documentation/gitprotocol-v2.txt > > @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version > > the `agent` capability with a value `Y` (in the form `agent=Y`) in its > > request to the server (but it MUST NOT do so if the server did not > > advertise the agent capability). The `X` and `Y` strings may contain any > > -printable ASCII characters except space (i.e., the byte range 32 < x < > > -127), and are typically of the form "package/version" (e.g., > > -"git/1.8.3.1"). The agent strings are purely informative for statistics > > -and debugging purposes, and MUST NOT be used to programmatically assume > > -the presence or absence of particular features. > > +printable ASCII characters (i.e., the byte range 31 < x < 127), and are > > Patches 1 & 2 redacted non-printables and SP separately, because SP > is considered printable. With this change you are allowing SP to be > passed without getting redacted? I do not think it is a good idea > (see above). > > While I'd prefer to keep the range the same as before, i.e. "any > printable ASCII characters except space", "33 <= x <= 126" may be > more readily recognisable that we are doing something unusual, as > "32 <= x <= 126" is fairly easily recognisable as "ASCII printable". > > > +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux") > > So, I'd suggest using something other than " " between "version" and > "os". Dot (as if the byte there were redacted) or slash or dash or > whatever, anything that is not whitespace. Yeah, Noted. Thanks. > > > +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can > > +be configured using the GIT_USER_AGENT environment variable and it takes > > +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)` > > +system call or its equivalent. The agent strings are purely informative for > > +statistics and debugging purposes, and MUST NOT be used to programmatically > > +assume the presence or absence of particular features. > > Other than these nits, I find the above very well done. > > As to the additional implementation of git_user_agent_sanitized(), > except for that same "do we really want SP there?" question, I see > nothing questionable there, either. > > Overall very nicely done and presented. Thank you. > > Thanks.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt index 1652fef3ae..f4831a8787 100644 --- a/Documentation/gitprotocol-v2.txt +++ b/Documentation/gitprotocol-v2.txt @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version the `agent` capability with a value `Y` (in the form `agent=Y`) in its request to the server (but it MUST NOT do so if the server did not advertise the agent capability). The `X` and `Y` strings may contain any -printable ASCII characters except space (i.e., the byte range 32 < x < -127), and are typically of the form "package/version" (e.g., -"git/1.8.3.1"). The agent strings are purely informative for statistics -and debugging purposes, and MUST NOT be used to programmatically assume -the presence or absence of particular features. +printable ASCII characters (i.e., the byte range 31 < x < 127), and are +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux") +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can +be configured using the GIT_USER_AGENT environment variable and it takes +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)` +system call or its equivalent. The agent strings are purely informative for +statistics and debugging purposes, and MUST NOT be used to programmatically +assume the presence or absence of particular features. ls-refs ~~~~~~~ diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh index 4c24a188b9..4f0b053c4a 100755 --- a/t/t5701-git-serve.sh +++ b/t/t5701-git-serve.sh @@ -8,13 +8,19 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh test_expect_success 'setup to generate files with expected content' ' - printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability && + printf "agent=git/%s" "$(git version | cut -d" " -f3)" >agent_capability && test_oid_cache <<-EOF && wrong_algo sha1:sha256 wrong_algo sha256:sha1 EOF + if test_have_prereq WINDOWS + then + printf "agent=FAKE\n" >agent_capability + else + printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability + fi && cat >expect.base <<-EOF && version 2 $(cat agent_capability) @@ -31,6 +37,10 @@ test_expect_success 'setup to generate files with expected content' ' test_expect_success 'test capability advertisement' ' cat expect.base expect.trailer >expect && + if test_have_prereq WINDOWS + then + GIT_USER_AGENT=FAKE && export GIT_USER_AGENT + fi && GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \ --advertise-capabilities >out && test-tool pkt-line unpack <out >actual && @@ -361,6 +371,10 @@ test_expect_success 'test capability advertisement with uploadpack.advertiseBund expect.extra \ expect.trailer >expect && + if test_have_prereq WINDOWS + then + GIT_USER_AGENT=FAKE && export GIT_USER_AGENT + fi && GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \ --advertise-capabilities >out && test-tool pkt-line unpack <out >actual && diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 78e054ab50..3465904323 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -2007,3 +2007,11 @@ test_trailing_hash () { test-tool hexdump | sed "s/ //g" } + +# Trim and replace each character with ascii code below 32 or above +# 127 (included) using a dot '.' character. +# Octal intervals \001-\040 and \177-\377 +# correspond to decimal intervals 1-32 and 127-255 +test_redact_non_printables () { + tr -d "\n\r" | tr "[\001-\040][\177-\377]" "." +} diff --git a/version.c b/version.c index d95221a72a..027ebc82b4 100644 --- a/version.c +++ b/version.c @@ -1,8 +1,9 @@ +#define USE_THE_REPOSITORY_VARIABLE + #include "git-compat-util.h" #include "version.h" #include "version-def.h" #include "strbuf.h" -#include "sane-ctype.h" #include "gettext.h" const char git_version_string[] = GIT_VERSION; @@ -34,6 +35,27 @@ const char *git_user_agent(void) return agent; } +/* + Retrieve, sanitize and cache operating system info for subsequent + calls. Return a pointer to the sanitized operating system info + string. +*/ +static const char *os_info(void) +{ + static const char *os = NULL; + + if (!os) { + struct strbuf buf = STRBUF_INIT; + + get_uname_info(&buf, 0); + /* Sanitize the os information immediately */ + redact_non_printables(&buf); + os = strbuf_detach(&buf, NULL); + } + + return os; +} + const char *git_user_agent_sanitized(void) { static const char *agent = NULL; @@ -43,6 +65,11 @@ const char *git_user_agent_sanitized(void) strbuf_addstr(&buf, git_user_agent()); redact_non_printables(&buf); + + if (!getenv("GIT_USER_AGENT")) { + strbuf_addch(&buf, ' '); + strbuf_addstr(&buf, os_info()); + } agent = strbuf_detach(&buf, NULL); } diff --git a/version.h b/version.h index 5eb586c0bd..bbde6d371a 100644 --- a/version.h +++ b/version.h @@ -1,6 +1,8 @@ #ifndef VERSION_H #define VERSION_H +struct repository; + extern const char git_version_string[]; extern const char git_built_from_commit_string[]; @@ -14,4 +16,5 @@ const char *git_user_agent_sanitized(void); */ int get_uname_info(struct strbuf *buf, unsigned int full); + #endif /* VERSION_H */
As some issues that can happen with a Git client can be operating system specific, it can be useful for a server to know which OS a client is using. In the same way it can be useful for a client to know which OS a server is using. Our current agent capability is in the form of "package/version" (e.g., "git/1.8.3.1"). Let's extend it to include the operating system name (os) i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux"). Including OS details in the agent capability simplifies implementation, maintains backward compatibility, avoids introducing a new capability, encourages adoption across Git-compatible software, and enhances debugging by providing complete environment information without affecting functionality. The operating system name is retrieved using the 'sysname' field of the `uname(2)` system call or its equivalent. However, there are differences between `uname(1)` (command-line utility) and `uname(2)` (system call) outputs on Windows. These discrepancies complicate testing on Windows platforms. For example: - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\ .2024-02-14.20:17.UTC.x86_64 - `uname(2)` output: Windows.10.0.20348 On Windows, uname(2) is not actually system-supplied but is instead already faked up by Git itself. We could have overcome the test issue on Windows by implementing a new `uname` subcommand in `test-tool` using uname(2), but except uname(2), which would be tested against itself, there would be nothing platform specific, so it's just simpler to disable the tests on Windows. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> --- Documentation/gitprotocol-v2.txt | 13 ++++++++----- t/t5701-git-serve.sh | 16 +++++++++++++++- t/test-lib-functions.sh | 8 ++++++++ version.c | 29 ++++++++++++++++++++++++++++- version.h | 3 +++ 5 files changed, 62 insertions(+), 7 deletions(-)