diff mbox series

[1/4] version: refactor redact_non_printables()

Message ID 20250106103713.1452035-2-usmanakinyemi202@gmail.com (mailing list archive)
State New
Headers show
Series Introduce os-version Capability with Configurable Options | expand

Commit Message

Usman Akinyemi Jan. 6, 2025, 10:30 a.m. UTC
The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's also make a few small improvements:
  - use 'size_t' for 'i' instead of 'int',
  - move the declaration of 'i' inside the 'for ( ... )',
  - use strbuf_detach() to explicitly detach the string contained by
    the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

Comments

Eric Sunshine Jan. 6, 2025, 10:35 p.m. UTC | #1
On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
> The git_user_agent_sanitized() function performs some sanitizing to
> avoid special characters being sent over the line and possibly messing
> up with the protocol or with the parsing on the other side.
>
> Let's extract this sanitizing into a new redact_non_printables() function,
> as we will want to reuse it in a following patch.
>
> For now the new redact_non_printables() function is still static as
> it's only needed locally.
>
> While at it, let's also make a few small improvements:
>   - use 'size_t' for 'i' instead of 'int',
>   - move the declaration of 'i' inside the 'for ( ... )',

Regarding the above two items...

>   - use strbuf_detach() to explicitly detach the string contained by
>     the 'buf' strbuf.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
> diff --git a/version.c b/version.c
> @@ -6,6 +6,20 @@
> +static void redact_non_printables(struct strbuf *buf)
> +{
> +       strbuf_trim(buf);
> +       for (size_t i = 0; i < buf->len; i++) {
> +               if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
> +                       buf->buf[i] = '.';
> +       }
> +}
> @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
>                 strbuf_addstr(&buf, git_user_agent());
> -               strbuf_trim(&buf);
> -               for (size_t i = 0; i < buf.len; i++) {

... the original code appears to have already been using `size_t` and
declaring the loop variable inside the `for` statement, despite what
the commit message says. So, is the commit message out of date? Or are
the patches out of order? Or something else?

> -                       if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> -                               buf.buf[i] = '.';
> -               }
> -               agent = buf.buf;
> +               redact_non_printables(&buf);
> +               agent = strbuf_detach(&buf, NULL);
Usman Akinyemi Jan. 8, 2025, 12:58 p.m. UTC | #2
On Tue, Jan 7, 2025 at 4:05 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
> > The git_user_agent_sanitized() function performs some sanitizing to
> > avoid special characters being sent over the line and possibly messing
> > up with the protocol or with the parsing on the other side.
> >
> > Let's extract this sanitizing into a new redact_non_printables() function,
> > as we will want to reuse it in a following patch.
> >
> > For now the new redact_non_printables() function is still static as
> > it's only needed locally.
> >
> > While at it, let's also make a few small improvements:
> >   - use 'size_t' for 'i' instead of 'int',
> >   - move the declaration of 'i' inside the 'for ( ... )',
>
> Regarding the above two items...
>
> >   - use strbuf_detach() to explicitly detach the string contained by
> >     the 'buf' strbuf.
> >
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> > ---
> > diff --git a/version.c b/version.c
> > @@ -6,6 +6,20 @@
> > +static void redact_non_printables(struct strbuf *buf)
> > +{
> > +       strbuf_trim(buf);
> > +       for (size_t i = 0; i < buf->len; i++) {
> > +               if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
> > +                       buf->buf[i] = '.';
> > +       }
> > +}
> > @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
> >                 strbuf_addstr(&buf, git_user_agent());
> > -               strbuf_trim(&buf);
> > -               for (size_t i = 0; i < buf.len; i++) {
>
> ... the original code appears to have already been using `size_t` and
> declaring the loop variable inside the `for` statement, despite what
> the commit message says. So, is the commit message out of date? Or are
> the patches out of order? Or something else?
I just investigated what happened. Another commit already added it and
I did a rebase on top of the "master".
I did not notice it at all. The commit message is out of date.

I will update it in the next iteration.
Thank you very much.
Usman.
>
> > -                       if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> > -                               buf.buf[i] = '.';
> > -               }
> > -               agent = buf.buf;
> > +               redact_non_printables(&buf);
> > +               agent = strbuf_detach(&buf, NULL);
diff mbox series

Patch

diff --git a/version.c b/version.c
index 4d763ab48d..78f025c808 100644
--- a/version.c
+++ b/version.c
@@ -6,6 +6,20 @@ 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ * TODO: ensure consecutive non-printable characters are only replaced once
+*/
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -27,12 +41,8 @@  const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;