[v2,5/7] quote: add sq_quote_argv_pretty_ltrim
diff mbox series

Message ID 5059776248b6686faaff37c97aa63d0212579cd8.1565273938.git.gitgitgadget@gmail.com
State New
Headers show
Series
  • trace2: clean up formatting in perf target format
Related show

Commit Message

Dominic Winkler via GitGitGadget Aug. 8, 2019, 2:19 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Create version of sq_quote_argv_pretty() that does not
insert a leading space before argv[0].

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 quote.c | 11 +++++++++++
 quote.h |  1 +
 2 files changed, 12 insertions(+)

Comments

Junio C Hamano Aug. 8, 2019, 6:05 p.m. UTC | #1
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Create version of sq_quote_argv_pretty() that does not
> insert a leading space before argv[0].
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  quote.c | 11 +++++++++++
>  quote.h |  1 +
>  2 files changed, 12 insertions(+)

I am OK with the basic idea, but I am somewhat unhappy about this
particular patch for two reasons:

 - If we were to keep this as a part of proper API in the longer
   term, the current sq_quote_argv_pretty() should be rewritten to
   use this to avoid repetition (e.g. as long as !!*argv, add a SP
   and then call this new thing);

 - something_ltrim() sounds as if you munge what is passed to you
   and chop off the left end, but that is not what this does.

Now, what is the right name for this new thing?  What does it do?

It looks to me that it appends each element of argv[], quoting it as
needed, and with SP in between.  So the right name for the family of
these functions should be around "append", which is the primary thing
they do, with "quoted" somewhere.

Having made the primary purpose of the helper clearer leads me to
wonder if "do not add SP before the first element, i.e. argv[0]", is
really what we want.  If we always clear the *dst strbuf before
starting to serialize argv[] into it, then the behaviour would make
sense, but we do not---we are "appending".

As long as we are appending, would we be better off doing something
sillily magical like this instead, I have to wonder?

	void sq_append_strings_quoted(struct strbuf *buf, const char **av)
	{
		int i;

		for (i = 0; av[i]; i++) {
			if (buf->len)
				strbuf_addch(buf, ' ');
			sq_quote_buf_pretty(buf, argv[0]);
		}
	}

That is, "if we are appending to an existing string, have SP to
separate the first element from that existing string; treat the
remaining elements the same way (if the buffer is empty, there is no
point adding SP at the beginning)".

I may have found a long-standing bug in sq_quote_buf_pretty(), by
the way.  What does it produce when *src is an empty string of
length 0?  It does not add anything to dst, but shouldn't we be
adding two single-quotes (i.e. an empty string inside sq pair)?

> diff --git a/quote.c b/quote.c
> index 7f2aa6faa4..7cad8798ac 100644
> --- a/quote.c
> +++ b/quote.c
> @@ -94,6 +94,17 @@ void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
>  	}
>  }
>  
> +void sq_quote_argv_pretty_ltrim(struct strbuf *dst, const char **argv)
> +{
> +	int i;
> +
> +	for (i = 0; argv[i]; i++) {
> +		if (i > 0)
> +			strbuf_addch(dst, ' ');
> +		sq_quote_buf_pretty(dst, argv[i]);
> +	}
> +}
> +
>  static char *sq_dequote_step(char *arg, char **next)
>  {
>  	char *dst = arg;
> diff --git a/quote.h b/quote.h
> index fb08dc085c..3b3d041a61 100644
> --- a/quote.h
> +++ b/quote.h
> @@ -40,6 +40,7 @@ void sq_quotef(struct strbuf *, const char *fmt, ...);
>   */
>  void sq_quote_buf_pretty(struct strbuf *, const char *src);
>  void sq_quote_argv_pretty(struct strbuf *, const char **argv);
> +void sq_quote_argv_pretty_ltrim(struct strbuf *, const char **argv);
>  
>  /* This unwraps what sq_quote() produces in place, but returns
>   * NULL if the input does not look like what sq_quote would have
Jeff Hostetler Aug. 8, 2019, 7:04 p.m. UTC | #2
On 8/8/2019 2:05 PM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Create version of sq_quote_argv_pretty() that does not
>> insert a leading space before argv[0].
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   quote.c | 11 +++++++++++
>>   quote.h |  1 +
>>   2 files changed, 12 insertions(+)
> 
> I am OK with the basic idea, but I am somewhat unhappy about this
> particular patch for two reasons:
> 
>   - If we were to keep this as a part of proper API in the longer
>     term, the current sq_quote_argv_pretty() should be rewritten to
>     use this to avoid repetition (e.g. as long as !!*argv, add a SP
>     and then call this new thing);
> 
>   - something_ltrim() sounds as if you munge what is passed to you
>     and chop off the left end, but that is not what this does.
> 
> Now, what is the right name for this new thing?  What does it do?

I struggled with the proper name for this.
And even thought about adding a 3rd arg to the current function to
indicate whether to have the leading SP before argv[0], but wasn't
sure if that was too disruptive.

> 
> It looks to me that it appends each element of argv[], quoting it as
> needed, and with SP in between.  So the right name for the family of
> these functions should be around "append", which is the primary thing
> they do, with "quoted" somewhere.
> 
> Having made the primary purpose of the helper clearer leads me to
> wonder if "do not add SP before the first element, i.e. argv[0]", is
> really what we want.  If we always clear the *dst strbuf before
> starting to serialize argv[] into it, then the behaviour would make
> sense, but we do not---we are "appending".
> 
> As long as we are appending, would we be better off doing something
> sillily magical like this instead, I have to wonder?
> 
> 	void sq_append_strings_quoted(struct strbuf *buf, const char **av)
> 	{
> 		int i;
> 
> 		for (i = 0; av[i]; i++) {
> 			if (buf->len)
> 				strbuf_addch(buf, ' ');
> 			sq_quote_buf_pretty(buf, argv[0]);
> 		}
> 	}
> 
> That is, "if we are appending to an existing string, have SP to
> separate the first element from that existing string; treat the
> remaining elements the same way (if the buffer is empty, there is no
> point adding SP at the beginning)".

I don't think that would do what we want.  We don't know what the
caller's expectations are.  In my uses in commits 6/7 and 7/7 I
already added the leading chars I wanted in the strbuf before calling
sq_quote_argv_pretty_ltrim() and assumed the output would be a true
append.  For example:

+	strbuf_addf(&buf_payload, "alias:%s argv:[", alias);
+	sq_quote_argv_pretty_ltrim(&buf_payload, argv);
+	strbuf_addch(&buf_payload, ']');

I like your suggestion of putting my new function in the _append_
category.  I think I'll add the 3rd arg to this and then it will
be completely specified and I can get rid of the _ltrim suffix.

I'll re-roll this.

> 
> I may have found a long-standing bug in sq_quote_buf_pretty(), by
> the way.  What does it produce when *src is an empty string of
> length 0?  It does not add anything to dst, but shouldn't we be
> adding two single-quotes (i.e. an empty string inside sq pair)?

I would think so.  I did a quick grep and most of the calls looked
guarded, so I don't think this is urgent.  I'll address this in a
separate commit shortly.

Thanks
Jeff


> 
>> diff --git a/quote.c b/quote.c
>> index 7f2aa6faa4..7cad8798ac 100644
>> --- a/quote.c
>> +++ b/quote.c
>> @@ -94,6 +94,17 @@ void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
>>   	}
>>   }
>>   
>> +void sq_quote_argv_pretty_ltrim(struct strbuf *dst, const char **argv)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; argv[i]; i++) {
>> +		if (i > 0)
>> +			strbuf_addch(dst, ' ');
>> +		sq_quote_buf_pretty(dst, argv[i]);
>> +	}
>> +}
>> +
>>   static char *sq_dequote_step(char *arg, char **next)
>>   {
>>   	char *dst = arg;
>> diff --git a/quote.h b/quote.h
>> index fb08dc085c..3b3d041a61 100644
>> --- a/quote.h
>> +++ b/quote.h
>> @@ -40,6 +40,7 @@ void sq_quotef(struct strbuf *, const char *fmt, ...);
>>    */
>>   void sq_quote_buf_pretty(struct strbuf *, const char *src);
>>   void sq_quote_argv_pretty(struct strbuf *, const char **argv);
>> +void sq_quote_argv_pretty_ltrim(struct strbuf *, const char **argv);
>>   
>>   /* This unwraps what sq_quote() produces in place, but returns
>>    * NULL if the input does not look like what sq_quote would have
Junio C Hamano Aug. 8, 2019, 8:01 p.m. UTC | #3
Jeff Hostetler <git@jeffhostetler.com> writes:

>> That is, "if we are appending to an existing string, have SP to
>> separate the first element from that existing string; treat the
>> remaining elements the same way (if the buffer is empty, there is no
>> point adding SP at the beginning)".
>
> I don't think that would do what we want.

I know that there are current callers of quote_argv_pretty that
either (1) expects that it will always get the leading SP for free,
or (2) has to work the unwanted SP around (basically, the places you
changed from quote_argv_pretty to quote_argv_pretty_ltrim in your
series).  But I wanted to see if we can come up with a single helper
whose behaviour is easy to explain and understand that both existing
and new callers can adopt---and if the resulting codebase becomes
easy to understand and maintain overall.  And if that would give us
the ideal longer term direction.

>> I may have found a long-standing bug in sq_quote_buf_pretty(), by
>> the way.  What does it produce when *src is an empty string of
>> length 0?  It does not add anything to dst, but shouldn't we be
>> adding two single-quotes (i.e. an empty string inside sq pair)?
>
> I would think so.  I did a quick grep and most of the calls looked
> guarded, so I don't think this is urgent.  I'll address this in a
> separate commit shortly.

Thanks.
René Scharfe Aug. 8, 2019, 10:49 p.m. UTC | #4
Am 08.08.19 um 21:04 schrieb Jeff Hostetler:
> On 8/8/2019 2:05 PM, Junio C Hamano wrote:
>> Having made the primary purpose of the helper clearer leads me to
>> wonder if "do not add SP before the first element, i.e. argv[0]", is
>> really what we want.  If we always clear the *dst strbuf before
>> starting to serialize argv[] into it, then the behaviour would make
>> sense, but we do not---we are "appending".
>>
>> As long as we are appending, would we be better off doing something
>> sillily magical like this instead, I have to wonder?
>>
>>     void sq_append_strings_quoted(struct strbuf *buf, const char **av)
>>     {
>>         int i;
>>
>>         for (i = 0; av[i]; i++) {
>>             if (buf->len)
>>                 strbuf_addch(buf, ' ');
>>             sq_quote_buf_pretty(buf, argv[0]);
>>         }
>>     }
>>
>> That is, "if we are appending to an existing string, have SP to
>> separate the first element from that existing string; treat the
>> remaining elements the same way (if the buffer is empty, there is no
>> point adding SP at the beginning)".
>
> I don't think that would do what we want.  We don't know what the
> caller's expectations are.  In my uses in commits 6/7 and 7/7 I
> already added the leading chars I wanted in the strbuf before calling
> sq_quote_argv_pretty_ltrim() and assumed the output would be a true
> append.  For example:
>
> +    strbuf_addf(&buf_payload, "alias:%s argv:[", alias);
> +    sq_quote_argv_pretty_ltrim(&buf_payload, argv);
> +    strbuf_addch(&buf_payload, ']');
>
> I like your suggestion of putting my new function in the _append_
> category.  I think I'll add the 3rd arg to this and then it will
> be completely specified and I can get rid of the _ltrim suffix.

Two observations:

If callers want to add something before a joined delimited list, they
already can with a strbuf_add* call.  No need to add that feature to
a function that joins lists.

And repetitions of repetitions (loops) are boring.

Apologies in advance for any coffee stains on your monitor, but
here's how I would start, probably followed by attempts to inline the
functions that become trivial wrappers:

---
 quote.c  | 18 ++++--------------
 strbuf.c | 20 +++++++++++++-------
 strbuf.h |  8 ++++++++
 3 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/quote.c b/quote.c
index 7f2aa6faa4..f422188852 100644
--- a/quote.c
+++ b/quote.c
@@ -74,24 +74,14 @@ void sq_quotef(struct strbuf *dst, const char *fmt, ...)

 void sq_quote_argv(struct strbuf *dst, const char **argv)
 {
-	int i;
-
-	/* Copy into destination buffer. */
-	strbuf_grow(dst, 255);
-	for (i = 0; argv[i]; ++i) {
-		strbuf_addch(dst, ' ');
-		sq_quote_buf(dst, argv[i]);
-	}
+	strbuf_addch(dst, ' ');
+	strbuf_map_join_argv(dst, argv, sq_quote_buf, " ");
 }

 void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
 {
-	int i;
-
-	for (i = 0; argv[i]; i++) {
-		strbuf_addch(dst, ' ');
-		sq_quote_buf_pretty(dst, argv[i]);
-	}
+	strbuf_addch(dst, ' ');
+	strbuf_map_join_argv(dst, argv, sq_quote_buf_pretty, " ");
 }

 static char *sq_dequote_step(char *arg, char **next)
diff --git a/strbuf.c b/strbuf.c
index d30f916858..d337853b53 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -304,17 +304,23 @@ void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2)
 	strbuf_setlen(sb, sb->len + sb2->len);
 }

+void strbuf_map_join_argv(struct strbuf *sb, const char **argv,
+			  void (*fn)(struct strbuf *, const char *),
+			  const char *separator)
+{
+	while (*argv) {
+		fn(sb, *argv++);
+		if (*argv)
+			strbuf_addstr(sb, separator);
+	}
+}
+
 const char *strbuf_join_argv(struct strbuf *buf,
 			     int argc, const char **argv, char delim)
 {
-	if (!argc)
-		return buf->buf;
+	char separator[] = { delim, '\0' };

-	strbuf_addstr(buf, *argv);
-	while (--argc) {
-		strbuf_addch(buf, delim);
-		strbuf_addstr(buf, *(++argv));
-	}
+	strbuf_map_join_argv(buf, argv, strbuf_addstr, separator);

 	return buf->buf;
 }
diff --git a/strbuf.h b/strbuf.h
index f62278a0be..7adeff94a7 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -297,6 +297,14 @@ static inline void strbuf_addstr(struct strbuf *sb, const char *s)
  */
 void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2);

+/**
+ * Apply `fn` to `sb` and each element of the NULL-terminated array
+ * `argv`. Add `separator` between these invocations.
+ */
+void strbuf_map_join_argv(struct strbuf *sb, const char **argv,
+			  void (*fn)(struct strbuf *, const char *),
+			  const char *separator);
+
 /**
  * Join the arguments into a buffer. `delim` is put between every
  * two arguments.
--
2.22.0
Jeff Hostetler Aug. 9, 2019, 5:13 p.m. UTC | #5
On 8/8/2019 6:49 PM, René Scharfe wrote:
> Am 08.08.19 um 21:04 schrieb Jeff Hostetler:
>> On 8/8/2019 2:05 PM, Junio C Hamano wrote:
>>> Having made the primary purpose of the helper clearer leads me to
>>> wonder if "do not add SP before the first element, i.e. argv[0]", is
>>> really what we want.  If we always clear the *dst strbuf before
>>> starting to serialize argv[] into it, then the behaviour would make
>>> sense, but we do not---we are "appending".
>>>
>>> As long as we are appending, would we be better off doing something
>>> sillily magical like this instead, I have to wonder?
>>>
>>>      void sq_append_strings_quoted(struct strbuf *buf, const char **av)
>>>      {
>>>          int i;
>>>
>>>          for (i = 0; av[i]; i++) {
>>>              if (buf->len)
>>>                  strbuf_addch(buf, ' ');
>>>              sq_quote_buf_pretty(buf, argv[0]);
>>>          }
>>>      }
>>>
>>> That is, "if we are appending to an existing string, have SP to
>>> separate the first element from that existing string; treat the
>>> remaining elements the same way (if the buffer is empty, there is no
>>> point adding SP at the beginning)".
>>
>> I don't think that would do what we want.  We don't know what the
>> caller's expectations are.  In my uses in commits 6/7 and 7/7 I
>> already added the leading chars I wanted in the strbuf before calling
>> sq_quote_argv_pretty_ltrim() and assumed the output would be a true
>> append.  For example:
>>
>> +    strbuf_addf(&buf_payload, "alias:%s argv:[", alias);
>> +    sq_quote_argv_pretty_ltrim(&buf_payload, argv);
>> +    strbuf_addch(&buf_payload, ']');
>>
>> I like your suggestion of putting my new function in the _append_
>> category.  I think I'll add the 3rd arg to this and then it will
>> be completely specified and I can get rid of the _ltrim suffix.
> 
> Two observations:
> 
> If callers want to add something before a joined delimited list, they
> already can with a strbuf_add* call.  No need to add that feature to
> a function that joins lists.
> 
> And repetitions of repetitions (loops) are boring.
> 
> Apologies in advance for any coffee stains on your monitor, but
> here's how I would start, probably followed by attempts to inline the
> functions that become trivial wrappers:


Um, yeah, I must say that I didn't expect the conversation to turn to
map-style functions and a change in design styles.  I think it would be
better to have that conversation in a different patch series and not mix
it with my trace2 janitoring.

I'm going to push a V3 that does just the minimum to have a sq_ function
that joins the args with a space delimiter (and without the leading
space) and re-write the existing function to call it after adding the
legacy leading space.  This will let existing callers continue to work
as is.  And they can be converted if/when anyone wants to dig into them.


> 
> ---
>   quote.c  | 18 ++++--------------
>   strbuf.c | 20 +++++++++++++-------
>   strbuf.h |  8 ++++++++
>   3 files changed, 25 insertions(+), 21 deletions(-)
> 
> diff --git a/quote.c b/quote.c
> index 7f2aa6faa4..f422188852 100644
> --- a/quote.c
> +++ b/quote.c
> @@ -74,24 +74,14 @@ void sq_quotef(struct strbuf *dst, const char *fmt, ...)
> 
>   void sq_quote_argv(struct strbuf *dst, const char **argv)
>   {
> -	int i;
> -
> -	/* Copy into destination buffer. */
> -	strbuf_grow(dst, 255);
> -	for (i = 0; argv[i]; ++i) {
> -		strbuf_addch(dst, ' ');
> -		sq_quote_buf(dst, argv[i]);
> -	}
> +	strbuf_addch(dst, ' ');
> +	strbuf_map_join_argv(dst, argv, sq_quote_buf, " ");
>   }
> 
>   void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
>   {
> -	int i;
> -
> -	for (i = 0; argv[i]; i++) {
> -		strbuf_addch(dst, ' ');
> -		sq_quote_buf_pretty(dst, argv[i]);
> -	}
> +	strbuf_addch(dst, ' ');

If I'm reading this correctly, this has slightly different behavior
than the original version.  Perhaps:

	if (argv[0])
		strbuf_addch(dst, ' ');

> +	strbuf_map_join_argv(dst, argv, sq_quote_buf_pretty, " ");
>   }
> 
>   static char *sq_dequote_step(char *arg, char **next)
> diff --git a/strbuf.c b/strbuf.c
> index d30f916858..d337853b53 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -304,17 +304,23 @@ void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2)
>   	strbuf_setlen(sb, sb->len + sb2->len);
>   }
> 
> +void strbuf_map_join_argv(struct strbuf *sb, const char **argv,
> +			  void (*fn)(struct strbuf *, const char *),
> +			  const char *separator)
> +{
> +	while (*argv) {
> +		fn(sb, *argv++);
> +		if (*argv)
> +			strbuf_addstr(sb, separator);
> +	}
> +}
> +
>   const char *strbuf_join_argv(struct strbuf *buf,
>   			     int argc, const char **argv, char delim)
>   {
> -	if (!argc)
> -		return buf->buf;
> +	char separator[] = { delim, '\0' };
> 
> -	strbuf_addstr(buf, *argv);
> -	while (--argc) {
> -		strbuf_addch(buf, delim);
> -		strbuf_addstr(buf, *(++argv));
> -	}
> +	strbuf_map_join_argv(buf, argv, strbuf_addstr, separator);
> 
>   	return buf->buf;
>   }
> diff --git a/strbuf.h b/strbuf.h
> index f62278a0be..7adeff94a7 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -297,6 +297,14 @@ static inline void strbuf_addstr(struct strbuf *sb, const char *s)
>    */
>   void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2);
> 
> +/**
> + * Apply `fn` to `sb` and each element of the NULL-terminated array
> + * `argv`. Add `separator` between these invocations.
> + */
> +void strbuf_map_join_argv(struct strbuf *sb, const char **argv,
> +			  void (*fn)(struct strbuf *, const char *),
> +			  const char *separator);
> +
>   /**
>    * Join the arguments into a buffer. `delim` is put between every
>    * two arguments.
> --
> 2.22.0
>
René Scharfe Aug. 9, 2019, 6:01 p.m. UTC | #6
Am 09.08.19 um 19:13 schrieb Jeff Hostetler:
> On 8/8/2019 6:49 PM, René Scharfe wrote:
>>   void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
>>   {
>> -    int i;
>> -
>> -    for (i = 0; argv[i]; i++) {
>> -        strbuf_addch(dst, ' ');
>> -        sq_quote_buf_pretty(dst, argv[i]);
>> -    }
>> +    strbuf_addch(dst, ' ');
>
> If I'm reading this correctly, this has slightly different behavior
> than the original version.  Perhaps:
>
>     if (argv[0])
>         strbuf_addch(dst, ' ');

Oh, yes, thanks for spotting this.

René

Patch
diff mbox series

diff --git a/quote.c b/quote.c
index 7f2aa6faa4..7cad8798ac 100644
--- a/quote.c
+++ b/quote.c
@@ -94,6 +94,17 @@  void sq_quote_argv_pretty(struct strbuf *dst, const char **argv)
 	}
 }
 
+void sq_quote_argv_pretty_ltrim(struct strbuf *dst, const char **argv)
+{
+	int i;
+
+	for (i = 0; argv[i]; i++) {
+		if (i > 0)
+			strbuf_addch(dst, ' ');
+		sq_quote_buf_pretty(dst, argv[i]);
+	}
+}
+
 static char *sq_dequote_step(char *arg, char **next)
 {
 	char *dst = arg;
diff --git a/quote.h b/quote.h
index fb08dc085c..3b3d041a61 100644
--- a/quote.h
+++ b/quote.h
@@ -40,6 +40,7 @@  void sq_quotef(struct strbuf *, const char *fmt, ...);
  */
 void sq_quote_buf_pretty(struct strbuf *, const char *src);
 void sq_quote_argv_pretty(struct strbuf *, const char **argv);
+void sq_quote_argv_pretty_ltrim(struct strbuf *, const char **argv);
 
 /* This unwraps what sq_quote() produces in place, but returns
  * NULL if the input does not look like what sq_quote would have