diff mbox series

[2/6] quote_path: give flags parameter to quote_path()

Message ID 20200908205224.4126551-3-gitster@pobox.com (mailing list archive)
State Superseded
Headers show
Series quote_path() clean-ups | expand

Commit Message

Junio C Hamano Sept. 8, 2020, 8:52 p.m. UTC
The quote_path() function computes a path (relative to its base
directory) and c-quotes the result if necessary.  Teach it to take a
flags parameter to allow its behaviour to be enriched later.

No behaviour change intended.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clean.c | 22 +++++++++++-----------
 builtin/grep.c  |  2 +-
 quote.c         |  4 ++--
 quote.h         |  2 +-
 wt-status.c     | 24 ++++++++++++------------
 5 files changed, 27 insertions(+), 27 deletions(-)

Comments

Jeff King Sept. 10, 2020, 12:21 p.m. UTC | #1
On Tue, Sep 08, 2020 at 01:52:20PM -0700, Junio C Hamano wrote:

> -char *quote_path(const char *in, const char *prefix, struct strbuf *out)
> +char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned flags)
>  {
>  	struct strbuf sb = STRBUF_INIT;
>  	const char *rel = relative_path(in, prefix, &sb);
>  	strbuf_reset(out);
> -	quote_c_style_counted(rel, strlen(rel), out, NULL, 0);
> +	quote_c_style_counted(rel, strlen(rel), out, NULL, flags);
>  	strbuf_release(&sb);

Here we take "flags", but then we pass it along in place of
quote_c_style_counted()'s "no_dq" parameter. That seems unlikely to be
the right thing (and indeed, it's reverted in the next commit).

> --- a/quote.h
> +++ b/quote.h
> @@ -72,7 +72,7 @@ void write_name_quoted_relative(const char *name, const char *prefix,
>  				FILE *fp, int terminator);
>  
>  /* quote path as relative to the given prefix */
> -char *quote_path(const char *in, const char *prefix, struct strbuf *out);
> +char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned);

The meaning of the last parameter is left a bit vague. :) Maybe worth
calling it "unsigned flags" (I don't mind omitting parameter names when
they are obvious, but I don't think this counts).

-Peff
Junio C Hamano Sept. 10, 2020, 3:04 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> On Tue, Sep 08, 2020 at 01:52:20PM -0700, Junio C Hamano wrote:
>
>> -char *quote_path(const char *in, const char *prefix, struct strbuf *out)
>> +char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned flags)
>>  {
>>  	struct strbuf sb = STRBUF_INIT;
>>  	const char *rel = relative_path(in, prefix, &sb);
>>  	strbuf_reset(out);
>> -	quote_c_style_counted(rel, strlen(rel), out, NULL, 0);
>> +	quote_c_style_counted(rel, strlen(rel), out, NULL, flags);
>>  	strbuf_release(&sb);
>
> Here we take "flags", but then we pass it along in place of
> quote_c_style_counted()'s "no_dq" parameter. That seems unlikely to be
> the right thing (and indeed, it's reverted in the next commit).

You are seeing a rebase artifact.  It needs to be corrected.  Thanks
for noticing.

As you seem to have guessed correctly, the initial plan was to have
the new logic in quote-c-style-counted and make anything at higher
level of the callchain just relay the "SP must be quoted" bit, and
the final two patches that are no longer necessary in the current
series were placed much earlier in an earlier draft of the series
as preparatory steps for that endgame.

But it turned out that the main loop of quote-c-style-counted needs
a surgery that is a bit larger than I would have liked, so the final
series consumes the "SP must be quoted" bit in quote_path() without
telling quote-c-style-counted about it.

The problem with the main loop of quote-style-counted I saw was that
the next_quote_pos() is designed to return the position of the byte
that must be prefixed with a backslash, and the machinery to help it
(namely, cq_must_quote() and cq_lookup[]) are written with that in
mind quite firmly.  The handling of SP we would be optionally adding
to the mix is somewhat different---it forces the end result to be
enclosed within a dq-pair, but it byitself does not need backslash
quoting.  Which means cq_lookup[] table needs to somehow encode a
bit per byte that says "byte with this value itself does not need to
be backslash-quoted, but the entire thing needs to be placed in a
dq-pair", and/or next_quote_pos() needs be able to say "here is a
byte to cause us to enclose the whole thing in a dq-pair, but the
byte itself need not be backslash-quoted".

Of course none of the above becomes unnecessary if we scan the whole
string for SP before the main loop in quote-c-style-counted, but the
function was written to process the input in a single pass and such
a change defeats its design.  If we need to do it in two passes, we
can have the caller do so anyway, at least for now.  That thinking
lead to the final organization of the series, with two steps that
used to be preparatory for passing the flag down thru to the bottom
layer rebased out as a discardable appendix at the end.

Thanks.
Junio C Hamano Sept. 10, 2020, 3:17 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Of course none of the above becomes unnecessary if we scan the whole
> string for SP before the main loop in quote-c-style-counted, but the
> function was written to process the input in a single pass and such
> a change defeats its design.  If we need to do it in two passes, we
> can have the caller do so anyway, at least for now.  That thinking
> lead to the final organization of the series, with two steps that
> used to be preparatory for passing the flag down thru to the bottom
> layer rebased out as a discardable appendix at the end.

Actually, this made me realize that another variant is possible.
It might be easier to read, or it might not.  Since I cannot tell
without actually writing one, let's see ...

Instead of scanning the output from quote-c-style-counted for SP in
the caller, the caller can do something like this.

/* quote path as relative to the given prefix */
char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned flags)
{
	struct strbuf sb = STRBUF_INIT;
	const char *rel = relative_path(in, prefix, &sb);
	int add_dq_ourselves = !!(flags & QUOTE_PATH_QUOTE_SP) && !!strchr(rel, ' ');

	strbuf_reset(out);
        if (add_dq_ourselves)
        	strbuf_addch(out, '"');
	quote_c_style_counted(rel, strlen(rel), out, NULL,
        		      add_dq_ourselves ? CQUOTE_NODQ : 0);
        if (add_dq_ourselves)
        	strbuf_addch(out, '"');
	strbuf_release(&sb);

	return out->buf;
}

That is, we tell quote-c-style-counted not to put dq-pair around its
output whether it needs to do the backslash quoting, if we know we
want the result inside dq-pair anyway (iow, when the caller wants us
to treat SP specially and we do see SP in the input).

I don't know if this is easier to follow or not.  I do think so
right now but that is only because it is still fresh in my brain.
Jeff King Sept. 10, 2020, 8:26 p.m. UTC | #4
On Thu, Sep 10, 2020 at 08:17:32AM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Of course none of the above becomes unnecessary if we scan the whole
> > string for SP before the main loop in quote-c-style-counted, but the
> > function was written to process the input in a single pass and such
> > a change defeats its design.  If we need to do it in two passes, we
> > can have the caller do so anyway, at least for now.  That thinking
> > lead to the final organization of the series, with two steps that
> > used to be preparatory for passing the flag down thru to the bottom
> > layer rebased out as a discardable appendix at the end.
> 
> Actually, this made me realize that another variant is possible.
> It might be easier to read, or it might not.  Since I cannot tell
> without actually writing one, let's see ...

Vger seems to be delivering slowly and out-of-order the last day or two,
so I got rather confused to receive this after seeing your v2. :)

> I don't know if this is easier to follow or not.  I do think so
> right now but that is only because it is still fresh in my brain.

I do think it is easier to read than the original.

One minor nit with your analysis, though: the current code is actually
two-pass already. One pass finds the next quoted character, but then we
have to take another pass to copy it into place. That second pass can be
done with a memcpy(), which helps.

If you know you are quoting, you can do a true single-pass
character-by-character. But of course part of our task is to find out
_if_ we are quoting. And even if that were not the case, on modern
processors it is not always true that single-pass is going to be faster.
This code is definitely not such a hot-spot that it's worth doing that
kind of micro-optimization.

  Aside: We _do_ have spots where that is not true. When I looked at
  replacing xdiff's hash the sticking point was that we compute the
  newlines _and_ hash in a single pass. Most "fast" hash functions are
  optimized to take bigger sequences of data, but splitting out the
  newline-finding eliminated any gains.

-Peff
diff mbox series

Patch

diff --git a/builtin/clean.c b/builtin/clean.c
index dee44fff6e..687ab473c2 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -162,7 +162,7 @@  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 	if ((force_flag & REMOVE_DIR_KEEP_NESTED_GIT) &&
 	    is_nonbare_repository_dir(path)) {
 		if (!quiet) {
-			quote_path(path->buf, prefix, &quoted);
+			quote_path(path->buf, prefix, &quoted, 0);
 			printf(dry_run ?  _(msg_would_skip_git_dir) : _(msg_skip_git_dir),
 					quoted.buf);
 		}
@@ -177,7 +177,7 @@  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 		res = dry_run ? 0 : rmdir(path->buf);
 		if (res) {
 			int saved_errno = errno;
-			quote_path(path->buf, prefix, &quoted);
+			quote_path(path->buf, prefix, &quoted, 0);
 			errno = saved_errno;
 			warning_errno(_(msg_warn_remove_failed), quoted.buf);
 			*dir_gone = 0;
@@ -202,7 +202,7 @@  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 			if (remove_dirs(path, prefix, force_flag, dry_run, quiet, &gone))
 				ret = 1;
 			if (gone) {
-				quote_path(path->buf, prefix, &quoted);
+				quote_path(path->buf, prefix, &quoted, 0);
 				string_list_append(&dels, quoted.buf);
 			} else
 				*dir_gone = 0;
@@ -210,11 +210,11 @@  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 		} else {
 			res = dry_run ? 0 : unlink(path->buf);
 			if (!res) {
-				quote_path(path->buf, prefix, &quoted);
+				quote_path(path->buf, prefix, &quoted, 0);
 				string_list_append(&dels, quoted.buf);
 			} else {
 				int saved_errno = errno;
-				quote_path(path->buf, prefix, &quoted);
+				quote_path(path->buf, prefix, &quoted, 0);
 				errno = saved_errno;
 				warning_errno(_(msg_warn_remove_failed), quoted.buf);
 				*dir_gone = 0;
@@ -238,7 +238,7 @@  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
 			*dir_gone = 1;
 		else {
 			int saved_errno = errno;
-			quote_path(path->buf, prefix, &quoted);
+			quote_path(path->buf, prefix, &quoted, 0);
 			errno = saved_errno;
 			warning_errno(_(msg_warn_remove_failed), quoted.buf);
 			*dir_gone = 0;
@@ -266,7 +266,7 @@  static void pretty_print_dels(void)
 	struct column_options copts;
 
 	for_each_string_list_item(item, &del_list) {
-		qname = quote_path(item->string, NULL, &buf);
+		qname = quote_path(item->string, NULL, &buf, 0);
 		string_list_append(&list, qname);
 	}
 
@@ -753,7 +753,7 @@  static int ask_each_cmd(void)
 	for_each_string_list_item(item, &del_list) {
 		/* Ctrl-D should stop removing files */
 		if (!eof) {
-			qname = quote_path(item->string, NULL, &buf);
+			qname = quote_path(item->string, NULL, &buf, 0);
 			/* TRANSLATORS: Make sure to keep [y/N] as is */
 			printf(_("Remove %s [y/N]? "), qname);
 			if (git_read_line_interactively(&confirm) == EOF) {
@@ -1047,19 +1047,19 @@  int cmd_clean(int argc, const char **argv, const char *prefix)
 			if (remove_dirs(&abs_path, prefix, rm_flags, dry_run, quiet, &gone))
 				errors++;
 			if (gone && !quiet) {
-				qname = quote_path(item->string, NULL, &buf);
+				qname = quote_path(item->string, NULL, &buf, 0);
 				printf(dry_run ? _(msg_would_remove) : _(msg_remove), qname);
 			}
 		} else {
 			res = dry_run ? 0 : unlink(abs_path.buf);
 			if (res) {
 				int saved_errno = errno;
-				qname = quote_path(item->string, NULL, &buf);
+				qname = quote_path(item->string, NULL, &buf, 0);
 				errno = saved_errno;
 				warning_errno(_(msg_warn_remove_failed), qname);
 				errors++;
 			} else if (!quiet) {
-				qname = quote_path(item->string, NULL, &buf);
+				qname = quote_path(item->string, NULL, &buf, 0);
 				printf(dry_run ? _(msg_would_remove) : _(msg_remove), qname);
 			}
 		}
diff --git a/builtin/grep.c b/builtin/grep.c
index 9a91ad643a..c8037388c6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -319,7 +319,7 @@  static void grep_source_name(struct grep_opt *opt, const char *filename,
 	}
 
 	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out);
+		quote_path(filename + tree_name_len, opt->prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
diff --git a/quote.c b/quote.c
index 7bb519c1a7..03ca85afb0 100644
--- a/quote.c
+++ b/quote.c
@@ -352,12 +352,12 @@  void write_name_quoted_relative(const char *name, const char *prefix,
 }
 
 /* quote path as relative to the given prefix */
-char *quote_path(const char *in, const char *prefix, struct strbuf *out)
+char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned flags)
 {
 	struct strbuf sb = STRBUF_INIT;
 	const char *rel = relative_path(in, prefix, &sb);
 	strbuf_reset(out);
-	quote_c_style_counted(rel, strlen(rel), out, NULL, 0);
+	quote_c_style_counted(rel, strlen(rel), out, NULL, flags);
 	strbuf_release(&sb);
 
 	return out->buf;
diff --git a/quote.h b/quote.h
index 837cb42a71..db7140b3c7 100644
--- a/quote.h
+++ b/quote.h
@@ -72,7 +72,7 @@  void write_name_quoted_relative(const char *name, const char *prefix,
 				FILE *fp, int terminator);
 
 /* quote path as relative to the given prefix */
-char *quote_path(const char *in, const char *prefix, struct strbuf *out);
+char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigned);
 
 /* quoting as a string literal for other languages */
 void perl_quote_buf(struct strbuf *sb, const char *src);
diff --git a/wt-status.c b/wt-status.c
index 6b87592856..d6ca7bd52c 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -336,7 +336,7 @@  static void wt_longstatus_print_unmerged_data(struct wt_status *s,
 		memset(padding, ' ', label_width);
 	}
 
-	one = quote_path(it->string, s->prefix, &onebuf);
+	one = quote_path(it->string, s->prefix, &onebuf, 0);
 	status_printf(s, color(WT_STATUS_HEADER, s), "\t");
 
 	how = wt_status_unmerged_status_string(d->stagemask);
@@ -402,8 +402,8 @@  static void wt_longstatus_print_change_data(struct wt_status *s,
 	if (d->rename_status == status)
 		one_name = d->rename_source;
 
-	one = quote_path(one_name, s->prefix, &onebuf);
-	two = quote_path(two_name, s->prefix, &twobuf);
+	one = quote_path(one_name, s->prefix, &onebuf, 0);
+	two = quote_path(two_name, s->prefix, &twobuf, 0);
 
 	status_printf(s, color(WT_STATUS_HEADER, s), "\t");
 	what = wt_status_diff_status_string(status);
@@ -964,7 +964,7 @@  static void wt_longstatus_print_other(struct wt_status *s,
 		struct string_list_item *it;
 		const char *path;
 		it = &(l->items[i]);
-		path = quote_path(it->string, s->prefix, &buf);
+		path = quote_path(it->string, s->prefix, &buf, 0);
 		if (column_active(s->colopts)) {
 			string_list_append(&output, path);
 			continue;
@@ -1848,7 +1848,7 @@  static void wt_shortstatus_unmerged(struct string_list_item *it,
 	} else {
 		struct strbuf onebuf = STRBUF_INIT;
 		const char *one;
-		one = quote_path(it->string, s->prefix, &onebuf);
+		one = quote_path(it->string, s->prefix, &onebuf, 0);
 		printf(" %s\n", one);
 		strbuf_release(&onebuf);
 	}
@@ -1877,7 +1877,7 @@  static void wt_shortstatus_status(struct string_list_item *it,
 		const char *one;
 
 		if (d->rename_source) {
-			one = quote_path(d->rename_source, s->prefix, &onebuf);
+			one = quote_path(d->rename_source, s->prefix, &onebuf, 0);
 			if (*one != '"' && strchr(one, ' ') != NULL) {
 				putchar('"');
 				strbuf_addch(&onebuf, '"');
@@ -1886,7 +1886,7 @@  static void wt_shortstatus_status(struct string_list_item *it,
 			printf("%s -> ", one);
 			strbuf_release(&onebuf);
 		}
-		one = quote_path(it->string, s->prefix, &onebuf);
+		one = quote_path(it->string, s->prefix, &onebuf, 0);
 		if (*one != '"' && strchr(one, ' ') != NULL) {
 			putchar('"');
 			strbuf_addch(&onebuf, '"');
@@ -1905,7 +1905,7 @@  static void wt_shortstatus_other(struct string_list_item *it,
 	} else {
 		struct strbuf onebuf = STRBUF_INIT;
 		const char *one;
-		one = quote_path(it->string, s->prefix, &onebuf);
+		one = quote_path(it->string, s->prefix, &onebuf, 0);
 		color_fprintf(s->fp, color(WT_STATUS_UNTRACKED, s), "%s", sign);
 		printf(" %s\n", one);
 		strbuf_release(&onebuf);
@@ -2222,9 +2222,9 @@  static void wt_porcelain_v2_print_changed_entry(
 		 */
 		sep_char = '\t';
 		eol_char = '\n';
-		path = quote_path(it->string, s->prefix, &buf);
+		path = quote_path(it->string, s->prefix, &buf, 0);
 		if (d->rename_source)
-			path_from = quote_path(d->rename_source, s->prefix, &buf_from);
+			path_from = quote_path(d->rename_source, s->prefix, &buf_from, 0);
 	}
 
 	if (path_from)
@@ -2310,7 +2310,7 @@  static void wt_porcelain_v2_print_unmerged_entry(
 	if (s->null_termination)
 		path_index = it->string;
 	else
-		path_index = quote_path(it->string, s->prefix, &buf_index);
+		path_index = quote_path(it->string, s->prefix, &buf_index, 0);
 
 	fprintf(s->fp, "%c %s %s %06o %06o %06o %06o %s %s %s %s%c",
 			unmerged_prefix, key, submodule_token,
@@ -2343,7 +2343,7 @@  static void wt_porcelain_v2_print_other(
 		path = it->string;
 		eol_char = '\0';
 	} else {
-		path = quote_path(it->string, s->prefix, &buf);
+		path = quote_path(it->string, s->prefix, &buf, 0);
 		eol_char = '\n';
 	}