diff mbox

quote arguments in xtrace output

Message ID e50519c3-4055-ab2a-b9f3-0776369e4106@inlv.org (mailing list archive)
State RFC
Delegated to: Herbert Xu
Headers show

Commit Message

Martijn Dekker Feb. 27, 2017, 6:22 a.m. UTC
The xtrace (set -x) output in dash is a bit of a pain, because arguments
containing whitespace aren't quoted. This can it impossible to tell
which bit belongs to which argument:

$ dash -c 'set -x; printf "%s\n" one "two three" four'
+ printf %s\n one two three four
one
two three
four

Another disadvantage is you cannot simply copy and paste the commands
from this xtrace output to repeat them for debugging purposes.

I wrote the attached patch which fixes this. I've been testing it for
about a week and had no issues.

The patch changes the following:

- Since we don't want every command name and argument quoted but only
those containing non-shell-safe characters, single_quote() has acquired
an extra argument indicating whether quoting should be conditional upon
the string containing non-shell-safe characters (1) or unconditional
(0). Shell-safe characters are defined as this set:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz%+,./:=@_^!-

- Quoting of variable assignments preceding commands needs to be handled
specially; in order for the output to be suitable for re-entry into the
shell, only the part after the "=" must be quoted. For this purpose, I
changed eprintlist() into two functions, eprintvarlist() to print the
variable assignments and eprintarglist() to print the rest.

Hope this is useful,

- Martijn

Comments

Harald van Dijk Feb. 27, 2017, 7:24 p.m. UTC | #1
On 27/02/2017 07:22, Martijn Dekker wrote:
> The xtrace (set -x) output in dash is a bit of a pain, because arguments
> containing whitespace aren't quoted. This can it impossible to tell
> which bit belongs to which argument:

Agreed.

> $ dash -c 'set -x; printf "%s\n" one "two three" four'
> + printf %s\n one two three four
> one
> two three
> four
>
> Another disadvantage is you cannot simply copy and paste the commands
> from this xtrace output to repeat them for debugging purposes.
>
> I wrote the attached patch which fixes this. I've been testing it for
> about a week and had no issues.
>
> The patch changes the following:
>
> - Since we don't want every command name and argument quoted but only
> those containing non-shell-safe characters, single_quote() has acquired
> an extra argument indicating whether quoting should be conditional upon
> the string containing non-shell-safe characters (1) or unconditional
> (0). Shell-safe characters are defined as this set:
> 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz%+,./:=@_^!-
>
> - Quoting of variable assignments preceding commands needs to be handled
> specially; in order for the output to be suitable for re-entry into the
> shell, only the part after the "=" must be quoted. For this purpose, I
> changed eprintlist() into two functions, eprintvarlist() to print the
> variable assignments and eprintarglist() to print the rest.

The general approach looks nice. I think this misses a few cases though 
if you're aiming to make it suitable for re-entry:

   $ src/dash -c 'set -x; \!'
   + !
   src/dash: 1: !: not found

   $ src/dash -c 'set -x; \a=b'
   + a=b
   src/dash: 1: a=b: not found

   $ src/dash -c 'set -x; \if'
   + if
   src/dash: 1: if: not found

I could create binaries by those names to get rid of the errors. 
Regardless, in none these cases would the original command be 
re-executed if the set -x output were entered into the shell.

Aside from the first, bash doesn't quote them either. It may be okay to 
leave this as it is and acknowledge that it doesn't handle all cases, as 
long as it handles enough of them to be an improvement.

Depending on what happens with 
<http://austingroupbugs.net/view.php?id=249>, if support for $'...' will 
become required in a future version of POSIX, it may additionally make 
sense to use that syntax for control characters already right now.

It feels like the set of shell-safe characters should be able to use the 
generated syntax.[ch] files, but none of the existing categories quite 
matches.

If the modification to single_quote can handle all cases (perhaps 
quoting unnecessarily if it's unclear), then the conditional parameter 
could be removed and applied unconditionally: as far as I can tell, 
there are no existing calls that require the single quotes for simple words.

> Hope this is useful,
>
> - Martijn

It looks useful to me. :)

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martijn Dekker Feb. 27, 2017, 8:08 p.m. UTC | #2
Op 27-02-17 om 20:24 schreef Harald van Dijk:
> The general approach looks nice. I think this misses a few cases though
> if you're aiming to make it suitable for re-entry:
> 
>   $ src/dash -c 'set -x; \!'
>   + !
>   src/dash: 1: !: not found
> 
>   $ src/dash -c 'set -x; \a=b'
>   + a=b
>   src/dash: 1: a=b: not found
> 
>   $ src/dash -c 'set -x; \if'
>   + if
>   src/dash: 1: if: not found
> 
> I could create binaries by those names to get rid of the errors.
> Regardless, in none these cases would the original command be
> re-executed if the set -x output were entered into the shell.

Good point.

The first and second cases would be fixed by simply removing "!" and "="
from SHELLSAFECHARS. Not a big loss.

Shell reserved words (a.k.a. shell keywords) such as "if" are harder to
fix. The single_quote() routine, if told to quote conditionally, would
have to iterate through the internal list of shell reserved words and
quote the string if it matches one of those. I will see if I can make
that happen.

Actually, after testing bash, yash, zsh, ksh93 and mksh as well as dash,
it looks like no current shell manages to quote strings that would
otherwise be shell keywords in xtrace output, with the exception of the
"!" keyword. So fixing this would make dash's xtrace output better than
any other shell's. :)

Since shell reserved words are part of the grammar, they never show up
in xtrace output, so at least the output is unambiguous: you know
anything that looks like a reserved word isn't.

> It may be okay to leave this as it is and acknowledge that it doesn't
> handle all cases, as long as it handles enough of them to be an
> improvement.

Yes. I would still remove "!" and "=" from SHELLSAFECHARS though; at
least that way you won't get output that looks like a negation or
assignment but isn't.

> If the modification to single_quote can handle all cases (perhaps
> quoting unnecessarily if it's unclear), then the conditional
> parameter could be removed and applied unconditionally: as far as I
> can tell, there are no existing calls that require the single quotes
> for simple words.

True, but this would change the output of something like:

$ dash -c 'alias stuff=xyz; trap true EXIT; alias; trap'

from:

stuff='xyz'
trap -- 'true' EXIT

to:

stuff=xyz
trap -- true EXIT

While there is technically nothing wrong with this, I think it's nicer
to leave those quoted unconditionally, because that facilitates editing
the output of those commands to feed it back to the shell.

- M.

--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Harald van Dijk Feb. 27, 2017, 9:34 p.m. UTC | #3
On 27/02/2017 21:08, Martijn Dekker wrote:
> Op 27-02-17 om 20:24 schreef Harald van Dijk:
>> If the modification to single_quote can handle all cases (perhaps
>> quoting unnecessarily if it's unclear), then the conditional
>> parameter could be removed and applied unconditionally: as far as I
>> can tell, there are no existing calls that require the single quotes
>> for simple words.
>
> True, but this would change the output of something like:
>
> $ dash -c 'alias stuff=xyz; trap true EXIT; alias; trap'
>
> from:
>
> stuff='xyz'
> trap -- 'true' EXIT
>
> to:
>
> stuff=xyz
> trap -- true EXIT
>
> While there is technically nothing wrong with this, I think it's nicer
> to leave those quoted unconditionally, because that facilitates editing
> the output of those commands to feed it back to the shell.

Fair point.

Looking at other shells:

             | bash | ksh93 | zsh | posh | dash | dash (patched)
    alias    | 2    | 1     | 1   | -    | 2    | 2
     trap    | 2    | 1     | 1   | 1    | 2    | 2
   export -p | 2    | 1     | 1   | 0    | 2    | 2
readonly -p | 2    | 1     | 1   | 0    | 2    | 2
      set    | 1    | 1     | 1   | -    | 2    | 2
      set -x | 1    | 1     | 1   | 0    | 0    | 1

Here, 0 means "never quoted", 1 means "quoted when containing special 
characters", and 2 means "always quoted".

Although there is a bit of variation in other shells, your approach 
brings dash closer to bash, which I believe is considered a goal for 
dash, all else being equal. My suggestion would bring dash closer to 
bash in one command, but further from bash in four others, so unless 
there is a good reason for it, I think I agree that that shouldn't be done.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -ur dash-0.5.9.1.orig/src/alias.c dash-0.5.9.1/src/alias.c
--- dash-0.5.9.1.orig/src/alias.c	2014-09-28 10:19:32.000000000 +0200
+++ dash-0.5.9.1/src/alias.c	2017-02-23 03:01:57.000000000 +0100
@@ -197,7 +197,7 @@ 
 
 void
 printalias(const struct alias *ap) {
-	out1fmt("%s=%s\n", ap->name, single_quote(ap->val));
+	out1fmt("%s=%s\n", ap->name, single_quote(ap->val, 0));
 }
 
 STATIC struct alias **
diff -ur dash-0.5.9.1.orig/src/eval.c dash-0.5.9.1/src/eval.c
--- dash-0.5.9.1.orig/src/eval.c	2016-09-02 16:12:23.000000000 +0200
+++ dash-0.5.9.1/src/eval.c	2017-02-23 03:14:56.000000000 +0100
@@ -95,7 +95,8 @@ 
 STATIC int evalbltin(const struct builtincmd *, int, char **, int);
 STATIC int evalfun(struct funcnode *, int, char **, int);
 STATIC void prehash(union node *);
-STATIC int eprintlist(struct output *, struct strlist *, int);
+STATIC int eprintvarlist(struct output *, struct strlist *, int);
+STATIC void eprintarglist(struct output *, struct strlist *, int);
 STATIC int bltincmd(int, char **);
 
 
@@ -786,8 +787,8 @@ 
 		out = &preverrout;
 		outstr(expandstr(ps4val()), out);
 		sep = 0;
-		sep = eprintlist(out, varlist.list, sep);
-		eprintlist(out, arglist.list, sep);
+		sep = eprintvarlist(out, varlist.list, sep);
+		eprintarglist(out, arglist.list, sep);
 		outcslow('\n', out);
 #ifdef FLUSHERR
 		flushout(out);
@@ -1107,16 +1108,35 @@ 
 
 
 STATIC int
-eprintlist(struct output *out, struct strlist *sp, int sep)
+eprintvarlist(struct output *out, struct strlist *sp, int sep)
 {
 	while (sp) {
 		const char *p;
+		int i;
 
-		p = " %s" + (1 - sep);
+		if (sep)
+			outfmt(out, " ");
 		sep |= 1;
-		outfmt(out, p, sp->text);
+		i = 0;
+		while (sp->text[i] != '=' && sp->text[i] != '\0')
+			outfmt(out, "%c", sp->text[i++]);
+		if (sp->text[i] == '=')
+			outfmt(out, "=%s", single_quote(sp->text+i+1, 1));
 		sp = sp->next;
 	}
 
 	return sep;
 }
+
+STATIC void
+eprintarglist(struct output *out, struct strlist *sp, int sep)
+{
+	while (sp) {
+		const char *p;
+
+		p = " %s" + (1 - sep);
+		sep |= 1;
+		outfmt(out, p, single_quote(sp->text, 1));
+		sp = sp->next;
+	}
+}
diff -ur dash-0.5.9.1.orig/src/mystring.c dash-0.5.9.1/src/mystring.c
--- dash-0.5.9.1.orig/src/mystring.c	2014-09-28 10:19:32.000000000 +0200
+++ dash-0.5.9.1/src/mystring.c	2017-02-23 03:10:48.000000000 +0100
@@ -184,13 +184,19 @@ 
 
 /*
  * Produce a possibly single quoted string suitable as input to the shell.
- * The return string is allocated on the stack.
+ * If 'conditional' is nonzero, quoting is only done if the string contains
+ * non-shellsafe characters; if it is zero, quoting is always done.
+ * If quoting was done, the return string is allocated on the stack,
+ * otherwise a pointer to the original string is returned.
  */
 
 char *
-single_quote(const char *s) {
+single_quote(const char *s, int conditional) {
 	char *p;
 
+	if (conditional && s[strspn(s, SHELLSAFECHARS)] == '\0')
+		return (char *)s;
+
 	STARTSTACKSTR(p);
 
 	do {
diff -ur dash-0.5.9.1.orig/src/mystring.h dash-0.5.9.1/src/mystring.h
--- dash-0.5.9.1.orig/src/mystring.h	2014-09-28 10:19:32.000000000 +0200
+++ dash-0.5.9.1/src/mystring.h	2017-02-23 03:01:57.000000000 +0100
@@ -54,10 +54,13 @@ 
 intmax_t atomax10(const char *);
 int number(const char *);
 int is_number(const char *);
-char *single_quote(const char *);
+char *single_quote(const char *, int);
 char *sstrdup(const char *);
 int pstrcmp(const void *, const void *);
 const char *const *findstring(const char *, const char *const *, size_t);
 
 #define equal(s1, s2)	(strcmp(s1, s2) == 0)
 #define scopy(s1, s2)	((void)strcpy(s2, s1))
+
+/* Characters that don't need quoting before re-entry into the shell */
+#define SHELLSAFECHARS "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz%+,./:=@_^!-"
diff -ur dash-0.5.9.1.orig/src/trap.c dash-0.5.9.1/src/trap.c
--- dash-0.5.9.1.orig/src/trap.c	2016-09-02 16:12:23.000000000 +0200
+++ dash-0.5.9.1/src/trap.c	2017-02-23 03:01:57.000000000 +0100
@@ -107,7 +107,7 @@ 
 			if (trap[signo] != NULL) {
 				out1fmt(
 					"trap -- %s %s\n",
-					single_quote(trap[signo]),
+					single_quote(trap[signo], 0),
 					signal_names[signo]
 				);
 			}
diff -ur dash-0.5.9.1.orig/src/var.c dash-0.5.9.1/src/var.c
--- dash-0.5.9.1.orig/src/var.c	2014-10-07 16:30:35.000000000 +0200
+++ dash-0.5.9.1/src/var.c	2017-02-23 03:01:57.000000000 +0100
@@ -417,7 +417,7 @@ 
 		p = strchrnul(*ep, '=');
 		q = nullstr;
 		if (*p)
-			q = single_quote(++p);
+			q = single_quote(++p, 0);
 
 		out1fmt("%s%s%.*s%s\n", prefix, sep, (int)(p - *ep), *ep, q);
 	}