diff mbox

dash bug: double-quoted "\" breaks glob protection for next char

Message ID 041881f9-9084-4083-345a-8f85792b48ef@gigawatt.nl (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show

Commit Message

Harald van Dijk March 4, 2018, 9:29 p.m. UTC
On 3/4/18 9:08 PM, Martijn Dekker wrote:
> Op 04-03-18 om 16:46 schreef Harald van Dijk:
>> FreeBSD sh also prints a blank line here.
> [...]
>> Like above, FreeBSD sh behaves like ksh.
> 
> I stand corrected.
> 
> Is there any port of FreeBSD sh to other operating systems? It would be
> much more convenient for me to include it in my tests if I didn't have
> to launch a FreeBSD VM and rsync & run the test scripts separately.

None that I know of. Running the test script over ssh might be slightly 
less difficult, but nothing as easy as a port. The source code contains 
several very much FreeBSD-specific bits.

>> Yes, the inconsistency should be fixed. Either it should be treated as
>> quoted or as unquoted, but not quoted-unless-it-comes-from-a-variable. I
>> have no strong feelings on which it should be.
> 
> Neither do I, so I would default to the behaviour that both pre-exists
> in dash and corresponds with the majority of other shells.

I went for the behaviour that required the fewest changes for now, which 
is to treat them as unquoted. If it is agreed that it should be quoted, 
it requires some additional (minor) complications in the parser, because 
the existing state would no longer be sufficient to determine whether } 
should end the substitution. But yes, I agree that given how long dash 
has treated this as quoted, it makes sense to keep that, unless there's 
a compelling reason not to.

> [...]
>>> $ src/dash -c 'printf "%s\n" "${$+\}}"'
>>> \}
>>>
>>> Expected output: }  (no backslash), as in bash 4, yash, ksh93, pdksh,
>>> mksh, zsh. In other words: it should be possible to escape a '}' with a
>>> backslash within the parameter expansion, even if the expansion is
>>> quoted.
>>>
>>> POSIX ref.: 2.6.2 Parameter Expansion
>>> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02
>>>
>>> | Any '}' escaped by a <backslash> or within a quoted string, and
>>> | characters in embedded arithmetic expansions, command substitutions,
>>> | and variable expansions, shall not be examined in determining the
>>> | matching '}'.
>>
>> I believe this actually requires dash's behaviour. This says the first }
>> isn't examined in determining the matching '}', but only that: it just
>> says the parameter expansion expression is $+\}. It doesn't say the
>> backslash is removed.
> 
> I believe the word "escaped" implies that removal. If a '}' is escaped
> by a backslash, it's implied that the backslash is removed as this
> escaping is parsed, just as it's implied that quotes are removed from a
> quoted string.

That's not implied, that's stated:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_07

> The quote characters ( <backslash>, single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted.

In this case, the backslash was quoted, so this doesn't apply.

>> I agree that it would be much better to print } here though.
> 
> All other current shells except bosh (schilytools sh) agree, too -- even
> FreeBSD sh, and I checked it this time.

Shells agree on the simple cases to remove the backslash:

   ${x+\}}
   "${x+\}}"
   <<EOF / ${x+\}} / EOF

And with this patch, dash behaves the same for the simple cases.

They aren't in agreement on the more complicated cases though:

In ${x+"\}"}, most shells keep the backslash. ksh and FreeBSD sh remove 
it. I think it makes most sense to keep it, because the general rule for 
\ in double-quoted strings is that it's only removed if the following 
character would have been special. This patch removes it.

In "${x+"\}"}", of the shells that treat the \} as quoted, most shells 
keep the backslash. bash removes it. I think it makes most sense to keep 
the backslash for the same reason as above.

In <<EOF / ${x+"\}"} / EOF, bash and ksh remove the backslash, posh and 
zsh keep it, and FreeBSD sh treats the " as a literal. Again I think it 
makes most sense to keep the backslash (but remove the "). This patch 
removes it.

Another pre-existing dash parsing bug that I encountered now is $(( ( 
$(( 1 )) ) )). This should expand to 1, but gives a hard error in dash, 
again due to the non-recursive nature of dash's parser. A small 
generalisation of what I've been doing so far could handle this, so it 
makes sense to me to try to achieve this at the same time.

Cheers,
Harald van Dijk
diff mbox

Patch

diff --git a/src/Makefile.am b/src/Makefile.am
index 139355e..525f8ef 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -66,7 +66,7 @@  syntax.c syntax.h: mksyntax
 signames.c: mksignames
 	./$^
 
-mksyntax: token.h
+mksyntax: parser.h token.h
 
 $(HELPERS): %: %.c
 	$(COMPILE_FOR_BUILD) -o $@ $<
diff --git a/src/TOUR b/src/TOUR
index 056e79b..f6a4641 100644
--- a/src/TOUR
+++ b/src/TOUR
@@ -150,6 +150,7 @@  special codes defined in parser.h.  The special codes are:
         CTLVAR              Variable substitution
         CTLENDVAR           End of variable substitution
         CTLBACKQ            Command substitution
+        CTLBACKQ|CTLQUOTE   Command substitution inside double quotes
         CTLESC              Escape next character
 
 A variable substitution contains the following elements:
@@ -169,13 +170,17 @@  stitution.  The possible types are:
         VSASSIGN            ${var=text}
         VSASSIGN|VSNUL      ${var=text}
 
-The name of the variable comes next, terminated by an equals
-sign.  If the type is not VSNORMAL, then the text field in the
-substitution follows, terminated by a CTLENDVAR byte.
+In addition, the type field will have the VSQUOTE flag set if the
+variable is enclosed in double quotes, or VSARITH set if the variable
+appears inside an $((...)) arithmetic expansion.  The name of the
+variable comes next, terminated by an equals sign.  If the type is not
+VSNORMAL, then the text field in the substitution follows, ter-
+minated by a CTLENDVAR byte.
 
 Commands in back quotes are parsed and stored in a linked list.
 The locations of these commands in the string are indicated by
-the CTLBACKQ character.
+CTLBACKQ and CTLBACKQ|CTLQUOTE characters, depending upon whether
+the back quotes were enclosed in double quotes.
 
 The character CTLESC escapes the next character, so that in case
 any of the CTL characters mentioned above appear in the input,
diff --git a/src/expand.c b/src/expand.c
index 2a50830..c498711 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -83,7 +83,7 @@ 
 #define RMESCAPE_HEAP	0x10	/* Malloc strings instead of stalloc */
 
 /* Add CTLESC when necessary. */
-#define QUOTES_ESC	(EXP_FULL | EXP_CASE | EXP_QPAT)
+#define QUOTES_ESC	(EXP_FULL | EXP_CASE)
 /* Do not skip NUL characters. */
 #define QUOTES_KEEPNUL	EXP_TILDE
 
@@ -112,12 +112,12 @@  static struct arglist exparg;
 
 STATIC void argstr(char *, int);
 STATIC char *exptilde(char *, char *, int);
-STATIC void expbackq(union node *, int);
+STATIC void expbackq(union node *, int, int);
 STATIC const char *subevalvar(char *, char *, int, int, int, int, int);
 STATIC char *evalvar(char *, int);
 STATIC size_t strtodest(const char *, const char *, int);
 STATIC void memtodest(const char *, size_t, const char *, int);
-STATIC ssize_t varvalue(char *, int, int, int *);
+STATIC ssize_t varvalue(char *, int, int, int);
 STATIC void expandmeta(struct strlist *, int);
 #ifdef HAVE_GLOB
 STATIC void addglob(const glob_t *);
@@ -243,22 +243,23 @@  argstr(char *p, int flag)
 		CTLESC,
 		CTLVAR,
 		CTLBACKQ,
+		CTLBACKQ | CTLQUOTE,
 		CTLENDARI,
 		0
 	};
 	const char *reject = spclchars;
-	int c;
-	int breakall = (flag & (EXP_WORD | EXP_QUOTED)) == EXP_WORD;
-	int inquotes;
+	int c = 0;
+	int quotes = flag & QUOTES_ESC;
 	size_t length;
 	int startloc;
+	int prev;
+	int dolatstrhack;
 
 	if (!(flag & EXP_VARTILDE)) {
 		reject += 2;
 	} else if (flag & EXP_VARTILDE2) {
 		reject++;
 	}
-	inquotes = 0;
 	length = 0;
 	if (flag & EXP_TILDE) {
 		char *q;
@@ -273,6 +274,7 @@  start:
 	startloc = expdest - (char *)stackblock();
 	for (;;) {
 		length += strcspn(p + length, reject);
+		prev = c;
 		c = (signed char)p[length];
 		if (c && (!(c & 0x80) || c == CTLENDARI)) {
 			/* c == '=' || c == ':' || c == CTLENDARI */
@@ -282,7 +284,7 @@  start:
 			int newloc;
 			expdest = stnputs(p, length, expdest);
 			newloc = expdest - (char *)stackblock();
-			if (breakall && !inquotes && newloc > startloc) {
+			if ((flag & (EXP_WORD | EXP_QUOTED)) == EXP_WORD && newloc > startloc) {
 				recordregion(startloc, newloc, 0);
 			}
 			startloc = newloc;
@@ -316,15 +318,9 @@  start:
 		case CTLENDVAR: /* ??? */
 			goto breakloop;
 		case CTLQUOTEMARK:
-			inquotes ^= EXP_QUOTED;
-			/* "$@" syntax adherence hack */
-			if (inquotes && !memcmp(p, dolatstr + 1,
-						DOLATSTRLEN - 1)) {
-				p = evalvar(p + 1, flag | inquotes) + 1;
-				goto start;
-			}
+			flag ^= EXP_QUOTED;
 addquote:
-			if (flag & QUOTES_ESC) {
+			if (quotes) {
 				p--;
 				length++;
 				startloc++;
@@ -333,27 +329,26 @@  addquote:
 		case CTLESC:
 			startloc++;
 			length++;
-
-			/*
-			 * Quoted parameter expansion pattern: remove quote
-			 * unless inside inner quotes or we have a literal
-			 * backslash.
-			 */
-			if (((flag | inquotes) & (EXP_QPAT | EXP_QUOTED)) ==
-			    EXP_QPAT && *p != '\\')
-				break;
-
 			goto addquote;
 		case CTLVAR:
-			p = evalvar(p, flag | inquotes);
+			/* "$@" syntax adherence hack */
+			dolatstrhack = !memcmp(p, dolatstr+1, DOLATSTRLEN-1) && !shellparam.nparam && quotes;
+			p = evalvar(p, flag);
+			if (dolatstrhack && prev == (char)CTLQUOTEMARK && *p == (char)CTLQUOTEMARK) {
+				expdest--;
+				flag ^= EXP_QUOTED;
+				p++;
+			}
 			goto start;
 		case CTLBACKQ:
-			expbackq(argbackq->n, flag | inquotes);
+			c = 0;
+		case CTLBACKQ|CTLQUOTE:
+			expbackq(argbackq->n, c, quotes);
 			argbackq = argbackq->next;
 			goto start;
 		case CTLENDARI:
 			p--;
-			expari(flag | inquotes);
+			expari(quotes);
 			goto start;
 		}
 	}
@@ -449,11 +444,12 @@  removerecordregions(int endoff)
  * evaluate, place result in (backed up) result, adjust string position.
  */
 void
-expari(int flag)
+expari(int quotes)
 {
 	struct stackmark sm;
 	char *p, *start;
 	int begoff;
+	char flag;
 	int len;
 	intmax_t result;
 
@@ -468,42 +464,24 @@  expari(int flag)
 	p = expdest;
 	pushstackmark(&sm, p - start);
 	*--p = '\0';
-	p--;
-	do {
-		int esc;
-
-		while (*p != (char)CTLARI) {
-			p--;
-#ifdef DEBUG
-			if (p < start) {
-				sh_error("missing CTLARI (shouldn't happen)");
-			}
-#endif
-		}
-
-		esc = esclen(start, p);
-		if (!(esc % 2)) {
-			break;
-		}
-
-		p -= esc + 1;
-	} while (1);
-
+	p = (char *)findstartchar(start, p, CTLARI, CTLENDARI);
 	begoff = p - start;
 
 	removerecordregions(begoff);
 
+	flag = p[1] & VSSYNTAX;
+
 	expdest = p;
 
-	if (likely(flag & QUOTES_ESC))
-		rmescapes(p + 1);
+	if (likely(quotes))
+		rmescapes(p + 2);
 
-	result = arith(p + 1);
+	result = arith(p + 2);
 	popstackmark(&sm);
 
 	len = cvtnum(result);
 
-	if (likely(!(flag & EXP_QUOTED)))
+	if (likely(!flag))
 		recordregion(begoff, begoff + len, 0);
 }
 
@@ -513,7 +491,7 @@  expari(int flag)
  */
 
 STATIC void
-expbackq(union node *cmd, int flag)
+expbackq(union node *cmd, int quoted, int quotes)
 {
 	struct backcmd in;
 	int i;
@@ -521,7 +499,7 @@  expbackq(union node *cmd, int flag)
 	char *p;
 	char *dest;
 	int startloc;
-	char const *syntax = flag & EXP_QUOTED ? DQSYNTAX : BASESYNTAX;
+	char const *syntax = quoted ? DQSYNTAX : BASESYNTAX;
 	struct stackmark smark;
 
 	INTOFF;
@@ -535,7 +513,7 @@  expbackq(union node *cmd, int flag)
 	if (i == 0)
 		goto read;
 	for (;;) {
-		memtodest(p, i, syntax, flag & QUOTES_ESC);
+		memtodest(p, i, syntax, quotes);
 read:
 		if (in.fd < 0)
 			break;
@@ -562,7 +540,7 @@  read:
 		STUNPUTC(dest);
 	expdest = dest;
 
-	if (!(flag & EXP_QUOTED))
+	if (!quoted)
 		recordregion(startloc, dest - (char *)stackblock(), 0);
 	TRACE(("evalbackq: size=%d: \"%.*s\"\n",
 		(dest - (char *)stackblock()) - startloc,
@@ -639,9 +617,8 @@  scanright(
 }
 
 STATIC const char *
-subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags, int flag)
+subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varflags, int quotes)
 {
-	int quotes = flag & QUOTES_ESC;
 	char *startp;
 	char *loc;
 	struct nodelist *saveargbackq = argbackq;
@@ -651,8 +628,7 @@  subevalvar(char *p, char *str, int strloc, int subtype, int startloc, int varfla
 	char *(*scan)(char *, char *, char *, char *, int , int);
 
 	argstr(p, EXP_TILDE | (subtype != VSASSIGN && subtype != VSQUESTION ?
-			       (flag & (EXP_QUOTED | EXP_QPAT) ?
-			        EXP_QPAT : EXP_CASE) : 0));
+			       EXP_CASE : 0));
 	STPUTC('\0', expdest);
 	argbackq = saveargbackq;
 	startp = stackblock() + startloc;
@@ -722,22 +698,25 @@  evalvar(char *p, int flag)
 	int startloc;
 	ssize_t varlen;
 	int easy;
+	int quotes;
 	int quoted;
 
+	quotes = flag & QUOTES_ESC;
 	varflags = *p++;
 	subtype = varflags & VSTYPE;
 
 	if (!subtype)
 		sh_error("Bad substitution");
 
-	quoted = flag & EXP_QUOTED;
+	quoted = varflags & VSQUOTE;
 	var = p;
 	easy = (!quoted || (*var == '@' && shellparam.nparam));
+
 	startloc = expdest - (char *)stackblock();
 	p = strchr(p, '=') + 1;
 
 again:
-	varlen = varvalue(var, varflags, flag, &quoted);
+	varlen = varvalue(var, varflags, flag, quoted);
 	if (varflags & VSNUL)
 		varlen--;
 
@@ -749,7 +728,8 @@  again:
 	if (subtype == VSMINUS) {
 vsplus:
 		if (varlen < 0) {
-			argstr(p, flag | EXP_TILDE | EXP_WORD);
+			argstr(p, flag | EXP_TILDE | EXP_WORD |
+				  (quoted ? EXP_QUOTED : 0));
 			goto end;
 		}
 		goto record;
@@ -759,8 +739,7 @@  vsplus:
 		if (varlen >= 0)
 			goto record;
 
-		subevalvar(p, var, 0, subtype, startloc, varflags,
-			   flag & ~QUOTES_ESC);
+		subevalvar(p, var, 0, subtype, startloc, varflags, 0);
 		varflags &= ~VSNUL;
 		/* 
 		 * Remove any recorded regions beyond 
@@ -806,7 +785,7 @@  record:
 		STPUTC('\0', expdest);
 		patloc = expdest - (char *)stackblock();
 		if (subevalvar(p, NULL, patloc, subtype,
-			       startloc, varflags, flag) == 0) {
+			       startloc, varflags, quotes) == 0) {
 			int amount = expdest - (
 				(char *)stackblock() + patloc - 1
 			);
@@ -823,7 +802,7 @@  end:
 		for (;;) {
 			if ((c = (signed char)*p++) == CTLESC)
 				p++;
-			else if (c == CTLBACKQ) {
+			else if (c == CTLBACKQ || c == (CTLBACKQ|CTLQUOTE)) {
 				if (varlen >= 0)
 					argbackq = argbackq->next;
 			} else if (c == CTLVAR) {
@@ -887,7 +866,7 @@  strtodest(p, syntax, quotes)
  */
 
 STATIC ssize_t
-varvalue(char *name, int varflags, int flags, int *quotedp)
+varvalue(char *name, int varflags, int flags, int quoted)
 {
 	int num;
 	char *p;
@@ -896,7 +875,6 @@  varvalue(char *name, int varflags, int flags, int *quotedp)
 	char sepc;
 	char **ap;
 	char const *syntax;
-	int quoted = *quotedp;
 	int subtype = varflags & VSTYPE;
 	int discard = subtype == VSPLUS || subtype == VSLENGTH;
 	int quotes = (discard ? 0 : (flags & QUOTES_ESC)) | QUOTES_KEEPNUL;
@@ -942,7 +920,6 @@  numvar:
 		sep |= ifsset() ? ifsval()[0] : ' ';
 param:
 		sepc = sep;
-		*quotedp = !sepc;
 		if (!(ap = shellparam.p))
 			return -1;
 		while ((p = *ap++)) {
@@ -1644,7 +1621,6 @@  char *
 _rmescapes(char *str, int flag)
 {
 	char *p, *q, *r;
-	unsigned inquotes;
 	int notescaped;
 	int globbing;
 
@@ -1674,24 +1650,23 @@  _rmescapes(char *str, int flag)
 			q = mempcpy(q, str, len);
 		}
 	}
-	inquotes = 0;
 	globbing = flag & RMESCAPE_GLOB;
 	notescaped = globbing;
 	while (*p) {
 		if (*p == (char)CTLQUOTEMARK) {
-			inquotes = ~inquotes;
 			p++;
 			notescaped = globbing;
 			continue;
 		}
+		if (*p == '\\') {
+			/* naked back slash */
+			notescaped = 0;
+			goto copy;
+		}
 		if (*p == (char)CTLESC) {
 			p++;
 			if (notescaped)
 				*q++ = '\\';
-		} else if (*p == '\\' && !inquotes) {
-			/* naked back slash */
-			notescaped = 0;
-			goto copy;
 		}
 		notescaped = globbing;
 copy:
diff --git a/src/expand.h b/src/expand.h
index 26dc5b4..90f5328 100644
--- a/src/expand.h
+++ b/src/expand.h
@@ -55,7 +55,6 @@  struct arglist {
 #define	EXP_VARTILDE	0x4	/* expand tildes in an assignment */
 #define	EXP_REDIR	0x8	/* file glob for a redirection (1 match only) */
 #define EXP_CASE	0x10	/* keeps quotes around for CASE pattern */
-#define EXP_QPAT	0x20	/* pattern in quoted parameter expansion */
 #define EXP_VARTILDE2	0x40	/* expand tildes after colons only */
 #define EXP_WORD	0x80	/* expand word in parameter expansion */
 #define EXP_QUOTED	0x100	/* expand word in double quotes */
diff --git a/src/jobs.c b/src/jobs.c
index 4f02e38..6ba6b48 100644
--- a/src/jobs.c
+++ b/src/jobs.c
@@ -1375,7 +1375,6 @@  cmdputs(const char *s)
 	char *nextc;
 	signed char c;
 	int subtype = 0;
-	int quoted = 0;
 	static const char vstype[VSTYPE + 1][4] = {
 		"", "}", "-", "+", "?", "=",
 		"%", "%%", "#", "##",
@@ -1397,11 +1396,11 @@  cmdputs(const char *s)
 				str = "${";
 			goto dostr;
 		case CTLENDVAR:
-			str = "\"}" + !(quoted & 1);
-			quoted >>= 1;
+			str = "}";
 			subtype = 0;
 			goto dostr;
 		case CTLBACKQ:
+		case CTLBACKQ|CTLQUOTE:
 			str = "$(...)";
 			goto dostr;
 		case CTLARI:
@@ -1411,14 +1410,11 @@  cmdputs(const char *s)
 			str = "))";
 			goto dostr;
 		case CTLQUOTEMARK:
-			quoted ^= 1;
 			c = '"';
 			break;
 		case '=':
 			if (subtype == 0)
 				break;
-			if ((subtype & VSTYPE) != VSNORMAL)
-				quoted <<= 1;
 			str = vstype[subtype & VSTYPE];
 			if (subtype & VSNUL)
 				c = ':';
@@ -1446,9 +1442,6 @@  dostr:
 			USTPUTC(c, nextc);
 		}
 	}
-	if (quoted & 1) {
-		USTPUTC('"', nextc);
-	}
 	*nextc = 0;
 	cmdnextc = nextc;
 }
diff --git a/src/mksyntax.c b/src/mksyntax.c
index a23c18c..41c9ceb 100644
--- a/src/mksyntax.c
+++ b/src/mksyntax.c
@@ -145,7 +145,8 @@  main(int argc, char **argv)
 		fprintf(hfile, "/* %s */\n", is_entry[i].comment);
 	}
 	putc('\n', hfile);
-	fprintf(hfile, "#define SYNBASE %d\n", 130);
+	fprintf(hfile, "#define SYNBASE %d\n", 131);
+	fprintf(hfile, "#define PVSSYNTAX %d\n", -131);
 	fprintf(hfile, "#define PEOF %d\n\n", -130);
 	fprintf(hfile, "#define PEOA %d\n\n", -129);
 	putc('\n', hfile);
@@ -158,6 +159,7 @@  main(int argc, char **argv)
 	putc('\n', hfile);
 
 	/* Generate the syntax tables. */
+	fputs("#include \"parser.h\"\n\n", cfile);
 	fputs("#include \"shell.h\"\n", cfile);
 	fputs("#include \"syntax.h\"\n\n", cfile);
 	init();
@@ -170,7 +172,8 @@  main(int argc, char **argv)
 	add("$", "CVAR");
 	add("}", "CENDVAR");
 	add("<>();&| \t", "CSPCL");
-	syntax[1] = "CSPCL";
+	syntax[0] = "0";
+	syntax[2] = "CSPCL";
 	print("basesyntax");
 	init();
 	fputs("\n/* syntax table used when in double quotes */\n", cfile);
@@ -182,6 +185,7 @@  main(int argc, char **argv)
 	add("}", "CENDVAR");
 	/* ':/' for tilde expansion, '-' for [a\-x] pattern ranges */
 	add("!*?[=~:/-]", "CCTL");
+	syntax[0] = "VSQUOTE";
 	print("dqsyntax");
 	init();
 	fputs("\n/* syntax table used when in single quotes */\n", cfile);
@@ -189,6 +193,7 @@  main(int argc, char **argv)
 	add("'", "CENDQUOTE");
 	/* ':/' for tilde expansion, '-' for [a\-x] pattern ranges */
 	add("!*?[=~:/-]\\", "CCTL");
+	syntax[0] = "0";
 	print("sqsyntax");
 	init();
 	fputs("\n/* syntax table used when in arithmetic */\n", cfile);
@@ -199,6 +204,7 @@  main(int argc, char **argv)
 	add("}", "CENDVAR");
 	add("(", "CLP");
 	add(")", "CRP");
+	syntax[0] = "VSARITH";
 	print("arisyntax");
 	filltable("0");
 	fputs("\n/* character classification table */\n", cfile);
@@ -223,7 +229,7 @@  filltable(char *dftval)
 {
 	int i;
 
-	for (i = 0 ; i < 258; i++)
+	for (i = 0 ; i < 259; i++)
 		syntax[i] = dftval;
 }
 
@@ -238,10 +244,10 @@  init(void)
 	int ctl;
 
 	filltable("CWORD");
-	syntax[0] = "CEOF";
-	syntax[1] = "CIGN";
+	syntax[1] = "CEOF";
+	syntax[2] = "CIGN";
 	for (ctl = CTL_FIRST; ctl <= CTL_LAST; ctl++ )
-		syntax[130 + ctl] = "CCTL";
+		syntax[131 + ctl] = "CCTL";
 }
 
 
@@ -253,7 +259,7 @@  static void
 add(char *p, char *type)
 {
 	while (*p)
-		syntax[(signed char)*p++ + 130] = type;
+		syntax[(signed char)*p++ + 131] = type;
 }
 
 
@@ -271,7 +277,7 @@  print(char *name)
 	fprintf(hfile, "extern const char %s[];\n", name);
 	fprintf(cfile, "const char %s[] = {\n", name);
 	col = 0;
-	for (i = 0 ; i < 258; i++) {
+	for (i = 0 ; i < 259; i++) {
 		if (i == 0) {
 			fputs("      ", cfile);
 		} else if ((i & 03) == 0) {
diff --git a/src/mystring.c b/src/mystring.c
index 0106bd2..a0d5e47 100644
--- a/src/mystring.c
+++ b/src/mystring.c
@@ -60,8 +60,7 @@ 
 char nullstr[1];		/* zero length string */
 const char spcstr[] = " ";
 const char snlfmt[] = "%s\n";
-const char dolatstr[] = { CTLQUOTEMARK, CTLVAR, VSNORMAL, '@', '=',
-			  CTLQUOTEMARK, '\0' };
+const char dolatstr[] = { CTLVAR, VSNORMAL|VSQUOTE, '@', '=', '\0' };
 const char qchars[] = { CTLESC, CTLQUOTEMARK, 0 };
 const char illnum[] = "Illegal number: %s";
 const char homestr[] = "HOME";
diff --git a/src/mystring.h b/src/mystring.h
index 083ea98..3a82f05 100644
--- a/src/mystring.h
+++ b/src/mystring.h
@@ -40,7 +40,7 @@ 
 extern const char snlfmt[];
 extern const char spcstr[];
 extern const char dolatstr[];
-#define DOLATSTRLEN 6
+#define DOLATSTRLEN 4
 extern const char qchars[];
 extern const char illnum[];
 extern const char homestr[];
diff --git a/src/parser.c b/src/parser.c
index 382658e..f8c95bc 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -876,24 +876,16 @@  readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 	size_t len;
 	struct nodelist *bqlist;
 	int quotef;
-	int dblquote;
+	int nhere;
 	int varnest;	/* levels of variables expansion */
-	int arinest;	/* levels of arithmetic expansion */
 	int parenlevel;	/* levels of parens in arithmetic */
-	int dqvarnest;	/* levels of variables expansion within double quotes */
 	int oldstyle;
-	/* syntax before arithmetic */
-	char const *uninitialized_var(prevsyntax);
 
-	dblquote = 0;
-	if (syntax == DQSYNTAX)
-		dblquote = 1;
+	nhere = eofmark && syntax == SQSYNTAX;
 	quotef = 0;
 	bqlist = NULL;
 	varnest = 0;
-	arinest = 0;
 	parenlevel = 0;
-	dqvarnest = 0;
 
 	STARTSTACKSTR(out);
 	loop: {	/* for each line, until end of word */
@@ -922,7 +914,7 @@  readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 				USTPUTC(c, out);
 				break;
 			case CCTL:
-				if (eofmark == NULL || dblquote)
+				if (!nhere)
 					USTPUTC(CTLESC, out);
 				USTPUTC(c, out);
 				break;
@@ -937,13 +929,17 @@  readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 					nlprompt();
 				} else {
 					if (
-						dblquote &&
+						syntax != BASESYNTAX &&
 						c != '\\' && c != '`' &&
 						c != '$' && (
 							c != '"' ||
 							eofmark != NULL
+						) && (
+							c != '}' ||
+							!varnest
 						)
 					) {
+						USTPUTC(CTLESC, out);
 						USTPUTC('\\', out);
 					}
 					USTPUTC(CTLESC, out);
@@ -960,16 +956,12 @@  quotemark:
 				break;
 			case CDQUOTE:
 				syntax = DQSYNTAX;
-				dblquote = 1;
 				goto quotemark;
 			case CENDQUOTE:
 				if (eofmark && !varnest)
 					USTPUTC(c, out);
 				else {
-					if (dqvarnest == 0) {
-						syntax = BASESYNTAX;
-						dblquote = 0;
-					}
+					syntax = BASESYNTAX;
 					quotef++;
 					goto quotemark;
 				}
@@ -979,14 +971,18 @@  quotemark:
 				break;
 			case CENDVAR:	/* '}' */
 				if (varnest > 0) {
-					varnest--;
-					if (dqvarnest > 0) {
-						dqvarnest--;
+					const char *startchar = findstartchar((char *)stackblock(), out, CTLVAR, CTLENDVAR);
+					char vstype = startchar[1] & VSTYPE;
+					char vssyntax = startchar[1] & VSSYNTAX;
+					const char *prevsyntax = vssyntax == (char)VSARITH ? ARISYNTAX : vssyntax == (char)VSQUOTE ? DQSYNTAX : BASESYNTAX;
+					if (syntax == (prevsyntax == BASESYNTAX || (vstype >= VSTRIM_FIRST && vstype <= VSTRIM_LAST) ? BASESYNTAX : DQSYNTAX)) {
+						syntax = prevsyntax;
+						varnest--;
+						USTPUTC(CTLENDVAR, out);
+						break;
 					}
-					USTPUTC(CTLENDVAR, out);
-				} else {
-					USTPUTC(c, out);
 				}
+				USTPUTC(c, out);
 				break;
 			case CLP:	/* '(' in arithmetic */
 				parenlevel++;
@@ -999,8 +995,9 @@  quotemark:
 				} else {
 					if (pgetc() == ')') {
 						USTPUTC(CTLENDARI, out);
-						if (!--arinest)
-							syntax = prevsyntax;
+
+						char type = findstartchar((char *)stackblock(), out - 1, CTLARI, CTLENDARI)[1] & VSSYNTAX;
+						syntax = type == (char)VSARITH ? ARISYNTAX : type == (char)VSQUOTE ? DQSYNTAX : BASESYNTAX;
 					} else {
 						/*
 						 * unbalanced parens
@@ -1289,12 +1286,13 @@  varname:
 badsub:
 			pungetc();
 		}
-		*((char *)stackblock() + typeloc) = subtype;
+		const char *prevsyntax = syntax;
 		if (subtype != VSNORMAL) {
 			varnest++;
-			if (dblquote)
-				dqvarnest++;
+			syntax = syntax == BASESYNTAX || (subtype >= VSTRIM_FIRST && subtype <= VSTRIM_LAST) ? BASESYNTAX : DQSYNTAX;
 		}
+		subtype |= prevsyntax[PVSSYNTAX];
+		*((char *)stackblock() + typeloc) = subtype;
 		STPUTC('=', out);
 	}
 	goto parsesub_return;
@@ -1352,7 +1350,7 @@  parsebackq: {
 					continue;
 				}
                                 if (pc != '\\' && pc != '`' && pc != '$'
-                                    && (!dblquote || pc != '"'))
+                                    && (syntax == BASESYNTAX || pc != '"'))
                                         STPUTC('\\', pout);
 				if (pc > PEOA) {
 					break;
@@ -1416,7 +1414,10 @@  done:
 		memcpy(out, str, savelen);
 		STADJUST(savelen, out);
 	}
-	USTPUTC(CTLBACKQ, out);
+	if (syntax != BASESYNTAX)
+		USTPUTC(CTLBACKQ | CTLQUOTE, out);
+	else
+		USTPUTC(CTLBACKQ, out);
 	if (oldstyle)
 		goto parsebackq_oldreturn;
 	else
@@ -1428,11 +1429,9 @@  done:
  */
 parsearith: {
 
-	if (++arinest == 1) {
-		prevsyntax = syntax;
-		syntax = ARISYNTAX;
-	}
 	USTPUTC(CTLARI, out);
+	USTPUTC(VSTYPE | syntax[PVSSYNTAX], out);
+	syntax = ARISYNTAX;
 	goto parsearith_return;
 }
 
@@ -1466,6 +1465,39 @@  endofname(const char *name)
 }
 
 
+const char *
+findstartchar(const char *start, const char *p, char open, char close) {
+	int nest = 1;
+	const char *q;
+	for (;; ) {
+		int d;
+
+		--p;
+
+#if DEBUG
+		if (p < start)
+			sh_error("missing start char (shouldn't happen)");
+#endif
+
+		if (*p == open) {
+			if ((p[1] & VSTYPE) == VSNORMAL)
+				continue;
+
+			d = -1;
+		checkescapes:
+			for (q = p; q != start && q[-1] == (char)CTLESC; q--)
+				;
+
+			if ((p - q) % 2 == 0 && !(nest += d))
+				return p;
+		} else if (*p == close) {
+			d = 1;
+			goto checkescapes;
+		}
+	}
+}
+
+
 /*
  * Called when an unexpected token is read during the parse.  The argument
  * is the token that is expected, or -1 if more than one type of token can
@@ -1540,7 +1572,7 @@  expandstr(const char *ps)
 	n.narg.text = wordtext;
 	n.narg.backquote = backquotelist;
 
-	expandarg(&n, NULL, EXP_QUOTED);
+	expandarg(&n, NULL, 0);
 	return stackblock();
 }
 
diff --git a/src/parser.h b/src/parser.h
index 2875cce..d239043 100644
--- a/src/parser.h
+++ b/src/parser.h
@@ -42,14 +42,19 @@ 
 #define CTLVAR -126		/* variable defn */
 #define CTLENDVAR -125
 #define CTLBACKQ -124
+#define CTLQUOTE 01		/* ored with CTLBACKQ code if in quotes */
+/*	CTLBACKQ | CTLQUOTE == -123 */
 #define	CTLARI -122		/* arithmetic expression */
 #define	CTLENDARI -121
 #define	CTLQUOTEMARK -120
 #define	CTL_LAST -120		/* last 'special' character */
 
-/* variable substitution byte (follows CTLVAR) */
+/* variable substitution byte (follows CTLVAR), values picked to be distinct from control characters */
 #define VSTYPE	0x0f		/* type of variable substitution */
 #define VSNUL	0x10		/* colon--treat the empty string as unset */
+#define VSSYNTAX 0xc0
+#define VSQUOTE 0x40		/* inside double quotes--suppress splitting */
+#define VSARITH 0xc0		/* inside $((...)) arithmetic */
 
 /* values of VSTYPE field */
 #define VSNORMAL	0x1		/* normal variable:  $var or ${var} */
@@ -57,10 +62,12 @@ 
 #define VSPLUS		0x3		/* ${var+text} */
 #define VSQUESTION	0x4		/* ${var?message} */
 #define VSASSIGN	0x5		/* ${var=text} */
+#define VSTRIM_FIRST 0x6
 #define VSTRIMRIGHT	0x6		/* ${var%pattern} */
 #define VSTRIMRIGHTMAX 	0x7		/* ${var%%pattern} */
 #define VSTRIMLEFT	0x8		/* ${var#pattern} */
 #define VSTRIMLEFTMAX	0x9		/* ${var##pattern} */
+#define VSTRIM_LAST 0x9
 #define VSLENGTH	0xa		/* ${#var} */
 
 /* values of checkkwd variable */
@@ -88,6 +95,7 @@  const char *getprompt(void *);
 const char *const *findkwd(const char *);
 char *endofname(const char *);
 const char *expandstr(const char *);
+const char *findstartchar(const char *, const char *, char, char);
 
 static inline int
 goodname(const char *p)
diff --git a/src/redir.c b/src/redir.c
index f96a76b..527b3be 100644
--- a/src/redir.c
+++ b/src/redir.c
@@ -304,7 +304,7 @@  openhere(union node *redir)
 
 	p = redir->nhere.doc->narg.text;
 	if (redir->type == NXHERE) {
-		expandarg(redir->nhere.doc, NULL, EXP_QUOTED);
+		expandarg(redir->nhere.doc, NULL, 0);
 		p = stackblock();
 	}
 
diff --git a/src/show.c b/src/show.c
index 4a049e9..839a40a 100644
--- a/src/show.c
+++ b/src/show.c
@@ -222,6 +222,7 @@  sharg(union node *arg, FILE *fp)
 		     putc('}', fp);
 		     break;
 		case CTLBACKQ:
+		case CTLBACKQ|CTLQUOTE:
 			putc('$', fp);
 			putc('(', fp);
 			shtree(bqlist->n, -1, NULL, fp);
@@ -314,6 +315,7 @@  trstring(char *s)
 		case CTLESC:  c = 'e';  goto backslash;
 		case CTLVAR:  c = 'v';  goto backslash;
 		case CTLBACKQ:  c = 'q';  goto backslash;
+		case CTLBACKQ|CTLQUOTE:  c = 'Q';  goto backslash;
 backslash:	  putc('\\', tracefile);
 			putc(c, tracefile);
 			break;