diff mbox

dash bug: double-quoted "\" breaks glob protection for next char

Message ID ecb3af24-cdf1-8b59-523d-8ab2673dbd5c@gigawatt.nl (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show

Commit Message

Harald van Dijk Feb. 18, 2018, 10:50 p.m. UTC
On 2/14/18 11:50 PM, Harald van Dijk wrote:
> On 2/14/18 10:44 PM, Harald van Dijk wrote:
>> On 2/14/18 9:03 PM, Harald van Dijk wrote:
>>> On 13/02/2018 14:53, Denys Vlasenko wrote:
>>>> $ >'\zzzz'
>>>> $ >'\wwww'
>>>> $ dash -c 'echo "\*"'
>>>> \wwww \zzzz
>>>
>>> [...]
>>>
>>> Currently:
>>>
>>> $ dash -c 'foo=a; echo "<${foo#[a\]]}>"'
>>> <>
>>>
>>> This is what I expect, and also what bash, ksh and posh do.
>>>
>>> With your patch:
>>>
>>> $ dash -c 'foo=a; echo "<${foo#[a\]]}>"'
>>> <a>
>>
>> Does the attached look right as an alternative? It treats a quoted 
>> backslash the same way as if it were preceded by CTLESC in _rmescapes. 
>> It passes your test case and mine, but I'll do more extensive testing.
> 
> It causes preglob's string to potentially grow larger than the original. 
> When called with RMESCAPE_ALLOC, that can be handled by increasing the 
> buffer size, but preglob also gets called without RMESCAPE_ALLOC to 
> modify a string in-place. That's never going to work with this approach. 
> Back to the drawing board...

There is a way to make it work: ensure sufficient memory is always 
available. Instead of inserting CTLESC, which caused problems, 
CTLQUOTEMARK+CTLQUOTEMARK can be inserted instead. It's effectively a 
no-op here. I'm currently testing the attached.

To be honest, FreeBSD sh's approach, keeping a syntax stack to detect 
characters' meaning reliably at parse time, feels more elegant to me 
right now, but that requires invasive and therefore risky changes to 
dash's code.

Cheers,
Harald van Dijk
diff mbox

Patch

diff --git a/src/expand.c b/src/expand.c
index 2a50830..af88a69 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -1686,12 +1686,17 @@  _rmescapes(char *str, int flag)
 		}
 		if (*p == (char)CTLESC) {
 			p++;
-			if (notescaped)
-				*q++ = '\\';
-		} else if (*p == '\\' && !inquotes) {
-			/* naked back slash */
-			notescaped = 0;
-			goto copy;
+			goto escape;
+		} else if (*p == '\\') {
+			if (inquotes) {
+escape:
+				if (notescaped)
+					*q++ = '\\';
+			} else {
+				/* naked back slash */
+				notescaped = 0;
+				goto copy;
+			}
 		}
 		notescaped = globbing;
 copy:
diff --git a/src/parser.c b/src/parser.c
index 382658e..bb16a46 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -944,6 +944,9 @@  readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 							eofmark != NULL
 						)
 					) {
+						/* Reserve extra memory in case this backslash will require later escaping. */
+						USTPUTC(CTLQUOTEMARK, out);
+						USTPUTC(CTLQUOTEMARK, out);
 						USTPUTC('\\', out);
 					}
 					USTPUTC(CTLESC, out);