diff mbox series

[v2,2/2] block/rbd: Add an escape-aware strchr helper

Message ID 20210401210150.2127670-3-ckuehl@redhat.com (mailing list archive)
State New, archived
Headers show
Series Fix segfault in qemu_rbd_parse_filename | expand

Commit Message

Connor Kuehl April 1, 2021, 9:01 p.m. UTC
Sometimes the parser needs to further split a token it has collected
from the token input stream. Right now, it does a cursory check to see
if the relevant characters appear in the token to determine if it should
break it down further.

However, qemu_rbd_next_tok() will escape characters as it removes tokens
from the token stream and plain strchr() won't. This can make the
initial strchr() check slightly misleading since it implies
qemu_rbd_next_tok() will find the token and split on it, except the
reality is that qemu_rbd_next_tok() will pass over it if it is escaped.

Use a custom strchr to avoid mixing escaped and unescaped string
operations.

Reported-by: Han Han <hhan@redhat.com>
Fixes: https://bugzilla.redhat.com/1873913
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
---
 block/rbd.c                | 20 ++++++++++++++++++--
 tests/qemu-iotests/231     |  4 ++++
 tests/qemu-iotests/231.out |  3 +++
 3 files changed, 25 insertions(+), 2 deletions(-)

Comments

Max Reitz April 6, 2021, 2:24 p.m. UTC | #1
On 01.04.21 23:01, Connor Kuehl wrote:
> Sometimes the parser needs to further split a token it has collected
> from the token input stream. Right now, it does a cursory check to see
> if the relevant characters appear in the token to determine if it should
> break it down further.
> 
> However, qemu_rbd_next_tok() will escape characters as it removes tokens
> from the token stream and plain strchr() won't. This can make the
> initial strchr() check slightly misleading since it implies
> qemu_rbd_next_tok() will find the token and split on it, except the
> reality is that qemu_rbd_next_tok() will pass over it if it is escaped.
> 
> Use a custom strchr to avoid mixing escaped and unescaped string
> operations.
> 
> Reported-by: Han Han <hhan@redhat.com>
> Fixes: https://bugzilla.redhat.com/1873913
> Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
> ---
>   block/rbd.c                | 20 ++++++++++++++++++--
>   tests/qemu-iotests/231     |  4 ++++
>   tests/qemu-iotests/231.out |  3 +++
>   3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/block/rbd.c b/block/rbd.c
> index 9071a00e3f..c0e4d4a952 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char delim, char **p)
>       return src;
>   }
>   
> +static char *qemu_rbd_strchr(char *src, char delim)
> +{
> +    char *p;
> +
> +    for (p = src; *p; ++p) {
> +        if (*p == delim) {
> +            return p;
> +        }
> +        if (*p == '\\') {
> +            ++p;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +

So I thought you could make qemu_rbd_do_next_tok() to do this.  (I 
didn’t say you should, but bear with me.)  That would be possible by 
giving it a new parameter (e.g. @find), and if that is set, return @end 
if *end == delim after the loop, and NULL otherwise.

Now, if you add wrapper functions to make it nice, there’s not much more 
difference in lines added compared to just adding a new function, but it 
does mean your function should basically be the same as 
qemu_rbd_next_tok(), except that no splitting happens, that there is no 
*p, and that @end is returned instead of @src.

So there is one difference, and that is that qemu_rbd_next_tok() has 
this condition to skip escaped characters:

     if (*end == '\\' && end[1] != '\0') {

where qemu_rbd_strchr() has only:

     if (*p == '\\') {

And I think qemu_rbd_next_tok() is right; if the string in question has 
a trailing backslash, qemu_rbd_strchr() will ignore the final NUL and 
continue searching past the end of the string.

Max
Connor Kuehl April 9, 2021, 2:05 p.m. UTC | #2
On 4/6/21 9:24 AM, Max Reitz wrote:
> On 01.04.21 23:01, Connor Kuehl wrote:
>> [..]
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 9071a00e3f..c0e4d4a952 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char 
>> delim, char **p)
>>       return src;
>>   }
>> +static char *qemu_rbd_strchr(char *src, char delim)
>> +{
>> +    char *p;
>> +
>> +    for (p = src; *p; ++p) {
>> +        if (*p == delim) {
>> +            return p;
>> +        }
>> +        if (*p == '\\') {
>> +            ++p;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
> 
> So I thought you could make qemu_rbd_do_next_tok() to do this.  (I 
> didn’t say you should, but bear with me.)  That would be possible by 
> giving it a new parameter (e.g. @find), and if that is set, return @end 
> if *end == delim after the loop, and NULL otherwise.
> 
> Now, if you add wrapper functions to make it nice, there’s not much more 
> difference in lines added compared to just adding a new function, but it 
> does mean your function should basically be the same as 
> qemu_rbd_next_tok(), except that no splitting happens, that there is no 
> *p, and that @end is returned instead of @src.

Do you have a strong preference for this? I agree that 
qemu_rbd_next_tok() could grow this functionality, but I think it'd be 
simpler to keep it separate in the form of qemu_rbd_strchr().

> 
> So there is one difference, and that is that qemu_rbd_next_tok() has 
> this condition to skip escaped characters:
> 
>      if (*end == '\\' && end[1] != '\0') {
> 
> where qemu_rbd_strchr() has only:
> 
>      if (*p == '\\') {
> 
> And I think qemu_rbd_next_tok() is right; if the string in question has 
> a trailing backslash, qemu_rbd_strchr() will ignore the final NUL and 
> continue searching past the end of the string.

Aha, good catch. I'll fix this up.

Thank you,

Connor
Max Reitz April 9, 2021, 2:19 p.m. UTC | #3
On 09.04.21 16:05, Connor Kuehl wrote:
> On 4/6/21 9:24 AM, Max Reitz wrote:
>> On 01.04.21 23:01, Connor Kuehl wrote:
>>> [..]
>>> diff --git a/block/rbd.c b/block/rbd.c
>>> index 9071a00e3f..c0e4d4a952 100644
>>> --- a/block/rbd.c
>>> +++ b/block/rbd.c
>>> @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char 
>>> delim, char **p)
>>>       return src;
>>>   }
>>> +static char *qemu_rbd_strchr(char *src, char delim)
>>> +{
>>> +    char *p;
>>> +
>>> +    for (p = src; *p; ++p) {
>>> +        if (*p == delim) {
>>> +            return p;
>>> +        }
>>> +        if (*p == '\\') {
>>> +            ++p;
>>> +        }
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>
>> So I thought you could make qemu_rbd_do_next_tok() to do this.  (I 
>> didn’t say you should, but bear with me.)  That would be possible by 
>> giving it a new parameter (e.g. @find), and if that is set, return 
>> @end if *end == delim after the loop, and NULL otherwise.
>>
>> Now, if you add wrapper functions to make it nice, there’s not much 
>> more difference in lines added compared to just adding a new function, 
>> but it does mean your function should basically be the same as 
>> qemu_rbd_next_tok(), except that no splitting happens, that there is 
>> no *p, and that @end is returned instead of @src.
> 
> Do you have a strong preference for this? I agree that 
> qemu_rbd_next_tok() could grow this functionality, but I think it'd be 
> simpler to keep it separate in the form of qemu_rbd_strchr().

Oh, no, no.  I mostly said this so it would be clear why both functions 
should basically have the same structure, i.e. why a difference in 
structure might be a sign that something’s wrong.

Sorry if I came across as too verbose.

>> So there is one difference, and that is that qemu_rbd_next_tok() has 
>> this condition to skip escaped characters:
>>
>>      if (*end == '\\' && end[1] != '\0') {
>>
>> where qemu_rbd_strchr() has only:
>>
>>      if (*p == '\\') {
>>
>> And I think qemu_rbd_next_tok() is right; if the string in question 
>> has a trailing backslash, qemu_rbd_strchr() will ignore the final NUL 
>> and continue searching past the end of the string.
> 
> Aha, good catch. I'll fix this up.

Thanks!

Max
diff mbox series

Patch

diff --git a/block/rbd.c b/block/rbd.c
index 9071a00e3f..c0e4d4a952 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -134,6 +134,22 @@  static char *qemu_rbd_next_tok(char *src, char delim, char **p)
     return src;
 }
 
+static char *qemu_rbd_strchr(char *src, char delim)
+{
+    char *p;
+
+    for (p = src; *p; ++p) {
+        if (*p == delim) {
+            return p;
+        }
+        if (*p == '\\') {
+            ++p;
+        }
+    }
+
+    return NULL;
+}
+
 static void qemu_rbd_unescape(char *src)
 {
     char *p;
@@ -171,7 +187,7 @@  static void qemu_rbd_parse_filename(const char *filename, QDict *options,
     qemu_rbd_unescape(found_str);
     qdict_put_str(options, "pool", found_str);
 
-    if (strchr(p, '@')) {
+    if (qemu_rbd_strchr(p, '@')) {
         image_name = qemu_rbd_next_tok(p, '@', &p);
 
         found_str = qemu_rbd_next_tok(p, ':', &p);
@@ -181,7 +197,7 @@  static void qemu_rbd_parse_filename(const char *filename, QDict *options,
         image_name = qemu_rbd_next_tok(p, ':', &p);
     }
     /* Check for namespace in the image_name */
-    if (strchr(image_name, '/')) {
+    if (qemu_rbd_strchr(image_name, '/')) {
         found_str = qemu_rbd_next_tok(image_name, '/', &image_name);
         qemu_rbd_unescape(found_str);
         qdict_put_str(options, "namespace", found_str);
diff --git a/tests/qemu-iotests/231 b/tests/qemu-iotests/231
index 0f66d0ca36..8e6c6447c1 100755
--- a/tests/qemu-iotests/231
+++ b/tests/qemu-iotests/231
@@ -55,6 +55,10 @@  _filter_conf()
 $QEMU_IMG info "json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=${BOGUS_CONF}'}" 2>&1 | _filter_conf
 $QEMU_IMG info "json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'${BOGUS_CONF}'}" 2>&1 | _filter_conf
 
+# Regression test: the qemu-img invocation is expected to fail, but it should
+# not seg fault the parser.
+$QEMU_IMG create "rbd:rbd/aa\/bb:conf=${BOGUS_CONF}" 1M 2>&1 | _filter_conf
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/231.out b/tests/qemu-iotests/231.out
index 747dd221bb..a785a6e859 100644
--- a/tests/qemu-iotests/231.out
+++ b/tests/qemu-iotests/231.out
@@ -4,4 +4,7 @@  unable to get monitor info from DNS SRV with service name: ceph-mon
 qemu-img: Could not open 'json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=BOGUS_CONF'}': error connecting: No such file or directory
 unable to get monitor info from DNS SRV with service name: ceph-mon
 qemu-img: Could not open 'json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'BOGUS_CONF'}': error connecting: No such file or directory
+Formatting 'rbd:rbd/aa\/bb:conf=BOGUS_CONF', fmt=raw size=1048576
+unable to get monitor info from DNS SRV with service name: ceph-mon
+qemu-img: rbd:rbd/aa\/bb:conf=BOGUS_CONF: error connecting: No such file or directory
 *** done