Message ID | 20210401210150.2127670-3-ckuehl@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix segfault in qemu_rbd_parse_filename | expand |
On 01.04.21 23:01, Connor Kuehl wrote: > Sometimes the parser needs to further split a token it has collected > from the token input stream. Right now, it does a cursory check to see > if the relevant characters appear in the token to determine if it should > break it down further. > > However, qemu_rbd_next_tok() will escape characters as it removes tokens > from the token stream and plain strchr() won't. This can make the > initial strchr() check slightly misleading since it implies > qemu_rbd_next_tok() will find the token and split on it, except the > reality is that qemu_rbd_next_tok() will pass over it if it is escaped. > > Use a custom strchr to avoid mixing escaped and unescaped string > operations. > > Reported-by: Han Han <hhan@redhat.com> > Fixes: https://bugzilla.redhat.com/1873913 > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > --- > block/rbd.c | 20 ++++++++++++++++++-- > tests/qemu-iotests/231 | 4 ++++ > tests/qemu-iotests/231.out | 3 +++ > 3 files changed, 25 insertions(+), 2 deletions(-) > > diff --git a/block/rbd.c b/block/rbd.c > index 9071a00e3f..c0e4d4a952 100644 > --- a/block/rbd.c > +++ b/block/rbd.c > @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char delim, char **p) > return src; > } > > +static char *qemu_rbd_strchr(char *src, char delim) > +{ > + char *p; > + > + for (p = src; *p; ++p) { > + if (*p == delim) { > + return p; > + } > + if (*p == '\\') { > + ++p; > + } > + } > + > + return NULL; > +} > + So I thought you could make qemu_rbd_do_next_tok() to do this. (I didn’t say you should, but bear with me.) That would be possible by giving it a new parameter (e.g. @find), and if that is set, return @end if *end == delim after the loop, and NULL otherwise. Now, if you add wrapper functions to make it nice, there’s not much more difference in lines added compared to just adding a new function, but it does mean your function should basically be the same as qemu_rbd_next_tok(), except that no splitting happens, that there is no *p, and that @end is returned instead of @src. So there is one difference, and that is that qemu_rbd_next_tok() has this condition to skip escaped characters: if (*end == '\\' && end[1] != '\0') { where qemu_rbd_strchr() has only: if (*p == '\\') { And I think qemu_rbd_next_tok() is right; if the string in question has a trailing backslash, qemu_rbd_strchr() will ignore the final NUL and continue searching past the end of the string. Max
On 4/6/21 9:24 AM, Max Reitz wrote: > On 01.04.21 23:01, Connor Kuehl wrote: >> [..] >> diff --git a/block/rbd.c b/block/rbd.c >> index 9071a00e3f..c0e4d4a952 100644 >> --- a/block/rbd.c >> +++ b/block/rbd.c >> @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char >> delim, char **p) >> return src; >> } >> +static char *qemu_rbd_strchr(char *src, char delim) >> +{ >> + char *p; >> + >> + for (p = src; *p; ++p) { >> + if (*p == delim) { >> + return p; >> + } >> + if (*p == '\\') { >> + ++p; >> + } >> + } >> + >> + return NULL; >> +} >> + > > So I thought you could make qemu_rbd_do_next_tok() to do this. (I > didn’t say you should, but bear with me.) That would be possible by > giving it a new parameter (e.g. @find), and if that is set, return @end > if *end == delim after the loop, and NULL otherwise. > > Now, if you add wrapper functions to make it nice, there’s not much more > difference in lines added compared to just adding a new function, but it > does mean your function should basically be the same as > qemu_rbd_next_tok(), except that no splitting happens, that there is no > *p, and that @end is returned instead of @src. Do you have a strong preference for this? I agree that qemu_rbd_next_tok() could grow this functionality, but I think it'd be simpler to keep it separate in the form of qemu_rbd_strchr(). > > So there is one difference, and that is that qemu_rbd_next_tok() has > this condition to skip escaped characters: > > if (*end == '\\' && end[1] != '\0') { > > where qemu_rbd_strchr() has only: > > if (*p == '\\') { > > And I think qemu_rbd_next_tok() is right; if the string in question has > a trailing backslash, qemu_rbd_strchr() will ignore the final NUL and > continue searching past the end of the string. Aha, good catch. I'll fix this up. Thank you, Connor
On 09.04.21 16:05, Connor Kuehl wrote: > On 4/6/21 9:24 AM, Max Reitz wrote: >> On 01.04.21 23:01, Connor Kuehl wrote: >>> [..] >>> diff --git a/block/rbd.c b/block/rbd.c >>> index 9071a00e3f..c0e4d4a952 100644 >>> --- a/block/rbd.c >>> +++ b/block/rbd.c >>> @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char >>> delim, char **p) >>> return src; >>> } >>> +static char *qemu_rbd_strchr(char *src, char delim) >>> +{ >>> + char *p; >>> + >>> + for (p = src; *p; ++p) { >>> + if (*p == delim) { >>> + return p; >>> + } >>> + if (*p == '\\') { >>> + ++p; >>> + } >>> + } >>> + >>> + return NULL; >>> +} >>> + >> >> So I thought you could make qemu_rbd_do_next_tok() to do this. (I >> didn’t say you should, but bear with me.) That would be possible by >> giving it a new parameter (e.g. @find), and if that is set, return >> @end if *end == delim after the loop, and NULL otherwise. >> >> Now, if you add wrapper functions to make it nice, there’s not much >> more difference in lines added compared to just adding a new function, >> but it does mean your function should basically be the same as >> qemu_rbd_next_tok(), except that no splitting happens, that there is >> no *p, and that @end is returned instead of @src. > > Do you have a strong preference for this? I agree that > qemu_rbd_next_tok() could grow this functionality, but I think it'd be > simpler to keep it separate in the form of qemu_rbd_strchr(). Oh, no, no. I mostly said this so it would be clear why both functions should basically have the same structure, i.e. why a difference in structure might be a sign that something’s wrong. Sorry if I came across as too verbose. >> So there is one difference, and that is that qemu_rbd_next_tok() has >> this condition to skip escaped characters: >> >> if (*end == '\\' && end[1] != '\0') { >> >> where qemu_rbd_strchr() has only: >> >> if (*p == '\\') { >> >> And I think qemu_rbd_next_tok() is right; if the string in question >> has a trailing backslash, qemu_rbd_strchr() will ignore the final NUL >> and continue searching past the end of the string. > > Aha, good catch. I'll fix this up. Thanks! Max
diff --git a/block/rbd.c b/block/rbd.c index 9071a00e3f..c0e4d4a952 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -134,6 +134,22 @@ static char *qemu_rbd_next_tok(char *src, char delim, char **p) return src; } +static char *qemu_rbd_strchr(char *src, char delim) +{ + char *p; + + for (p = src; *p; ++p) { + if (*p == delim) { + return p; + } + if (*p == '\\') { + ++p; + } + } + + return NULL; +} + static void qemu_rbd_unescape(char *src) { char *p; @@ -171,7 +187,7 @@ static void qemu_rbd_parse_filename(const char *filename, QDict *options, qemu_rbd_unescape(found_str); qdict_put_str(options, "pool", found_str); - if (strchr(p, '@')) { + if (qemu_rbd_strchr(p, '@')) { image_name = qemu_rbd_next_tok(p, '@', &p); found_str = qemu_rbd_next_tok(p, ':', &p); @@ -181,7 +197,7 @@ static void qemu_rbd_parse_filename(const char *filename, QDict *options, image_name = qemu_rbd_next_tok(p, ':', &p); } /* Check for namespace in the image_name */ - if (strchr(image_name, '/')) { + if (qemu_rbd_strchr(image_name, '/')) { found_str = qemu_rbd_next_tok(image_name, '/', &image_name); qemu_rbd_unescape(found_str); qdict_put_str(options, "namespace", found_str); diff --git a/tests/qemu-iotests/231 b/tests/qemu-iotests/231 index 0f66d0ca36..8e6c6447c1 100755 --- a/tests/qemu-iotests/231 +++ b/tests/qemu-iotests/231 @@ -55,6 +55,10 @@ _filter_conf() $QEMU_IMG info "json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=${BOGUS_CONF}'}" 2>&1 | _filter_conf $QEMU_IMG info "json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'${BOGUS_CONF}'}" 2>&1 | _filter_conf +# Regression test: the qemu-img invocation is expected to fail, but it should +# not seg fault the parser. +$QEMU_IMG create "rbd:rbd/aa\/bb:conf=${BOGUS_CONF}" 1M 2>&1 | _filter_conf + # success, all done echo "*** done" rm -f $seq.full diff --git a/tests/qemu-iotests/231.out b/tests/qemu-iotests/231.out index 747dd221bb..a785a6e859 100644 --- a/tests/qemu-iotests/231.out +++ b/tests/qemu-iotests/231.out @@ -4,4 +4,7 @@ unable to get monitor info from DNS SRV with service name: ceph-mon qemu-img: Could not open 'json:{'file.driver':'rbd','file.filename':'rbd:rbd/bogus:conf=BOGUS_CONF'}': error connecting: No such file or directory unable to get monitor info from DNS SRV with service name: ceph-mon qemu-img: Could not open 'json:{'file.driver':'rbd','file.pool':'rbd','file.image':'bogus','file.conf':'BOGUS_CONF'}': error connecting: No such file or directory +Formatting 'rbd:rbd/aa\/bb:conf=BOGUS_CONF', fmt=raw size=1048576 +unable to get monitor info from DNS SRV with service name: ceph-mon +qemu-img: rbd:rbd/aa\/bb:conf=BOGUS_CONF: error connecting: No such file or directory *** done
Sometimes the parser needs to further split a token it has collected from the token input stream. Right now, it does a cursory check to see if the relevant characters appear in the token to determine if it should break it down further. However, qemu_rbd_next_tok() will escape characters as it removes tokens from the token stream and plain strchr() won't. This can make the initial strchr() check slightly misleading since it implies qemu_rbd_next_tok() will find the token and split on it, except the reality is that qemu_rbd_next_tok() will pass over it if it is escaped. Use a custom strchr to avoid mixing escaped and unescaped string operations. Reported-by: Han Han <hhan@redhat.com> Fixes: https://bugzilla.redhat.com/1873913 Signed-off-by: Connor Kuehl <ckuehl@redhat.com> --- block/rbd.c | 20 ++++++++++++++++++-- tests/qemu-iotests/231 | 4 ++++ tests/qemu-iotests/231.out | 3 +++ 3 files changed, 25 insertions(+), 2 deletions(-)