Message ID | ad7c89bbae34155566ae7c9ca2cb501f21c7d585.1724330921.git.javi.merino@cloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [XEN,v1] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line() | expand |
On 22.08.2024 15:13, Javi Merino wrote: > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" > call in main_dmesg(). ASAN reports a heap buffer overflow: an > off-by-one access to cr->buffer. > > The readconsole sysctl copies up to count characters into the buffer, > but it does not add a null character at the end. Despite the > documentation of libxl_xen_console_read_line(), line_r is not > nul-terminated if 16384 characters were copied to the buffer. > > Fix this by making count one less that the size of the allocated > buffer so that the last byte is always null. > > Reported-by: Edwin Török <edwin.torok@cloud.com> > Signed-off-by: Javi Merino <javi.merino@cloud.com> Perhaps wants a Fixes: tag as well? > --- a/tools/libs/light/libxl_console.c > +++ b/tools/libs/light/libxl_console.c > @@ -779,7 +779,7 @@ libxl_xen_console_reader * > cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); > cr->buffer = libxl__zalloc(NOGC, size); > cr->size = size; > - cr->count = size; > + cr->count = size - 1; > cr->clear = clear; > cr->incremental = 1; While this looks to be addressing the issue at hand, I still wonder: Why does the "count" field exist at all? It's certainly odd to be set right here (when the buffer actually is empty). It's used solely in libxl_xen_console_read_line(), so could be a local variable there. Then, further, I wonder why struct libxl__xen_console_reader lives in libxl_internal.h - it's used solely in libxl_console.c. Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears the buffer anyway? Jan
On 22/08/2024 2:13 pm, Javi Merino wrote: > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" > call in main_dmesg(). ASAN reports a heap buffer overflow: an > off-by-one access to cr->buffer. > > The readconsole sysctl copies up to count characters into the buffer, > but it does not add a null character at the end. Despite the > documentation of libxl_xen_console_read_line(), line_r is not > nul-terminated if 16384 characters were copied to the buffer. > > Fix this by making count one less that the size of the allocated > buffer so that the last byte is always null. > > Reported-by: Edwin Török <edwin.torok@cloud.com> > Signed-off-by: Javi Merino <javi.merino@cloud.com> > --- > tools/libs/light/libxl_console.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c > index a563c9d3c7f9..fa28e2139453 100644 > --- a/tools/libs/light/libxl_console.c > +++ b/tools/libs/light/libxl_console.c > @@ -779,7 +779,7 @@ libxl_xen_console_reader * > cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); > cr->buffer = libxl__zalloc(NOGC, size); > cr->size = size; > - cr->count = size; > + cr->count = size - 1; > cr->clear = clear; > cr->incremental = 1; > This looks like it will fix the ASAN issue, but I think a better fix would be: - printf("%s", line); + printf("%.*s", cr->count, line); because otherwise there's a world of sharp corners once Xen has wrapped the buffer for the first time. Which brings us a lot of other WTFs in this code... First, libxl_console.c describes it's functionality in terms of lines, and line_reader() in the API. Yet it's not lines, it's a 16k buffer with generally multi-line content. It's too late to fix the naming, but we could at least rewrite the comments not to be blatant lies. Just out of context above the hunk is: unsigned int size = 16384; which isn't accurate. The size of Xen's console ring can be changed at compile time (like XenServer does), and/or by command line variable. Because the dmesg ring is never full (it just keeps producing and overwriting tail data), it's impossible to get a clean copy except in a single hypercall; the incremental/offset parameters are an illusion, and do not function correctly if a new printk() has occurred between adjacent hypercalls. And that is why setting count to size - 1 probably isn't wise. It means that, even in the ideal case where Xen's ringbuffer is 16k, you've still got to make 2 hypercalls to get the content. ~Andrew
On Fri, Aug 23, 2024 at 08:31:50AM +0200, Jan Beulich wrote: > On 22.08.2024 15:13, Javi Merino wrote: > > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" > > call in main_dmesg(). ASAN reports a heap buffer overflow: an > > off-by-one access to cr->buffer. > > > > The readconsole sysctl copies up to count characters into the buffer, > > but it does not add a null character at the end. Despite the > > documentation of libxl_xen_console_read_line(), line_r is not > > nul-terminated if 16384 characters were copied to the buffer. > > > > Fix this by making count one less that the size of the allocated > > buffer so that the last byte is always null. > > > > Reported-by: Edwin Török <edwin.torok@cloud.com> > > Signed-off-by: Javi Merino <javi.merino@cloud.com> > > Perhaps wants a Fixes: tag as well? Seems to be: Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'") > > --- a/tools/libs/light/libxl_console.c > > +++ b/tools/libs/light/libxl_console.c > > @@ -779,7 +779,7 @@ libxl_xen_console_reader * > > cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); > > cr->buffer = libxl__zalloc(NOGC, size); > > cr->size = size; > > - cr->count = size; > > + cr->count = size - 1; > > cr->clear = clear; > > cr->incremental = 1; > > While this looks to be addressing the issue at hand, I still wonder: Why > does the "count" field exist at all? It's certainly odd to be set right > here (when the buffer actually is empty). It's used solely in > libxl_xen_console_read_line(), so could be a local variable there. It isn't only odd to have "count" field, it is also used the wrong way. Once "cr->count" is set to 0, the next call to libxl_xen_console_read_line() will simply return an empty NULL, even if in the mean time more data is available to read from the string. Also, if the last call to libxl_xen_console_read_line() was shorter than the buffer size, none of the following call will use the full size of the buffer. This isn't really a problem for `xl dmesg`, but could be if we want to implement "--follow" option for example. So Javi, could you remove that `cr->count` field and use a local variable instead? And a comment in the code might be useful about why there's a "-1". But I think a better way to address the issue would be to actually set '\0' at the end of the string after the call to xc_readconsolering(), and remove the not very useful memset(0). That way, it will be more explicite as to why we need "-1". > Then, further, I wonder why struct libxl__xen_console_reader lives in > libxl_internal.h - it's used solely in libxl_console.c. History? It actually use to live in libxl.h, 14 yr ago. > Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears > the buffer anyway? Overuse of libxl__zalloc when it was use to replace the open coding that was used to allocate "cr". While it doesn't seems necessary, I don't think it hurt to keep it there. Thanks,
On Fri, Aug 23, 2024 at 10:34:05AM +0100, Andrew Cooper wrote: > On 22/08/2024 2:13 pm, Javi Merino wrote: > This looks like it will fix the ASAN issue, but I think a better fix > would be: > > - printf("%s", line); > + printf("%.*s", cr->count, line); nul-terminated string are safer to manipulate, and cr->count isn't available outside of libxl, so that's a no go. > because otherwise there's a world of sharp corners once Xen has wrapped > the buffer for the first time. If wrapped buffer are visible to libxl or `xl dmesg`, that a bug in xen, in XEN_SYSCTL_readconsole. It doesn't looks like it's possible to know when the console ring buffer wrapped. > > Which brings us a lot of other WTFs in this code... > > First, libxl_console.c describes it's functionality in terms of lines, > and line_reader() in the API. Yet it's not lines, it's a 16k buffer > with generally multi-line content. It's too late to fix the naming, but > we could at least rewrite the comments not to be blatant lies. > > > Just out of context above the hunk is: > > unsigned int size = 16384; > > which isn't accurate. The size of Xen's console ring can be changed at > compile time (like XenServer does), and/or by command line variable. This buffer size isn't link to Xen's console ring size, both value don't need to match. > Because the dmesg ring is never full (it just keeps producing and > overwriting tail data), it's impossible to get a clean copy except in a > single hypercall; the incremental/offset parameters are an illusion, and > do not function correctly if a new printk() has occurred between > adjacent hypercalls. Well, let's hope there's no more that "ring_size - size" is been written to the ring in between two hypercall. There's doesn't seems to be anything that can be done to work around this with this hypercall (beside using a huge buffer size). I think xenconsoled can read the console ring buffer, but does so continuously, so if `xl dmesg` have some shortcoming, it might be fine. > And that is why setting count to size - 1 probably isn't wise. It means > that, even in the ideal case where Xen's ringbuffer is 16k, you've still > got to make 2 hypercalls to get the content. Well, I've run `xl dmesg` on one machine, it seems to have taken 9 hypercall to get everything, and there's still the first few line of the boot available. So if the ringbuffer was indeed 16k, it would not be ideal at all... as lots of stuff would have been overwritten. Cheers,
On Fri, Aug 23, 2024 at 10:14:32AM GMT, Anthony PERARD wrote: > On Fri, Aug 23, 2024 at 08:31:50AM +0200, Jan Beulich wrote: > > On 22.08.2024 15:13, Javi Merino wrote: > > > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" > > > call in main_dmesg(). ASAN reports a heap buffer overflow: an > > > off-by-one access to cr->buffer. > > > > > > The readconsole sysctl copies up to count characters into the buffer, > > > but it does not add a null character at the end. Despite the > > > documentation of libxl_xen_console_read_line(), line_r is not > > > nul-terminated if 16384 characters were copied to the buffer. > > > > > > Fix this by making count one less that the size of the allocated > > > buffer so that the last byte is always null. > > > > > > Reported-by: Edwin Török <edwin.torok@cloud.com> > > > Signed-off-by: Javi Merino <javi.merino@cloud.com> > > > > Perhaps wants a Fixes: tag as well? > > Seems to be: > Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'") Yes, I'll add it. > > > --- a/tools/libs/light/libxl_console.c > > > +++ b/tools/libs/light/libxl_console.c > > > @@ -779,7 +779,7 @@ libxl_xen_console_reader * > > > cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); > > > cr->buffer = libxl__zalloc(NOGC, size); > > > cr->size = size; > > > - cr->count = size; > > > + cr->count = size - 1; > > > cr->clear = clear; > > > cr->incremental = 1; > > > > While this looks to be addressing the issue at hand, I still wonder: Why > > does the "count" field exist at all? It's certainly odd to be set right > > here (when the buffer actually is empty). It's used solely in > > libxl_xen_console_read_line(), so could be a local variable there. True. I just wanted to fix the actual bug, but I guess while we are at it we can improve this piece of code. > It isn't only odd to have "count" field, it is also used the wrong way. > Once "cr->count" is set to 0, the next call to > libxl_xen_console_read_line() will simply return an empty NULL, even if > in the mean time more data is available to read from the string. > Also, if the last call to libxl_xen_console_read_line() was shorter than > the buffer size, none of the following call will use the full size of > the buffer. This isn't really a problem for `xl dmesg`, but could be if > we want to implement "--follow" option for example. > > So Javi, could you remove that `cr->count` field and use a local > variable instead? Sure, I'll do that. > And a comment in the code might be useful about why there's a "-1". > > But I think a better way to address the issue would be to actually set > '\0' at the end of the string after the call to xc_readconsolering(), > and remove the not very useful memset(0). That way, it will be more > explicite as to why we need "-1". Right, I will get rid of the memset(0) and just nul-terminate the string. > > Then, further, I wonder why struct libxl__xen_console_reader lives in > > libxl_internal.h - it's used solely in libxl_console.c. > > History? It actually use to live in libxl.h, 14 yr ago. > > > Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears > > the buffer anyway? > > Overuse of libxl__zalloc when it was use to replace the open coding that > was used to allocate "cr". While it doesn't seems necessary, I don't > think it hurt to keep it there. It used to be a malloc() that was changed to a zalloc() in 39eaabdf4131 (libxl: don't leak buf in libxl_xen_console_read_start error handling, 2013-12-03). The comment in the commit message is wrong, the malloc+memset was being done for libxl_xen_console_reader (the struct) not for the buffer. It doesn't hurt to keep it, but if I'm making changes in this area, I may as well just make it a libxl__malloc() . Cheers, Javi
On Fri, Aug 23, 2024 at 10:34:05AM GMT, Andrew Cooper wrote: > On 22/08/2024 2:13 pm, Javi Merino wrote: > > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" > > call in main_dmesg(). ASAN reports a heap buffer overflow: an > > off-by-one access to cr->buffer. > > > > The readconsole sysctl copies up to count characters into the buffer, > > but it does not add a null character at the end. Despite the > > documentation of libxl_xen_console_read_line(), line_r is not > > nul-terminated if 16384 characters were copied to the buffer. > > > > Fix this by making count one less that the size of the allocated > > buffer so that the last byte is always null. > > > > Reported-by: Edwin Török <edwin.torok@cloud.com> > > Signed-off-by: Javi Merino <javi.merino@cloud.com> > > --- > > tools/libs/light/libxl_console.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c > > index a563c9d3c7f9..fa28e2139453 100644 > > --- a/tools/libs/light/libxl_console.c > > +++ b/tools/libs/light/libxl_console.c > > @@ -779,7 +779,7 @@ libxl_xen_console_reader * > > cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); > > cr->buffer = libxl__zalloc(NOGC, size); > > cr->size = size; > > - cr->count = size; > > + cr->count = size - 1; > > cr->clear = clear; > > cr->incremental = 1; > > > > This looks like it will fix the ASAN issue, but I think a better fix > would be: > > - printf("%s", line); > + printf("%.*s", cr->count, line); > > because otherwise there's a world of sharp corners once Xen has wrapped > the buffer for the first time. > > > Which brings us a lot of other WTFs in this code... > > First, libxl_console.c describes it's functionality in terms of lines, > and line_reader() in the API. Yet it's not lines, it's a 16k buffer > with generally multi-line content. It's too late to fix the naming, but > we could at least rewrite the comments not to be blatant lies. > > > Just out of context above the hunk is: > > unsigned int size = 16384; > > which isn't accurate. The size of Xen's console ring can be changed at > compile time (like XenServer does), and/or by command line variable. > > Because the dmesg ring is never full (it just keeps producing and > overwriting tail data), it's impossible to get a clean copy except in a > single hypercall; the incremental/offset parameters are an illusion, and > do not function correctly if a new printk() has occurred between > adjacent hypercalls. > > And that is why setting count to size - 1 probably isn't wise. It means > that, even in the ideal case where Xen's ringbuffer is 16k, you've still > got to make 2 hypercalls to get the content. You are right, having to do 2 hypercalls in the ideal case is not good. I'm going to get rid of cr->count, and make libxl_xen_console_read_line() honour its documentation of returning a nul-terminated string. I'll just make the allocation one char bigger, which was the fix I originally had. Cheers, Javi
diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c index a563c9d3c7f9..fa28e2139453 100644 --- a/tools/libs/light/libxl_console.c +++ b/tools/libs/light/libxl_console.c @@ -779,7 +779,7 @@ libxl_xen_console_reader * cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); cr->buffer = libxl__zalloc(NOGC, size); cr->size = size; - cr->count = size; + cr->count = size - 1; cr->clear = clear; cr->incremental = 1;
When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" call in main_dmesg(). ASAN reports a heap buffer overflow: an off-by-one access to cr->buffer. The readconsole sysctl copies up to count characters into the buffer, but it does not add a null character at the end. Despite the documentation of libxl_xen_console_read_line(), line_r is not nul-terminated if 16384 characters were copied to the buffer. Fix this by making count one less that the size of the allocated buffer so that the last byte is always null. Reported-by: Edwin Török <edwin.torok@cloud.com> Signed-off-by: Javi Merino <javi.merino@cloud.com> --- tools/libs/light/libxl_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)