diff mbox series

[XEN,v1] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line()

Message ID ad7c89bbae34155566ae7c9ca2cb501f21c7d585.1724330921.git.javi.merino@cloud.com (mailing list archive)
State New
Headers show
Series [XEN,v1] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line() | expand

Commit Message

Javi Merino Aug. 22, 2024, 1:13 p.m. UTC
When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
call in main_dmesg().  ASAN reports a heap buffer overflow: an
off-by-one access to cr->buffer.

The readconsole sysctl copies up to count characters into the buffer,
but it does not add a null character at the end.  Despite the
documentation of libxl_xen_console_read_line(), line_r is not
nul-terminated if 16384 characters were copied to the buffer.

Fix this by making count one less that the size of the allocated
buffer so that the last byte is always null.

Reported-by: Edwin Török <edwin.torok@cloud.com>
Signed-off-by: Javi Merino <javi.merino@cloud.com>
---
 tools/libs/light/libxl_console.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich Aug. 23, 2024, 6:31 a.m. UTC | #1
On 22.08.2024 15:13, Javi Merino wrote:
> When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> call in main_dmesg().  ASAN reports a heap buffer overflow: an
> off-by-one access to cr->buffer.
> 
> The readconsole sysctl copies up to count characters into the buffer,
> but it does not add a null character at the end.  Despite the
> documentation of libxl_xen_console_read_line(), line_r is not
> nul-terminated if 16384 characters were copied to the buffer.
> 
> Fix this by making count one less that the size of the allocated
> buffer so that the last byte is always null.
> 
> Reported-by: Edwin Török <edwin.torok@cloud.com>
> Signed-off-by: Javi Merino <javi.merino@cloud.com>

Perhaps wants a Fixes: tag as well?

> --- a/tools/libs/light/libxl_console.c
> +++ b/tools/libs/light/libxl_console.c
> @@ -779,7 +779,7 @@ libxl_xen_console_reader *
>      cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
>      cr->buffer = libxl__zalloc(NOGC, size);
>      cr->size = size;
> -    cr->count = size;
> +    cr->count = size - 1;
>      cr->clear = clear;
>      cr->incremental = 1;

While this looks to be addressing the issue at hand, I still wonder: Why
does the "count" field exist at all? It's certainly odd to be set right
here (when the buffer actually is empty). It's used solely in
libxl_xen_console_read_line(), so could be a local variable there.

Then, further, I wonder why struct libxl__xen_console_reader lives in
libxl_internal.h - it's used solely in libxl_console.c.

Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears
the buffer anyway?

Jan
Andrew Cooper Aug. 23, 2024, 9:34 a.m. UTC | #2
On 22/08/2024 2:13 pm, Javi Merino wrote:
> When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> call in main_dmesg().  ASAN reports a heap buffer overflow: an
> off-by-one access to cr->buffer.
>
> The readconsole sysctl copies up to count characters into the buffer,
> but it does not add a null character at the end.  Despite the
> documentation of libxl_xen_console_read_line(), line_r is not
> nul-terminated if 16384 characters were copied to the buffer.
>
> Fix this by making count one less that the size of the allocated
> buffer so that the last byte is always null.
>
> Reported-by: Edwin Török <edwin.torok@cloud.com>
> Signed-off-by: Javi Merino <javi.merino@cloud.com>
> ---
>  tools/libs/light/libxl_console.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c
> index a563c9d3c7f9..fa28e2139453 100644
> --- a/tools/libs/light/libxl_console.c
> +++ b/tools/libs/light/libxl_console.c
> @@ -779,7 +779,7 @@ libxl_xen_console_reader *
>      cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
>      cr->buffer = libxl__zalloc(NOGC, size);
>      cr->size = size;
> -    cr->count = size;
> +    cr->count = size - 1;
>      cr->clear = clear;
>      cr->incremental = 1;
>  

This looks like it will fix the ASAN issue, but I think a better fix
would be:

-        printf("%s", line);
+       printf("%.*s", cr->count, line);

because otherwise there's a world of sharp corners once Xen has wrapped
the buffer for the first time.


Which brings us a lot of other WTFs in this code...

First, libxl_console.c describes it's functionality in terms of lines,
and line_reader() in the API.  Yet it's not lines, it's a 16k buffer
with generally multi-line content.  It's too late to fix the naming, but
we could at least rewrite the comments not to be blatant lies.


Just out of context above the hunk is:

    unsigned int size = 16384;

which isn't accurate.  The size of Xen's console ring can be changed at
compile time (like XenServer does), and/or by command line variable.

Because the dmesg ring is never full (it just keeps producing and
overwriting tail data), it's impossible to get a clean copy except in a
single hypercall; the incremental/offset parameters are an illusion, and
do not function correctly if a new printk() has occurred between
adjacent hypercalls.

And that is why setting count to size - 1 probably isn't wise.  It means
that, even in the ideal case where Xen's ringbuffer is 16k, you've still
got to make 2 hypercalls to get the content.

~Andrew
Anthony PERARD Aug. 23, 2024, 10:14 a.m. UTC | #3
On Fri, Aug 23, 2024 at 08:31:50AM +0200, Jan Beulich wrote:
> On 22.08.2024 15:13, Javi Merino wrote:
> > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> > call in main_dmesg().  ASAN reports a heap buffer overflow: an
> > off-by-one access to cr->buffer.
> > 
> > The readconsole sysctl copies up to count characters into the buffer,
> > but it does not add a null character at the end.  Despite the
> > documentation of libxl_xen_console_read_line(), line_r is not
> > nul-terminated if 16384 characters were copied to the buffer.
> > 
> > Fix this by making count one less that the size of the allocated
> > buffer so that the last byte is always null.
> > 
> > Reported-by: Edwin Török <edwin.torok@cloud.com>
> > Signed-off-by: Javi Merino <javi.merino@cloud.com>
> 
> Perhaps wants a Fixes: tag as well?

Seems to be:
Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'")

> > --- a/tools/libs/light/libxl_console.c
> > +++ b/tools/libs/light/libxl_console.c
> > @@ -779,7 +779,7 @@ libxl_xen_console_reader *
> >      cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
> >      cr->buffer = libxl__zalloc(NOGC, size);
> >      cr->size = size;
> > -    cr->count = size;
> > +    cr->count = size - 1;
> >      cr->clear = clear;
> >      cr->incremental = 1;
> 
> While this looks to be addressing the issue at hand, I still wonder: Why
> does the "count" field exist at all? It's certainly odd to be set right
> here (when the buffer actually is empty). It's used solely in
> libxl_xen_console_read_line(), so could be a local variable there.

It isn't only odd to have "count" field, it is also used the wrong way.
Once "cr->count" is set to 0, the next call to
libxl_xen_console_read_line() will simply return an empty NULL, even if
in the mean time more data is available to read from the string.
Also, if the last call to libxl_xen_console_read_line() was shorter than
the buffer size, none of the following call will use the full size of
the buffer. This isn't really a problem for `xl dmesg`, but could be if
we want to implement "--follow" option for example.

So Javi, could you remove that `cr->count` field and use a local
variable instead?

And a comment in the code might be useful about why there's a "-1".

But I think a better way to address the issue would be to actually set
'\0' at the end of the string after the call to xc_readconsolering(),
and remove the not very useful memset(0). That way, it will be more
explicite as to why we need "-1".

> Then, further, I wonder why struct libxl__xen_console_reader lives in
> libxl_internal.h - it's used solely in libxl_console.c.

History? It actually use to live in libxl.h, 14 yr ago.

> Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears
> the buffer anyway?

Overuse of libxl__zalloc when it was use to replace the open coding that
was used to allocate "cr". While it doesn't seems necessary, I don't
think it hurt to keep it there.

Thanks,
Anthony PERARD Aug. 23, 2024, 10:39 a.m. UTC | #4
On Fri, Aug 23, 2024 at 10:34:05AM +0100, Andrew Cooper wrote:
> On 22/08/2024 2:13 pm, Javi Merino wrote:
> This looks like it will fix the ASAN issue, but I think a better fix
> would be:
> 
> -        printf("%s", line);
> +       printf("%.*s", cr->count, line);

nul-terminated string are safer to manipulate, and cr->count isn't
available outside of libxl, so that's a no go.

> because otherwise there's a world of sharp corners once Xen has wrapped
> the buffer for the first time.

If wrapped buffer are visible to libxl or `xl dmesg`, that a bug in xen,
in XEN_SYSCTL_readconsole. It doesn't looks like it's possible to know
when the console ring buffer wrapped.

> 
> Which brings us a lot of other WTFs in this code...
> 
> First, libxl_console.c describes it's functionality in terms of lines,
> and line_reader() in the API.  Yet it's not lines, it's a 16k buffer
> with generally multi-line content.  It's too late to fix the naming, but
> we could at least rewrite the comments not to be blatant lies.
> 
> 
> Just out of context above the hunk is:
> 
>     unsigned int size = 16384;
> 
> which isn't accurate.  The size of Xen's console ring can be changed at
> compile time (like XenServer does), and/or by command line variable.

This buffer size isn't link to Xen's console ring size, both value don't
need to match.

> Because the dmesg ring is never full (it just keeps producing and
> overwriting tail data), it's impossible to get a clean copy except in a
> single hypercall; the incremental/offset parameters are an illusion, and
> do not function correctly if a new printk() has occurred between
> adjacent hypercalls.

Well, let's hope there's no more that "ring_size - size" is been written
to the ring in between two hypercall. There's doesn't seems to be
anything that can be done to work around this with this hypercall
(beside using a huge buffer size).

I think xenconsoled can read the console ring buffer, but does so
continuously, so if `xl dmesg` have some shortcoming, it might be fine.

> And that is why setting count to size - 1 probably isn't wise.  It means
> that, even in the ideal case where Xen's ringbuffer is 16k, you've still
> got to make 2 hypercalls to get the content.

Well, I've run `xl dmesg` on one machine, it seems to have taken 9
hypercall to get everything, and there's still the first few line of the
boot available. So if the ringbuffer was indeed 16k, it would not be
ideal at all... as lots of stuff would have been overwritten.

Cheers,
Javi Merino Aug. 23, 2024, 12:53 p.m. UTC | #5
On Fri, Aug 23, 2024 at 10:14:32AM GMT, Anthony PERARD wrote:
> On Fri, Aug 23, 2024 at 08:31:50AM +0200, Jan Beulich wrote:
> > On 22.08.2024 15:13, Javi Merino wrote:
> > > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> > > call in main_dmesg().  ASAN reports a heap buffer overflow: an
> > > off-by-one access to cr->buffer.
> > >
> > > The readconsole sysctl copies up to count characters into the buffer,
> > > but it does not add a null character at the end.  Despite the
> > > documentation of libxl_xen_console_read_line(), line_r is not
> > > nul-terminated if 16384 characters were copied to the buffer.
> > >
> > > Fix this by making count one less that the size of the allocated
> > > buffer so that the last byte is always null.
> > >
> > > Reported-by: Edwin Török <edwin.torok@cloud.com>
> > > Signed-off-by: Javi Merino <javi.merino@cloud.com>
> >
> > Perhaps wants a Fixes: tag as well?
> 
> Seems to be:
> Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'")

Yes, I'll add it.

> > > --- a/tools/libs/light/libxl_console.c
> > > +++ b/tools/libs/light/libxl_console.c
> > > @@ -779,7 +779,7 @@ libxl_xen_console_reader *
> > >      cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
> > >      cr->buffer = libxl__zalloc(NOGC, size);
> > >      cr->size = size;
> > > -    cr->count = size;
> > > +    cr->count = size - 1;
> > >      cr->clear = clear;
> > >      cr->incremental = 1;
> >
> > While this looks to be addressing the issue at hand, I still wonder: Why
> > does the "count" field exist at all? It's certainly odd to be set right
> > here (when the buffer actually is empty). It's used solely in
> > libxl_xen_console_read_line(), so could be a local variable there.

True.  I just wanted to fix the actual bug, but I guess while we are
at it we can improve this piece of code.

> It isn't only odd to have "count" field, it is also used the wrong way.
> Once "cr->count" is set to 0, the next call to
> libxl_xen_console_read_line() will simply return an empty NULL, even if
> in the mean time more data is available to read from the string.
> Also, if the last call to libxl_xen_console_read_line() was shorter than
> the buffer size, none of the following call will use the full size of
> the buffer. This isn't really a problem for `xl dmesg`, but could be if
> we want to implement "--follow" option for example.
> 
> So Javi, could you remove that `cr->count` field and use a local
> variable instead?

Sure, I'll do that.

> And a comment in the code might be useful about why there's a "-1".
> 
> But I think a better way to address the issue would be to actually set
> '\0' at the end of the string after the call to xc_readconsolering(),
> and remove the not very useful memset(0). That way, it will be more
> explicite as to why we need "-1".

Right, I will get rid of the memset(0) and just nul-terminate the string.

> > Then, further, I wonder why struct libxl__xen_console_reader lives in
> > libxl_internal.h - it's used solely in libxl_console.c.
> 
> History? It actually use to live in libxl.h, 14 yr ago.
>
> > Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears
> > the buffer anyway?
> 
> Overuse of libxl__zalloc when it was use to replace the open coding that
> was used to allocate "cr". While it doesn't seems necessary, I don't
> think it hurt to keep it there.

It used to be a malloc() that was changed to a zalloc() in
39eaabdf4131 (libxl: don't leak buf in libxl_xen_console_read_start
error handling, 2013-12-03).  The comment in the commit message is
wrong, the malloc+memset was being done for libxl_xen_console_reader
(the struct) not for the buffer.

It doesn't hurt to keep it, but if I'm making changes in this area, I
may as well just make it a libxl__malloc() .

Cheers,
Javi
Javi Merino Aug. 23, 2024, 12:57 p.m. UTC | #6
On Fri, Aug 23, 2024 at 10:34:05AM GMT, Andrew Cooper wrote:
> On 22/08/2024 2:13 pm, Javi Merino wrote:
> > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> > call in main_dmesg().  ASAN reports a heap buffer overflow: an
> > off-by-one access to cr->buffer.
> >
> > The readconsole sysctl copies up to count characters into the buffer,
> > but it does not add a null character at the end.  Despite the
> > documentation of libxl_xen_console_read_line(), line_r is not
> > nul-terminated if 16384 characters were copied to the buffer.
> >
> > Fix this by making count one less that the size of the allocated
> > buffer so that the last byte is always null.
> >
> > Reported-by: Edwin Török <edwin.torok@cloud.com>
> > Signed-off-by: Javi Merino <javi.merino@cloud.com>
> > ---
> >  tools/libs/light/libxl_console.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c
> > index a563c9d3c7f9..fa28e2139453 100644
> > --- a/tools/libs/light/libxl_console.c
> > +++ b/tools/libs/light/libxl_console.c
> > @@ -779,7 +779,7 @@ libxl_xen_console_reader *
> >      cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
> >      cr->buffer = libxl__zalloc(NOGC, size);
> >      cr->size = size;
> > -    cr->count = size;
> > +    cr->count = size - 1;
> >      cr->clear = clear;
> >      cr->incremental = 1;
> >  
> 
> This looks like it will fix the ASAN issue, but I think a better fix
> would be:
> 
> -        printf("%s", line);
> +       printf("%.*s", cr->count, line);
> 
> because otherwise there's a world of sharp corners once Xen has wrapped
> the buffer for the first time.
> 
> 
> Which brings us a lot of other WTFs in this code...
> 
> First, libxl_console.c describes it's functionality in terms of lines,
> and line_reader() in the API.  Yet it's not lines, it's a 16k buffer
> with generally multi-line content.  It's too late to fix the naming, but
> we could at least rewrite the comments not to be blatant lies.
> 
> 
> Just out of context above the hunk is:
> 
>     unsigned int size = 16384;
> 
> which isn't accurate.  The size of Xen's console ring can be changed at
> compile time (like XenServer does), and/or by command line variable.
> 
> Because the dmesg ring is never full (it just keeps producing and
> overwriting tail data), it's impossible to get a clean copy except in a
> single hypercall; the incremental/offset parameters are an illusion, and
> do not function correctly if a new printk() has occurred between
> adjacent hypercalls.
> 
> And that is why setting count to size - 1 probably isn't wise.  It means
> that, even in the ideal case where Xen's ringbuffer is 16k, you've still
> got to make 2 hypercalls to get the content.

You are right, having to do 2 hypercalls in the ideal case is not
good.  I'm going to get rid of cr->count, and make
libxl_xen_console_read_line() honour its documentation of returning a
nul-terminated string.  I'll just make the allocation one char bigger,
which was the fix I originally had.

Cheers,
Javi
diff mbox series

Patch

diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c
index a563c9d3c7f9..fa28e2139453 100644
--- a/tools/libs/light/libxl_console.c
+++ b/tools/libs/light/libxl_console.c
@@ -779,7 +779,7 @@  libxl_xen_console_reader *
     cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
     cr->buffer = libxl__zalloc(NOGC, size);
     cr->size = size;
-    cr->count = size;
+    cr->count = size - 1;
     cr->clear = clear;
     cr->incremental = 1;