diff mbox

[1/2] qga-win: prevent crash when executing fsinfo command

Message ID 20180626151038.24771-2-sameeh@daynix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sameeh Jubran June 26, 2018, 3:10 p.m. UTC
From: Sameeh Jubran <sjubran@redhat.com>

The fsinfo command is currently implemented for Windows only and it's disk
parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga
code. When enabled and executed the qemu-ga crashed with the following message:

------------------------------------------------
File qapi/qapi-visit-core.c, Line 49

Expression: !(v->type & VISITOR_OUTPUT) || *obj)
------------------------------------------------

After some digging, turns out that the GuestPCIAddress is null and the
qapi visitor doesn't like that, so we can always allocate it instead and
initiate all it's members to -1.

Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
---
 qga/commands-win32.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Philippe Mathieu-Daudé June 26, 2018, 4:11 p.m. UTC | #1
On 06/26/2018 12:10 PM, Sameeh Jubran wrote:
> From: Sameeh Jubran <sjubran@redhat.com>
> 
> The fsinfo command is currently implemented for Windows only and it's disk
> parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga
> code. When enabled and executed the qemu-ga crashed with the following message:
> 
> ------------------------------------------------
> File qapi/qapi-visit-core.c, Line 49
> 
> Expression: !(v->type & VISITOR_OUTPUT) || *obj)
> ------------------------------------------------

Due to these multiple lines starting with '---', this patch confuses
Thunderbird + colorediffs extension in a funny way...

> 
> After some digging, turns out that the GuestPCIAddress is null and the
> qapi visitor doesn't like that, so we can always allocate it instead and
> initiate all it's members to -1.
> 
> Signed-off-by: Sameeh Jubran <sjubran@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---
>  qga/commands-win32.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 2d48394748..c5f1c884e1 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
>      char *buffer = NULL;
>      GuestPCIAddress *pci = NULL;
>      char *name = g_strdup(&guid[4]);
> +    pci = g_malloc0(sizeof(*pci));
> +    pci->domain = -1;
> +    pci->slot = -1;
> +    pci->function = -1;
> +    pci->bus = -1;
>  
>      if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) {
>          error_setg_win32(errp, GetLastError(), "failed to get dos device name");
> @@ -556,7 +561,6 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
>  
>          func = addr & 0x0000FFFF;
>          dev = (addr >> 16) & 0x0000FFFF;
> -        pci = g_malloc0(sizeof(*pci));
>          pci->domain = dev;
>          pci->slot = slot;
>          pci->function = func;
>
Eric Blake June 28, 2018, 9:44 p.m. UTC | #2
On 06/26/2018 10:10 AM, Sameeh Jubran wrote:
> From: Sameeh Jubran <sjubran@redhat.com>
> 
> The fsinfo command is currently implemented for Windows only and it's disk
> parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga
> code. When enabled and executed the qemu-ga crashed with the following message:
> 
> ------------------------------------------------
> File qapi/qapi-visit-core.c, Line 49
> 
> Expression: !(v->type & VISITOR_OUTPUT) || *obj)
> ------------------------------------------------
> 
> After some digging, turns out that the GuestPCIAddress is null and the
> qapi visitor doesn't like that, so we can always allocate it instead and
> initiate all it's members to -1.

Adding Markus for a qapi back-compat question.

Is faking an invalid address better than making the output optional instead?

> +++ b/qga/commands-win32.c
> @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
>       char *buffer = NULL;
>       GuestPCIAddress *pci = NULL;
>       char *name = g_strdup(&guid[4]);
> +    pci = g_malloc0(sizeof(*pci));
> +    pci->domain = -1;
> +    pci->slot = -1;
> +    pci->function = -1;
> +    pci->bus = -1;

Right now, we have:

##
# @GuestDiskAddress:
#
# @pci-controller: controller's PCI address
# @bus-type: bus type
# @bus: bus id
# @target: target id
# @unit: unit id
#
# Since: 2.2
##
{ 'struct': 'GuestDiskAddress',
   'data': {'pci-controller': 'GuestPCIAddress',
            'bus-type': 'GuestDiskBusType',
            'bus': 'int', 'target': 'int', 'unit': 'int'} }

and the problem you ran into is that under certain conditions, we have 
no idea what to populate in GuestPCIAddress.  Your patch populates 
garbage instead; but I'm wondering if it would be better to instead mark 
pci-controller as optional, where code that CAN populate it would set 
has_pci_controller=true, and the code that crashed will now work by 
leaving the struct NULL (and has_pci_controller=false).  But that 
removes output that the client may expect to be present, hence, this is 
a back-compat question of the best way to approach this.
Sameeh Jubran June 28, 2018, 11:33 p.m. UTC | #3
On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote:

> On 06/26/2018 10:10 AM, Sameeh Jubran wrote:
>
>> From: Sameeh Jubran <sjubran@redhat.com>
>>
>> The fsinfo command is currently implemented for Windows only and it's disk
>> parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to
>> the qga
>> code. When enabled and executed the qemu-ga crashed with the following
>> message:
>>
>> ------------------------------------------------
>> File qapi/qapi-visit-core.c, Line 49
>>
>> Expression: !(v->type & VISITOR_OUTPUT) || *obj)
>> ------------------------------------------------
>>
>> After some digging, turns out that the GuestPCIAddress is null and the
>> qapi visitor doesn't like that, so we can always allocate it instead and
>> initiate all it's members to -1.
>>
>
> Adding Markus for a qapi back-compat question.
>
> Is faking an invalid address better than making the output optional
> instead?
>
> +++ b/qga/commands-win32.c
>> @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid,
>> Error **errp)
>>       char *buffer = NULL;
>>       GuestPCIAddress *pci = NULL;
>>       char *name = g_strdup(&guid[4]);
>> +    pci = g_malloc0(sizeof(*pci));
>> +    pci->domain = -1;
>> +    pci->slot = -1;
>> +    pci->function = -1;
>> +    pci->bus = -1;
>>
>
> Right now, we have:
>
> ##
> # @GuestDiskAddress:
> #
> # @pci-controller: controller's PCI address
> # @bus-type: bus type
> # @bus: bus id
> # @target: target id
> # @unit: unit id
> #
> # Since: 2.2
> ##
> { 'struct': 'GuestDiskAddress',
>   'data': {'pci-controller': 'GuestPCIAddress',
>            'bus-type': 'GuestDiskBusType',
>            'bus': 'int', 'target': 'int', 'unit': 'int'} }
>
> and the problem you ran into is that under certain conditions, we have no
> idea what to populate in GuestPCIAddress.  Your patch populates garbage
> instead; but I'm wondering if it would be better to instead mark
> pci-controller as optional, where code that CAN populate it would set
> has_pci_controller=true, and the code that crashed will now work by leaving
> the struct NULL (and has_pci_controller=false).  But that removes output
> that the client may expect to be present, hence, this is a back-compat
> question of the best way to approach this.

Since all of the fields are ints, I believe that "-1" is very informative
even though I don't like this approach but it does preserve backward
compatibility as you've already clarified.

I don't want to assume anything but this bug has been laying around for
quite a while and no one complained, moreover,  the disk option is not
enabled by default in upstream code so I don't think that backward
compatibility is actually disturbed but then again this is just an
assumption.

One more thing to consider is the second patch of this series which
actually was the root cause of why we didn't allocate the " GuestDiskAddress"
struct which is some registry keys being absent for the specific device
which would leave us with two options:
1. Don not return anything for all of the fields (leave it null)
2. Return partial information
I think that the second option is preferable for obvious reasons.

I am not that familiar with schema files, but it is possible to make int
fields optional too?

I can live with either approaches, maintainers, it's your call :)

Thanks for the review!

>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>
Michael Roth July 9, 2018, 10:58 p.m. UTC | #4
Quoting Sameeh Jubran (2018-06-28 18:33:58)
> 
> 
> On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote:
> 
>     On 06/26/2018 10:10 AM, Sameeh Jubran wrote:
> 
>         From: Sameeh Jubran <sjubran@redhat.com>
> 
>         The fsinfo command is currently implemented for Windows only and it's
>         disk
>         parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to
>         the qga
>         code. When enabled and executed the qemu-ga crashed with the following
>         message:
> 
>         ------------------------------------------------
>         File qapi/qapi-visit-core.c, Line 49
> 
>         Expression: !(v->type & VISITOR_OUTPUT) || *obj)
>         ------------------------------------------------
> 
>         After some digging, turns out that the GuestPCIAddress is null and the
>         qapi visitor doesn't like that, so we can always allocate it instead
>         and
>         initiate all it's members to -1.
> 
> 
>     Adding Markus for a qapi back-compat question.
> 
>     Is faking an invalid address better than making the output optional
>     instead?
> 
> 
>         +++ b/qga/commands-win32.c
>         @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid,
>         Error **errp)
>               char *buffer = NULL;
>               GuestPCIAddress *pci = NULL;
>               char *name = g_strdup(&guid[4]);
>         +    pci = g_malloc0(sizeof(*pci));
>         +    pci->domain = -1;
>         +    pci->slot = -1;
>         +    pci->function = -1;
>         +    pci->bus = -1;
> 
> 
>     Right now, we have:
> 
>     ##
>     # @GuestDiskAddress:
>     #
>     # @pci-controller: controller's PCI address
>     # @bus-type: bus type
>     # @bus: bus id
>     # @target: target id
>     # @unit: unit id
>     #
>     # Since: 2.2
>     ##
>     { 'struct': 'GuestDiskAddress',
>       'data': {'pci-controller': 'GuestPCIAddress',
>                'bus-type': 'GuestDiskBusType',
>                'bus': 'int', 'target': 'int', 'unit': 'int'} }
> 
>     and the problem you ran into is that under certain conditions, we have no
>     idea what to populate in GuestPCIAddress.  Your patch populates garbage
>     instead; but I'm wondering if it would be better to instead mark
>     pci-controller as optional, where code that CAN populate it would set
>     has_pci_controller=true, and the code that crashed will now work by leaving
>     the struct NULL (and has_pci_controller=false).  But that removes output
>     that the client may expect to be present, hence, this is a back-compat
>     question of the best way to approach this.
> 
> Since all of the fields are ints, I believe that "-1" is very informative even
> though I don't like this approach but it does preserve backward compatibility 
> as you've already clarified.
> 
> I don't want to assume anything but this bug has been laying around for quite a
> while and no one complained, moreover,  the disk option is not enabled by
> default in upstream code so I don't think that backward compatibility is
> actually disturbed but then again this is just an assumption.
> 
> One more thing to consider is the second patch of this series which actually
> was the root cause of why we didn't allocate the " GuestDiskAddress" struct
> which is some registry keys being absent for the specific device which would
> leave us with two options:
> 1. Don not return anything for all of the fields (leave it null)
> 2. Return partial information
> I think that the second option is preferable for obvious reasons.
> 
> I am not that familiar with schema files, but it is possible to make int fields
> optional too?
> 
> I can live with either approaches, maintainers, it's your call :)

IMO both approaches potentially break clients, but making the field
optional in this case would be likely to trigger a segfault for any code
that relies on GuestPCIAddress being populated, whereas -1 would likely
just propagate through the stack as a signed integer as the schema calls
for. Lack of -1/undefined handling further up the stack could still
cause breakage but the severity seems lesser. I'm inclined to go this
route for 2.13. If there are no objections I plan to include this in a
pull for 2.13-rc1.

> 
> Thanks for the review!
> 
>    
>     --
>     Eric Blake, Principal Software Engineer
>     Red Hat, Inc.           +1-919-301-3266
>     Virtualization:  qemu.org | libvirt.org
> 
> 
> 
> 
> --
> Respectfully,
> Sameeh Jubran
> Linkedin
> Software Engineer @ Daynix.
Sameeh Jubran July 16, 2018, 11:42 a.m. UTC | #5
I've found another bug which is related to this one, where pci_controller
might also be null:
in the function "build_guest_disk_info" in "qga/commands-win32.c"

if (bus == BusTypeScsi || bus == BusTypeAta || bus == BusTypeRAID

#if (_WIN32_WINNT >= 0x0600)

/* This bus type is not supported before Windows Server 2003 SP1 */

|| bus == BusTypeSas

#endif

) {

/* We are able to use the same ioctls for different bus types

* according to Microsoft docs

* https://technet.microsoft.com/en-us/library/ee851589(v=ws.10).aspx */

if (DeviceIoControl(vol_h, IOCTL_SCSI_GET_ADDRESS, NULL, 0, scsi_ad,

sizeof(SCSI_ADDRESS), &len, NULL)) {

disk->unit = addr.Lun;

disk->target = addr.TargetId;

disk->bus = addr.PathId;

disk->pci_controller = get_pci_info(name, errp);

}

/* We do not set error in this case, because we still have enough

* information about volume. */

} else {

disk->pci_controller = NULL;

}

This can be easily prevented by always calling "get_pci_info(name, errp);",
is this a separate patch, or should be combined with this one? what do you
think?

On Tue, Jul 10, 2018 at 1:58 AM, Michael Roth <mdroth@linux.vnet.ibm.com>
wrote:

> Quoting Sameeh Jubran (2018-06-28 18:33:58)
> >
> >
> > On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote:
> >
> >     On 06/26/2018 10:10 AM, Sameeh Jubran wrote:
> >
> >         From: Sameeh Jubran <sjubran@redhat.com>
> >
> >         The fsinfo command is currently implemented for Windows only and
> it's
> >         disk
> >         parameter can be enabled by adding the define
> "CONFIG_QGA_NTDDSCSI" to
> >         the qga
> >         code. When enabled and executed the qemu-ga crashed with the
> following
> >         message:
> >
> >         ------------------------------------------------
> >         File qapi/qapi-visit-core.c, Line 49
> >
> >         Expression: !(v->type & VISITOR_OUTPUT) || *obj)
> >         ------------------------------------------------
> >
> >         After some digging, turns out that the GuestPCIAddress is null
> and the
> >         qapi visitor doesn't like that, so we can always allocate it
> instead
> >         and
> >         initiate all it's members to -1.
> >
> >
> >     Adding Markus for a qapi back-compat question.
> >
> >     Is faking an invalid address better than making the output optional
> >     instead?
> >
> >
> >         +++ b/qga/commands-win32.c
> >         @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char
> *guid,
> >         Error **errp)
> >               char *buffer = NULL;
> >               GuestPCIAddress *pci = NULL;
> >               char *name = g_strdup(&guid[4]);
> >         +    pci = g_malloc0(sizeof(*pci));
> >         +    pci->domain = -1;
> >         +    pci->slot = -1;
> >         +    pci->function = -1;
> >         +    pci->bus = -1;
> >
> >
> >     Right now, we have:
> >
> >     ##
> >     # @GuestDiskAddress:
> >     #
> >     # @pci-controller: controller's PCI address
> >     # @bus-type: bus type
> >     # @bus: bus id
> >     # @target: target id
> >     # @unit: unit id
> >     #
> >     # Since: 2.2
> >     ##
> >     { 'struct': 'GuestDiskAddress',
> >       'data': {'pci-controller': 'GuestPCIAddress',
> >                'bus-type': 'GuestDiskBusType',
> >                'bus': 'int', 'target': 'int', 'unit': 'int'} }
> >
> >     and the problem you ran into is that under certain conditions, we
> have no
> >     idea what to populate in GuestPCIAddress.  Your patch populates
> garbage
> >     instead; but I'm wondering if it would be better to instead mark
> >     pci-controller as optional, where code that CAN populate it would set
> >     has_pci_controller=true, and the code that crashed will now work by
> leaving
> >     the struct NULL (and has_pci_controller=false).  But that removes
> output
> >     that the client may expect to be present, hence, this is a
> back-compat
> >     question of the best way to approach this.
> >
> > Since all of the fields are ints, I believe that "-1" is very
> informative even
> > though I don't like this approach but it does preserve backward
> compatibility
> > as you've already clarified.
> >
> > I don't want to assume anything but this bug has been laying around for
> quite a
> > while and no one complained, moreover,  the disk option is not enabled by
> > default in upstream code so I don't think that backward compatibility is
> > actually disturbed but then again this is just an assumption.
> >
> > One more thing to consider is the second patch of this series which
> actually
> > was the root cause of why we didn't allocate the " GuestDiskAddress"
> struct
> > which is some registry keys being absent for the specific device which
> would
> > leave us with two options:
> > 1. Don not return anything for all of the fields (leave it null)
> > 2. Return partial information
> > I think that the second option is preferable for obvious reasons.
> >
> > I am not that familiar with schema files, but it is possible to make int
> fields
> > optional too?
> >
> > I can live with either approaches, maintainers, it's your call :)
>
> IMO both approaches potentially break clients, but making the field
> optional in this case would be likely to trigger a segfault for any code
> that relies on GuestPCIAddress being populated, whereas -1 would likely
> just propagate through the stack as a signed integer as the schema calls
> for. Lack of -1/undefined handling further up the stack could still
> cause breakage but the severity seems lesser. I'm inclined to go this
> route for 2.13. If there are no objections I plan to include this in a
> pull for 2.13-rc1.
>
> >
> > Thanks for the review!
> >
> >
> >     --
> >     Eric Blake, Principal Software Engineer
> >     Red Hat, Inc.           +1-919-301-3266
> >     Virtualization:  qemu.org | libvirt.org
> >
> >
> >
> >
> > --
> > Respectfully,
> > Sameeh Jubran
> > Linkedin
> > Software Engineer @ Daynix.
>
>
diff mbox

Patch

diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 2d48394748..c5f1c884e1 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -485,6 +485,11 @@  static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
     char *buffer = NULL;
     GuestPCIAddress *pci = NULL;
     char *name = g_strdup(&guid[4]);
+    pci = g_malloc0(sizeof(*pci));
+    pci->domain = -1;
+    pci->slot = -1;
+    pci->function = -1;
+    pci->bus = -1;
 
     if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) {
         error_setg_win32(errp, GetLastError(), "failed to get dos device name");
@@ -556,7 +561,6 @@  static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
 
         func = addr & 0x0000FFFF;
         dev = (addr >> 16) & 0x0000FFFF;
-        pci = g_malloc0(sizeof(*pci));
         pci->domain = dev;
         pci->slot = slot;
         pci->function = func;