Message ID | 20180626151038.24771-2-sameeh@daynix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/26/2018 12:10 PM, Sameeh Jubran wrote: > From: Sameeh Jubran <sjubran@redhat.com> > > The fsinfo command is currently implemented for Windows only and it's disk > parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga > code. When enabled and executed the qemu-ga crashed with the following message: > > ------------------------------------------------ > File qapi/qapi-visit-core.c, Line 49 > > Expression: !(v->type & VISITOR_OUTPUT) || *obj) > ------------------------------------------------ Due to these multiple lines starting with '---', this patch confuses Thunderbird + colorediffs extension in a funny way... > > After some digging, turns out that the GuestPCIAddress is null and the > qapi visitor doesn't like that, so we can always allocate it instead and > initiate all it's members to -1. > > Signed-off-by: Sameeh Jubran <sjubran@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> > --- > qga/commands-win32.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/qga/commands-win32.c b/qga/commands-win32.c > index 2d48394748..c5f1c884e1 100644 > --- a/qga/commands-win32.c > +++ b/qga/commands-win32.c > @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp) > char *buffer = NULL; > GuestPCIAddress *pci = NULL; > char *name = g_strdup(&guid[4]); > + pci = g_malloc0(sizeof(*pci)); > + pci->domain = -1; > + pci->slot = -1; > + pci->function = -1; > + pci->bus = -1; > > if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) { > error_setg_win32(errp, GetLastError(), "failed to get dos device name"); > @@ -556,7 +561,6 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp) > > func = addr & 0x0000FFFF; > dev = (addr >> 16) & 0x0000FFFF; > - pci = g_malloc0(sizeof(*pci)); > pci->domain = dev; > pci->slot = slot; > pci->function = func; >
On 06/26/2018 10:10 AM, Sameeh Jubran wrote: > From: Sameeh Jubran <sjubran@redhat.com> > > The fsinfo command is currently implemented for Windows only and it's disk > parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga > code. When enabled and executed the qemu-ga crashed with the following message: > > ------------------------------------------------ > File qapi/qapi-visit-core.c, Line 49 > > Expression: !(v->type & VISITOR_OUTPUT) || *obj) > ------------------------------------------------ > > After some digging, turns out that the GuestPCIAddress is null and the > qapi visitor doesn't like that, so we can always allocate it instead and > initiate all it's members to -1. Adding Markus for a qapi back-compat question. Is faking an invalid address better than making the output optional instead? > +++ b/qga/commands-win32.c > @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp) > char *buffer = NULL; > GuestPCIAddress *pci = NULL; > char *name = g_strdup(&guid[4]); > + pci = g_malloc0(sizeof(*pci)); > + pci->domain = -1; > + pci->slot = -1; > + pci->function = -1; > + pci->bus = -1; Right now, we have: ## # @GuestDiskAddress: # # @pci-controller: controller's PCI address # @bus-type: bus type # @bus: bus id # @target: target id # @unit: unit id # # Since: 2.2 ## { 'struct': 'GuestDiskAddress', 'data': {'pci-controller': 'GuestPCIAddress', 'bus-type': 'GuestDiskBusType', 'bus': 'int', 'target': 'int', 'unit': 'int'} } and the problem you ran into is that under certain conditions, we have no idea what to populate in GuestPCIAddress. Your patch populates garbage instead; but I'm wondering if it would be better to instead mark pci-controller as optional, where code that CAN populate it would set has_pci_controller=true, and the code that crashed will now work by leaving the struct NULL (and has_pci_controller=false). But that removes output that the client may expect to be present, hence, this is a back-compat question of the best way to approach this.
On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote: > On 06/26/2018 10:10 AM, Sameeh Jubran wrote: > >> From: Sameeh Jubran <sjubran@redhat.com> >> >> The fsinfo command is currently implemented for Windows only and it's disk >> parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to >> the qga >> code. When enabled and executed the qemu-ga crashed with the following >> message: >> >> ------------------------------------------------ >> File qapi/qapi-visit-core.c, Line 49 >> >> Expression: !(v->type & VISITOR_OUTPUT) || *obj) >> ------------------------------------------------ >> >> After some digging, turns out that the GuestPCIAddress is null and the >> qapi visitor doesn't like that, so we can always allocate it instead and >> initiate all it's members to -1. >> > > Adding Markus for a qapi back-compat question. > > Is faking an invalid address better than making the output optional > instead? > > +++ b/qga/commands-win32.c >> @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, >> Error **errp) >> char *buffer = NULL; >> GuestPCIAddress *pci = NULL; >> char *name = g_strdup(&guid[4]); >> + pci = g_malloc0(sizeof(*pci)); >> + pci->domain = -1; >> + pci->slot = -1; >> + pci->function = -1; >> + pci->bus = -1; >> > > Right now, we have: > > ## > # @GuestDiskAddress: > # > # @pci-controller: controller's PCI address > # @bus-type: bus type > # @bus: bus id > # @target: target id > # @unit: unit id > # > # Since: 2.2 > ## > { 'struct': 'GuestDiskAddress', > 'data': {'pci-controller': 'GuestPCIAddress', > 'bus-type': 'GuestDiskBusType', > 'bus': 'int', 'target': 'int', 'unit': 'int'} } > > and the problem you ran into is that under certain conditions, we have no > idea what to populate in GuestPCIAddress. Your patch populates garbage > instead; but I'm wondering if it would be better to instead mark > pci-controller as optional, where code that CAN populate it would set > has_pci_controller=true, and the code that crashed will now work by leaving > the struct NULL (and has_pci_controller=false). But that removes output > that the client may expect to be present, hence, this is a back-compat > question of the best way to approach this. Since all of the fields are ints, I believe that "-1" is very informative even though I don't like this approach but it does preserve backward compatibility as you've already clarified. I don't want to assume anything but this bug has been laying around for quite a while and no one complained, moreover, the disk option is not enabled by default in upstream code so I don't think that backward compatibility is actually disturbed but then again this is just an assumption. One more thing to consider is the second patch of this series which actually was the root cause of why we didn't allocate the " GuestDiskAddress" struct which is some registry keys being absent for the specific device which would leave us with two options: 1. Don not return anything for all of the fields (leave it null) 2. Return partial information I think that the second option is preferable for obvious reasons. I am not that familiar with schema files, but it is possible to make int fields optional too? I can live with either approaches, maintainers, it's your call :) Thanks for the review! > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3266 > Virtualization: qemu.org | libvirt.org >
Quoting Sameeh Jubran (2018-06-28 18:33:58) > > > On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote: > > On 06/26/2018 10:10 AM, Sameeh Jubran wrote: > > From: Sameeh Jubran <sjubran@redhat.com> > > The fsinfo command is currently implemented for Windows only and it's > disk > parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to > the qga > code. When enabled and executed the qemu-ga crashed with the following > message: > > ------------------------------------------------ > File qapi/qapi-visit-core.c, Line 49 > > Expression: !(v->type & VISITOR_OUTPUT) || *obj) > ------------------------------------------------ > > After some digging, turns out that the GuestPCIAddress is null and the > qapi visitor doesn't like that, so we can always allocate it instead > and > initiate all it's members to -1. > > > Adding Markus for a qapi back-compat question. > > Is faking an invalid address better than making the output optional > instead? > > > +++ b/qga/commands-win32.c > @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, > Error **errp) > char *buffer = NULL; > GuestPCIAddress *pci = NULL; > char *name = g_strdup(&guid[4]); > + pci = g_malloc0(sizeof(*pci)); > + pci->domain = -1; > + pci->slot = -1; > + pci->function = -1; > + pci->bus = -1; > > > Right now, we have: > > ## > # @GuestDiskAddress: > # > # @pci-controller: controller's PCI address > # @bus-type: bus type > # @bus: bus id > # @target: target id > # @unit: unit id > # > # Since: 2.2 > ## > { 'struct': 'GuestDiskAddress', > 'data': {'pci-controller': 'GuestPCIAddress', > 'bus-type': 'GuestDiskBusType', > 'bus': 'int', 'target': 'int', 'unit': 'int'} } > > and the problem you ran into is that under certain conditions, we have no > idea what to populate in GuestPCIAddress. Your patch populates garbage > instead; but I'm wondering if it would be better to instead mark > pci-controller as optional, where code that CAN populate it would set > has_pci_controller=true, and the code that crashed will now work by leaving > the struct NULL (and has_pci_controller=false). But that removes output > that the client may expect to be present, hence, this is a back-compat > question of the best way to approach this. > > Since all of the fields are ints, I believe that "-1" is very informative even > though I don't like this approach but it does preserve backward compatibility > as you've already clarified. > > I don't want to assume anything but this bug has been laying around for quite a > while and no one complained, moreover, the disk option is not enabled by > default in upstream code so I don't think that backward compatibility is > actually disturbed but then again this is just an assumption. > > One more thing to consider is the second patch of this series which actually > was the root cause of why we didn't allocate the " GuestDiskAddress" struct > which is some registry keys being absent for the specific device which would > leave us with two options: > 1. Don not return anything for all of the fields (leave it null) > 2. Return partial information > I think that the second option is preferable for obvious reasons. > > I am not that familiar with schema files, but it is possible to make int fields > optional too? > > I can live with either approaches, maintainers, it's your call :) IMO both approaches potentially break clients, but making the field optional in this case would be likely to trigger a segfault for any code that relies on GuestPCIAddress being populated, whereas -1 would likely just propagate through the stack as a signed integer as the schema calls for. Lack of -1/undefined handling further up the stack could still cause breakage but the severity seems lesser. I'm inclined to go this route for 2.13. If there are no objections I plan to include this in a pull for 2.13-rc1. > > Thanks for the review! > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3266 > Virtualization: qemu.org | libvirt.org > > > > > -- > Respectfully, > Sameeh Jubran > Linkedin > Software Engineer @ Daynix.
I've found another bug which is related to this one, where pci_controller might also be null: in the function "build_guest_disk_info" in "qga/commands-win32.c" if (bus == BusTypeScsi || bus == BusTypeAta || bus == BusTypeRAID #if (_WIN32_WINNT >= 0x0600) /* This bus type is not supported before Windows Server 2003 SP1 */ || bus == BusTypeSas #endif ) { /* We are able to use the same ioctls for different bus types * according to Microsoft docs * https://technet.microsoft.com/en-us/library/ee851589(v=ws.10).aspx */ if (DeviceIoControl(vol_h, IOCTL_SCSI_GET_ADDRESS, NULL, 0, scsi_ad, sizeof(SCSI_ADDRESS), &len, NULL)) { disk->unit = addr.Lun; disk->target = addr.TargetId; disk->bus = addr.PathId; disk->pci_controller = get_pci_info(name, errp); } /* We do not set error in this case, because we still have enough * information about volume. */ } else { disk->pci_controller = NULL; } This can be easily prevented by always calling "get_pci_info(name, errp);", is this a separate patch, or should be combined with this one? what do you think? On Tue, Jul 10, 2018 at 1:58 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > Quoting Sameeh Jubran (2018-06-28 18:33:58) > > > > > > On Fri, Jun 29, 2018 at 12:44 AM, Eric Blake <eblake@redhat.com> wrote: > > > > On 06/26/2018 10:10 AM, Sameeh Jubran wrote: > > > > From: Sameeh Jubran <sjubran@redhat.com> > > > > The fsinfo command is currently implemented for Windows only and > it's > > disk > > parameter can be enabled by adding the define > "CONFIG_QGA_NTDDSCSI" to > > the qga > > code. When enabled and executed the qemu-ga crashed with the > following > > message: > > > > ------------------------------------------------ > > File qapi/qapi-visit-core.c, Line 49 > > > > Expression: !(v->type & VISITOR_OUTPUT) || *obj) > > ------------------------------------------------ > > > > After some digging, turns out that the GuestPCIAddress is null > and the > > qapi visitor doesn't like that, so we can always allocate it > instead > > and > > initiate all it's members to -1. > > > > > > Adding Markus for a qapi back-compat question. > > > > Is faking an invalid address better than making the output optional > > instead? > > > > > > +++ b/qga/commands-win32.c > > @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char > *guid, > > Error **errp) > > char *buffer = NULL; > > GuestPCIAddress *pci = NULL; > > char *name = g_strdup(&guid[4]); > > + pci = g_malloc0(sizeof(*pci)); > > + pci->domain = -1; > > + pci->slot = -1; > > + pci->function = -1; > > + pci->bus = -1; > > > > > > Right now, we have: > > > > ## > > # @GuestDiskAddress: > > # > > # @pci-controller: controller's PCI address > > # @bus-type: bus type > > # @bus: bus id > > # @target: target id > > # @unit: unit id > > # > > # Since: 2.2 > > ## > > { 'struct': 'GuestDiskAddress', > > 'data': {'pci-controller': 'GuestPCIAddress', > > 'bus-type': 'GuestDiskBusType', > > 'bus': 'int', 'target': 'int', 'unit': 'int'} } > > > > and the problem you ran into is that under certain conditions, we > have no > > idea what to populate in GuestPCIAddress. Your patch populates > garbage > > instead; but I'm wondering if it would be better to instead mark > > pci-controller as optional, where code that CAN populate it would set > > has_pci_controller=true, and the code that crashed will now work by > leaving > > the struct NULL (and has_pci_controller=false). But that removes > output > > that the client may expect to be present, hence, this is a > back-compat > > question of the best way to approach this. > > > > Since all of the fields are ints, I believe that "-1" is very > informative even > > though I don't like this approach but it does preserve backward > compatibility > > as you've already clarified. > > > > I don't want to assume anything but this bug has been laying around for > quite a > > while and no one complained, moreover, the disk option is not enabled by > > default in upstream code so I don't think that backward compatibility is > > actually disturbed but then again this is just an assumption. > > > > One more thing to consider is the second patch of this series which > actually > > was the root cause of why we didn't allocate the " GuestDiskAddress" > struct > > which is some registry keys being absent for the specific device which > would > > leave us with two options: > > 1. Don not return anything for all of the fields (leave it null) > > 2. Return partial information > > I think that the second option is preferable for obvious reasons. > > > > I am not that familiar with schema files, but it is possible to make int > fields > > optional too? > > > > I can live with either approaches, maintainers, it's your call :) > > IMO both approaches potentially break clients, but making the field > optional in this case would be likely to trigger a segfault for any code > that relies on GuestPCIAddress being populated, whereas -1 would likely > just propagate through the stack as a signed integer as the schema calls > for. Lack of -1/undefined handling further up the stack could still > cause breakage but the severity seems lesser. I'm inclined to go this > route for 2.13. If there are no objections I plan to include this in a > pull for 2.13-rc1. > > > > > Thanks for the review! > > > > > > -- > > Eric Blake, Principal Software Engineer > > Red Hat, Inc. +1-919-301-3266 > > Virtualization: qemu.org | libvirt.org > > > > > > > > > > -- > > Respectfully, > > Sameeh Jubran > > Linkedin > > Software Engineer @ Daynix. > >
diff --git a/qga/commands-win32.c b/qga/commands-win32.c index 2d48394748..c5f1c884e1 100644 --- a/qga/commands-win32.c +++ b/qga/commands-win32.c @@ -485,6 +485,11 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp) char *buffer = NULL; GuestPCIAddress *pci = NULL; char *name = g_strdup(&guid[4]); + pci = g_malloc0(sizeof(*pci)); + pci->domain = -1; + pci->slot = -1; + pci->function = -1; + pci->bus = -1; if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) { error_setg_win32(errp, GetLastError(), "failed to get dos device name"); @@ -556,7 +561,6 @@ static GuestPCIAddress *get_pci_info(char *guid, Error **errp) func = addr & 0x0000FFFF; dev = (addr >> 16) & 0x0000FFFF; - pci = g_malloc0(sizeof(*pci)); pci->domain = dev; pci->slot = slot; pci->function = func;