diff mbox

[RFC] qemu: fix hot remove assigned device

Message ID 715D42877B251141A38726ABF5CABF2C054596A9FF@pdsmsx503.ccr.corp.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Han, Weidong June 10, 2009, 7:45 a.m. UTC
Gerd Hoffmann wrote:
> On 06/09/09 16:51, Paul Brook wrote:
>> On Tuesday 09 June 2009, Han, Weidong wrote:
>>> Paul Brook wrote:
>>>> On Monday 08 June 2009, Weidong Han wrote:
>>>>> When hot remove an assigned device, segmentation fault was
>>>>> triggered by qemu_free(&pci_dev->qdev) in pci_unregister_device().
>>>>> pci_register_device() doesn't initialize or set pci_dev->qdev.
>>>>> For an assigned device, qdev variable isn't touched at all. So
>>>>> segmentation fault happens when to free a non-initialized qdev.
>>>> Better would be to just disable hot remove for devices still using
>>>> the legacy pci_register_device API.
>>> PCI passthrough uses pci_register_device to register assigned
>>> device to qemu. Is there newer API to do so?
>> 
>> Yes. See e.g. LSI scsi emulation.
> 
> Well.  Except that you can't (yet) register pci config read/write
> callbacks using the qdev-based API.
> 

So pci passthrough have to use pci_register_device now. I cooked a new patch (post below) to fix this issue. Thanks.


When hot remove an assigned device, segmentation fault was triggered
by qdev_free(&pci_dev->qdev) in pci_unregister_device().
pci_register_device() doesn't initialize or set pci_dev->qdev. For an
assigned device, qdev variable isn't touched at all. So segmentation
fault happens when to free a non-initialized qdev.

This patch adds a parameter in pci_unregister_device to check if
it's an assigned device. For assgined device, free pci_dev instead of
qdev. No changes for other devices.


Signed-off-by: Weidong Han <weidong.han@intel.com>
---
 hw/device-assignment.c |    2 +-
 hw/pci-hotplug.c       |    2 +-
 hw/pci.c               |    8 ++++++--
 hw/pci.h               |    2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

Comments

Avi Kivity June 10, 2009, 8:06 a.m. UTC | #1
Han, Weidong wrote:
>  
> -int pci_unregister_device(PCIDevice *pci_dev)
> +int pci_unregister_device(PCIDevice *pci_dev, int assigned)
>  {
>      int ret = 0;
>  
> @@ -377,7 +377,11 @@ int pci_unregister_device(PCIDevice *pci_dev)
>      qemu_free_irqs(pci_dev->irq);
>      pci_irq_index--;
>      pci_dev->bus->devices[pci_dev->devfn] = NULL;
> -    qdev_free(&pci_dev->qdev);
> +
> +    if (assigned)
> +        qemu_free(pci_dev);
> +    else
> +        qdev_free(&pci_dev->qdev);
>      return 0;
>  }
>   

Can you check pci_dev->qdev instead of assigned?  A little less ugly.
Han, Weidong June 10, 2009, 8:31 a.m. UTC | #2
Avi Kivity wrote:
> Han, Weidong wrote:
>> 
>> -int pci_unregister_device(PCIDevice *pci_dev)
>> +int pci_unregister_device(PCIDevice *pci_dev, int assigned)  {
>>      int ret = 0;
>> 
>> @@ -377,7 +377,11 @@ int pci_unregister_device(PCIDevice *pci_dev)
>>      qemu_free_irqs(pci_dev->irq);
>>      pci_irq_index--;
>>      pci_dev->bus->devices[pci_dev->devfn] = NULL;
>> -    qdev_free(&pci_dev->qdev);
>> +
>> +    if (assigned)
>> +        qemu_free(pci_dev);
>> +    else
>> +        qdev_free(&pci_dev->qdev);
>>      return 0;
>>  }
>> 
> 
> Can you check pci_dev->qdev instead of assigned?  A little less ugly.

I tried to find an easy and clean way to check it, but I found the members of struct PCIDevice and DeviceState were not suitable for this checking, and qdev is not pointer in struct PCIDevice.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity June 10, 2009, 8:42 a.m. UTC | #3
Han, Weidong wrote:
> I tried to find an easy and clean way to check it, but I found the members of struct PCIDevice and DeviceState were not suitable for this checking, and qdev is not pointer in struct PCIDevice.
>   

Yes, of course.  I applied the patch, thanks.
Gerd Hoffmann June 10, 2009, 8:49 a.m. UTC | #4
On 06/10/09 10:31, Han, Weidong wrote:
> Avi Kivity wrote:
>> Can you check pci_dev->qdev instead of assigned?  A little less
>> ugly.
>
> I tried to find an easy and clean way to check it, but I found the
> members of struct PCIDevice and DeviceState were not suitable for
> this checking, and qdev is not pointer in struct PCIDevice.

Well, certain DeviceState pointers (type for example) are set only in 
case the device was created using the qdev api.  I think you certainly 
should get away without adding a new parameter to pci_unregister_device.

Also I've just sent a patch to the qemu-devel fixing the qdev register 
API for pci devices, allowing to register config space callbacks.  Once 
this is merged you can switch pci passthrough to the new qdev API.

cheers,
   Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Han, Weidong June 10, 2009, 8:55 a.m. UTC | #5
Gerd Hoffmann wrote:
> On 06/10/09 10:31, Han, Weidong wrote:
>> Avi Kivity wrote:
>>> Can you check pci_dev->qdev instead of assigned?  A little less
>>> ugly.
>> 
>> I tried to find an easy and clean way to check it, but I found the
>> members of struct PCIDevice and DeviceState were not suitable for
>> this checking, and qdev is not pointer in struct PCIDevice.
> 
> Well, certain DeviceState pointers (type for example) are set only in
> case the device was created using the qdev api.  I think you certainly
> should get away without adding a new parameter to
> pci_unregister_device. 
> 
> Also I've just sent a patch to the qemu-devel fixing the qdev register
> API for pci devices, allowing to register config space callbacks. 
> Once this is merged you can switch pci passthrough to the new qdev
> API. 
> 

Good. Extending the qdev APIs is the most elegant solution. Thanks.

Regards,
Weidong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 624d15a..65920d0 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -581,7 +581,7 @@  static void free_assigned_device(AssignedDevInfo *adev)
             dev->real_device.config_fd = 0;
         }
 
-        pci_unregister_device(&dev->dev);
+        pci_unregister_device(&dev->dev, 1);
 #ifdef KVM_CAP_IRQ_ROUTING
         free_dev_irq_entries(dev);
 #endif
diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c
index d7c8b84..18a4912 100644
--- a/hw/pci-hotplug.c
+++ b/hw/pci-hotplug.c
@@ -271,6 +271,6 @@  void pci_device_hot_remove_success(int pcibus, int slot)
         break;
     }
 
-    pci_unregister_device(d);
+    pci_unregister_device(d, 0);
 }
 
diff --git a/hw/pci.c b/hw/pci.c
index 25581a4..35fd064 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -363,7 +363,7 @@  static void pci_unregister_io_regions(PCIDevice *pci_dev)
     }
 }
 
-int pci_unregister_device(PCIDevice *pci_dev)
+int pci_unregister_device(PCIDevice *pci_dev, int assigned)
 {
     int ret = 0;
 
@@ -377,7 +377,11 @@  int pci_unregister_device(PCIDevice *pci_dev)
     qemu_free_irqs(pci_dev->irq);
     pci_irq_index--;
     pci_dev->bus->devices[pci_dev->devfn] = NULL;
-    qdev_free(&pci_dev->qdev);
+
+    if (assigned)
+        qemu_free(pci_dev);
+    else
+        qdev_free(&pci_dev->qdev);
     return 0;
 }
 
diff --git a/hw/pci.h b/hw/pci.h
index f2dccb5..f658e78 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -199,7 +199,7 @@  PCIDevice *pci_register_device(PCIBus *bus, const char *name,
                                int instance_size, int devfn,
                                PCIConfigReadFunc *config_read,
                                PCIConfigWriteFunc *config_write);
-int pci_unregister_device(PCIDevice *pci_dev);
+int pci_unregister_device(PCIDevice *pci_dev, int assigned);
 
 void pci_register_io_region(PCIDevice *pci_dev, int region_num,
                             uint32_t size, int type,